In the last posts of this series we prepared the data set and had a look at range partitioning. In this post we look at another partitioning strategy: List partitioning. Instead of partitioning by a range (typically based on day, year, month) list partitioning is used to partition on an explicit list with key values that define the partitions.
Coming back to the materialized view which holds our initial data set:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | postgres=# d mv_traffic_violations Materialized view "public.mv_traffic_violations" Column | Type | Collation | Nullable | Default -------------------------+------------------------+-----------+----------+--------- seqid | text | | | date_of_stop | date | | | time_of_stop | time without time zone | | | agency | text | | | subagency | text | | | description | text | | | location | text | | | latitude | numeric | | | longitude | numeric | | | accident | text | | | belts | boolean | | | personal_injury | boolean | | | property_damage | boolean | | | fatal | boolean | | | commercial_license | boolean | | | hazmat | boolean | | | commercial_vehicle | boolean | | | alcohol | boolean | | | workzone | boolean | | | state | text | | | vehicletype | text | | | year | smallint | | | make | text | | | model | text | | | color | text | | | violation_type | text | | | charge | text | | | article | text | | | contributed_to_accident | boolean | | | race | text | | | gender | text | | | driver_city | text | | | driver_state | text | | | dl_state | text | | | arrest_type | text | | | geolocation | point | | | council_districts | smallint | | | councils | smallint | | | communities | smallint | | | zip_codes | smallint | | | municipalities | smallint | | | |
The application might want to ask for all traffic violations that match a specific violation type:
1 2 3 4 5 6 7 8 | postgres=# select distinct violation_type from mv_traffic_violations; violation_type ---------------- Warning SERO Citation ESERO (4 rows ) |
Partitioning by that would result in 4 partitions (plus one default partition for any other potential violation type in the future). The definition of the table and it’s partitions is pretty much the same as for range partitioning but this time we go for list partitioning:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | create table traffic_violations_p_list ( seqid text , date_of_stop date , time_of_stop time , agency text , subagency text , description text , location text , latitude numeric , longitude numeric , accident text , belts boolean , personal_injury boolean , property_damage boolean , fatal boolean , commercial_license boolean , hazmat boolean , commercial_vehicle boolean , alcohol boolean , workzone boolean , state text , vehicletype text , year smallint , make text , model text , color text , violation_type text , charge text , article text , contributed_to_accident boolean , race text , gender text , driver_city text , driver_state text , dl_state text , arrest_type text , geolocation point , council_districts smallint , councils smallint , communities smallint , zip_codes smallint , municipalities smallint ) partition by list (violation_type); |
psql will show the list partitioning strategy as well:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | postgres=# d traffic_violations_p_list Partitioned table "public.traffic_violations_p_list" Column | Type | Collation | Nullable | Default -------------------------+------------------------+-----------+----------+--------- seqid | text | | | date_of_stop | date | | | time_of_stop | time without time zone | | | agency | text | | | subagency | text | | | description | text | | | location | text | | | latitude | numeric | | | longitude | numeric | | | accident | text | | | belts | boolean | | | personal_injury | boolean | | | property_damage | boolean | | | fatal | boolean | | | commercial_license | boolean | | | hazmat | boolean | | | commercial_vehicle | boolean | | | alcohol | boolean | | | workzone | boolean | | | state | text | | | vehicletype | text | | | year | smallint | | | make | text | | | model | text | | | color | text | | | violation_type | text | | | charge | text | | | article | text | | | contributed_to_accident | boolean | | | race | text | | | gender | text | | | driver_city | text | | | driver_state | text | | | dl_state | text | | | arrest_type | text | | | geolocation | point | | | council_districts | smallint | | | councils | smallint | | | communities | smallint | | | zip_codes | smallint | | | municipalities | smallint | | | Partition key : LIST (violation_type) Number of partitions: 0 |
Now we can create the partitions:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | create table traffic_violations_p_list_warning partition of traffic_violations_p_list for values in ( 'Warning' ); create table traffic_violations_p_list_sero partition of traffic_violations_p_list for values in ( 'SERO' ); create table traffic_violations_p_list_Citation partition of traffic_violations_p_list for values in ( 'Citation' ); create table traffic_violations_p_list_ESERO partition of traffic_violations_p_list for values in ( 'ESERO' ); create table traffic_violations_p_list_default partition of traffic_violations_p_list DEFAULT ; |
Again, psql will display the partitions nicely:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | postgres=# d+ traffic_violations_p_list Partitioned table "public.traffic_violations_p_list" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description -------------------------+------------------------+-----------+----------+---------+----------+--------------+------------- seqid | text | | | | extended | | date_of_stop | date | | | | plain | | time_of_stop | time without time zone | | | | plain | | agency | text | | | | extended | | subagency | text | | | | extended | | description | text | | | | extended | | location | text | | | | extended | | latitude | numeric | | | | main | | longitude | numeric | | | | main | | accident | text | | | | extended | | belts | boolean | | | | plain | | personal_injury | boolean | | | | plain | | property_damage | boolean | | | | plain | | fatal | boolean | | | | plain | | commercial_license | boolean | | | | plain | | hazmat | boolean | | | | plain | | commercial_vehicle | boolean | | | | plain | | alcohol | boolean | | | | plain | | workzone | boolean | | | | plain | | state | text | | | | extended | | vehicletype | text | | | | extended | | year | smallint | | | | plain | | make | text | | | | extended | | model | text | | | | extended | | color | text | | | | extended | | violation_type | text | | | | extended | | charge | text | | | | extended | | article | text | | | | extended | | contributed_to_accident | boolean | | | | plain | | race | text | | | | extended | | gender | text | | | | extended | | driver_city | text | | | | extended | | driver_state | text | | | | extended | | dl_state | text | | | | extended | | arrest_type | text | | | | extended | | geolocation | point | | | | plain | | council_districts | smallint | | | | plain | | councils | smallint | | | | plain | | communities | smallint | | | | plain | | zip_codes | smallint | | | | plain | | municipalities | smallint | | | | plain | | Partition key : LIST (violation_type) Partitions: traffic_violations_p_list_citation FOR VALUES IN ( 'Citation' ), traffic_violations_p_list_esero FOR VALUES IN ( 'ESERO' ), traffic_violations_p_list_sero FOR VALUES IN ( 'SERO' ), traffic_violations_p_list_warning FOR VALUES IN ( 'Warning' ), traffic_violations_p_list_default DEFAULT |
Loading the data and checking the amount of data for each partition:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | postgres=# insert into traffic_violations_p_list postgres-# select * from mv_traffic_violations; INSERT 0 1528078 postgres=# select count (*) from traffic_violations_p_list_warning; count -------- 759747 (1 row) postgres=# select count (*) from traffic_violations_p_list_sero; count ------- 899 (1 row) postgres=# select count (*) from traffic_violations_p_list_Citation; count -------- 693226 (1 row) postgres=# select count (*) from traffic_violations_p_list_ESERO; count ------- 74206 (1 row) postgres=# select count (*) from traffic_violations_p_list_default; count ------- 0 (1 row) |
Done. This is list partitioning. Instead of partitioning by a range, list partitioning is for partitioning by key values. In the next post we will have a look at hash partitioning.