The PostgreSQL mailing list archives are a wonderful place to get new ideas just by reading through random posts. This time I found an interesting idea on how to partition an existing table almost online without dumping and reloading the data in some way. For those who remember: Before PostgreSQL 10 the only way to partition a table in PostgreSQL was to use table inheritance. I haven’t used table inheritance (docs) since quite some while, but it turned out that this is still something which can be very useful.
As usual lets start with a simple table and some data:
postgres=# create table t ( a int primary key, b text );
CREATE TABLE
postgres=# insert into t select i, 'aaa' from generate_series(1,1000000) i;
INSERT 0 1000000
postgres=# insert into t select i+1000000, 'bbb' from generate_series(1,1000000) i;
INSERT 0 1000000
What options do we have if we want to partition this table by list for column b? We could create the new partition structure beside the current table, and then in one point in time load the existing data into the partitioned table (either by using insert into .. select * from, or by dumping and reloading). Another option would be to setup logical replication from the old table into the new partitioned table. The third option , what we’ll have a look now, is to use table inheritance as an intermediate step to partition the table almost online.
The first bits we need are the child tables which are supposed to hold that data for the values of ‘aaa’ and ‘bbb’:
create table t_aaa ( a int, b text check (b in ('aaa')) ) inherits ( t );
create table t_bbb ( a int, b text check (b in ('bbb')) ) inherits ( t );
The important point here is the check constraint (more on that later), and of course the inheritance. For now we’ve build the inheritance tree but the child tables do not contain any data. Before we load the child table we need to make sure, that new data arriving in the parent table gets routed to the correct child table by creating a trigger:
postgres=# CREATE OR REPLACE FUNCTION tmp_trigger()
RETURNS TRIGGER AS $$
BEGIN
IF ( NEW.b = 'aaa' )
THEN
INSERT INTO t_aaa VALUES (NEW.*);
ELSIF ( NEW.b = 'bbb' )
THEN
INSERT INTO t_bbb VALUES (NEW.*);
ELSE
RAISE EXCEPTION 'Value out of range!';
END IF;
RETURN NULL;
END;
$$
LANGUAGE plpgsql;
CREATE FUNCTION
postgres=# CREATE TRIGGER tmp_insert_trigger
BEFORE INSERT ON t
FOR EACH ROW EXECUTE PROCEDURE tmp_trigger();
CREATE TRIGGER
A quick test to check that the trigger works:
postgres=# insert into t values (-1,'aaa');
INSERT 0 0
postgres=# insert into t values (-2,'bbb');
INSERT 0 0
postgres=# select * from only t_aaa;
a | b
----+-----
-1 | aaa
(1 row)
postgres=# select * from only t_bbb;
a | b
----+-----
-2 | bbb
postgres=# select * from t where a in (-1,-2);
a | b
----+-----
-1 | aaa
-2 | bbb
(2 rows)
This confirms that:
- Data is routed into the correct child table
- Data is accessible over the parent
Having that confirmed we can delete the data from the parent table and insert into the child tables (maybe in multiple batches):
postgres=# with aa as ( delete from t where b = 'aaa' returning * ) insert into t_aaa select * from aa;
INSERT 0 1000000
postgres=# with bb as ( delete from t where b = 'bbb' returning * ) insert into t_bbb select * from bb;
INSERT 0 1000000
We do that until the parent table is empty and all data is in the child tables:
postgres=# select count(*) from only t;
count
-------
0
(1 row)
postgres=# select count(*) from only t_aaa;
count
---------
1000000
(1 row)
postgres=# select count(*) from only t_bbb;
count
---------
1000000
(1 row)
The final step is to destroy the inheritance and attach the child tables as new partitions to a newly created partitioned table, and finally rename the old one and the new one:
postgres=# begin;
BEGIN
postgres=*# create table tt ( a int, b text ) partition by list(b);
CREATE TABLE
postgres=*# alter table t_aaa no inherit t;
ALTER TABLE
postgres=*# alter table tt attach partition t_aaa for values in ('aaa');
ALTER TABLE
postgres=*# alter table t_bbb no inherit t;
ALTER TABLE
postgres=*# alter table tt attach partition t_bbb for values in ('bbb');
ALTER TABLE
postgres=*# alter table t rename to t_old;
ALTER TABLE
postgres=*# alter table tt rename to t;
ALTER TABLE
postgres=*# commit;
COMMIT
Because the check constraint matches the partition key, PostgreSQL can just attach the tables as new partitions without scanning the tables, so this is a very fast operation with a very short lock. Done:
postgres=# select * from only t_aaa;
Partitioned table "public.t"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
--------+---------+-----------+----------+---------+----------+-------------+--------------+-------------
a | integer | | | | plain | | |
b | text | | | | extended | | |
Partition key: LIST (b)
Partitions: t_aaa FOR VALUES IN ('aaa'),
t_bbb FOR VALUES IN ('bbb')
But, pleas keep in mind:
- This is a really simple test case, of course this works well. In busy systems this might get more tricky
- Routing data with a trigger might introduce some performance degradation
- You somehow have to deal with data coming in while you go from the old to the new structure, or stop data from coming in during the last step