Short answer: Yes, it is. Being at a customer the last days we wanted to parallel pg_dump a 2TB database. We were quite surprised that it was quite slow and it was not immediately clear why it was. Well, the answer is in the documentation: When you go for parallel dumps you need to use the directory format and this comes with: “This format is compressed by default and also supports parallel dumps.”. Compression takes time, so the question was if we could disable compression which was not clear from the statement: Does “compressed by default” mean that it per default is compressed and you can not change that or does it just mean it is the default, but you can change it?
As always, lets setup a short test case:
postgres=# create table dmp1 as select a,a::varchar b,now() c from generate_series ( 1, 1000000) a; SELECT 1000000 postgres=# create table dmp2 as select * from dmp1; SELECT 1000000 postgres=# create table dmp3 as select * from dmp1; SELECT 1000000 postgres=# create table dmp4 as select * from dmp1; SELECT 1000000 postgres=# d dmp* Table "public.dmp1" Column | Type | Collation | Nullable | Default --------+--------------------------+-----------+----------+--------- a | integer | | | b | character varying | | | c | timestamp with time zone | | | Table "public.dmp2" Column | Type | Collation | Nullable | Default --------+--------------------------+-----------+----------+--------- a | integer | | | b | character varying | | | c | timestamp with time zone | | | Table "public.dmp3" Column | Type | Collation | Nullable | Default --------+--------------------------+-----------+----------+--------- a | integer | | | b | character varying | | | c | timestamp with time zone | | | Table "public.dmp4" Column | Type | Collation | Nullable | Default --------+--------------------------+-----------+----------+--------- a | integer | | | b | character varying | | | c | timestamp with time zone | | |
We have four tables each containing 1’000’000 rows. When we use pg_dump in parallel with the default it looks like this:
postgres@pgbox:/home/postgres/ [PG10] mkdir /var/tmp/dmp postgres@pgbox:/home/postgres/ [PG10] time pg_dump --format=d --jobs=4 --file=/var/tmp/dmp/ postgres real 0m2.788s user 0m2.459s sys 0m0.597s postgres@pgbox:/home/postgres/ [PG10] ls -la /var/tmp/dmp/ total 19528 drwxr-xr-x. 2 postgres postgres 4096 Mar 9 07:16 . drwxrwxrwt. 4 root root 51 Mar 9 07:15 .. -rw-r--r--. 1 postgres postgres 25 Mar 9 07:16 3113.dat.gz -rw-r--r--. 1 postgres postgres 25 Mar 9 07:16 3114.dat.gz -rw-r--r--. 1 postgres postgres 25 Mar 9 07:16 3115.dat.gz -rw-r--r--. 1 postgres postgres 4991138 Mar 9 07:16 3116.dat.gz -rw-r--r--. 1 postgres postgres 4991138 Mar 9 07:16 3117.dat.gz -rw-r--r--. 1 postgres postgres 4991138 Mar 9 07:16 3118.dat.gz -rw-r--r--. 1 postgres postgres 4991138 Mar 9 07:16 3119.dat.gz -rw-r--r--. 1 postgres postgres 5819 Mar 9 07:16 toc.dat
As stated in the documentation the result is compressed. When speed is more important then the size on disk you can however disable the compression:
postgres@pgbox:/home/postgres/ [PG10] rm -rf /var/tmp/dmp/* postgres@pgbox:/home/postgres/ [PG10] time pg_dump --format=d --jobs=4 --file=/var/tmp/dmp/ --compress=0 postgres real 0m5.357s user 0m0.065s sys 0m0.460s postgres@pgbox:/home/postgres/ [PG10] ls -la /var/tmp/dmp/ total 171040 drwxr-xr-x. 2 postgres postgres 4096 Mar 9 07:18 . drwxrwxrwt. 4 root root 51 Mar 9 07:15 .. -rw-r--r--. 1 postgres postgres 5 Mar 9 07:18 3113.dat -rw-r--r--. 1 postgres postgres 5 Mar 9 07:18 3114.dat -rw-r--r--. 1 postgres postgres 5 Mar 9 07:18 3115.dat -rw-r--r--. 1 postgres postgres 43777797 Mar 9 07:18 3116.dat -rw-r--r--. 1 postgres postgres 43777797 Mar 9 07:18 3117.dat -rw-r--r--. 1 postgres postgres 43777797 Mar 9 07:18 3118.dat -rw-r--r--. 1 postgres postgres 43777797 Mar 9 07:18 3119.dat -rw-r--r--. 1 postgres postgres 5819 Mar 9 07:18 toc.dat
In my case it got slower than the compressed dump but this is because I do not really have fast disks on my little VM. When you have a good storage solution disabling compression should bring you more speed.