{"id":11011,"date":"2018-03-09T06:25:04","date_gmt":"2018-03-09T05:25:04","guid":{"rendered":"https:\/\/www.dbi-services.com\/blog\/parallel-pg_dump-is-slow-by-default\/"},"modified":"2018-03-09T06:25:04","modified_gmt":"2018-03-09T05:25:04","slug":"parallel-pg_dump-is-slow-by-default","status":"publish","type":"post","link":"https:\/\/www.dbi-services.com\/blog\/parallel-pg_dump-is-slow-by-default\/","title":{"rendered":"Parallel pg_dump is slow by default?"},"content":{"rendered":"<p>Short answer: Yes, it is. Being at a customer the last days we wanted to parallel pg_dump a 2TB database. We were quite surprised that it was quite slow and it was not immediately clear why it was. Well, the answer is in the <a href=\"https:\/\/www.postgresql.org\/docs\/10\/static\/app-pgdump.html\" target=\"_blank\" rel=\"noopener\">documentation<\/a>: When you go for parallel dumps you need to use the directory format and this comes with: &#8220;This format is compressed by default and also supports parallel dumps.&#8221;. Compression takes time, so the question was if we could disable compression which was not clear from the statement: Does &#8220;compressed by default&#8221; mean that it per default is compressed and you can not change that or does it just mean it is the default, but you can change it?<\/p>\n<p><!--more--><\/p>\n<p>As always, lets setup a short test case:<\/p>\n<pre class=\"brush: sql; gutter: true; first-line: 1\">\npostgres=# create table dmp1 as \n           select a,a::varchar b,now() c \n             from generate_series ( 1, 1000000) a;\nSELECT 1000000\npostgres=# create table dmp2 as select * from dmp1;\nSELECT 1000000\npostgres=# create table dmp3 as select * from dmp1;\nSELECT 1000000\npostgres=# create table dmp4 as select * from dmp1;\nSELECT 1000000\npostgres=# d dmp*\n                        Table \"public.dmp1\"\n Column |           Type           | Collation | Nullable | Default \n--------+--------------------------+-----------+----------+---------\n a      | integer                  |           |          | \n b      | character varying        |           |          | \n c      | timestamp with time zone |           |          | \n\n                        Table \"public.dmp2\"\n Column |           Type           | Collation | Nullable | Default \n--------+--------------------------+-----------+----------+---------\n a      | integer                  |           |          | \n b      | character varying        |           |          | \n c      | timestamp with time zone |           |          | \n\n                        Table \"public.dmp3\"\n Column |           Type           | Collation | Nullable | Default \n--------+--------------------------+-----------+----------+---------\n a      | integer                  |           |          | \n b      | character varying        |           |          | \n c      | timestamp with time zone |           |          | \n\n                        Table \"public.dmp4\"\n Column |           Type           | Collation | Nullable | Default \n--------+--------------------------+-----------+----------+---------\n a      | integer                  |           |          | \n b      | character varying        |           |          | \n c      | timestamp with time zone |           |          | \n\n<\/pre>\n<p>We have four tables each containing 1&#8217;000&#8217;000 rows. When we use pg_dump in parallel with the default it looks like this:<\/p>\n<pre class=\"brush: bash; gutter: true; first-line: 1\">\npostgres@pgbox:\/home\/postgres\/ [PG10] mkdir \/var\/tmp\/dmp\npostgres@pgbox:\/home\/postgres\/ [PG10] time pg_dump --format=d --jobs=4 --file=\/var\/tmp\/dmp\/ postgres\n\nreal\t0m2.788s\nuser\t0m2.459s\nsys\t0m0.597s\npostgres@pgbox:\/home\/postgres\/ [PG10] ls -la \/var\/tmp\/dmp\/\ntotal 19528\ndrwxr-xr-x. 2 postgres postgres    4096 Mar  9 07:16 .\ndrwxrwxrwt. 4 root     root          51 Mar  9 07:15 ..\n-rw-r--r--. 1 postgres postgres      25 Mar  9 07:16 3113.dat.gz\n-rw-r--r--. 1 postgres postgres      25 Mar  9 07:16 3114.dat.gz\n-rw-r--r--. 1 postgres postgres      25 Mar  9 07:16 3115.dat.gz\n-rw-r--r--. 1 postgres postgres 4991138 Mar  9 07:16 3116.dat.gz\n-rw-r--r--. 1 postgres postgres 4991138 Mar  9 07:16 3117.dat.gz\n-rw-r--r--. 1 postgres postgres 4991138 Mar  9 07:16 3118.dat.gz\n-rw-r--r--. 1 postgres postgres 4991138 Mar  9 07:16 3119.dat.gz\n-rw-r--r--. 1 postgres postgres    5819 Mar  9 07:16 toc.dat\n<\/pre>\n<p>As stated in the documentation the result is compressed. When speed is more important then the size on disk you can however disable the compression:<\/p>\n<pre class=\"brush: bash; gutter: true; first-line: 1\">\npostgres@pgbox:\/home\/postgres\/ [PG10] rm -rf \/var\/tmp\/dmp\/*\npostgres@pgbox:\/home\/postgres\/ [PG10] time pg_dump --format=d --jobs=4 --file=\/var\/tmp\/dmp\/ --compress=0 postgres\n\nreal\t0m5.357s\nuser\t0m0.065s\nsys\t0m0.460s\npostgres@pgbox:\/home\/postgres\/ [PG10] ls -la \/var\/tmp\/dmp\/\ntotal 171040\ndrwxr-xr-x. 2 postgres postgres     4096 Mar  9 07:18 .\ndrwxrwxrwt. 4 root     root           51 Mar  9 07:15 ..\n-rw-r--r--. 1 postgres postgres        5 Mar  9 07:18 3113.dat\n-rw-r--r--. 1 postgres postgres        5 Mar  9 07:18 3114.dat\n-rw-r--r--. 1 postgres postgres        5 Mar  9 07:18 3115.dat\n-rw-r--r--. 1 postgres postgres 43777797 Mar  9 07:18 3116.dat\n-rw-r--r--. 1 postgres postgres 43777797 Mar  9 07:18 3117.dat\n-rw-r--r--. 1 postgres postgres 43777797 Mar  9 07:18 3118.dat\n-rw-r--r--. 1 postgres postgres 43777797 Mar  9 07:18 3119.dat\n-rw-r--r--. 1 postgres postgres     5819 Mar  9 07:18 toc.dat\n<\/pre>\n<p>In my case it got slower than the compressed dump but this is because I do not really have fast disks on my little VM. When you have a good storage solution disabling compression should bring you more speed.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Short answer: Yes, it is. Being at a customer the last days we wanted to parallel pg_dump a 2TB database. We were quite surprised that it was quite slow and it was not immediately clear why it was. Well, the answer is in the documentation: When you go for parallel dumps you need to use [&hellip;]<\/p>\n","protected":false},"author":29,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[229],"tags":[77],"type_dbi":[],"class_list":["post-11011","post","type-post","status-publish","format-standard","hentry","category-database-administration-monitoring","tag-postgresql"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.2 (Yoast SEO v27.5) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>Parallel pg_dump is slow by default? - dbi Blog<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.dbi-services.com\/blog\/parallel-pg_dump-is-slow-by-default\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Parallel pg_dump is slow by default?\" \/>\n<meta property=\"og:description\" content=\"Short answer: Yes, it is. Being at a customer the last days we wanted to parallel pg_dump a 2TB database. We were quite surprised that it was quite slow and it was not immediately clear why it was. Well, the answer is in the documentation: When you go for parallel dumps you need to use [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.dbi-services.com\/blog\/parallel-pg_dump-is-slow-by-default\/\" \/>\n<meta property=\"og:site_name\" content=\"dbi Blog\" \/>\n<meta property=\"article:published_time\" content=\"2018-03-09T05:25:04+00:00\" \/>\n<meta name=\"author\" content=\"Daniel Westermann\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@westermanndanie\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Daniel Westermann\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/parallel-pg_dump-is-slow-by-default\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/parallel-pg_dump-is-slow-by-default\\\/\"},\"author\":{\"name\":\"Daniel Westermann\",\"@id\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/#\\\/schema\\\/person\\\/8d08e9bd996a89bd75c0286cbabf3c66\"},\"headline\":\"Parallel pg_dump is slow by default?\",\"datePublished\":\"2018-03-09T05:25:04+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/parallel-pg_dump-is-slow-by-default\\\/\"},\"wordCount\":224,\"commentCount\":0,\"keywords\":[\"PostgreSQL\"],\"articleSection\":[\"Database Administration &amp; Monitoring\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/parallel-pg_dump-is-slow-by-default\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/parallel-pg_dump-is-slow-by-default\\\/\",\"url\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/parallel-pg_dump-is-slow-by-default\\\/\",\"name\":\"Parallel pg_dump is slow by default? - dbi Blog\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/#website\"},\"datePublished\":\"2018-03-09T05:25:04+00:00\",\"author\":{\"@id\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/#\\\/schema\\\/person\\\/8d08e9bd996a89bd75c0286cbabf3c66\"},\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/parallel-pg_dump-is-slow-by-default\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/parallel-pg_dump-is-slow-by-default\\\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/parallel-pg_dump-is-slow-by-default\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Accueil\",\"item\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Parallel pg_dump is slow by default?\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/\",\"name\":\"dbi Blog\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/#\\\/schema\\\/person\\\/8d08e9bd996a89bd75c0286cbabf3c66\",\"name\":\"Daniel Westermann\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/31350ceeecb1dd8986339a29bf040d4cd3cd087d410deccd8f55234466d6c317?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/31350ceeecb1dd8986339a29bf040d4cd3cd087d410deccd8f55234466d6c317?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/31350ceeecb1dd8986339a29bf040d4cd3cd087d410deccd8f55234466d6c317?s=96&d=mm&r=g\",\"caption\":\"Daniel Westermann\"},\"description\":\"Daniel Westermann is Principal Consultant and Technology Leader Open Infrastructure at dbi services. He has more than 15 years of experience in management, engineering and optimization of databases and infrastructures, especially on Oracle and PostgreSQL. Since the beginning of his career, he has specialized in Oracle Technologies and is Oracle Certified Professional 12c and Oracle Certified Expert RAC\\\/GridInfra. Over time, Daniel has become increasingly interested in open source technologies, becoming \u201cTechnology Leader Open Infrastructure\u201d and PostgreSQL expert. \u00a0Based on community or EnterpriseDB tools, he develops and installs complex high available solutions with PostgreSQL. He is also a certified PostgreSQL Plus 9.0 Professional and a Postgres Advanced Server 9.4 Professional. He is a regular speaker at PostgreSQL conferences in Switzerland and Europe. Today Daniel is also supporting our customers on AWS services such as AWS RDS, database migrations into the cloud, EC2 and automated infrastructure management with AWS SSM (System Manager). He is a certified AWS Solutions Architect Professional. Prior to dbi services, Daniel was Management System Engineer at LC SYSTEMS-Engineering AG in Basel. Before that, he worked as Oracle Developper &amp;\u00a0Project Manager at Delta Energy Solutions AG in Basel (today Powel AG). Daniel holds a diploma in Business Informatics (DHBW, Germany). His branch-related experience mainly covers the pharma industry, the financial sector, energy, lottery and telecommunications.\",\"sameAs\":[\"https:\\\/\\\/x.com\\\/westermanndanie\"],\"url\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/author\\\/daniel-westermann\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Parallel pg_dump is slow by default? - dbi Blog","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.dbi-services.com\/blog\/parallel-pg_dump-is-slow-by-default\/","og_locale":"en_US","og_type":"article","og_title":"Parallel pg_dump is slow by default?","og_description":"Short answer: Yes, it is. Being at a customer the last days we wanted to parallel pg_dump a 2TB database. We were quite surprised that it was quite slow and it was not immediately clear why it was. Well, the answer is in the documentation: When you go for parallel dumps you need to use [&hellip;]","og_url":"https:\/\/www.dbi-services.com\/blog\/parallel-pg_dump-is-slow-by-default\/","og_site_name":"dbi Blog","article_published_time":"2018-03-09T05:25:04+00:00","author":"Daniel Westermann","twitter_card":"summary_large_image","twitter_creator":"@westermanndanie","twitter_misc":{"Written by":"Daniel Westermann","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.dbi-services.com\/blog\/parallel-pg_dump-is-slow-by-default\/#article","isPartOf":{"@id":"https:\/\/www.dbi-services.com\/blog\/parallel-pg_dump-is-slow-by-default\/"},"author":{"name":"Daniel Westermann","@id":"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/8d08e9bd996a89bd75c0286cbabf3c66"},"headline":"Parallel pg_dump is slow by default?","datePublished":"2018-03-09T05:25:04+00:00","mainEntityOfPage":{"@id":"https:\/\/www.dbi-services.com\/blog\/parallel-pg_dump-is-slow-by-default\/"},"wordCount":224,"commentCount":0,"keywords":["PostgreSQL"],"articleSection":["Database Administration &amp; Monitoring"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.dbi-services.com\/blog\/parallel-pg_dump-is-slow-by-default\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.dbi-services.com\/blog\/parallel-pg_dump-is-slow-by-default\/","url":"https:\/\/www.dbi-services.com\/blog\/parallel-pg_dump-is-slow-by-default\/","name":"Parallel pg_dump is slow by default? - dbi Blog","isPartOf":{"@id":"https:\/\/www.dbi-services.com\/blog\/#website"},"datePublished":"2018-03-09T05:25:04+00:00","author":{"@id":"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/8d08e9bd996a89bd75c0286cbabf3c66"},"breadcrumb":{"@id":"https:\/\/www.dbi-services.com\/blog\/parallel-pg_dump-is-slow-by-default\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.dbi-services.com\/blog\/parallel-pg_dump-is-slow-by-default\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.dbi-services.com\/blog\/parallel-pg_dump-is-slow-by-default\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Accueil","item":"https:\/\/www.dbi-services.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Parallel pg_dump is slow by default?"}]},{"@type":"WebSite","@id":"https:\/\/www.dbi-services.com\/blog\/#website","url":"https:\/\/www.dbi-services.com\/blog\/","name":"dbi Blog","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.dbi-services.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/8d08e9bd996a89bd75c0286cbabf3c66","name":"Daniel Westermann","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/31350ceeecb1dd8986339a29bf040d4cd3cd087d410deccd8f55234466d6c317?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/31350ceeecb1dd8986339a29bf040d4cd3cd087d410deccd8f55234466d6c317?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/31350ceeecb1dd8986339a29bf040d4cd3cd087d410deccd8f55234466d6c317?s=96&d=mm&r=g","caption":"Daniel Westermann"},"description":"Daniel Westermann is Principal Consultant and Technology Leader Open Infrastructure at dbi services. He has more than 15 years of experience in management, engineering and optimization of databases and infrastructures, especially on Oracle and PostgreSQL. Since the beginning of his career, he has specialized in Oracle Technologies and is Oracle Certified Professional 12c and Oracle Certified Expert RAC\/GridInfra. Over time, Daniel has become increasingly interested in open source technologies, becoming \u201cTechnology Leader Open Infrastructure\u201d and PostgreSQL expert. \u00a0Based on community or EnterpriseDB tools, he develops and installs complex high available solutions with PostgreSQL. He is also a certified PostgreSQL Plus 9.0 Professional and a Postgres Advanced Server 9.4 Professional. He is a regular speaker at PostgreSQL conferences in Switzerland and Europe. Today Daniel is also supporting our customers on AWS services such as AWS RDS, database migrations into the cloud, EC2 and automated infrastructure management with AWS SSM (System Manager). He is a certified AWS Solutions Architect Professional. Prior to dbi services, Daniel was Management System Engineer at LC SYSTEMS-Engineering AG in Basel. Before that, he worked as Oracle Developper &amp;\u00a0Project Manager at Delta Energy Solutions AG in Basel (today Powel AG). Daniel holds a diploma in Business Informatics (DHBW, Germany). His branch-related experience mainly covers the pharma industry, the financial sector, energy, lottery and telecommunications.","sameAs":["https:\/\/x.com\/westermanndanie"],"url":"https:\/\/www.dbi-services.com\/blog\/author\/daniel-westermann\/"}]}},"_links":{"self":[{"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/posts\/11011","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/users\/29"}],"replies":[{"embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/comments?post=11011"}],"version-history":[{"count":0,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/posts\/11011\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/media?parent=11011"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/categories?post=11011"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/tags?post=11011"},{"taxonomy":"type","embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/type_dbi?post=11011"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}