Vacuum in PostgreSQL is one of the most important points to consider when managing a PostgreSQL instance. Usually vacuum is running in the background and just gets the job done. But, as always, there are situations when you need to get a closer look at what is going on. PostgreSQL 9.6 (currently in Beta1) introduced a new view which allows to see the progress of the vacuum worker processes. Lets have a short look at it.
To have something available for vacuum to do lets generate some data:
(postgres@[local]:5432) [postgres] > create table t1 ( a int, b varchar(40), c timestamp ); CREATE TABLE insert into t1 ( a, b, c ) select aa, bb, cc from generate_series(1,10000000) aa , md5(aa::varchar) bb , now() cc; NSERT 0 10000000
Vacuuming this table right now does not make any sense as there are no tuples to cleanup. We’ll need to delete or update some data for this. Before doing this I’ll disable autovacuum so I can kick it off manually when I am ready with my data:
(postgres@[local]:5432) [postgres] > show autovacuum; autovacuum ------------ on (1 row) (postgres@[local]:5432) [postgres] > alter system set autovacuum='off'; ALTER SYSTEM (postgres@[local]:5432) [postgres] > select * from pg_reload_conf(); pg_reload_conf ---------------- t (1 row) (postgres@[local]:5432) [postgres] >
Having a look at the log file confirms that the autovacuum launcher process was shut down:
2016-05-17 07:33:02.648 CEST - 4 - 2703 - - @ LOG: received SIGHUP, reloading configuration files 2016-05-17 07:33:02.653 CEST - 5 - 2703 - - @ LOG: parameter "autovacuum" changed to "off" 2016-05-17 07:33:02.655 CEST - 2 - 2709 - - @ LOG: autovacuum launcher shutting down
Time to delete some data from the test table:
delete from t1 where mod(a,7) = 0 ; DELETE 1428571
This should result in some dead tuples which we can check by querying pg_stat_user_tables:
(postgres@[local]:5432) [postgres] > select n_tup_del , n_live_tup , n_dead_tup from pg_stat_user_tables where relname='t1'; n_tup_del | n_live_tup | n_dead_tup -----------+------------+------------ 1571428 | 8571429 | 1428571 (1 row)
The amount of dead tuples corresponds to the number of rows we deleted. Now we can start vacuum on the table and check the new pg_stat_progress_vacuum for what is going on in a seconds session.
Session 1:
postgres@[local]:5432) [postgres] > vacuum verbose t1;
Session 2:
(postgres@[local]:5432) [postgres] > x Expanded display is on. (postgres@[local]:5432) [postgres] > select * from pg_stat_progress_vacuum;
The result of the above command (executed twice with some seconds in between) is:
(postgres@[local]:5432) [postgres] > select * from pg_stat_progress_vacuum; -[ RECORD 1 ]------+-------------- pid | 3051 datid | 13322 datname | postgres relid | 16388 phase | scanning heap heap_blks_total | 93458 heap_blks_scanned | 1568 heap_blks_vacuumed | 1567 index_vacuum_count | 0 max_dead_tuples | 291 num_dead_tuples | 16 (postgres@[local]:5432) [postgres] > select * from pg_stat_progress_vacuum; -[ RECORD 1 ]------+-------------- pid | 3051 datid | 13322 datname | postgres relid | 16388 phase | scanning heap heap_blks_total | 93458 heap_blks_scanned | 90618 heap_blks_vacuumed | 90617 index_vacuum_count | 0 max_dead_tuples | 291 num_dead_tuples | 15
Pretty cool. Now we can estimate how long the operation will take or how much work is already done (when heap_blks_scanned comes close to heap_blks_total work should almost be done). According to the documentation and to the thread on hackers this is only the first view providing progress information…more will probably come in future releases.