Almost every PostgreSQL I get in touch with is not configured to use huge pages, which is quite a surprise as it can give you a performance boost. Actually it is not the PostgreSQL instance you need to configure but the operating system to provide that. PostgreSQL will use huge pages by default when they are configured and will fall back to normal pages otherwise. The parameter which controls that in PostgreSQL is huge_pages which defaults to “try” leading to the behavior just described: Try to get them, otherwise use normal pages. Lets see how you can do that on RedHat and CentOS. I’ll write another post about how you do that for Debian based distributions shortly.
What you need to know is that RedHat as well as CentOS come with tuned profiles by default. This means kernel parameters and other settings are managed through profiles dynamically and not anymore by adjusting /etc/sysctl (although that works as well). When you are in virtualized environment (VirtualBox in my case) you probably will see something like this:
postgres@pgbox:/home/postgres/ [PG10] tuned-adm active Current active profile: virtual-guest
Virtual guest is maybe not the best solution for database server as it comes with those settings (especially vm.dirty_ratio and vm.swappiness):
postgres@pgbox:/home/postgres/ [PG10] cat /usr/lib/tuned/virtual-guest/tuned.conf | egrep -v "^$|^#" [main] summary=Optimize for running inside a virtual guest include=throughput-performance [sysctl] vm.dirty_ratio = 30 vm.swappiness = 30
What we do at dbi services is to provide our own profile which adjusts the settings better suited for a database server.
postgres@pgbox:/home/postgres/ [PG10] cat /etc/tuned/dbi-postgres/tuned.conf | egrep -v "^$|^#" [main] summary=dbi services tuned profile for PostgreSQL servers [cpu] governor=performance energy_perf_bias=performance min_perf_pct=100 [disk] readahead=>4096 [sysctl] vm.overcommit_memory=2 vm.swappiness=0 vm.dirty_ratio=2 vm.dirty_background_ratio=1
What has all this to do with larges pages you might think. Well, tuning profiles can also be used to configure them and for us this is the preferred method because we can do it all in one file. But we before we do that lets look at the PostgreSQL instance:
postgres=# select version(); version ---------------------------------------------------------------------------------------------------------------------------- PostgreSQL 10.0 build on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16), 64-bit (1 row) postgres=# show huge_pages; huge_pages ------------ try (1 row)
As said at the beginning of this post the default behavior of PostgreSQL is to use them if available. The question now is: How can you check if you have huge pages configured on the operating system level? The answer is in the virtual /proc/meminfo file:
postgres=# ! cat /proc/meminfo | grep -i huge AnonHugePages: 6144 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB
Alle “HugePages” statistics report a zero so this system definitely is not configured to provide huge pages to PostgreSQL. AnonHugePages is for Transparent Hugepage and it is common recommendation to disable them for database servers. So we have two tasks to complete:
- Disable transparent huge pages
- Configure the system to provide enough huge pages for our PostgreSQL instance
For disabling transparent huge pages we just need to add the following lines to our tuning profile:
postgres@pgbox:/home/postgres/ [PG10] sudo echo "[vm] > transparent_hugepages=never" >> /etc/tuned/dbi-postgres/tuned.conf
When transparent huge pages are enabled you can see that in the following file:
postgres@pgbox:/home/postgres/ [PG10] cat /sys/kernel/mm/transparent_hugepage/enabled [always] madvise never
Once we switch the profile to our own profile:
postgres@pgbox:/home/postgres/ [PG10] sudo tuned-adm profile dbi-postgres postgres@pgbox:/home/postgres/ [PG10] sudo tuned-adm active Current active profile: dbi-postgres
… you’ll notice that it is disabled from now on:
postgres@pgbox:/home/postgres/ [PG10] cat /sys/kernel/mm/transparent_hugepage/enabled always madvise [never]
Task one completed. For configuring the operating system to provide huge pages for our PostgreSQL we need to know how many huge pages we require. How do we do that? The procedure is documented in the PostgreSQL documentation. Basically you start your instance and then check how many you would require. In my case, to get the PID of the postmaster process:
postgres@pgbox:/home/postgres/ [PG10] head -1 $PGDATA/postmaster.pid 1640
To get the VmPeak for that process:
postgres@pgbox:/home/postgres/ [PG10] grep ^VmPeak /proc/1640/status VmPeak: 344340 kB
As the huge page size is 2MB on my system (which should be default for most systems):
postgres@pgbox:/home/postgres/ [PG10] grep ^Hugepagesize /proc/meminfo Hugepagesize: 2048 kB
… we will require at least 344340/2048 huge pages for this PostgreSQL instance:
postgres@pgbox:/home/postgres/ [PG10] echo "344340/2048" | bc 168
All we need to do is to add this to our tuning profile in the “[sysctl]” section:
postgres@pgbox:/home/postgres/ [PG10] grep nr_hugepages /etc/tuned/dbi-postgres/tuned.conf vm.nr_hugepages=170
Re-set the profile and we’re done:
postgres@pgbox:/home/postgres/ [PG10] sudo tuned-adm profile dbi-postgres postgres@pgbox:/home/postgres/ [PG10] cat /proc/meminfo | grep -i huge AnonHugePages: 4096 kB HugePages_Total: 170 HugePages_Free: 170 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB
This confirms that we now have 170 huge pages of which all of them are free to consume. Now lets configure PostgreSQL to only start when it can get the amount of huge pages required by switching the “huge_pages” parameter to “on” and restart the instance:
postgres@pgbox:/home/postgres/ [PG10] psql -c "alter system set huge_pages=on" postgres ALTER SYSTEM Time: 0.719 ms postgres@pgbox:/home/postgres/ [PG10] pg_ctl -D $PGDATA restart -m fast waiting for server to shut down.... done server stopped waiting for server to start....2018-02-25 11:21:29.107 CET - 1 - 3170 - - @ LOG: listening on IPv4 address "0.0.0.0", port 5441 2018-02-25 11:21:29.107 CET - 2 - 3170 - - @ LOG: listening on IPv6 address "::", port 5441 2018-02-25 11:21:29.110 CET - 3 - 3170 - - @ LOG: listening on Unix socket "/tmp/.s.PGSQL.5441" 2018-02-25 11:21:29.118 CET - 4 - 3170 - - @ LOG: redirecting log output to logging collector process 2018-02-25 11:21:29.118 CET - 5 - 3170 - - @ HINT: Future log output will appear in directory "pg_log". done server started
As the instance started all should be fine and we can confirm that by looking at the statistics in /proc/meminfo:
postgres@pgbox:/home/postgres/ [PG10] cat /proc/meminfo | grep -i huge AnonHugePages: 4096 kB HugePages_Total: 170 HugePages_Free: 162 HugePages_Rsvd: 64 HugePages_Surp: 0 Hugepagesize: 2048 kB
You might be surprised that not all (actually only 8) huge pages are used right now but this will change as soon as you put some load on the system:
postgres=# create table t1 as select * from generate_series(1,1000000); SELECT 1000000 postgres=# select count(*) from t1; count --------- 1000000 (1 row) postgres=# ! cat /proc/meminfo | grep -i huge AnonHugePages: 4096 kB HugePages_Total: 170 HugePages_Free: 153 HugePages_Rsvd: 55 HugePages_Surp: 0 Hugepagesize: 2048 kB postgres=#
Hope this helps …