This Blog is about some findings on Microsoft’s Azure service for PostgreSQL which I think needed to be noted.
Findings about the infrastructure
The Microsoft Azure Service for PostgreSQL is using Windows as OS Infrastructure, there are several points where it is possible to find out the used OS on PostgreSQL Level.
- select version(); => PostgreSQL 11.6, compiled by Visual C++ build 1800, 64-bi
- show dynamic_shared_memory_type; => windows
- show archive_command; => c:postgresbinxlogcopyxlogcopy.exe archive blob “%f” “%p”
So i think it is secured knowledge that the Azure Service for PostgreSQL is using Windows as OS.
Findings about performance
Microsoft offers 3 IOPS per GB Storage, in their own Quicktips which can be found here https://techcommunity.microsoft.com/t5/azure-database-support-blog/azure-database-for-postgresql-performance-quick-tips/ba-p/369125
Microsoft describes under Point B and C:
B. IOPS throughput
Please remember that the server has 3 IOPS per 1 GB of Storage. If your application requires higher IOPs, then it is recommended that you scale up you Azure Database for PostgreSQL server storage size to get more IOPS so that your application performance is not impacted by storage throttling.
C. IO waits
If IO waits are observed from PostgreSQL performance troubleshooting, then increasing the storage size should be considered for higher IO throughput. Check the wait queries using the portal.
Means a small database with higher performance requirements needs more storage than used by the data volume to get higher performance, and for example at 100GB 300 IOPS is not as fast as PostgreSQL can be.
The default collation on Azure is US Western 1252 which is not PostgreSQL default, so be careful by creating your databases and try to use UTF8 instead.
Not all extensions provided by the contrib package are available on Azure, this may have some consequences if your application requires one of them:
- adminpack administrative functions for PostgreSQL
- amcheck functions for verifying relation integrity
- autoinc functions for autoincrementing fields
- bloom bloom access method – signature file based index
- dict_xsyn text search dictionary template for extended synonym processing
- file_fdw foreign-data wrapper for flat file access
- hstore_plperl transform between hstore and plperl
- hstore_plperlu transform between hstore and plperlu
- insert_username functions for tracking who changed a table
- intagg integer aggregator and enumerator (obsolete)
- jsonb_plperl transform between jsonb and plperl
- jsonb_plperlu transform between jsonb and plperlu
- lo Large Object maintenance
- moddatetime functions for tracking last modification time
- pageinspect inspect the contents of database pages at a low level
- pg_freespacemap examine the free space map (FSM)
- pg_qualstats An extension collecting statistics about quals
- pg_stat_kcache Kernel statistics gathering
- pg_visibility examine the visibility map (VM) and page-level visibility info
- refint functions for implementing referential integrity (obsolete)
- repmgr Replication manager for PostgreSQL
- seg data type for representing line segments or floating-point intervals
- sslinfo information about SSL certificates
- tcn Triggered change notifications
- timetravel functions for implementing time travel
- tsm_system_rows TABLESAMPLE method which accepts number of rows as a limit
- tsm_system_time TABLESAMPLE method which accepts time in milliseconds as a limit
- xml2 XPath querying and XSLT
Findings about Security
This is a very important point for someone who has to fulfill security requirements by corporate security. On the configuration page it is possible to enabled a firewall rule to allow Azure services to connect, what does this mean in reality?
With select * from pg_hba_file_rules; it is possible to read out the pg_hba.conf used for the Azure Service for PostgreSQL, with switching on this firewall rule the pg_hba.conf contains more than 10900 Entries including 45 /16 Networks, means these 45 /16 Networks allowing to connect 45 x 65025 IP Adresses = 2.926 Million IP addresses allowed to connect with host and Password security only.
It is interesting that with enabled SSL enforcement the entries inside pg_hba.conf are still on host and not on hostssl, according to the PostgreSQL Documentation also non SSL connections are allowed: https://www.postgresql.org/docs/11/auth-pg-hba-conf.html
This record matches connection attempts made using TCP/IP. host records match either SSL or non-SSL connection attempts.
This record matches connection attempts made using TCP/IP, but only when the connection is made with SSL encryption.
Require the client to supply an unencrypted password for authentication. Since the password is sent in clear text over the network, this should not be used on untrusted networks.
With enabled firwall rule allow Azure services to connect we found on some Services that select * from pg_stat_ssl; shows no connection using SSL! Restarting the service solved the issue for the moment.
So in my mind there are some points needed to be discussed before starting with the Azure Service for PostgreSQL, but this needs to be done on any Cloud service before using it. According to some discussions on a project we had, some following up meetings about the future of the Microsoft Azure Service for PostgreSQL.
At first, Microsoft moves to a flexible server implementation based on Linux and get rid of the performance and security issuing connection gateway by using vnet integration, so the pg_hba nightmare should be solved with the new solution.
The new solution is vm based, one vm per instance, no multiple instance setup.
I have a few concerns, Microsoft is building the installation packages on their own, using the PostgreSQL build in replication (no repmgr, no patroni) and the Linux used is Ubuntu. If it is available for met for testing, I will follow up with testing if the issues we have found on the Windows based offering are gone now. But I see the way is going into the right direction.