<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Archives des Database Administration &amp; Monitoring - dbi Blog</title>
	<atom:link href="https://www.dbi-services.com/blog/category/database-administration-monitoring/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.dbi-services.com/blog/category/database-administration-monitoring/</link>
	<description></description>
	<lastBuildDate>Fri, 26 Jun 2026 19:11:43 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	

<image>
	<url>https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2025/05/cropped-favicon_512x512px-min-32x32.png</url>
	<title>Archives des Database Administration &amp; Monitoring - dbi Blog</title>
	<link>https://www.dbi-services.com/blog/category/database-administration-monitoring/</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Highly Available, Load-Balanced PostgreSQL with Patroni, HAProxy, and Keepalived</title>
		<link>https://www.dbi-services.com/blog/highly-available-load-balanced-postgresql-with-patroni-haproxy-and-keepalived/</link>
					<comments>https://www.dbi-services.com/blog/highly-available-load-balanced-postgresql-with-patroni-haproxy-and-keepalived/#respond</comments>
		
		<dc:creator><![CDATA[Joan Frey]]></dc:creator>
		<pubDate>Fri, 26 Jun 2026 19:11:41 +0000</pubDate>
				<category><![CDATA[Database Administration & Monitoring]]></category>
		<category><![CDATA[Database management]]></category>
		<category><![CDATA[Operating systems]]></category>
		<category><![CDATA[HAProxy]]></category>
		<category><![CDATA[keepalived]]></category>
		<category><![CDATA[Load]]></category>
		<category><![CDATA[load balancer]]></category>
		<category><![CDATA[load balancing]]></category>
		<category><![CDATA[postgresql]]></category>
		<guid isPermaLink="false">https://www.dbi-services.com/blog/?p=45342</guid>

					<description><![CDATA[<p>Patroni runs your PostgreSQL cluster and handles failover, promoting a replica the moment the primary dies and recording the change in its distributed store (etcd, Consul, or ZooKeeper). That part works on its own. Your applications still need one stable address to connect to, and they need writes to reach the primary while reads spread [&#8230;]</p>
<p>L’article <a href="https://www.dbi-services.com/blog/highly-available-load-balanced-postgresql-with-patroni-haproxy-and-keepalived/">Highly Available, Load-Balanced PostgreSQL with Patroni, HAProxy, and Keepalived</a> est apparu en premier sur <a href="https://www.dbi-services.com/blog">dbi Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">Patroni runs your PostgreSQL cluster and handles failover, promoting a replica the moment the primary dies and recording the change in its distributed store (etcd, Consul, or ZooKeeper). That part works on its own.</p>



<p class="wp-block-paragraph">Your applications still need one stable address to connect to, and they need writes to reach the primary while reads spread across replicas. HAProxy handles that routing, with a floating IP from Keepalived in front of it.</p>



<h1 id="h-how-the-tools-work-together" class="wp-block-heading">How the tools work together</h1>



<p class="wp-block-paragraph">Three components stands between your application and the database.</p>



<p class="wp-block-paragraph"><strong>Patroni</strong> manages replication and failover, and it runs an agent on every PostgreSQL node. Each agent exposes a small REST API (port 8008 by default) that reports that node&#8217;s role.</p>



<p class="wp-block-paragraph"><strong>HAProxy</strong> accepts client connections and forwards them to the right node. It asks Patroni&#8217;s REST API which node is the primary and which are replicas, then sends each connection to a matching node.</p>



<p class="wp-block-paragraph"><strong>Keepalived</strong> publishes a virtual IP that floats between your HAProxy hosts using VRRP. Your application connects to the VIP, so one HAProxy host going down doesn&#8217;t take the whole entry point with it.</p>



<p class="wp-block-paragraph">Your application talks to the VIP. Keepalived points the VIP at a live HAProxy. HAProxy forwards the connection to whichever PostgreSQL node Patroni reports as healthy for that role.</p>



<h1 id="h-the-health-check-method" class="wp-block-heading">The health-check method</h1>



<p class="wp-block-paragraph">HAProxy checks one port and routes to another.</p>



<p class="wp-block-paragraph">Patroni&#8217;s REST API returns an HTTP status that depends on the node&#8217;s role:</p>



<ul class="wp-block-list">
<li><code>GET /</code> returns <code>200</code> only on the leader (the primary). A non-leader node returns <code>503</code>.</li>



<li><code>GET /primary</code> is the explicit name for the same leader check.</li>



<li><code>GET /replica</code> returns <code>200</code> only on a running replica.</li>



<li><code>GET /read-only</code> returns <code>200</code> on the primary or a replica, any node that can serve a read.</li>
</ul>



<p class="wp-block-paragraph">In our case, HAProxy runs its health check against the API port (8008) and reads that status code, then forwards the SQL connection to the database port (5432). A node receives traffic only when its API answers <code>200</code> for the role that listener cares about. Point a listener&#8217;s check at <code>/</code> and it follows the primary. Point it at <code>/replica</code> and it follows the replicas. Patroni promotes a new leader, the status codes change, and HAProxy moves traffic to match within a couple of health-check cycles.</p>



<h1 id="h-a-first-and-simple-working-configuration" class="wp-block-heading">A first and simple working configuration</h1>



<p class="wp-block-paragraph">A two-node setup with <code>10.5.5.147</code> and <code>10.5.5.148</code> looks like this. One listener handles writes, the other handles reads.</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; title: ; notranslate">
listen PG1
    bind *:5000
    option httpchk
    http-check expect status 200
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    server postgresql_10.5.5.147_5432 10.5.5.147:5432 maxconn 100 check port 8008
    server postgresql_10.5.5.148_5432 10.5.5.148:5432 maxconn 100 check port 8008

listen PG1_ro
    bind *:5001
    option httpchk GET /replica
    http-check expect status 200
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    server postgresql_10.5.5.147_5432 10.5.5.147:5432 maxconn 100 check port 8008
    server postgresql_10.5.5.148_5432 10.5.5.148:5432 maxconn 100 check port 8008
</pre></div>


<p class="wp-block-paragraph">This example runs PostgreSQL on port 5432 and the Patroni API on 8008, so swap in whatever ports your deployment uses (the defaults are 5432 and 8008).</p>



<p class="wp-block-paragraph">Line by line:</p>



<ul class="wp-block-list">
<li><code>bind *:5000</code> and <code>bind *:5001</code> are the two addresses your applications connect to. Send writes to 5000 and reads to 5001.</li>



<li><code>option httpchk</code> (with no path) on the first listener checks Patroni&#8217;s root endpoint. Only the leader answers <code>200</code>, so HAProxy sends port 5000 traffic to the current primary.</li>



<li><code>option httpchk GET /replica</code> on the second listener checks the replica endpoint, so HAProxy sends port 5001 traffic to a replica.</li>



<li><code>http-check expect status 200</code> tells HAProxy that <code>200</code> means healthy and anything else means down.</li>



<li><code>inter 3s fall 3 rise 2</code> checks every 3 seconds, marks a server down after 3 failures, and brings it back after 2 successes.</li>



<li><code>on-marked-down shutdown-sessions</code> kills existing connections to a server the instant HAProxy marks it down, so clients reconnect and get rerouted instead of hanging on a dead node.</li>



<li><code>check port 8008</code> is the trick in action: health checks hit the Patroni API on 8008 while HAProxy forwards traffic to PostgreSQL on 5432.</li>



<li><code>maxconn 100</code> limit connections per server so you don&#8217;t exhaust PostgreSQL&#8217;s connection slots.</li>
</ul>



<p class="wp-block-paragraph">For a primary plus one or more replicas, this routes writes and reads to the right node and survives a failover.</p>



<h1 id="h-the-failure-mode-hiding-in-the-read-path" class="wp-block-heading">The failure mode hiding in the read path</h1>



<p class="wp-block-paragraph">Imagine a two-node cluster: one primary, one replica. The replica goes down. Maybe it crashed, maybe Patroni is mid-switchover and no standby exists for a few seconds.</p>



<p class="wp-block-paragraph">Your read traffic hits port 5001. That listener marks a server up only when <code>GET /replica</code> returns <code>200</code>, and right now no node is a replica. HAProxy has zero usable servers in the pool, so it refuses the connection. Read queries start failing.</p>



<p class="wp-block-paragraph">The primary is up the entire time, and it can serve those reads. Your config won&#8217;t send them there, because you told the read listener to look for replicas and nothing else. You&#8217;ve turned a degraded cluster that could still serve reads into a read outage. You feel this most on small clusters, and each failover passes through a window where the old primary becomes a replica and no standby is available yet. In the worst case, your replica is down, and one of your application is connecting to port 5001, resulting in errors.</p>



<h1 id="h-the-fix-fall-back-to-the-primary" class="wp-block-heading">The fix: fall back to the primary</h1>



<p class="wp-block-paragraph">Send reads to the primary when the read listener runs out of replicas, instead of dropping them.</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; title: ; notranslate">
listen PG1
    bind *:5000
    option httpchk
    http-check expect status 200
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    server postgresql_10.5.5.147_5432 10.5.5.147:5432 maxconn 100 check port 8008
    server postgresql_10.5.5.148_5432 10.5.5.148:5432 maxconn 100 check port 8008

listen PG1_ro
    bind *:5001
    option httpchk GET /replica
    http-check expect status 200
    use_backend PG1_ro_leader if { nbsrv(PG1_ro) eq 0 }
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    server postgresql_10.5.5.147_5432 10.5.5.147:5432 maxconn 100 check port 8008
    server postgresql_10.5.5.148_5432 10.5.5.148:5432 maxconn 100 check port 8008

backend PG1_ro_leader
    option httpchk GET /primary
    http-check expect status 200
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    server postgresql_10.5.5.147_5432 10.5.5.147:5432 maxconn 100 check port 8008
    server postgresql_10.5.5.148_5432 10.5.5.148:5432 maxconn 100 check port 8008
</pre></div>


<p class="wp-block-paragraph">This new line carries the whole fix:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; title: ; notranslate">
use_backend PG1_ro_leader if { nbsrv(PG1_ro) eq 0 }
</pre></div>


<p class="wp-block-paragraph"><code>nbsrv(PG1_ro)</code> counts the usable servers in the <code>PG1_ro</code> pool, which here means the number of available replicas, since those servers pass the check only when <code>GET /replica</code> returns <code>200</code>. While at least one replica is up, the count stays above zero, the condition is false, and reads stay on the replicas. The moment the last replica drops, <code>nbsrv(PG1_ro)</code> hits zero, the condition fires, and HAProxy diverts reads to the <code>PG1_ro_leader</code> backend.</p>



<p class="wp-block-paragraph">That backend health-checks with <code>GET /primary</code>, so the only server it counts as up is the current primary. HAProxy sends reads to the primary until a replica returns, then shifts them back to the replica pool once a replica passes its check again.</p>



<p class="wp-block-paragraph">Three names have to agree for this to work. The backend you define (<code>backend PG1_ro_leader</code>), the backend you route to (<code>use_backend PG1_ro_leader</code>), and the pool you count (<code>nbsrv(PG1_ro)</code>) all reference the real section names. Drop in a stale name from an earlier version and HAProxy either refuses to start or counts the wrong pool.</p>



<h2 class="wp-block-heading">The /read-only shortcut and what it costs</h2>



<p class="wp-block-paragraph">Patroni offers <code>GET /read-only</code>, which returns <code>200</code> on the primary and the replicas alike. Point the read listener there and both the primary and the replicas serve reads, no fallback backend needed.</p>



<p class="wp-block-paragraph">The cost is read load on the primary even when your replicas are healthy and idle. The fallback approach keeps reads off the primary until the replicas are gone, then leans on it as a safety net. You protect the primary&#8217;s write capacity during normal operation and still keep reads alive during a replica outage.</p>



<p class="wp-block-paragraph">To keep lagging replicas out of the read pool, Patroni accepts a threshold on the replica check, for example <code>GET /replica?lag=10MB</code>, which fails any replica more than 10 MB behind. Pair that with the fallback and HAProxy drops the lagging replicas from rotation while reads still have somewhere to go.</p>



<h1 class="wp-block-heading">Keepalived: removing HAProxy as a single point of failure</h1>



<p class="wp-block-paragraph">One HAProxy host fronting the cluster moves the single point of failure up a layer. Run HAProxy on two hosts and let Keepalived float a virtual IP between them with VRRP. Your application connects to the VIP, and whichever HAProxy holds it answers.</p>



<p class="wp-block-paragraph">A minimal <code>keepalived.conf</code> on the primary HAProxy host:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; title: ; notranslate">
vrrp_script chk_haproxy {
    script &quot;killall -0 haproxy&quot;   # succeeds while the haproxy process is alive
    interval 2
    weight 2
}

vrrp_instance VI_1 {
    interface eth0
    state MASTER
    virtual_router_id 51
    priority 101
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass changeme
    }
    virtual_ipaddress {
        10.0.0.1
    }
    track_script {
        chk_haproxy
    }
}
</pre></div>


<p class="wp-block-paragraph">The second HAProxy host runs the same file with <code>state BACKUP</code> and a lower <code>priority</code> (100). Both advertise over VRRP, and the higher priority holds the VIP. <code>chk_haproxy</code> runs every two seconds. If HAProxy dies on the active host, its priority drops and the backup takes the VIP, so an HAProxy crash on one host no longer takes the entry point down with it.</p>



<p class="wp-block-paragraph">Point your applications at <code>10.0.0.1:5000</code> for writes and <code>10.5.5.100:5001</code> for reads. Your applications never see which physical HAProxy does the work.</p>



<h1 class="wp-block-heading">Summary</h1>



<p class="wp-block-paragraph">Patroni keeps the cluster healthy and picks the leader. HAProxy turns Patroni&#8217;s REST API into routing, sending writes to the primary and reads to the replicas by health-checking the API port while forwarding to the database port. The naive read-only listener drops reads when the last replica goes down, even though the primary could serve them. Adding <code>use_backend ... if { nbsrv(...) eq 0 }</code> with a primary-checking backend closes that gap, and a lag threshold on the replica check keeps stale standbys out of rotation. Keepalived puts a floating VIP in front of two HAProxy instances so the proxy layer survives a host failure too.</p>



<p class="wp-block-paragraph">Writes reach the primary and reads spread across the replicas. Reads stay up as long as one node in the cluster is alive.</p>



<p class="wp-block-paragraph">Let me know if you find any improvements to this configuration <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f600.png" alt="😀" class="wp-smiley" style="height: 1em; max-height: 1em;" /> </p>
<p>L’article <a href="https://www.dbi-services.com/blog/highly-available-load-balanced-postgresql-with-patroni-haproxy-and-keepalived/">Highly Available, Load-Balanced PostgreSQL with Patroni, HAProxy, and Keepalived</a> est apparu en premier sur <a href="https://www.dbi-services.com/blog">dbi Blog</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.dbi-services.com/blog/highly-available-load-balanced-postgresql-with-patroni-haproxy-and-keepalived/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Upgrade RHEL from 9.6 to 10.1 (when running PostgreSQL/Patroni)</title>
		<link>https://www.dbi-services.com/blog/upgrade-rhel-from-9-6-to-10-1-when-running-postgresql-patroni/</link>
					<comments>https://www.dbi-services.com/blog/upgrade-rhel-from-9-6-to-10-1-when-running-postgresql-patroni/#respond</comments>
		
		<dc:creator><![CDATA[Joan Frey]]></dc:creator>
		<pubDate>Fri, 26 Jun 2026 10:39:40 +0000</pubDate>
				<category><![CDATA[Database Administration & Monitoring]]></category>
		<category><![CDATA[Operating systems]]></category>
		<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[10]]></category>
		<category><![CDATA[leapp]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[operating system]]></category>
		<category><![CDATA[os]]></category>
		<category><![CDATA[RHEL]]></category>
		<category><![CDATA[upgrade]]></category>
		<guid isPermaLink="false">https://www.dbi-services.com/blog/?p=43285</guid>

					<description><![CDATA[<p>Upgrading from RHEL 9.6 to 10.1 is not just a routine update, it’s a major platform shift. When your server runs PostgreSQL compiled from source and a Patroni-managed cluster, the complexity increases significantly. System libraries change, Python environments break, ICU versions evolve, and your database binaries may no longer start after reboot. In this guide, [&#8230;]</p>
<p>L’article <a href="https://www.dbi-services.com/blog/upgrade-rhel-from-9-6-to-10-1-when-running-postgresql-patroni/">Upgrade RHEL from 9.6 to 10.1 (when running PostgreSQL/Patroni)</a> est apparu en premier sur <a href="https://www.dbi-services.com/blog">dbi Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">Upgrading from RHEL 9.6 to 10.1 is not just a routine update, it’s a major platform shift. When your server runs PostgreSQL compiled from source and a Patroni-managed cluster, the complexity increases significantly. System libraries change, Python environments break, ICU versions evolve, and your database binaries may no longer start after reboot.</p>



<p class="wp-block-paragraph">In this guide, I walk through a real-world in-place upgrade using Leapp, covering preparation, resolving high-severity warnings, executing the upgrade, recompiling PostgreSQL, fixing collation mismatches, and restoring Patroni.</p>



<h2 class="wp-block-heading" id="h-i-preparation">I. Preparation</h2>



<p class="wp-block-paragraph">Before the upgrade, you must ensure the current OS is healthy and fully patched.</p>



<h3 class="wp-block-heading" id="h-1-pause-high-availability">1. Pause High Availability</h3>



<p class="wp-block-paragraph">Prevent Patroni from triggering a failover during the reboot cycles. If you are using a single PostgreSQL cluster, stop it by stopping the service or by using pg_ctl stop.</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; title: ; notranslate">
patronictl -c /etc/patroni/patroni.yml pause
systemctl stop patroni
</pre></div>


<h3 class="wp-block-heading" id="h-2-fix-subscription-amp-perform-full-update">2. Fix Subscription &amp; Perform Full Update</h3>



<p class="wp-block-paragraph">I&#8217;m using an old VM for this blog, and If just like me, you see 403 Forbidden errors on repositories like codeready-builder, refresh your registration:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; title: ; notranslate">
&#x5B;root@patroni2 ~]# sudo dnf update -y
Updating Subscription Management repositories.

This system is registered with an entitlement server, but is not receiving updates. You can use subscription-manager to assign subscriptions.

Red Hat CodeReady Linux Builder for RHEL 9 x86_64 (RPMs)                                                                                                     761  B/s | 480  B     00:00
Errors during downloading metadata for repository &#039;codeready-builder-for-rhel-9-x86_64-rpms&#039;:
  - Status code: 403 for https://cdn.redhat.com/content/dist/rhel9/9/x86_64/codeready-builder/os/repodata/repomd.xml (IP: 23.206.57.92)
Error: Failed to download metadata for repo &#039;codeready-builder-for-rhel-9-x86_64-rpms&#039;: Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried

&#x5B;root@patroni2 ~]# sudo subscription-manager clean
&#x5B;root@patroni2 ~]# sudo subscription-manager register --force
&#x5B;root@patroni2 ~]# sudo subscription-manager attach --auto
&#x5B;root@patroni2 ~]# sudo subscription-manager refresh

&#x5B;root@patroni2 ~]# sudo dnf update -y
...
Complete!

&#x5B;root@patroni2 ~]# reboot
</pre></div>


<h2 class="wp-block-heading" id="h-ii-the-leapp-upgrade-to-10-1">II. The Leapp Upgrade to 10.1</h2>



<h3 class="wp-block-heading" id="h-1-install-amp-analyze">1. Install &amp; Analyze</h3>



<p class="wp-block-paragraph">In this first phase, we are preparing the system for a major in-place upgrade using Leapp, the official upgrade framework for Red Hat–based distributions. When we install the package:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; title: ; notranslate">
&#x5B;root@patroni2 ~]# dnf install leapp-upgrade -y
...
Installed:
  leapp-0.20.0-1.el9.noarch                leapp-deps-0.20.0-1.el9.noarch          leapp-upgrade-el9toel10-0.23.0-1.el9.noarch       leapp-upgrade-el9toel10-deps-0.23.0-1.el9.noarch
  libdb-utils-5.3.28-57.el9_6.x86_64       python3-leapp-0.20.0-1.el9.noarch       systemd-container-252-55.el9_7.7.x86_64

Complete!
</pre></div>


<p class="wp-block-paragraph">When running leapp preupgrade &#8211;target 10.1, we are not performing the upgrade. Instead, Leapp performs a full system audit to determine if the server is ready for RHEL 10.1. It checks:</p>



<ul class="wp-block-list">
<li>Installed packages and their compatibility</li>



<li>Deprecated or removed libraries</li>



<li>Kernel drivers that will not exist in RHEL 10</li>



<li>Bootloader configuration (GRUB2)</li>



<li>GPG key validity</li>



<li>Custom system-level modifications (like dynamic linker changes)</li>



<li>&#8230;</li>
</ul>



<p class="wp-block-paragraph">Think of this step as a dry-run with intelligence.</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; title: ; notranslate">
&#x5B;root@patroni2 ~]# sudo leapp preupgrade --target 10.1

...

============================================================
                      REPORT OVERVIEW
============================================================

HIGH and MEDIUM severity reports:
    1. GRUB2 core will be automatically updated during the upgrade
    2. Detected customized configuration for dynamic linker.
    3. Leapp detected loaded kernel drivers which are no longer maintained in RHEL 10.
    4. Failed to read GPG keys from provided key files
    5. Berkeley DB (libdb) has been detected on your system

Reports summary:
    Errors:                      0
    Inhibitors:                  0
    HIGH severity reports:       4
    MEDIUM severity reports:     1
    LOW severity reports:        1
    INFO severity reports:       3

Before continuing, review the full report below for details about discovered problems and possible remediation instructions:
    A report has been generated at /var/log/leapp/leapp-report.txt
    A report has been generated at /var/log/leapp/leapp-report.json
</pre></div>


<p class="wp-block-paragraph">After running the pre-upgrade analysis, the next step is to carefully review:</p>



<pre class="wp-block-preformatted">/var/log/leapp/leapp-report.txt</pre>



<p class="wp-block-paragraph">What we are looking for first is simple:</p>



<ul class="wp-block-list">
<li>Errors: 0</li>



<li>Inhibitors: 0</li>
</ul>



<p class="wp-block-paragraph">If an Inhibitor is present, the upgrade will be blocked entirely.<br>In my case, there were no blockers, but I did have several high severity warnings.</p>



<p class="wp-block-paragraph">High severity does not mean the upgrade will fail.<br>It means: This could break something, review it carefully.</p>



<p class="wp-block-paragraph">Let’s look at one concrete example from my system.</p>



<h3 class="wp-block-heading" id="h-2-high-severity-example-dynamic-linker-customization">2. High Severity Example – Dynamic Linker Customization</h3>



<p class="wp-block-paragraph">Leapp detected that my system had a custom dynamic linker configuration:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; title: ; notranslate">
Risk Factor: high

Title: Detected customized configuration for dynamic linker.

Summary: Custom configurations to the dynamic linker could potentially impact the upgrade in a negative way. The custom configuration includes modifications to /etc/ld.so.conf, custom or modified drop in config files in the /etc/ld.so.conf.d directory and additional entries in the LD_LIBRARY_PATH or LD_PRELOAD variables. These modifications configure the dynamic linker to use different libraries that might not be provided by Red Hat products or might not be present during the whole upgrade process. The following custom configurations were detected by leapp:

- The following drop in config files were marked as custom:

    - /etc/ld.so.conf.d/postgres.conf

Remediation: &#x5B;hint] Remove or revert the custom dynamic linker configurations and apply the changes using the ldconfig command. In case of possible active software collections we suggest disabling them persistently.

Key: cc9bd972af70b7a27f66a37b11a00dcfcb73b1bc

----------------------------------------
</pre></div>


<h4 class="wp-block-heading" id="h-what-does-this-actually-mean">What does this actually mean?</h4>



<p class="wp-block-paragraph">The dynamic linker (ld.so) is responsible for loading shared libraries at runtime.</p>



<p class="wp-block-paragraph">By modifying:</p>



<ul class="wp-block-list">
<li>/etc/ld.so.conf</li>



<li>files in /etc/ld.so.conf.d/</li>



<li>LD_LIBRARY_PATH</li>



<li><code>LD_PR</code>E<code>LOAD</code></li>
</ul>



<p class="wp-block-paragraph">we are telling the system to load non-standard or custom libraries. In PostgreSQL environments (especially with custom builds or extensions), this is common practice. However, during a major OS upgrade, these custom paths might:</p>



<ul class="wp-block-list">
<li>Point to libraries that do not exist in RHEL 10</li>



<li>Override new system libraries</li>



<li>Break dependency resolution mid-upgrade</li>
</ul>



<p class="wp-block-paragraph">Leapp flags this because it cannot guarantee consistency during the transition phase. In my case, it shouldn&#8217;t be an issue, because inside postgres.conf, I only have a path aiming to the lib directories of my PostgreSQL installation, which will not change, but we will still see how to prevent an error.</p>



<h4 class="wp-block-heading" id="h-understanding-the-remediation">Understanding the Remediation</h4>



<p class="wp-block-paragraph">The report clearly suggests:</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p class="wp-block-paragraph">Remove or revert the custom dynamic linker configurations and apply the changes using the ldconfig command.</p>
</blockquote>



<p class="wp-block-paragraph">In my case, the configuration was related to PostgreSQL, so temporarily removing it is safe for the upgrade preparation phase. Instead of deleting it permanently, I moved it aside:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; title: ; notranslate">
&#x5B;root@patroni2 ~]# sudo mv /etc/ld.so.conf.d/postgres.conf /tmp/postgres.conf.bak
&#x5B;root@patroni2 ~]# sudo ldconfig
</pre></div>


<p class="wp-block-paragraph">ldconfig rebuilds the system library cache now based only on standard paths.</p>



<h4 class="wp-block-heading" id="h-re-run-the-preupgrade-check">Re-Run the Preupgrade Check</h4>



<p class="wp-block-paragraph">After remediation, always re-run the preupgrade command and check the report again. If the fix was successful:</p>



<ul class="wp-block-list">
<li>The severity should disappear from the REPORT OVERVIEW</li>



<li>The issue should no longer appear in leapp-report.txt</li>
</ul>



<p class="wp-block-paragraph">This validation loop is important. We are progressively cleaning the system until it is fully compliant for upgrade.</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; title: ; notranslate">
============================================================
                      REPORT OVERVIEW
============================================================

HIGH and MEDIUM severity reports:
    1. Leapp detected loaded kernel drivers which are no longer maintained in RHEL 10.
    2. GRUB2 core will be automatically updated during the upgrade
    3. Failed to read GPG keys from provided key files
    4. Berkeley DB (libdb) has been detected on your system

Reports summary:
    Errors:                      0
    Inhibitors:                  0
    HIGH severity reports:       3
    MEDIUM severity reports:     1
    LOW severity reports:        1
    INFO severity reports:       3

Before continuing, review the full report below for details about discovered problems and possible remediation instructions:
    A report has been generated at /var/log/leapp/leapp-report.txt
    A report has been generated at /var/log/leapp/leapp-report.json
</pre></div>


<p class="wp-block-paragraph">Only once the report is clean, or fully understood, should we proceed to the actual upgrade execution.</p>



<h3 class="wp-block-heading" id="h-2-execute-the-upgrade">2. Execute the upgrade</h3>



<p class="wp-block-paragraph">Once all errors and inhibitors are resolved, and high-severity findings have been reviewed or remediated, we are finally ready to perform the actual in-place upgrade. This is the moment where Leapp transitions from analysis mode to execution mode.</p>



<h4 class="wp-block-heading" id="h-about-the-repository-warning">About the Repository Warning</h4>



<p class="wp-block-paragraph">During the preupgrade phase, Leapp informed us that codeready-builder-&#8230; repositories are not officially supported during the upgrade process and are excluded by default.</p>



<p class="wp-block-paragraph">This is expected behavior as Leapp only enables a minimal, controlled set of repositories to ensure:</p>



<ul class="wp-block-list">
<li>Package consistency</li>



<li>Dependency resolution stability</li>



<li>Predictable upgrade paths</li>
</ul>



<p class="wp-block-paragraph">However, in PostgreSQL environments, some packages (extensions, development headers, libraries) may depend on CodeReady Builder. If a repository is truly required during the upgrade, we must explicitly enable it using:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; title: ; notranslate">
--enablerepo &lt;repoid&gt;
</pre></div>


<h4 class="wp-block-heading" id="h-running-the-upgrade">Running the Upgrade</h4>



<p class="wp-block-paragraph">Since I need CodeReady Builder for PostgreSQLdependencies, I ran:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; title: ; notranslate">
leapp upgrade --target 10.1 --enablerepo codeready-builder-for-rhel-10-x86_64-rpms
</pre></div>


<h4 class="wp-block-heading">What happens when we run this command?</h4>



<p class="wp-block-paragraph">At this stage, Leapp:</p>



<ul class="wp-block-list">
<li>Resolves and downloads required RHEL 10 packages</li>



<li>Builds a temporary upgrade environment</li>



<li>Prepares a special upgrade initramfs</li>



<li>Modifies the bootloader (GRUB) to boot into the upgrade environment on next reboot</li>
</ul>



<p class="wp-block-paragraph">The system is not upgraded immediately. The actual OS transition happens during the next boot.</p>



<p class="wp-block-paragraph">After the command completes, Leapp generates another report. Just like in the preupgrade phase, verify:</p>



<ul class="wp-block-list">
<li>Errors: 0</li>



<li>Inhibitors: 0</li>
</ul>



<p class="wp-block-paragraph">If everything looks clean, we can proceed.</p>



<h4 class="wp-block-heading" id="h-reboot-the-real-upgrade-begins">Reboot – The Real Upgrade Begins</h4>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; title: ; notranslate">
# reboot
</pre></div>


<p class="wp-block-paragraph">This is where the real upgrade starts. During boot:</p>



<ul class="wp-block-list">
<li>The system enters a temporary upgrade environment</li>



<li>Packages are replaced</li>



<li>Obsolete components are removed</li>



<li>Configuration files are migrated</li>



<li>The new RHEL 10 kernel is installed</li>
</ul>



<p class="wp-block-paragraph">This phase can take several minutes depending on your VM/server resources. My VM doesn&#8217;t have many resources and it took me around 30 minutes. Be patient, interrupting this process can leave the system in an inconsistent state.</p>



<h4 class="wp-block-heading" id="h-verifying-the-upgrade">Verifying the Upgrade</h4>



<p class="wp-block-paragraph">Once the server is back online, confirm the OS version:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; title: ; notranslate">
&#x5B;root@patroni2 ~]# cat /etc/os-release
NAME=&quot;Red Hat Enterprise Linux&quot;
VERSION=&quot;10.1 (Coughlan)&quot;
ID=&quot;rhel&quot;
ID_LIKE=&quot;centos fedora&quot;
VERSION_ID=&quot;10.1&quot;
PLATFORM_ID=&quot;platform:el10&quot;
PRETTY_NAME=&quot;Red Hat Enterprise Linux 10.1 (Coughlan)&quot;
ANSI_COLOR=&quot;0;31&quot;
LOGO=&quot;fedora-logo-icon&quot;
CPE_NAME=&quot;cpe:/o:redhat:enterprise_linux:10.1&quot;
HOME_URL=&quot;https://www.redhat.com/&quot;
VENDOR_NAME=&quot;Red Hat&quot;
VENDOR_URL=&quot;https://www.redhat.com/&quot;
DOCUMENTATION_URL=&quot;https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/10&quot;
BUG_REPORT_URL=&quot;https://issues.redhat.com/&quot;
</pre></div>


<p class="wp-block-paragraph">REDHAT_BUGZILLA_PRODUCT=&#8221;Red Hat Enterprise Linux 10&#8243;<br>REDHAT_BUGZILLA_PRODUCT_VERSION=10.1<br>REDHAT_SUPPORT_PRODUCT=&#8221;Red Hat Enterprise Linux&#8221;<br>REDHAT_SUPPORT_PRODUCT_VERSION=&#8221;10.1&#8243;</p>



<p class="wp-block-paragraph">This confirms that we are now running RHEL 10.1.</p>



<h2 class="wp-block-heading" id="h-iii-post-upgrade-database-recovery">III. Post-Upgrade Database Recovery</h2>



<p class="wp-block-paragraph">Once you login to RHEL 10.1, your Postgres binaries in /u01 will fail because libicuuc.so.67 (from RHEL 9) is missing. It can also fail because of other libraries.</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; title: ; notranslate">
14:45:55 postgres@patroni2:/home/postgres/ &#x5B;test-op-patroni] pgstart
/u01/app/postgres/product/17/db_6/bin/postgres: error while loading shared libraries: libicuuc.so.67: cannot open shared object file: No such file or directory
no data was returned by command &quot;&quot;/u01/app/postgres/product/17/db_6/bin/postgres&quot; -V&quot;
command not found
program &quot;postgres&quot; is needed by pg_ctl but was not found in the same directory as &quot;/u01/app/postgres/product/17/db_6/bin/pg_ctl&quot;
</pre></div>


<h3 class="wp-block-heading" id="h-1-recompile-postgresql">1. Recompile PostgreSQL</h3>



<p class="wp-block-paragraph">Since you installed from source, you must re compile PostgreSQL with the new RHEL 10 system libraries. Here is the command I personally use to build it, with the postgres user:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: plain; title: ; notranslate">
postgres@patroni2:/home/postgres/ &#x5B;dummy] MAJOR=&quot;17&quot;
postgres@patroni2:/home/postgres/ &#x5B;dummy] MINOR=&quot;6&quot;
postgres@patroni2:/home/postgres/ &#x5B;dummy] tar axf postgresql-${MAJOR}.${MINOR}.tar.gz
postgres@patroni2:/home/postgres/ &#x5B;dummy] mkdir build; cd $_
postgres@patroni2:/home/postgres/ &#x5B;dummy] export PGHOME=&quot;/u01/app/postgres/product/${MAJOR}/db_${MINOR}&quot;
postgres@patroni2:/home/postgres/ &#x5B;dummy] export SEGSIZE=2
postgres@patroni2:/home/postgres/ &#x5B;dummy] export BLOCKSIZE=8
postgres@patroni2:/home/postgres/ &#x5B;dummy] meson setup . ../postgresql-${MAJOR}.${MINOR}
postgres@patroni2:/home/postgres/ &#x5B;dummy] meson configure -Dprefix=${PGHOME}                   -Dbindir=${PGHOME}/bin                   -Ddatadir=${PGHOME}/share                   -Dincludedir=${PGHOME}/include                   -Dlibdir=${PGHOME}/lib                   -Dsysconfdir=${PGHOME}/etc                   -Dpgport=5432                   -Dplperl=enabled                   -Dplpython=enabled                   -Dssl=openssl                   -Dpam=enabled                   -Dldap=enabled                   -Dlibxml=enabled                   -Dlibxslt=enabled                   -Dsegsize=${SEGSIZE}                   -Dblocksize=${BLOCKSIZE}                   -Dllvm=enabled                   -Duuid=ossp                   -Dzstd=enabled                   -Dlz4=enabled                   -Dzstd=enabled                   -Dgssapi=enabled                   -Dsystemd=enabled                   -Dicu=enabled                   -Dsystem_tzdata=/usr/share/zoneinfo                   -Dextra_version=&quot; dbi services build&quot;
postgres@patroni2:/home/postgres/ &#x5B;dummy] ninja
postgres@patroni2:/home/postgres/ &#x5B;dummy] ninja install
</pre></div>


<h3 class="wp-block-heading" id="h-2-restore-library-paths">2. Restore Library Paths</h3>



<p class="wp-block-paragraph">[root@patroni2 ~]# mv /tmp/postgres.conf.bak /etc/ld.so.conf.d/postgres.conf<br>[root@patroni2 ~]# ldconfig</p>



<h3 class="wp-block-heading" id="h-3-start-amp-fix-collation-mismatch">3. Start &amp; Fix Collation Mismatch</h3>



<p class="wp-block-paragraph">Postgres will now start, but will warn you about Collation Version Mismatches (2.34 vs 2.39).</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; title: ; notranslate">
15:09:55 postgres@patroni2:/home/postgres/build/ &#x5B;test-op-patroni] pgstart
waiting for server to start.... done
server started
15:10:28 postgres@patroni2:/home/postgres/build/ &#x5B;test-op-patroni] psql
WARNING:  database &quot;postgres&quot; has a collation version mismatch
DETAIL:  The database was created using collation version 2.34, but the operating system provides version 2.39.
HINT:  Rebuild all objects in this database that use the default collation and run ALTER DATABASE postgres REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.
psql (17.6 dbi services build)
Type &quot;help&quot; for help.
</pre></div>


<p class="wp-block-paragraph">Inside Postgres, run for every database:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: sql; title: ; notranslate">
ALTER DATABASE postgres REFRESH COLLATION VERSION;
ALTER DATABASE

-- Repeat for other DBs if applicable
REINDEX DATABASE postgres;
</pre></div>


<p class="wp-block-paragraph">Your PostgreSQL is now starting properly and your server has been upgraded.</p>



<h3 class="wp-block-heading" id="h-4-in-case-of-a-patroni-cluster">4. In case of a patroni cluster</h3>



<p class="wp-block-paragraph">Since the system Python version has changed after the OS upgrade, your old .local venv is invalid. You must recreate it. Here is how I install patroni using the postgres user:</p>



<p class="wp-block-paragraph">$ python3 -m venv .local<br>$ .local/bin/pip3 install &#8211;upgrade pip<br>$ .local/bin/pip3 install &#8211;upgrade setuptools<br>$ .local/bin/pip3 install wheel<br>$ .local/bin/pip3 install psycopg[binary]<br>$ .local/bin/pip3 install python-etcd<br>$ .local/bin/pip3 install patroni<br>$ .local/bin/patroni version</p>



<p class="wp-block-paragraph">Check if the cluster sees the member again. If patronictl list is empty, a restart of the service is usually required to re-register with etcd.</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; title: ; notranslate">
10:45:40 postgres@patroni2:/home/postgres/ &#x5B;test-op-patroni] patronictl list
+ Cluster: test-op-patroni (7565882985963789761) -+-----+------------+-----+
| Member | Host | Role | State | TL | Receive LSN | Lag | Replay LSN | Lag |
+--------+------+------+-------+----+-------------+-----+------------+-----+
+--------+------+------+-------+----+-------------+-----+------------+-----+
10:45:48 postgres@patroni2:/home/postgres/ &#x5B;test-op-patroni] sudo systemctl restart patroni
10:45:55 postgres@patroni2:/home/postgres/ &#x5B;test-op-patroni] patronictl list
+ Cluster: test-op-patroni (7565882985963789761) --+---------+----+-------------+-----+------------+-----+
| Member                 | Host           | Role   | State   | TL | Receive LSN | Lag | Replay LSN | Lag |
+------------------------+----------------+--------+---------+----+-------------+-----+------------+-----+
| patroni-tst-op-geapg02 | 192.168.56.142 | Leader | running | 11 |             |     |            |     |
+------------------------+----------------+--------+---------+----+-------------+-----+------------+-----+

</pre></div>


<h2 class="wp-block-heading" id="h-iv-conclusion">IV. Conclusion</h2>



<p class="wp-block-paragraph">Upgrading from RHEL 9.6 to 10.1 is a big move. It’s not just a simple update; it’s a total shift in the system&#8217;s foundation. Between hardware driver changes and library updates, you really have to pay attention to the details to keep your database running.</p>



<p class="wp-block-paragraph">RHEL 10.1 is a great, modern platform, but you can&#8217;t just click &#8220;update&#8221; and hope for the best. By planning the upgrade, planning to rebuild your binaries, refreshing your database objects, you can make the jump without the drama. Take the pre-upgrade report seriously, it’s there for a reason!</p>



<p class="wp-block-paragraph">That said, for a production PostgreSQL cluster, especially one managed with Patroni and etcd, I would not recommend this in-place upgrade approach. Even if Leapp makes the process technically possible, you are still:</p>



<ul class="wp-block-list">
<li>Modifying the operating system in place</li>



<li>Replacing core libraries underneath a running database stack</li>



<li>Trusting automated dependency resolution during a major version jump</li>
</ul>



<p class="wp-block-paragraph">In production, risk reduction should always be the priority.</p>



<p class="wp-block-paragraph">Instead, I strongly recommend provisioning new VMs or physical servers, installing RHEL 10.1 from scratch, deploying PostgreSQL, Patroni, and etcd cleanly, rebuilding the cluster from best practices, and then migrating the data from the old environment to the new one using replication or another appropriate method.</p>



<p class="wp-block-paragraph">Sometimes the safest upgrade… is a new cluster.</p>
<p>L’article <a href="https://www.dbi-services.com/blog/upgrade-rhel-from-9-6-to-10-1-when-running-postgresql-patroni/">Upgrade RHEL from 9.6 to 10.1 (when running PostgreSQL/Patroni)</a> est apparu en premier sur <a href="https://www.dbi-services.com/blog">dbi Blog</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.dbi-services.com/blog/upgrade-rhel-from-9-6-to-10-1-when-running-postgresql-patroni/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Zabbix &#8211; Monitoring full cluster Patroni/PostgreSQL</title>
		<link>https://www.dbi-services.com/blog/zabbix-monitoring-full-cluster-patroni-postgresql/</link>
					<comments>https://www.dbi-services.com/blog/zabbix-monitoring-full-cluster-patroni-postgresql/#respond</comments>
		
		<dc:creator><![CDATA[Aurélien Py]]></dc:creator>
		<pubDate>Fri, 12 Jun 2026 14:06:13 +0000</pubDate>
				<category><![CDATA[Database Administration & Monitoring]]></category>
		<category><![CDATA[Monitoring]]></category>
		<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[Zabbix]]></category>
		<category><![CDATA[Patroni]]></category>
		<category><![CDATA[postgresql]]></category>
		<guid isPermaLink="false">https://www.dbi-services.com/blog/?p=44284</guid>

					<description><![CDATA[<p>Introduction Monitoring a PostgreSQL environment involves much more than simply checking whether the database is running. In a modern high availability architecture, multiple components work together to keep the service available and stable. In this article, we will explain how to monitor a PostgreSQL cluster environment with Zabbix: The goal of this blog is to [&#8230;]</p>
<p>L’article <a href="https://www.dbi-services.com/blog/zabbix-monitoring-full-cluster-patroni-postgresql/">Zabbix &#8211; Monitoring full cluster Patroni/PostgreSQL</a> est apparu en premier sur <a href="https://www.dbi-services.com/blog">dbi Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<h1 class="wp-block-heading" id="h-introduction">Introduction</h1>



<p class="wp-block-paragraph">Monitoring a PostgreSQL environment involves much more than simply checking whether the database is running. In a modern high availability architecture, multiple components work together to keep the service available and stable.</p>



<p class="wp-block-paragraph">In this article, we will explain how to monitor a PostgreSQL cluster environment with Zabbix:</p>



<ul class="wp-block-list">
<li>PostgreSQL</li>



<li>Patroni</li>



<li>HAProxy</li>



<li>ETCD</li>



<li>Keepalived</li>
</ul>



<p class="wp-block-paragraph">The goal of this blog is to provide a simple and clear overview that can be understood by both beginners and experienced administrators.</p>



<span id="more-44284"></span>



<h2 class="wp-block-heading" id="h-postgresql-high-availability-architecture">PostgreSQL High Availability Architecture</h2>



<p class="wp-block-paragraph">A typical PostgreSQL high availability environment often contains several components:</p>



<figure class="wp-block-table"><table class="has-fixed-layout"><thead><tr><th>Component</th><th>Role</th></tr></thead><tbody><tr><td>PostgreSQL</td><td>Database engine</td></tr><tr><td>Patroni</td><td>PostgreSQL cluster management and failover</td></tr><tr><td>ETCD</td><td>Distributed key-value store used by Patroni</td></tr><tr><td>HAProxy</td><td>Load balancing and traffic routing</td></tr><tr><td>Keepalived</td><td>Virtual IP management and failover</td></tr><tr><td>Zabbix</td><td>Monitoring and alerting platform</td></tr></tbody></table></figure>



<p class="wp-block-paragraph">In a PostgreSQL HA environment, monitoring only the database is not enough because every component directly impacts cluster stability and application availability. Moreover, a failures in components like HAProxy, ETCD, or Keepalived can directly impact application availability and cluster stability. Comprehensive monitoring ensures the entire ecosystem remains healthy and operational.</p>



<h2 class="wp-block-heading" id="h-monitoring-postgresql-with-zabbix">Monitoring PostgreSQL with Zabbix</h2>



<p class="wp-block-paragraph">Zabbix already provides an official PostgreSQL <a href="https://www.zabbix.com/integrations/postgresql#postgresql_agent2">template</a>.</p>



<p class="wp-block-paragraph">This template allows administrators to monitor:</p>



<ul class="wp-block-list">
<li>Database availability</li>



<li>Connections</li>



<li>Transactions</li>



<li>Cache hit ratio</li>



<li>Replication status</li>



<li>Locks</li>



<li>Deadlocks</li>



<li>Database size</li>



<li>Query statistics</li>



<li>WAL activity</li>



<li>Performance metrics</li>
</ul>



<p class="wp-block-paragraph">Therefore, using the official template is usually the fastest and easiest way to start monitoring PostgreSQL.</p>



<h2 class="wp-block-heading" id="h-etcd">ETCD</h2>



<p class="wp-block-paragraph">Zabbix already provides an official template for <a href="https://www.zabbix.com/integrations/etcd">ETCD</a> monitoring.</p>



<p class="wp-block-paragraph">The template uses HTTP Agent items to collect metrics directly from the <code>/metrics</code> endpoint exposed by ETCD.</p>



<p class="wp-block-paragraph">In addition, the template works without external scripts and uses Prometheus-style metrics collection.</p>



<p class="wp-block-paragraph">The official template already includes many useful triggers and metrics for monitoring ETCD health, performance, and cluster activity.</p>



<figure class="wp-block-image size-full is-resized"><img fetchpriority="high" decoding="async" width="269" height="300" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image.png" alt="" class="wp-image-44286" style="aspect-ratio:0.896661284441898;width:220px;height:auto" /></figure>



<h2 class="wp-block-heading" id="h-haproxy">HAProxy</h2>



<p class="wp-block-paragraph">HAProxy plays a critical role in a PostgreSQL high availability environment, ensuring that client connections are automatically routed to the correct PostgreSQL node and maintaining application connectivity during failovers.</p>



<p class="wp-block-paragraph">Zabbix already provides an official HAProxy monitoring template that can be used as a base for monitoring and customization: <a href="https://www.zabbix.com/integrations/haproxy">HAProxy by HTTP</a></p>



<p class="wp-block-paragraph">The number of active servers in HAProxy is also monitored by the template. However, the template does not create a trigger by default for this metric. This means that a backend could lose one or more active servers without generating any alert in Zabbix. For this reason, it is highly recommended to create custom triggers on this item to immediately detect backend availability issues and avoid unnoticed service degradation.</p>



<p class="wp-block-paragraph">To implement this properly, a new macro should be created in the template to define the minimum expected number of active servers. Then, a custom trigger must be added on the item prototype available in the Low-Level Discovery (LLD) section of the template. Consequently, Zabbix can automatically monitor and alert on all discovered HAProxy backends.</p>



<figure class="wp-block-image size-full"><img decoding="async" width="497" height="115" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-1.png" alt="" class="wp-image-44287" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-1.png 497w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-1-300x69.png 300w" sizes="(max-width: 497px) 100vw, 497px" /></figure>



<h2 class="wp-block-heading" id="h-patroni">Patroni</h2>



<p class="wp-block-paragraph">Zabbix does not directly include Patroni monitoring in the standard PostgreSQL templates.</p>



<p class="wp-block-paragraph">One approach is to use the Zabbix <a href="https://www.zabbix.com/integrations/systemd">Systemd </a>template, which allows monitoring of services running on the server. In this case, it is simply necessary to filter on the <code>patroni</code> service name to verify that the service is active.</p>



<p class="wp-block-paragraph">Another approach is to create a custom item in a template using a Zabbix agent item with the following key:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; title: ; notranslate">
net.tcp.service&#x5B;&quot;{$HAPROXY.STATS.SCHEME}&quot;,&quot;{$HAPROXY.STATS.HOST}&quot;,&quot;{$HAPROXY.STATS.PORT.PATRONI}&quot;]
</pre></div>


<p class="wp-block-paragraph">This item performs a direct HTTP TCP connection test on port <code>5000</code>, which is commonly used by the Patroni API. This configuration allows Zabbix to verify that the Patroni service responds correctly and remains reachable.</p>



<p class="wp-block-paragraph">Additionally, combining both monitoring methods significantly improves monitoring reliability.</p>



<h2 class="wp-block-heading" id="h-monitoring-keepalived">Monitoring Keepalived</h2>



<p class="wp-block-paragraph">Keepalived monitoring is not included by default in the standard Zabbix templates.<br>To properly monitor the VIP management layer, administrators should create custom items whether the Virtual IP (VIP) is correctly present on one of the cluster nodes.</p>



<p class="wp-block-paragraph">A common approach is to create three different items:</p>



<ul class="wp-block-list">
<li>Two items using the <code>net.tcp.service</code> key to test TCP connectivity for:
<ul class="wp-block-list">
<li>the READ-ONLY access</li>



<li>the READ-WRITE access</li>
</ul>
</li>



<li>One additional item using a <code>system.run</code> command to directly verify the VIP status on the server.</li>
</ul>



<figure class="wp-block-image size-full"><img decoding="async" width="637" height="98" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-2.png" alt="" class="wp-image-44288" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-2.png 637w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-2-300x46.png 300w" sizes="(max-width: 637px) 100vw, 637px" /></figure>



<p class="wp-block-paragraph">Together, these checks help ensure that:</p>



<ul class="wp-block-list">
<li>the VIP is correctly assigned</li>



<li>the PostgreSQL services are reachable</li>



<li>the failover mechanism is working properly</li>
</ul>



<p class="wp-block-paragraph">Finally, administrators should create the appropriate triggers for these items in order to immediately detect VIP failover or connectivity issues.</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="380" height="191" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-3.png" alt="" class="wp-image-44289" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-3.png 380w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-3-300x151.png 300w" sizes="auto, (max-width: 380px) 100vw, 380px" /></figure>



<h2 class="wp-block-heading" id="h-conclusion">Conclusion</h2>



<p class="wp-block-paragraph">In conclusion, monitoring PostgreSQL alone is not enough in a high availability environment.</p>



<p class="wp-block-paragraph">A complete monitoring strategy must include:</p>



<ul class="wp-block-list">
<li>PostgreSQL</li>



<li>Patroni</li>



<li>HAProxy</li>



<li>ETCD</li>



<li>Keepalived</li>
</ul>



<p class="wp-block-paragraph">Using Zabbix with both official and custom templates provides a centralized and efficient monitoring solution.</p>



<p class="wp-block-paragraph">The PostgreSQL template available in Zabbix already offers excellent database monitoring capabilities.</p>



<p class="wp-block-paragraph">Additional templates for ETCD and HAProxy help extend monitoring to the full HA architecture and improve visibility across the entire platform.</p>



<p class="wp-block-paragraph">As a result, this approach helps administrators detect failures earlier, improve troubleshooting, and ensure better service availability.</p>



<p class="wp-block-paragraph">If you need you can try other blogs regarding Zabbix or postgreSQL here: <a href="https://www.dbi-services.com/blog/">dbi Blog</a></p>
<p>L’article <a href="https://www.dbi-services.com/blog/zabbix-monitoring-full-cluster-patroni-postgresql/">Zabbix &#8211; Monitoring full cluster Patroni/PostgreSQL</a> est apparu en premier sur <a href="https://www.dbi-services.com/blog">dbi Blog</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.dbi-services.com/blog/zabbix-monitoring-full-cluster-patroni-postgresql/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Zabbix &#8211; Oracle backups monitoring</title>
		<link>https://www.dbi-services.com/blog/zabbix-oracle-backups-monitoring/</link>
					<comments>https://www.dbi-services.com/blog/zabbix-oracle-backups-monitoring/#respond</comments>
		
		<dc:creator><![CDATA[Aurélien Py]]></dc:creator>
		<pubDate>Tue, 26 May 2026 15:11:12 +0000</pubDate>
				<category><![CDATA[Database Administration & Monitoring]]></category>
		<category><![CDATA[Monitoring]]></category>
		<category><![CDATA[Zabbix]]></category>
		<category><![CDATA[Backup]]></category>
		<category><![CDATA[Oracle]]></category>
		<guid isPermaLink="false">https://www.dbi-services.com/blog/?p=42829</guid>

					<description><![CDATA[<p>If your Oracle database is running in ARCHIVELOG mode, monitoring your backups is essential.Without proper backup supervision, the Fast Recovery Area (FRA) can eventually become full, which may block archive log generation and, in the worst case, stop database activity. In this article, we will see how to monitor Oracle backups using Zabbix and the [&#8230;]</p>
<p>L’article <a href="https://www.dbi-services.com/blog/zabbix-oracle-backups-monitoring/">Zabbix &#8211; Oracle backups monitoring</a> est apparu en premier sur <a href="https://www.dbi-services.com/blog">dbi Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">If your Oracle database is running in ARCHIVELOG mode, monitoring your backups is essential.<br>Without proper backup supervision, the Fast Recovery Area (FRA) can eventually become full, which may block archive log generation and, in the worst case, stop database activity.</p>



<span id="more-42829"></span>



<p class="wp-block-paragraph">In this article, we will see how to monitor Oracle backups using Zabbix and the Oracle plugin available with Zabbix Agent 2.</p>



<p class="wp-block-paragraph">This guide assumes that:</p>



<ul class="wp-block-list">
<li>ARCHIVELOG mode is already enabled, </li>



<li>RMAN backups are already configured, </li>



<li>and Zabbix Agent 2 is already installed on the Oracle server.</li>
</ul>



<h2 class="wp-block-heading" id="h-setup-zabbix-agent-2">Setup Zabbix agent 2</h2>



<p class="wp-block-paragraph">By default, the Oracle template provided with Zabbix does not include items dedicated to backup monitoring.<br>To achieve this, we will use the Oracle plugin feature called CustomQueries. You can find the documentation for this plugin on the <a href="https://www.zabbix.com/documentation/current/en/manual/appendix/config/zabbix_agent2_plugins/oracle_plugin" id="https://www.zabbix.com/documentation/current/en/manual/appendix/config/zabbix_agent2_plugins/oracle_plugin">official web-site</a>.</p>



<p class="wp-block-paragraph">The approach is straightforward:</p>



<ul class="wp-block-list">
<li>create custom SQL queries, </li>



<li>store them locally on the database server, </li>



<li>and let Zabbix execute them periodically</li>
</ul>



<p class="wp-block-paragraph">First, create a directory that will contain your monitoring SQL scripts:</p>



<pre class="wp-block-code"><code>mkdir -p /etc/zabbix/oracle/sql 

chmod 755 -R /etc/zabbix/oracle/sql</code></pre>



<p class="wp-block-paragraph" id="h-create-a-sql-script-that-returns-the-timestamp-of-the-latest-level-0-backup">Create a SQL script that returns the timestamp of the latest level 0 backup.</p>



<pre class="wp-block-code"><code><code>vi /etc/zabbix/oracle/sql/last_inc0.sql</code></code></pre>



<pre class="wp-block-code"><code><code>SELECT
    CAST(((CAST(BS.COMPLETION_TIME AS DATE) - DATE '1970-01-01') * 86400) AS NUMERIC) AS LAST_INC0
FROM
    V$BACKUP_SET BS
WHERE
    BS.INCREMENTAL_LEVEL = 0
ORDER BY
    BS.COMPLETION_TIME DESC
FETCH FIRST 1 ROWS ONLY;</code></code></pre>



<p class="wp-block-paragraph">This query retrieves the completion time of the latest RMAN level 0 backup. Then, it converts the value into Unix epoch format, which makes preprocessing and trigger calculations much easier inside Zabbix.</p>



<p class="wp-block-paragraph">You can easily adapt this query to monitor:</p>



<ul class="wp-block-list">
<li>archive log backups,</li>



<li>level 1 incremental backups,</li>



<li>controlfile backups,</li>



<li>or even backup duration.</li>
</ul>



<p class="wp-block-paragraph" id="h-configure-the-oracle-plugin">Configure the Oracle Plugin</p>



<p class="wp-block-paragraph">Now edit the Oracle plugin configuration file:</p>



<pre class="wp-block-code"><code><code>vi /etc/zabbix/zabbix_agent2.d/plugins.d/oracle.conf</code></code></pre>



<pre class="wp-block-code"><code><code>Plugins.Oracle.CustomQueriesPath=/etc/zabbix/oracle/sql/</code></code></pre>



<p class="wp-block-paragraph">With this configuration, the Oracle plugin knows exactly where to find your custom SQL scripts.</p>



<p class="wp-block-paragraph">Afterward, restart the Zabbix Agent 2 service:</p>



<pre class="wp-block-code"><code>systemctl restart zabbix-agent2</code></pre>



<h2 class="wp-block-heading" id="h-setup-zabbix-template">Setup Zabbix template</h2>



<p class="wp-block-paragraph">In your Oracle template, create a new <strong>Zabbix agent item</strong> using the following key:</p>



<p class="wp-block-paragraph">oracle.custom.query[&#8220;{$ORACLE.CONNSTRING}&#8221;,&#8221;{$ORACLE.USER}&#8221;,&#8221;{$ORACLE.PASSWORD}&#8221;,&#8221;{$ORACLE.SERVICE}&#8221;,<strong>last_inc0</strong>]</p>



<p class="wp-block-paragraph">Make sure that:</p>



<ul class="wp-block-list">
<li>the query name matches your SQL file name (<code>last_inc0.sql</code>),</li>



<li>and the preprocessing steps are correctly configured.</li>
</ul>



<p class="wp-block-paragraph">Once the item becomes active, Zabbix immediately starts collecting the timestamp of the latest level 0 backup.</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="786" height="286" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/02/Screenshot-2026-02-10-153119.png" alt="" class="wp-image-42831" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/02/Screenshot-2026-02-10-153119.png 786w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/02/Screenshot-2026-02-10-153119-300x109.png 300w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/02/Screenshot-2026-02-10-153119-768x279.png 768w" sizes="auto, (max-width: 786px) 100vw, 786px" /></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="813" height="202" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/02/Screenshot-2026-02-10-153316.png" alt="" class="wp-image-42832" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/02/Screenshot-2026-02-10-153316.png 813w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/02/Screenshot-2026-02-10-153316-300x75.png 300w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/02/Screenshot-2026-02-10-153316-768x191.png 768w" sizes="auto, (max-width: 813px) 100vw, 813px" /></figure>



<ul class="wp-block-list">
<li>And now you can see your last backup time. Then you just have to create a trigger with this item and you have finished !</li>
</ul>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="25" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/02/Screenshot-2026-02-10-160538-1024x25.png" alt="" class="wp-image-42834" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/02/Screenshot-2026-02-10-160538-1024x25.png 1024w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/02/Screenshot-2026-02-10-160538-300x7.png 300w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/02/Screenshot-2026-02-10-160538-768x18.png 768w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/02/Screenshot-2026-02-10-160538.png 1041w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<h2 class="wp-block-heading" id="h-conclusion">Conclusion</h2>



<p class="wp-block-paragraph">By using custom Oracle queries with Zabbix Agent 2, you can quickly extend the default Oracle monitoring capabilities with very little configuration effort.</p>



<p class="wp-block-paragraph">More importantly, this approach helps you detect backup issues early before the FRA fills up and impacts production systems.</p>



<p class="wp-block-paragraph">Because the solution remains fully customizable, you can easily extend it later to monitor additional RMAN metrics that match your operational requirements.</p>



<p class="wp-block-paragraph">You can find other blog regarding Zabbix with this <a href="https://www.dbi-services.com/blog/tag/zabbix/page/3/">link</a>.</p>



<p class="wp-block-paragraph"></p>
<p>L’article <a href="https://www.dbi-services.com/blog/zabbix-oracle-backups-monitoring/">Zabbix &#8211; Oracle backups monitoring</a> est apparu en premier sur <a href="https://www.dbi-services.com/blog">dbi Blog</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.dbi-services.com/blog/zabbix-oracle-backups-monitoring/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>SQL Server Snapshot Backup and Restore with Proxmox ZFS &#8211; REST API with SQL Server 2025 (3/3)</title>
		<link>https://www.dbi-services.com/blog/sql-server-snapshot-backup-and-restore-with-proxmox-zfs-rest-api-with-sql-server-2025-3-3/</link>
					<comments>https://www.dbi-services.com/blog/sql-server-snapshot-backup-and-restore-with-proxmox-zfs-rest-api-with-sql-server-2025-3-3/#respond</comments>
		
		<dc:creator><![CDATA[Amine Haloui]]></dc:creator>
		<pubDate>Thu, 14 May 2026 21:39:18 +0000</pubDate>
				<category><![CDATA[Database Administration & Monitoring]]></category>
		<category><![CDATA[Database management]]></category>
		<category><![CDATA[Operating systems]]></category>
		<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[proxmox]]></category>
		<category><![CDATA[ZFS]]></category>
		<guid isPermaLink="false">https://www.dbi-services.com/blog/?p=44525</guid>

					<description><![CDATA[<p>The proposed architecture consists in adding a small internal REST API on the Proxmox server in order to expose a controlled ZFS snapshot operation. SQL Server 2025 can then call this API through sp_invoke_external_rest_endpoint, instead of running SSH commands directly or relying on an external tool. The role of the API is deliberately limited: it [&#8230;]</p>
<p>L’article <a href="https://www.dbi-services.com/blog/sql-server-snapshot-backup-and-restore-with-proxmox-zfs-rest-api-with-sql-server-2025-3-3/">SQL Server Snapshot Backup and Restore with Proxmox ZFS &#8211; REST API with SQL Server 2025 (3/3)</a> est apparu en premier sur <a href="https://www.dbi-services.com/blog">dbi Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">The proposed architecture consists in adding a small internal REST API on the Proxmox server in order to expose a controlled ZFS snapshot operation. SQL Server 2025 can then call this API through sp_invoke_external_rest_endpoint, instead of running SSH commands directly or relying on an external tool.</p>



<p class="wp-block-paragraph">The role of the API is deliberately limited: it receives a snapshot request, checks that the requested zvol is authorized, and then runs the zfs snapshot command on the Proxmox side. An allowlist is used to restrict the ZFS volumes that can be accessed. This prevents a REST call from being able to manipulate any dataset on the server.</p>



<p class="wp-block-paragraph">With this approach, we can reproduce a behavior close to what an enterprise storage array provides, but using Proxmox and ZFS. It is important to note that Proxmox does not natively provide the same level of integration as Pure Storage for SQL Server snapshots. Pure Storage provides dedicated mechanisms and integrations. In our case, we need to build a specific orchestration layer. The REST API therefore acts as an adapter between SQL Server, which drives the snapshot backup workflow, and ZFS, which actually performs the storage-level snapshot.</p>



<h2 class="wp-block-heading" id="h-architecture">Architecture</h2>



<p class="wp-block-paragraph">Here is a global overview of the architecture:</p>



<ul class="wp-block-list">
<li>SQL Server freezes the database I/Os</li>



<li>SQL Server 2025 calls the internal REST API</li>



<li>The REST API validates the request and checks the zvol allowlist</li>



<li>The API triggers the ZFS snapshot on Proxmox</li>



<li>The API returns the snapshot information to SQL Server</li>



<li>SQL Server creates the metadata-only backup</li>



<li>The database I/Os are released</li>
</ul>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="998" height="1024" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-65-998x1024.png" alt="" class="wp-image-44526" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-65-998x1024.png 998w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-65-292x300.png 292w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-65-768x788.png 768w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-65-1496x1536.png 1496w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-65-1995x2048.png 1995w" sizes="auto, (max-width: 998px) 100vw, 998px" /></figure>



<h2 class="wp-block-heading">REST API implementation</h2>



<p class="wp-block-paragraph">Under Proxmox, we install the required packages:</p>



<pre class="wp-block-code"><code>apt update
apt install -y python3-venv sudo openssl</code></pre>



<p class="wp-block-paragraph">We create a dedicated user:</p>



<pre class="wp-block-code"><code>useradd --system \
&nbsp; --home /opt/sql-zfs-api \
&nbsp; --shell /usr/sbin/nologin \
&nbsp; sqlsnap</code></pre>



<p class="wp-block-paragraph">We create the following folders:</p>



<pre class="wp-block-code"><code>mkdir -p /opt/sql-zfs-api
mkdir -p /etc/sql-zfs-api</code></pre>



<p class="wp-block-paragraph">We declare the authorized zvol :</p>



<pre class="wp-block-code"><code>cat &gt;/etc/sql-zfs-api/allowed-zvols &lt;&lt;'EOF'
sqlpool/pve/vm-302-disk-0
EOF</code></pre>



<p class="wp-block-paragraph">We create a root-only allowlist:</p>



<pre class="wp-block-code"><code>chown root:root /etc/sql-zfs-api/allowed-zvols
chmod 600 /etc/sql-zfs-api/allowed-zvols</code></pre>



<p class="wp-block-paragraph">Then we create the secured ZFS helper. This script is executed as root through sudo, but it rejects any dataset that is not defined in the allowlist.</p>



<pre class="wp-block-code"><code>cat &gt;/usr/local/sbin/sql-zfs-helper &lt;&lt;'EOF'
#!/usr/bin/env bash
set -euo pipefail

ALLOW_FILE="/etc/sql-zfs-api/allowed-zvols"
LOCK_FILE="/run/sql-zfs-helper.lock"

die() {
  echo "$*" &gt;&amp;2
  exit 1
}

exec 9&gt;"$LOCK_FILE"
flock -n 9 || die "another snapshot operation is already running"

&#091;&#091; -r "$ALLOW_FILE" ]] || die "allowlist not readable: $ALLOW_FILE"

mapfile -t ALLOWED_DATASETS &lt; &lt;(grep -Ev '^\s*(#|$)' "$ALLOW_FILE")

is_allowed() {
  local ds="$1"
  local allowed
  for allowed in "${ALLOWED_DATASETS&#091;@]}"; do
    &#091;&#091; "$ds" == "$allowed" ]] &amp;&amp; return 0
  done
  return 1
}

valid_snapname() {
  &#091;&#091; "$1" =~ ^&#091;A-Za-z0-9_.:-]{1,120}$ ]]
}

ACTION="${1:-}"
shift || true

case "$ACTION" in
  snapshot)
    SNAPNAME="${1:-}"
    shift || true

    valid_snapname "$SNAPNAME" || die "invalid snapshot name: $SNAPNAME"
    &#091;&#091; "$#" -ge 1 ]] || die "no zvol specified"
    &#091;&#091; "$#" -le 8 ]] || die "too many zvols"

    SNAPSHOTS=()

    for DS in "$@"; do
      is_allowed "$DS" || die "dataset not allowed: $DS"
      /sbin/zfs list -H -t volume -o name "$DS" &gt;/dev/null 2&gt;&amp;1 || die "zvol not found: $DS"

      FULLSNAP="${DS}@${SNAPNAME}"

      if /sbin/zfs list -H -t snapshot -o name "$FULLSNAP" &gt;/dev/null 2&gt;&amp;1; then
        die "snapshot already exists: $FULLSNAP"
      fi

      SNAPSHOTS+=("$FULLSNAP")
    done

    /sbin/zfs snapshot "${SNAPSHOTS&#091;@]}"
    /sbin/zfs hold sqlsnap "${SNAPSHOTS&#091;@]}"

    printf '{"status":"ok","snapshots":&#091;'
    SEP=""
    for S in "${SNAPSHOTS&#091;@]}"; do
      printf '%s"%s"' "$SEP" "$S"
      SEP=","
    done
    printf ']}\n'
    ;;

  list)
    /sbin/zfs list -H -t snapshot -o name -r sqlpool | grep '@sql_' || true
    ;;

  *)
    die "usage: sql-zfs-helper snapshot SNAPNAME ZVOL &#091;ZVOL...]"
    ;;
esac
EOF

chown root:root /usr/local/sbin/sql-zfs-helper
chmod 750 /usr/local/sbin/sql-zfs-helper
</code></pre>



<p class="wp-block-paragraph">We only allow the helper through sudo:</p>



<pre class="wp-block-code"><code>cat &gt;/etc/sudoers.d/sql-zfs-helper &lt;&lt;'EOF'
sqlsnap ALL=(root) NOPASSWD: /usr/local/sbin/sql-zfs-helper *
EOF

chmod 440 /etc/sudoers.d/sql-zfs-helper
visudo -cf /etc/sudoers.d/sql-zfs-helper</code></pre>



<p class="wp-block-paragraph">We install the FastAPI API:</p>



<pre class="wp-block-code"><code>python3 -m venv /opt/sql-zfs-api/venv
/opt/sql-zfs-api/venv/bin/pip install fastapi "uvicorn&#091;standard]"</code></pre>



<p class="wp-block-paragraph">We create the application file:</p>



<pre class="wp-block-code"><code>cat &gt;/opt/sql-zfs-api/app.py &lt;&lt;'EOF'
import os
import re
import json
import socket
import secrets
import subprocess
from datetime import datetime, timezone
from fastapi import FastAPI, Header, HTTPException
from pydantic import BaseModel, Field

API_KEY = os.environ.get("SQL_ZFS_API_KEY", "")
ALLOW_FILE = "/etc/sql-zfs-api/allowed-zvols"
SNAP_RE = re.compile(r"^&#091;A-Za-z0-9_.:-]{1,120}$")

app = FastAPI(title="SQL ZFS Snapshot API", version="1.0.0")


class SnapshotRequest(BaseModel):
    database: str = Field(..., min_length=1, max_length=128)
    vmid: int = 302
    snapname: str = Field(..., min_length=1, max_length=120)
    zvols: list&#091;str] = Field(..., min_length=1, max_length=8)


def load_allowed_zvols() -&gt; set&#091;str]:
    with open(ALLOW_FILE, "r", encoding="utf-8") as f:
        return {
            line.strip()
            for line in f
            if line.strip() and not line.strip().startswith("#")
        }


def check_api_key(x_sqlsnap_key: str | None) -&gt; None:
    if not API_KEY:
        raise HTTPException(status_code=500, detail="API key not configured")

    if not x_sqlsnap_key:
        raise HTTPException(status_code=401, detail="missing API key")

    if not secrets.compare_digest(x_sqlsnap_key, API_KEY):
        raise HTTPException(status_code=403, detail="invalid API key")


@app.get("/health")
def health():
    return {
        "status": "ok",
        "host": socket.gethostname(),
        "utc": datetime.now(timezone.utc).isoformat(),
    }


@app.post("/v1/sql-zfs/snapshot")
def create_snapshot(
    req: SnapshotRequest,
    x_sqlsnap_key: str | None = Header(default=None, alias="x-sqlsnap-key"),
):
    check_api_key(x_sqlsnap_key)

    if not SNAP_RE.fullmatch(req.snapname):
        raise HTTPException(status_code=400, detail="invalid snapname")

    allowed = load_allowed_zvols()

    for zvol in req.zvols:
        if zvol not in allowed:
            raise HTTPException(status_code=403, detail=f"zvol not allowed: {zvol}")

    cmd = &#091;
        "sudo",
        "/usr/local/sbin/sql-zfs-helper",
        "snapshot",
        req.snapname,
        *req.zvols,
    ]

    try:
        completed = subprocess.run(
            cmd,
            text=True,
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            timeout=30,
            check=False,
        )
    except subprocess.TimeoutExpired:
        raise HTTPException(status_code=504, detail="zfs snapshot timeout")

    if completed.returncode != 0:
        raise HTTPException(
            status_code=500,
            detail={
                "error": completed.stderr.strip(),
                "stdout": completed.stdout.strip(),
            },
        )

    snapshots = &#091;f"{zvol}@{req.snapname}" for zvol in req.zvols]

    return {
        "status": "ok",
        "database": req.database,
        "vmid": req.vmid,
        "snapname": req.snapname,
        "snapshots": snapshots,
        "media_description": "zfs|" + socket.gethostname() + "|" + ";".join(snapshots),
    }
EOF

chown -R root:root /opt/sql-zfs-api
chmod 755 /opt/sql-zfs-api
chmod 644 /opt/sql-zfs-api/app.py
</code></pre>



<p class="wp-block-paragraph">We configure and generate the key:</p>



<pre class="wp-block-code"><code>APIKEY="$(openssl rand -hex 32)"
echo "$APIKEY"</code></pre>



<p class="wp-block-paragraph">We create the environment file:</p>



<pre class="wp-block-code"><code>cat &gt;/etc/sql-zfs-api/sql-zfs-api.env &lt;&lt;EOF
SQL_ZFS_API_KEY=$APIKEY
EOF

chown root:root /etc/sql-zfs-api/sql-zfs-api.env
chmod 600 /etc/sql-zfs-api/sql-zfs-api.env</code></pre>



<p class="wp-block-paragraph">We need to save the generated key.</p>



<p class="wp-block-paragraph">Next, we enable HTTPS. SQL Server sp_invoke_external_rest_endpoint calls HTTPS endpoints, and the documentation specifies that only HTTPS endpoints with TLS are supported.</p>



<pre class="wp-block-code"><code>openssl req -x509 -newkey rsa:4096 -sha256 -days 360 -nodes \
  -keyout /etc/sql-zfs-api/tls.key \
  -out /etc/sql-zfs-api/tls.crt \
  -subj "/CN=promox1" \
  -addext "subjectAltName=DNS:promox1,IP:192.168.1.110"

chown root:sqlsnap /etc/sql-zfs-api/tls.key /etc/sql-zfs-api/tls.crt
chmod 640 /etc/sql-zfs-api/tls.key
chmod 644 /etc/sql-zfs-api/tls.crt</code></pre>



<p class="wp-block-paragraph">The /etc/sql-zfs-api/tls.crt certificate must be imported into the Windows trusted root certification authorities on the SQL Server side. Otherwise, the HTTPS call may fail.</p>



<p class="wp-block-paragraph">We create the systemd service:</p>



<pre class="wp-block-code"><code>cat &gt;/etc/systemd/system/sql-zfs-api.service &lt;&lt;'EOF'
&#091;Unit]
Description=SQL Server to ZFS Snapshot API
After=network-online.target
Wants=network-online.target

&#091;Service]
User=sqlsnap
Group=sqlsnap
WorkingDirectory=/opt/sql-zfs-api
EnvironmentFile=/etc/sql-zfs-api/sql-zfs-api.env
ExecStart=/opt/sql-zfs-api/venv/bin/uvicorn app:app --host 0.0.0.0 --port 8443 --ssl-keyfile /etc/sql-zfs-api/tls.key --ssl-certfile /etc/sql-zfs-api/tls.crt
Restart=on-failure
RestartSec=3

&#091;Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl enable --now sql-zfs-api
systemctl status sql-zfs-api
</code></pre>



<p class="wp-block-paragraph">We check the status of our API:</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="697" height="186" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-67.png" alt="" class="wp-image-44528" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-67.png 697w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-67-300x80.png 300w" sizes="auto, (max-width: 697px) 100vw, 697px" /></figure>



<p class="wp-block-paragraph">It is possible to call the API in PowerShell using Invoke-RestMethod with PowerShell 7:</p>



<pre class="wp-block-code"><code>$headers = @{
"Content-Type"  = "application/json"
"x-sqlsnap-key" = "MyKey"
}

$body = @{
database = "StackOverflow"
vmid     = 302
snapname = "StackOverflow_test010"
zvols    = @("sqlpool/pve/vm-302-disk-0")
} | ConvertTo-Json -Depth 5

Invoke-RestMethod `
-Uri "https://192.168.1.110:8443/v1/sql-zfs/snapshot" `
-Method Post `
-Headers $headers `
-Body $body `
-ContentType "application/json" `
-SkipCertificateCheck
</code></pre>



<p class="wp-block-paragraph">This gives:</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="833" height="510" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-80.png" alt="" class="wp-image-44590" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-80.png 833w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-80-300x184.png 300w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-80-768x470.png 768w" sizes="auto, (max-width: 833px) 100vw, 833px" /></figure>



<h2 class="wp-block-heading" id="h-test-from-sql-server">Test from SQL Server</h2>



<p class="wp-block-paragraph">A certificate was generated on Proxmox and it needs to be imported on the SQL Server host. In my case, it was located here:</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="404" height="79" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-69.png" alt="" class="wp-image-44530" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-69.png 404w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-69-300x59.png 300w" sizes="auto, (max-width: 404px) 100vw, 404px" /></figure>



<p class="wp-block-paragraph">I then imported it on Windows Server:</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="788" height="149" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-70.png" alt="" class="wp-image-44531" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-70.png 788w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-70-300x57.png 300w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-70-768x145.png 768w" sizes="auto, (max-width: 788px) 100vw, 788px" /></figure>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="118" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-71-1024x118.png" alt="" class="wp-image-44532" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-71-1024x118.png 1024w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-71-300x34.png 300w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-71-768x88.png 768w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-71.png 1384w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p class="wp-block-paragraph">For testing purposes, I created something simple. On the SQL Server side, we can create a database that will be used to store our future stored procedure. This procedure will allow us to interact with the API. In my case, I created a database called dbi_tools:</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="244" height="131" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-72.png" alt="" class="wp-image-44533" /></figure>



<p class="wp-block-paragraph">This database will contain a credential. In our case, the DATABASE SCOPED CREDENTIAL is used to securely store the authentication information required to call the REST API from SQL Server. This allows us, for example, to protect the API key:</p>



<pre class="wp-block-code"><code>USE &#091;dbi_tools]
GO

IF NOT EXISTS (
    SELECT 1
    FROM sys.symmetric_keys
    WHERE name = '##MS_DatabaseMasterKey##'
)
BEGIN
    CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'MyStrongPassword_%99';
END
GO

CREATE DATABASE SCOPED CREDENTIAL &#091;https://192.168.1.110:8443/v1/sql-zfs/snapshot]
WITH
    IDENTITY = 'HTTPEndpointHeaders',
    SECRET = '{"x-sqlsnap-key":"MyAPIKey"}';
GO</code></pre>



<p class="wp-block-paragraph">We then create a stored procedure to encapsulate the code used to call the API:</p>



<pre class="wp-block-code"><code>USE dbi_tools;
GO

CREATE OR ALTER PROCEDURE dbo.usp_BackupDatabase_WithZfsSnapshot
    @DatabaseName sysname,
    @BackupDirectory nvarchar(4000) = N'D:\Backups\'
AS
BEGIN
    SET NOCOUNT ON;

    DECLARE @Url nvarchar(4000) =
        N'https://192.168.1.110:8443/v1/sql-zfs/snapshot';

    DECLARE @Vmid int = 302;

    DECLARE @ZvolsJson nvarchar(max) =
        N'&#091;"sqlpool/pve/vm-302-disk-0"]';

    DECLARE @Stamp varchar(20) =
        REPLACE(REPLACE(CONVERT(varchar(19), SYSUTCDATETIME(), 126), '-', ''), ':', '') + 'Z';

    DECLARE @SafeDbName nvarchar(128) =
        REPLACE(REPLACE(REPLACE(@DatabaseName, N' ', N'_'), N'&#091;', N''), N']', N'');

    DECLARE @SnapName nvarchar(128) =
        CONCAT(N'sql_', @SafeDbName, N'_', @Stamp);

    DECLARE @BackupFile nvarchar(4000) =
        CONCAT(@BackupDirectory, N'\', @SafeDbName, N'_', @Stamp, N'.bkm');

    DECLARE @Payload nvarchar(max) =
    (
        SELECT
            @DatabaseName AS &#091;database],
            @Vmid AS &#091;vmid],
            @SnapName AS &#091;snapname],
            JSON_QUERY(@ZvolsJson) AS &#091;zvols]
        FOR JSON PATH, WITHOUT_ARRAY_WRAPPER
    );

    DECLARE @ReturnCode int;
    DECLARE @Response nvarchar(max);
    DECLARE @SnapshotList nvarchar(max);

    SELECT @SnapshotList =
        STRING_AGG(CONCAT(&#091;value], N'@', @SnapName), N';')
    FROM OPENJSON(@ZvolsJson);

    DECLARE @MediaDescription nvarchar(max) =
        CONCAT(N'zfs|promox1|', @SnapshotList);

    DECLARE @Sql nvarchar(max);

    BEGIN TRY
        SET @Sql =
            N'ALTER DATABASE ' + QUOTENAME(@DatabaseName) +
            N' SET SUSPEND_FOR_SNAPSHOT_BACKUP = ON;';

        EXEC sys.sp_executesql @Sql;

        EXEC @ReturnCode = sys.sp_invoke_external_rest_endpoint
            @url = @Url,
            @method = N'POST',
            @headers = N'{"Content-Type":"application/json","Accept":"application/json"}',
            @payload = @Payload,
            @credential = &#091;https://192.168.1.110:8443/v1/sql-zfs/snapshot],
            @timeout = 30,
            @response = @Response OUTPUT;

        IF @ReturnCode &lt;&gt; 0
        BEGIN
            DECLARE @Err nvarchar(max) =
                CONCAT(N'ZFS snapshot API failed. ReturnCode=', @ReturnCode, N' Response=', @Response);
            THROW 51001, @Err, 1;
        END;

        SET @Sql =
            N'BACKUP DATABASE ' + QUOTENAME(@DatabaseName) + N'
              TO DISK = @BackupFile
              WITH METADATA_ONLY,
                   FORMAT,
                   MEDIANAME = @MediaName,
                   MEDIADESCRIPTION = @MediaDescription,
                   NAME = @BackupName;';

        EXEC sys.sp_executesql
            @Sql,
            N'@BackupFile nvarchar(4000),
              @MediaName nvarchar(128),
              @MediaDescription nvarchar(max),
              @BackupName nvarchar(128)',
            @BackupFile = @BackupFile,
            @MediaName = @SnapName,
            @MediaDescription = @MediaDescription,
            @BackupName = @SnapName;

        SELECT
            @DatabaseName AS database_name,
            @SnapName AS zfs_snapshot_name,
            @SnapshotList AS zfs_snapshots,
            @BackupFile AS metadata_backup_file,
            @MediaDescription AS media_description,
            @Response AS api_response;
    END TRY
    BEGIN CATCH
        IF DATABASEPROPERTYEX(@DatabaseName, 'IsDatabaseSuspendedForSnapshotBackup') = 1
        BEGIN
            SET @Sql =
                N'ALTER DATABASE ' + QUOTENAME(@DatabaseName) +
                N' SET SUSPEND_FOR_SNAPSHOT_BACKUP = OFF;';

            EXEC sys.sp_executesql @Sql;
        END;

        THROW;
    END CATCH
END;
GO
</code></pre>



<p class="wp-block-paragraph">We then call the stored procedure:</p>



<pre class="wp-block-code"><code>EXEC dbi_tools.dbo.usp_BackupDatabase_WithZfsSnapshot
    @DatabaseName = N'StackOverflow',
    @BackupDirectory = N'D:\Backups\';</code></pre>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="137" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-73-1024x137.png" alt="" class="wp-image-44534" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-73-1024x137.png 1024w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-73-300x40.png 300w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-73-768x102.png 768w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-73.png 1432w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p class="wp-block-paragraph">The backup was generated :</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="630" height="149" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-74.png" alt="" class="wp-image-44535" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-74.png 630w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-74-300x71.png 300w" sizes="auto, (max-width: 630px) 100vw, 630px" /></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="777" height="411" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-75.png" alt="" class="wp-image-44536" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-75.png 777w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-75-300x159.png 300w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-75-768x406.png 768w" sizes="auto, (max-width: 777px) 100vw, 777px" /></figure>



<h2 class="wp-block-heading" id="h-references">References</h2>



<p class="wp-block-paragraph"><a href="https://learn.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-invoke-external-rest-endpoint-transact-sql?view=sql-server-ver17&amp;tabs=request-headers">sp_invoke_external_rest_endpoint</a></p>



<p class="wp-block-paragraph">Thank you. <a href="https://www.linkedin.com/in/amine-haloui-76968056/">Amine Haloui</a></p>
<p>L’article <a href="https://www.dbi-services.com/blog/sql-server-snapshot-backup-and-restore-with-proxmox-zfs-rest-api-with-sql-server-2025-3-3/">SQL Server Snapshot Backup and Restore with Proxmox ZFS &#8211; REST API with SQL Server 2025 (3/3)</a> est apparu en premier sur <a href="https://www.dbi-services.com/blog">dbi Blog</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.dbi-services.com/blog/sql-server-snapshot-backup-and-restore-with-proxmox-zfs-rest-api-with-sql-server-2025-3-3/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>SQL Server Snapshot Backup and Restore with Proxmox ZFS &#8211; Powershell implementation (2/3)</title>
		<link>https://www.dbi-services.com/blog/sql-server-snapshot-backup-and-restore-with-proxmox-zfs-2-3/</link>
					<comments>https://www.dbi-services.com/blog/sql-server-snapshot-backup-and-restore-with-proxmox-zfs-2-3/#respond</comments>
		
		<dc:creator><![CDATA[Amine Haloui]]></dc:creator>
		<pubDate>Thu, 14 May 2026 21:35:41 +0000</pubDate>
				<category><![CDATA[Database Administration & Monitoring]]></category>
		<category><![CDATA[Database management]]></category>
		<category><![CDATA[Operating systems]]></category>
		<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[PowerShell]]></category>
		<category><![CDATA[proxmox]]></category>
		<category><![CDATA[ZFS]]></category>
		<guid isPermaLink="false">https://www.dbi-services.com/blog/?p=44497</guid>

					<description><![CDATA[<p>In the previous section, we discussed the drawbacks of running the commands manually. Indeed, the manual process was taking too much time and could directly impact the database state while the freeze was occurring. To address this issue, it is possible to automate the solution with PowerShell. The idea is to automate the different operations [&#8230;]</p>
<p>L’article <a href="https://www.dbi-services.com/blog/sql-server-snapshot-backup-and-restore-with-proxmox-zfs-2-3/">SQL Server Snapshot Backup and Restore with Proxmox ZFS &#8211; Powershell implementation (2/3)</a> est apparu en premier sur <a href="https://www.dbi-services.com/blog">dbi Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">In the previous section, we discussed the drawbacks of running the commands manually. Indeed, the manual process was taking too much time and could directly impact the database state while the freeze was occurring.</p>



<p class="wp-block-paragraph">To address this issue, it is possible to automate the solution with PowerShell. The idea is to automate the different operations involved in the snapshot backup and restore process.</p>



<p class="wp-block-paragraph">We will use two scripts:</p>



<ul class="wp-block-list">
<li>One script to perform the backups and create the snapshots.</li>



<li>One script to perform the restores.</li>
</ul>



<h2 class="wp-block-heading" id="h-backup-process">Backup process</h2>



<p class="wp-block-paragraph">Here is how the backup process works:</p>



<ul class="wp-block-list">
<li>We connect to the corresponding SQL Server instance.</li>



<li>We change the state of the database using ALTER DATABASE &#8230; SET SUSPEND_FOR_SNAPSHOT_BACKUP = ON. At this point, the I/Os are frozen.</li>



<li>We connect to the hypervisor through SSH.</li>



<li>We create the snapshot.</li>



<li>We back up the database using BACKUP DATABASE &#8230; WITH METADATA_ONLY.</li>



<li>We change the state of the database using ALTER DATABASE &#8230; SET SUSPEND_FOR_SNAPSHOT_BACKUP = OFF. At this point, the I/Os are unfrozen.</li>
</ul>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="627" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-50-1024x627.png" alt="" class="wp-image-44499" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-50-1024x627.png 1024w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-50-300x184.png 300w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-50-768x470.png 768w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-50-1536x941.png 1536w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-50-2048x1254.png 2048w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<h2 class="wp-block-heading">Powershell implementation (backup)</h2>



<p class="wp-block-paragraph">Here is the code used to perform the backup:</p>



<pre class="wp-block-code"><code>param(
    &#091;string]$SqlInstance = "VM-WS25-SQL2",
    &#091;string]$Database    = "StackOverflow",
    &#091;string]$BackupDir   = "D:\Backups",
    &#091;string]$PveHost     = "192.168.1.110",
    &#091;string]$PveUser     = "MyUser",
    &#091;string&#091;]]$Zvols     = @("sqlpool/pve/vm-302-disk-0")
)

$Timestamp = Get-Date -Format "yyyyMMddTHHmmss"
$SnapName  = "sql_${Database}_${Timestamp}"

$DbSafe = $Database.Replace("]", "]]")
$BackupFile = Join-Path $BackupDir "${Database}_${Timestamp}.bkm"

$ZfsSnapshots = $Zvols | ForEach-Object { "$_@$SnapName" }
$ZfsSnapshotArgs = $ZfsSnapshots -join " "

$MediaDescription = "zfs|$PveHost|$ZfsSnapshotArgs"

$BackupFileSql = $BackupFile.Replace("'", "''")
$MediaSql = $MediaDescription.Replace("'", "''")

$connString = "Server=$SqlInstance;Database=master;Integrated Security=True;TrustServerCertificate=True;Application Name=ZFS-TSQL-Snapshot;"
$conn = New-Object System.Data.SqlClient.SqlConnection $connString

function Invoke-SqlNonQuery {
    param(&#091;string]$Sql)

    $cmd = $conn.CreateCommand()
    $cmd.CommandTimeout = 0
    $cmd.CommandText = $Sql
    &#091;void]$cmd.ExecuteNonQuery()
}

try {
    $conn.Open()

    Write-Host "Freezing SQL database writes..."
    Invoke-SqlNonQuery "ALTER DATABASE &#091;$DbSafe] SET SUSPEND_FOR_SNAPSHOT_BACKUP = ON;"

    Write-Host "Taking ZFS snapshot on Proxmox..."
    ssh "$PveUser@$PveHost" "zfs snapshot $ZfsSnapshotArgs &amp;&amp; zfs hold sqlsnap $ZfsSnapshotArgs"

    if ($LASTEXITCODE -ne 0) {
        throw "ZFS snapshot failed on $PveHost"
    }

    Write-Host "Writing SQL metadata backup..."

    Invoke-SqlNonQuery @"
BACKUP DATABASE &#091;$DbSafe]
TO DISK = N'$BackupFileSql'
WITH METADATA_ONLY,
     MEDIADESCRIPTION = N'$MediaSql',
     NAME = N'$SnapName';
"@

    Write-Host "Snapshot backup completed:"
    Write-Host "  Snapshot: $ZfsSnapshotArgs"
    Write-Host "  Metadata: $BackupFile"
}
catch {
    Write-Warning $_

    try {
        Write-Warning "Attempting to unfreeze SQL database..."
        Invoke-SqlNonQuery "ALTER DATABASE &#091;$DbSafe] SET SUSPEND_FOR_SNAPSHOT_BACKUP = OFF;"
    }
    catch {
        Write-Warning "Could not unfreeze cleanly. Check SQL Server error log."
    }

    throw
}
finally {
    $conn.Close()
}</code></pre>



<h2 class="wp-block-heading">Restore process</h2>



<p class="wp-block-paragraph">Here is how the restore process works:</p>



<ul class="wp-block-list">
<li>We connect to the corresponding SQL Server instance.</li>



<li>We take the database offline.</li>



<li>The volume dedicated to the StackOverflow database is taken offline.</li>



<li>We connect to the hypervisor through SSH.</li>



<li>We roll back the corresponding snapshot.</li>



<li>We restore the database using the corresponding backup, which was created at the same time as the snapshot.</li>
</ul>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="627" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-51-1024x627.png" alt="" class="wp-image-44501" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-51-1024x627.png 1024w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-51-300x184.png 300w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-51-768x470.png 768w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-51-1536x941.png 1536w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-51-2048x1254.png 2048w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<h2 class="wp-block-heading">Powershell implementation (restore)</h2>



<p class="wp-block-paragraph">Here is the code used to perform the restore:</p>



<pre class="wp-block-code"><code>param(
    &#091;string]$SqlInstance = "VM-WS25-SQL2",
    &#091;string]$Database    = "StackOverflow",
    &#091;string]$BackupFile  = "D:\Backups\StackOverflow_20260514T122642.bkm",
    &#091;string]$SnapName    = "sql_StackOverflow_20260514T122642",
    &#091;string]$PveHost     = "192.168.1.110",
    &#091;string]$PveUser     = "MyUser",
    &#091;string&#091;]]$Zvols     = @("sqlpool/pve/vm-302-disk-0"),
    &#091;string&#091;]]$DatabaseDriveLetters = @("T"),
    &#091;switch]$NoRecovery
)

$ErrorActionPreference = "Stop"

function Assert-SafeName {
    param(
        &#091;string]$Value,
        &#091;string]$Name,
        &#091;string]$Pattern
    )

    if ($Value -notmatch $Pattern) {
        throw "$Name contained not allowed characters : $Value"
    }
}

function Normalize-DriveLetter {
    param(&#091;string]$DriveLetter)

    $letter = $DriveLetter.Trim().TrimEnd(":").ToUpperInvariant()

    if ($letter -notmatch '^&#091;A-Z]$') {
        throw "Drive letter invalid : $DriveLetter"
    }

    return $letter
}

function Get-DiskForDriveLetter {
    param(&#091;string]$DriveLetter)

    $letter = Normalize-DriveLetter $DriveLetter

    $partition = Get-Partition -DriveLetter $letter -ErrorAction Stop
    $disk = $partition | Get-Disk -ErrorAction Stop

    return &#091;pscustomobject]@{
        DriveLetter = $letter
        DiskNumber  = &#091;int]$disk.Number
        IsOffline   = &#091;bool]$disk.IsOffline
        FriendlyName = $disk.FriendlyName
        Size        = $disk.Size
    }
}

function Invoke-SshChecked {
    param(&#091;string]$Command)

    Write-Host "SSH $PveUser@$PveHost :: $Command"

    &amp; ssh "$PveUser@$PveHost" "$Command"

    if ($LASTEXITCODE -ne 0) {
        throw "SSH command failed with code $LASTEXITCODE : $Command"
    }
}

function New-SqlConnection {
    $connString = "Server=$SqlInstance;Database=master;Integrated Security=True;TrustServerCertificate=True;Application Name=ZFS-TSQL-Restore-NoVmRestart;"
    return New-Object System.Data.SqlClient.SqlConnection $connString
}

function Invoke-SqlNonQuery {
    param(&#091;string]$Sql)

    $conn = New-SqlConnection

    try {
        $conn.Open()
        $cmd = $conn.CreateCommand()
        $cmd.CommandTimeout = 0
        $cmd.CommandText = $Sql
        &#091;void]$cmd.ExecuteNonQuery()
    }
    finally {
        $conn.Close()
    }
}

function Invoke-SqlScalar {
    param(&#091;string]$Sql)

    $conn = New-SqlConnection

    try {
        $conn.Open()
        $cmd = $conn.CreateCommand()
        $cmd.CommandTimeout = 0
        $cmd.CommandText = $Sql
        return $cmd.ExecuteScalar()
    }
    finally {
        $conn.Close()
    }
}

function Set-DatabaseDisksOffline {
    param(&#091;object&#091;]]$DiskInfos)

    $offlinedByScript = @()

    foreach ($diskInfo in ($DiskInfos | Sort-Object DiskNumber -Unique)) {
        if ($diskInfo.IsOffline) {
            Write-Host "Disque $($diskInfo.DiskNumber) déjà offline. Lecteur $($diskInfo.DriveLetter):"
            continue
        }

        Write-Host "Taking the Windows disk offline $($diskInfo.DiskNumber), drive $($diskInfo.DriveLetter):"
        Set-Disk -Number $diskInfo.DiskNumber -IsOffline $true

        $offlinedByScript += $diskInfo
    }

    return $offlinedByScript
}

function Set-DatabaseDisksOnline {
    param(&#091;object&#091;]]$DiskInfos)

    foreach ($diskInfo in ($DiskInfos | Sort-Object DiskNumber -Unique)) {
        Write-Host "Bringing the Windows disk back online. $($diskInfo.DiskNumber), drive $($diskInfo.DriveLetter):"
        Set-Disk -Number $diskInfo.DiskNumber -IsOffline $false
    }

    Write-Host "Update-HostStorageCache..."
    Update-HostStorageCache
}

Assert-SafeName -Value $SnapName -Name "SnapName" -Pattern '^&#091;A-Za-z0-9_.:-]{1,160}$'

foreach ($zvol in $Zvols) {
    Assert-SafeName -Value $zvol -Name "Zvol" -Pattern '^&#091;A-Za-z0-9_.:/-]{1,240}$'
}

$DbQuoted = "&#091;" + $Database.Replace("]", "]]") + "]"
$DbLiteral = $Database.Replace("'", "''")
$BackupFileSql = $BackupFile.Replace("'", "''")

$ZfsSnapshots = $Zvols | ForEach-Object { "$_@$SnapName" }
$ZfsSnapshotArgs = ($ZfsSnapshots | ForEach-Object { "'$_'" }) -join " "

$RecoveryOption = if ($NoRecovery) { "NORECOVERY" } else { "RECOVERY" }

$DatabaseDiskInfos = @()
$DisksOfflinedByScript = @()

Write-Host ""
Write-Host "Restore SQL Server from a ZFS snapshot, without restarting the VM"
Write-Host "SQL Instance : $SqlInstance"
Write-Host "Database     : $Database"
Write-Host "BackupFile   : $BackupFile"
Write-Host "DB volumes   : $($DatabaseDriveLetters -join ', ')"
Write-Host "Snapshots    :"
$ZfsSnapshots | ForEach-Object { Write-Host "  $_" }
Write-Host ""

try {
    Write-Host "Checking ZFS snapshots..."
    Invoke-SshChecked "zfs list -H -t snapshot -o name $ZfsSnapshotArgs &gt;/dev/null"

    Write-Host "Identifying Windows disks containing SQL Server files..."
    foreach ($driveLetter in $DatabaseDriveLetters) {
        $diskInfo = Get-DiskForDriveLetter $driveLetter
        $DatabaseDiskInfos += $diskInfo

        Write-Host "Drive $($diskInfo.DriveLetter): -&gt; Windows disk $($diskInfo.DiskNumber) &#091;$($diskInfo.FriendlyName)]"
    }

    $backupDrive = $null
    if ($BackupFile -match '^(&#091;A-Za-z]):\\') {
        $backupDrive = Normalize-DriveLetter $Matches&#091;1]

        try {
            $backupDiskInfo = Get-DiskForDriveLetter $backupDrive
            $targetDiskNumbers = @($DatabaseDiskInfos | ForEach-Object { $_.DiskNumber } | Select-Object -Unique)

            if ($targetDiskNumbers -contains $backupDiskInfo.DiskNumber) {
                throw @"
The backup file $BackupFile is located on drive $backupDrive, which is on the same Windows disk as the SQL Server data volume.
Taking the data disk offline would make the .bkm file inaccessible, and a rollback could also make the .bkm file disappear.
Move the .bkm file to C:, a network share, or another disk that is not rolled back.
"@
            }
        }
        catch {
            throw
        }
    }

    Write-Host "Checking whether the SQL Server database exists..."
    $DbExists = Invoke-SqlScalar "SELECT CASE WHEN DB_ID(N'$DbLiteral') IS NULL THEN 0 ELSE 1 END;"

    if ($DbExists -eq 1) {
        Write-Host "Taking database $Database OFFLINE..."
        Invoke-SqlNonQuery @"
ALTER DATABASE $DbQuoted SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
ALTER DATABASE $DbQuoted SET OFFLINE WITH ROLLBACK IMMEDIATE;
"@
    }
    else {
        Write-Host "Database $Database does not exist in SQL Server. Continuing with disk offline and ZFS rollback."
    }

    Write-Host "Taking Windows disks containing MDF/LDF files offline..."
    $DisksOfflinedByScript = Set-DatabaseDisksOffline -DiskInfos $DatabaseDiskInfos

    Write-Host "Rolling back ZFS snapshot..."
    $RollbackCommands = ($ZfsSnapshots | ForEach-Object { "zfs rollback -r '$_'" }) -join "; "
    Invoke-SshChecked "set -e; $RollbackCommands"

    Write-Host "Bringing Windows disks back online..."
    Set-DatabaseDisksOnline -DiskInfos $DisksOfflinedByScript
    $DisksOfflinedByScript = @()

    Write-Host "Short pause to let Windows and SQL Server detect the restored disk state..."
    Start-Sleep -Seconds 5

    Write-Host "Restoring SQL Server metadata-only backup..."

    $RestoreSql = @"
RESTORE DATABASE $DbQuoted
FROM DISK = N'$BackupFileSql'
WITH METADATA_ONLY,
     REPLACE,
     $RecoveryOption;
"@

    Invoke-SqlNonQuery $RestoreSql

    if (-not $NoRecovery) {
        Write-Host "Setting database back to MULTI_USER..."
        Invoke-SqlNonQuery @"
ALTER DATABASE $DbQuoted SET MULTI_USER;
"@
    }

    Write-Host ""
    Write-Host "Restore completed."
    Write-Host "Database : $Database"
    Write-Host "Snapshot : $SnapName"
    Write-Host "Backup   : $BackupFile"
}
catch {
    Write-Warning "Restore failed: $_"

    if ($DisksOfflinedByScript.Count -gt 0) {
        try {
            Write-Warning "Attempting to bring disks offlined by the script back online..."
            Set-DatabaseDisksOnline -DiskInfos $DisksOfflinedByScript
            $DisksOfflinedByScript = @()
        }
        catch {
            Write-Warning "Unable to automatically bring the disks back online. Check with Get-Disk."
        }
    }

    try {
        $DbExistsAfterError = Invoke-SqlScalar "SELECT CASE WHEN DB_ID(N'$DbLiteral') IS NULL THEN 0 ELSE 1 END;"

        if ($DbExistsAfterError -eq 1 -and -not $NoRecovery) {
            Write-Warning "Attempting to set the database back ONLINE/MULTI_USER..."
            Invoke-SqlNonQuery @"
ALTER DATABASE $DbQuoted SET ONLINE;
ALTER DATABASE $DbQuoted SET MULTI_USER;
"@
        }
    }
    catch {
        Write-Warning "Unable to automatically set the database back ONLINE/MULTI_USER."
    }

    throw
}</code></pre>



<h2 class="wp-block-heading">What does it look like?</h2>



<p class="wp-block-paragraph">We start the backup process:</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="530" height="82" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-52.png" alt="" class="wp-image-44503" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-52.png 530w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-52-300x46.png 300w" sizes="auto, (max-width: 530px) 100vw, 530px" /></figure>



<p class="wp-block-paragraph">We verify that the snapshot is present:</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="750" height="131" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-53.png" alt="" class="wp-image-44504" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-53.png 750w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-53-300x52.png 300w" sizes="auto, (max-width: 750px) 100vw, 750px" /></figure>



<p class="wp-block-paragraph">We verify that the backup is present:</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="601" height="36" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-54.png" alt="" class="wp-image-44505" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-54.png 601w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-54-300x18.png 300w" sizes="auto, (max-width: 601px) 100vw, 601px" /></figure>



<p class="wp-block-paragraph">We drop the StackOverflow database:</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="314" height="301" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-55.png" alt="" class="wp-image-44506" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-55.png 314w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-55-300x288.png 300w" sizes="auto, (max-width: 314px) 100vw, 314px" /></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="310" height="231" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-56.png" alt="" class="wp-image-44507" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-56.png 310w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-56-300x224.png 300w" sizes="auto, (max-width: 310px) 100vw, 310px" /></figure>



<p class="wp-block-paragraph">We start the restore process:</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="951" height="384" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-57.png" alt="" class="wp-image-44508" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-57.png 951w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-57-300x121.png 300w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-57-768x310.png 768w" sizes="auto, (max-width: 951px) 100vw, 951px" /></figure>



<p class="wp-block-paragraph">The database is available again. The restore took only a few seconds for a database of approximately 200 GB.</p>



<h2 class="wp-block-heading">Major drawbacks</h2>



<p class="wp-block-paragraph">In my case, the solution is executed from the SQL Server itself. Ideally, it should rather be hosted on another server or client machine. We could also imagine running these scripts from a scheduler such as RedDeck, for example.</p>



<p class="wp-block-paragraph">During the database restore, the database is switched to SINGLE_USER mode. This could be an issue if the applications using the database reconnect very frequently. A better approach would probably be to explicitly terminate the active sessions using the KILL command.</p>



<p class="wp-block-paragraph">We have also not yet covered the use of a REST API.</p>



<p class="wp-block-paragraph">Thank you. <a href="https://www.linkedin.com/in/amine-haloui-76968056/">Amine Haloui</a></p>
<p>L’article <a href="https://www.dbi-services.com/blog/sql-server-snapshot-backup-and-restore-with-proxmox-zfs-2-3/">SQL Server Snapshot Backup and Restore with Proxmox ZFS &#8211; Powershell implementation (2/3)</a> est apparu en premier sur <a href="https://www.dbi-services.com/blog">dbi Blog</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.dbi-services.com/blog/sql-server-snapshot-backup-and-restore-with-proxmox-zfs-2-3/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>A Misleading SSAS Error in Power BI Report Server When Using DirectQuery Mode</title>
		<link>https://www.dbi-services.com/blog/a-misleading-ssas-error-in-power-bi-report-server-when-using-directquery-mode/</link>
					<comments>https://www.dbi-services.com/blog/a-misleading-ssas-error-in-power-bi-report-server-when-using-directquery-mode/#respond</comments>
		
		<dc:creator><![CDATA[Amine Haloui]]></dc:creator>
		<pubDate>Thu, 14 May 2026 21:17:45 +0000</pubDate>
				<category><![CDATA[Business Intelligence]]></category>
		<category><![CDATA[Database Administration & Monitoring]]></category>
		<category><![CDATA[Database management]]></category>
		<category><![CDATA[Power BI Report Server]]></category>
		<guid isPermaLink="false">https://www.dbi-services.com/blog/?p=44400</guid>

					<description><![CDATA[<p>Our client was experiencing issues after publishing a report that used Direct Query mode. Specifically, when the report was queried, the following error occurred: Error :&#160; We couldn&#8217;t connect to the Analysis Services server. Make sure you&#8217;ve entered the connection string correctly. However, this issue did not occur in Power BI Desktop. In Power BI, [&#8230;]</p>
<p>L’article <a href="https://www.dbi-services.com/blog/a-misleading-ssas-error-in-power-bi-report-server-when-using-directquery-mode/">A Misleading SSAS Error in Power BI Report Server When Using DirectQuery Mode</a> est apparu en premier sur <a href="https://www.dbi-services.com/blog">dbi Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">Our client was experiencing issues after publishing a report that used Direct Query mode. Specifically, when the report was queried, the following error occurred:</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="738" height="154" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-20.png" alt="" class="wp-image-44402" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-20.png 738w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-20-300x63.png 300w" sizes="auto, (max-width: 738px) 100vw, 738px" /></figure>



<p class="wp-block-paragraph">Error :&nbsp; We couldn&#8217;t connect to the Analysis Services server. Make sure you&#8217;ve entered the connection string correctly.</p>



<p class="wp-block-paragraph">However, this issue did not occur in Power BI Desktop.</p>



<p class="wp-block-paragraph">In Power BI, several data loading modes are available. Import mode loads data into the Power BI model, which usually provides faster performance and richer modeling capabilities. DirectQuery mode does not store the data in the model instead, each interaction sends queries to the source system in real time. Import is generally better for speed and flexibility, while DirectQuery is useful when data must stay in the source or remain near real-time. The trade-off is that DirectQuery depends more heavily on source performance, network latency, and source-system limitations.</p>



<h2 class="wp-block-heading" id="h-configuration">Configuration</h2>



<p class="wp-block-paragraph">At first glance, one might think that the corresponding report is trying to connect to an SSAS service and that there is a connectivity issue between Power BI Report Server and a SQL Server Analysis Services instance.</p>



<p class="wp-block-paragraph">However, after reviewing the data source, there was no connection to SSAS:</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="667" height="388" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-22.png" alt="" class="wp-image-44405" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-22.png 667w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-22-300x175.png 300w" sizes="auto, (max-width: 667px) 100vw, 667px" /></figure>



<p class="wp-block-paragraph">We did not have this type of configuration:</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="357" height="145" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-21.png" alt="" class="wp-image-44407" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-21.png 357w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-21-300x122.png 300w" sizes="auto, (max-width: 357px) 100vw, 357px" /></figure>



<p class="wp-block-paragraph"><strong>The questions that arise</strong></p>



<p class="wp-block-paragraph">Why are we getting an error message even though the report is not trying to connect to a SQL Server Analysis Services instance?</p>



<p class="wp-block-paragraph">Why is our client seeing this error message and unable to query the report?</p>



<h2 class="wp-block-heading" id="h-troubleshooting">Troubleshooting</h2>



<p class="wp-block-paragraph">By reviewing the Power BI Report Server logs, it was possible to see this type of message:</p>



<p class="wp-block-paragraph">Failed to get CSDL. &#8212;&gt; MsolapWrapper.MsolapWrapperException: Failure encountered while getting schema.</p>



<p class="wp-block-paragraph">CannotRetrieveModelException: An error occurred while loading the model&#8230; Verify that the connection information is correct and that you have permissions to access the data source.</p>



<p class="wp-block-paragraph">It is also possible to retrieve some information from the ExecutionLog3 table:</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="40" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-19-1024x40.png" alt="" class="wp-image-44401" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-19-1024x40.png 1024w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-19-300x12.png 300w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-19-768x30.png 768w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-19.png 1329w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p class="wp-block-paragraph">Indeed,&nbsp; whenever a Power BI report is rendered or a scheduled refresh is executed, new entries are written to the ExecutionLog3 table. These entries can be queried through the ExecutionLog3 view in the Report Server catalog database. The ConceptualSchema event corresponds to a user viewing the report.</p>



<p class="wp-block-paragraph">When querying the Event Viewer, it returned these errors at the time we tried to query the report:</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="146" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-25-1024x146.png" alt="" class="wp-image-44404" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-25-1024x146.png 1024w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-25-300x43.png 300w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-25-768x109.png 768w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-25.png 1348w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<h2 class="wp-block-heading" id="h-more-details-about-the-first-errors">More details about the first errors</h2>



<p class="wp-block-paragraph">We have two error messages that seem to point in two different directions. In reality, the first error messages are not very useful and appear because although the error message refers to Analysis Services, the report was not connecting to an external SSAS instance. Power BI Report Server uses an internal Analysis Services engine to load and query Power BI report models. Therefore, the error was raised by the internal PBIRS Analysis Services engine, not by a standalone SQL Server Analysis Services instance.</p>



<p class="wp-block-paragraph">Power BI Report Server may report an Analysis Services-related error even when the report does not connect to an external SSAS instance. This is because PBIRS uses an internal Analysis Services engine to host and execute the Power BI semantic model behind the report. In DirectQuery mode, the data remains in SQL Server, but the report model, metadata, relationships, measures, and DAX queries are still processed through this internal engine.</p>



<p class="wp-block-paragraph">When a user opens the report, PBIRS asks this local Analysis Services process to load the model and generate the queries sent to SQL Server.</p>



<p class="wp-block-paragraph">Therefore, if the internal engine fails while loading the model, validating metadata, or connecting to the SQL Server data source, the error may mention Analysis Services. This does not mean that the report is connected to a standalone SSAS instance.</p>



<h2 class="wp-block-heading" id="h-more-details-about-the-second-errors">More details about the second errors</h2>



<p class="wp-block-paragraph">This was the second error that pointed us in the right direction to actually resolve the issue. After looking at it more closely, we started considering connection encryption and certificates. This problem is documented, and several solutions are available.</p>



<p class="wp-block-paragraph">Indeed, the SQL Server instance queried to retrieve the data did not have a certificate issued by a trusted certificate authority. It was using a self-generated certificate.</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="832" height="257" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-24.png" alt="" class="wp-image-44403" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-24.png 832w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-24-300x93.png 300w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-24-768x237.png 768w" sizes="auto, (max-width: 832px) 100vw, 832px" /></figure>



<p class="wp-block-paragraph">This can lead to errors such as the ones mentioned above, or errors like the following:</p>



<p class="wp-block-paragraph">Microsoft SQL: A connection was successfully established with the server, but then an error occurred during the login process. Provider: SSL Provider, error: 0 &#8211; The certificate chain was issued by an authority that is not trusted.</p>



<h2 class="wp-block-heading" id="h-solutions">Solutions</h2>



<p class="wp-block-paragraph">We had at least three options to resolve this issue:</p>



<ul class="wp-block-list">
<li>Change the connection mode to Import</li>



<li>Install a certificate issued by a trusted certificate authority however this would represent a major change</li>



<li>Create a new environment variable on the Power BI Report Server</li>
</ul>



<p class="wp-block-paragraph">The client chose the easiest solution to implement: creating the corresponding environment variable.</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="833" height="538" src="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-23.png" alt="" class="wp-image-44406" srcset="https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-23.png 833w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-23-300x194.png 300w, https://www.dbi-services.com/blog/wp-content/uploads/sites/2/2026/05/image-23-768x496.png 768w" sizes="auto, (max-width: 833px) 100vw, 833px" /></figure>



<p class="wp-block-paragraph">We then restarted the corresponding Power BI Report Server service and this resolved the issue.</p>



<h2 class="wp-block-heading" id="h-references">References :</h2>



<p class="wp-block-paragraph"><a href="https://learn.microsoft.com/en-us/power-bi/report-server/scheduled-refresh-troubleshoot">https://learn.microsoft.com/en-us/power-bi/report-server/scheduled-refresh-troubleshoot</a></p>



<p class="wp-block-paragraph"><a href="https://learn.microsoft.com/en-us/power-query/connectors/sql-server#sql-server-certificate-isnt-trusted-on-the-client-power-bi-desktop-or-on-premises-data-gateway">https://learn.microsoft.com/en-us/power-query/connectors/sql-server#sql-server-certificate-isnt-trusted-on-the-client-power-bi-desktop-or-on-premises-data-gateway</a></p>



<p class="wp-block-paragraph">Thank you. <a href="https://www.linkedin.com/in/amine-haloui-76968056/">Amine Haloui</a></p>
<p>L’article <a href="https://www.dbi-services.com/blog/a-misleading-ssas-error-in-power-bi-report-server-when-using-directquery-mode/">A Misleading SSAS Error in Power BI Report Server When Using DirectQuery Mode</a> est apparu en premier sur <a href="https://www.dbi-services.com/blog">dbi Blog</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.dbi-services.com/blog/a-misleading-ssas-error-in-power-bi-report-server-when-using-directquery-mode/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>PostgreSQL 19: Dynamically adjust the I/O worker pool</title>
		<link>https://www.dbi-services.com/blog/postgresql-19-dynamically-adjust-the-i-o-worker-pool/</link>
					<comments>https://www.dbi-services.com/blog/postgresql-19-dynamically-adjust-the-i-o-worker-pool/#respond</comments>
		
		<dc:creator><![CDATA[Daniel Westermann]]></dc:creator>
		<pubDate>Wed, 13 May 2026 05:12:15 +0000</pubDate>
				<category><![CDATA[Database Administration & Monitoring]]></category>
		<category><![CDATA[Database management]]></category>
		<category><![CDATA[PostgreSQL]]></category>
		<guid isPermaLink="false">https://www.dbi-services.com/blog/?p=44393</guid>

					<description><![CDATA[<p>When PostgreSQL 18 was released last year one of the major features was the introduction of the asynchronous I/O subsystem. The main configuration parameter for this was (and still is) io_method, which can be &#8220;worker&#8221; (the default), io_uring or sync (the old behavior). If you opted for &#8220;workers&#8221; the number of those workers is controlled [&#8230;]</p>
<p>L’article <a href="https://www.dbi-services.com/blog/postgresql-19-dynamically-adjust-the-i-o-worker-pool/">PostgreSQL 19: Dynamically adjust the I/O worker pool</a> est apparu en premier sur <a href="https://www.dbi-services.com/blog">dbi Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">When <a href="https://www.postgresql.org/docs/current/release-18.html#RELEASE-18-CHANGES" target="_blank" rel="noreferrer noopener">PostgreSQL 18 was released</a> last year one of the major features was the <a href="https://www.dbi-services.com/blog/postgresql-18-support-for-asynchronous-i-o/" target="_blank" rel="noreferrer noopener">introduction of the asynchronous I/O subsystem</a>. The main configuration parameter for this was (and still is) <a href="https://www.postgresql.org/docs/18/runtime-config-resource.html#GUC-IO-METHOD" target="_blank" rel="noreferrer noopener">io_method</a>, which can be &#8220;worker&#8221; (the default), <a href="https://en.wikipedia.org/wiki/Io_uring" target="_blank" rel="noreferrer noopener">io_uring</a> or sync (the old behavior). If you opted for &#8220;workers&#8221; the number of those workers is controlled by &#8220;<a href="https://www.postgresql.org/docs/18/runtime-config-resource.html#GUC-IO-WORKERS" target="_blank" rel="noreferrer noopener">io_workers</a>&#8221; and the default for this is 3. PostgreSQL 19 most probably will change the way how many of those workers are launched, not anymore using the static value of &#8220;io_workers&#8221; but making this dynamic by launching workers from a predefined pool.</p>



<p class="wp-block-paragraph">The configuration parameter &#8220;io_workers&#8221; is gone and four additional parameters show up to control this:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: sql; highlight: [1]; title: ; notranslate">
postgres=# \dconfig io_*work*
 List of configuration parameters
         Parameter         | Value 
---------------------------+-------
 io_max_workers            | 8
 io_min_workers            | 2
 io_worker_idle_timeout    | 1min
 io_worker_launch_interval | 100ms
(4 rows)
</pre></div>


<p class="wp-block-paragraph">&#8220;io_min_workes&#8221; (as the name implies) controls how many workers are available by default, which is two:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; highlight: [1]; title: ; notranslate">
postgres@:/home/postgres/ &#x5B;DEV] ps -ef | grep postgres | grep worker | grep -v grep
postgres    8564    8562  0 06:34 ?        00:00:00 postgres: pgdev: io worker 0
postgres    8565    8562  0 06:34 ?        00:00:00 postgres: pgdev: io worker 1
</pre></div>


<p class="wp-block-paragraph">&#8220;io_max_workers&#8221; (again, as the name implies) controls the maximum worker processes which can be launched for the whole instance.</p>



<p class="wp-block-paragraph">To see that dynamic startup of workers in action lets create a simple table containing twenty million rows:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: sql; highlight: [1,3]; title: ; notranslate">
postgres=# create table t ( a int, b text, c timestamptz );
CREATE TABLE
postgres=# insert into t select i, i::text, now() from generate_series(1,20000000) i;
INSERT 0 2000000
</pre></div>


<p class="wp-block-paragraph">While watching the workers in a separate session:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; highlight: [1]; title: ; notranslate">
postgres@:/home/postgres/ &#x5B;DEV] watch &quot;ps -ef | grep postgres | grep worker | grep -v grep&quot;

Every 2.0s: ps -ef | grep postgres | grep worker | grep -v grep               pgbox.it.dbi-services.com: 06:52:20 AM
                                                                                                       in 0.022s (0)
postgres    8564    8562  0 06:34 ?        00:00:00 postgres: pgdev: io worker 0
postgres    8565    8562  0 06:34 ?        00:00:00 postgres: pgdev: io worker 1
</pre></div>


<p class="wp-block-paragraph">&#8230; and doing a count(*) over the whole table in session one:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: sql; highlight: [1]; title: ; notranslate">
postgres=# select count(*) from t;
  count   
----------
 20000000
(1 row)
</pre></div>


<p class="wp-block-paragraph">&#8230; you&#8217;ll notice that an additional worker (io worker 2) shows up in the second session watching the processes (maybe you have to play a bit with the number of rows depending on your configuration of PostgreSQL):</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; highlight: [6]; title: ; notranslate">
Every 2.0s: ps -ef | grep postgres | grep worker | grep -v grep               pgbox.it.dbi-services.com: 07:02:40 AM
                                                                                                       in 0.018s (0)
postgres    8564    8562  0 06:34 ?        00:00:02 postgres: pgdev: io worker 0
postgres    8565    8562  0 06:34 ?        00:00:00 postgres: pgdev: io worker 1
postgres   11914    8562  0 07:02 ?        00:00:00 postgres: pgdev: io worker 2
</pre></div>


<p class="wp-block-paragraph">Once this additional worker is idle for one minute it will disappear and we&#8217;re back to two worker processes:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; highlight: [1]; title: ; notranslate">
Every 2.0s: ps -ef | grep postgres | grep worker | grep -v grep               pgbox.it.dbi-services.com: 07:04:24 AM
                                                                                                       in 0.020s (0)
postgres    8564    8562  0 06:34 ?        00:00:02 postgres: pgdev: io worker 0
postgres    8565    8562  0 06:34 ?        00:00:00 postgres: pgdev: io worker 1
</pre></div>


<p class="wp-block-paragraph">This is controlled by &#8220;io_worker_idle_timeout&#8221; and the default is one minute. </p>



<p class="wp-block-paragraph">The remaining configuration knob is &#8220;io_worker_launch_interval&#8221;, and this is the interval at which additional workers can be launched. The reason behind this is, that not too many workers will be launched at once.</p>



<p class="wp-block-paragraph">This will make tuning the workers easier, compared to PostgreSQL 18. Again, thanks to all involved, the commit is <a href="https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=d1c01b79d4ae90e52bf9db9c05c9de17b7313e85">here</a>.</p>



<p class="wp-block-paragraph"></p>
<p>L’article <a href="https://www.dbi-services.com/blog/postgresql-19-dynamically-adjust-the-i-o-worker-pool/">PostgreSQL 19: Dynamically adjust the I/O worker pool</a> est apparu en premier sur <a href="https://www.dbi-services.com/blog">dbi Blog</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.dbi-services.com/blog/postgresql-19-dynamically-adjust-the-i-o-worker-pool/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>PostgreSQL 19: pg_waldump can now read from archives</title>
		<link>https://www.dbi-services.com/blog/postgresql-19-pg_waldump-can-now-read-from-archives/</link>
					<comments>https://www.dbi-services.com/blog/postgresql-19-pg_waldump-can-now-read-from-archives/#respond</comments>
		
		<dc:creator><![CDATA[Daniel Westermann]]></dc:creator>
		<pubDate>Mon, 11 May 2026 04:48:04 +0000</pubDate>
				<category><![CDATA[Database Administration & Monitoring]]></category>
		<category><![CDATA[Database management]]></category>
		<category><![CDATA[PostgreSQL]]></category>
		<guid isPermaLink="false">https://www.dbi-services.com/blog/?p=44025</guid>

					<description><![CDATA[<p>When PostgreSQL 18 introduced the ability to verify tar based (and compressed) backups with pg_verifybackup there was one limitation: The verification of the WAL files in the tars (or compressed files) had to be skipped (--no-parse-wal) because pg_waldump in that version of PostgreSQL is not able to cope with that (and pg_waldump is used by [&#8230;]</p>
<p>L’article <a href="https://www.dbi-services.com/blog/postgresql-19-pg_waldump-can-now-read-from-archives/">PostgreSQL 19: pg_waldump can now read from archives</a> est apparu en premier sur <a href="https://www.dbi-services.com/blog">dbi Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">When <a href="https://www.postgresql.org/docs/current/release-18.html#RELEASE-18-HIGHLIGHTS" target="_blank" rel="noreferrer noopener">PostgreSQL 18 introduced the ability to verify tar based (and compressed) backups with pg_verifybackup</a> there was one limitation: <a href="https://www.dbi-services.com/blog/postgresql-18-verify-tar-format-and-compressed-backups/" target="_blank" rel="noreferrer noopener">The verification of the WAL files in the tars (or compressed files) had to be skipped</a> (<code>--no-parse-wal</code>) because <a href="https://www.postgresql.org/docs/18/pgwaldump.html" target="_blank" rel="noreferrer noopener">pg_waldump</a> in that version of PostgreSQL is not able to cope with that (and pg_waldump is used by pg_verifybackup). This will change with PostgreSQL 19 because of this <a href="https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b15c1513984e6eafd264bf6e84a08549905621f1" target="_blank" rel="noreferrer noopener">commit</a>: &#8220;pg_waldump: Add support for reading WAL from tar archives&#8221;.</p>



<p class="wp-block-paragraph">This is maybe not a feature a lot of people have waited for but it makes two tasks a lot easier:</p>



<ul class="wp-block-list">
<li>As mentioned above: pg_verifybackup can now read from WAL in tar and compressed files and therefore can do WAL verification</li>



<li>When you have WAL in a tar or compressed file and you know what you&#8217;re looking for you do not need to manually extract those archives before using pg_waldump</li>
</ul>



<p class="wp-block-paragraph">To see that in action once can create a tar or compressed backup with pb_basebackup:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; highlight: [1,2,3]; title: ; notranslate">
postgres@:/home/postgres/ &#x5B;pgdev] mkdir /var/tmp/dummy
postgres@:/home/postgres/ &#x5B;pgdev] pg_basebackup --checkpoint=fast --format=t --pgdata=/var/tmp/dummy
postgres@:/home/postgres/ &#x5B;pgdev] ls -la /var/tmp/dummy
total 128476
drwxr-xr-x. 1 postgres postgres        66 May 11 06:36 .
drwxrwxrwt. 1 root     root           762 May 11 06:33 ..
-rw-------. 1 postgres postgres    149515 May 11 06:36 backup_manifest
-rw-------. 1 postgres postgres 114619904 May 11 06:36 base.tar
-rw-------. 1 postgres postgres  16778752 May 11 06:36 pg_wal.tar
</pre></div>


<p class="wp-block-paragraph">Looking at the PostgreSQL log file while the backup is running gives us a LSN we can give to pg_waldump:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: plain; highlight: [3]; title: ; notranslate">
2026-05-11 06:36:18.397 CEST - 2 - 1731 -  - @ - 0LOG:  checkpoint complete: fast force wait: wrote 2 buffers (0.0%), wrote 3 SLRU buffers; 0 WAL file(s) added, 1 removed, 0 recycled; write=0.002 s, sync=0.005 s, total=0.019 s; sync files=4, longest=0.003 s, average=0.002 s; distance=16384 kB, estimate=16384 kB; lsn=0/0D000088, redo lsn=0/0D000028

postgres@:/home/postgres/ &#x5B;pgdev] pg_waldump --path=/var/tmp/dummy/pg_wal.tar -s &quot;0/0D000088&quot; 
rmgr: XLOG        len (rec/tot):    122/   122, tx:          0, lsn: 0/0D000088, prev 0/0D000050, desc: CHECKPOINT_ONLINE redo 0/0D000028; tli 1; prev tli 1; fpw true; wal_level replica; logical decoding false; xid 0:729; oid 16420; multi 1; offset 1; oldest xid 684 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 729; checksums on; online
rmgr: Standby     len (rec/tot):     54/    54, tx:          0, lsn: 0/0D000108, prev 0/0D000088, desc: RUNNING_XACTS nextXid 729 latestCompletedXid 728 oldestRunningXid 729; dbid: 0
rmgr: XLOG        len (rec/tot):     34/    34, tx:          0, lsn: 0/0D000140, prev 0/0D000108, desc: BACKUP_END 0/0D000028
rmgr: XLOG        len (rec/tot):     24/    24, tx:          0, lsn: 0/0D000168, prev 0/0D000140, desc: SWITCH 
pg_waldump: error: could not find WAL &quot;00000001000000000000000E&quot; in archive &quot;pg_wal.tar
</pre></div>


<p class="wp-block-paragraph">This helps pg_verifybackup fully verify a backup (in previous versions you had to use &#8220;&#8211;no-parse-wal&#8221;):</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; highlight: [1]; title: ; notranslate">
postgres@:/home/postgres/ &#x5B;pgdev] pg_verifybackup --progress /var/tmp/dummy/
111933/111933 kB (100%) verified
backup successfully verified
</pre></div>


<p class="wp-block-paragraph">As usual, thanks to all involved.</p>
<p>L’article <a href="https://www.dbi-services.com/blog/postgresql-19-pg_waldump-can-now-read-from-archives/">PostgreSQL 19: pg_waldump can now read from archives</a> est apparu en premier sur <a href="https://www.dbi-services.com/blog">dbi Blog</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.dbi-services.com/blog/postgresql-19-pg_waldump-can-now-read-from-archives/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>PostgreSQL 19: Importing statistics from remote servers</title>
		<link>https://www.dbi-services.com/blog/postgresql-19-importing-statistics-from-remote-servers/</link>
					<comments>https://www.dbi-services.com/blog/postgresql-19-importing-statistics-from-remote-servers/#respond</comments>
		
		<dc:creator><![CDATA[Daniel Westermann]]></dc:creator>
		<pubDate>Mon, 20 Apr 2026 08:15:22 +0000</pubDate>
				<category><![CDATA[Database Administration & Monitoring]]></category>
		<category><![CDATA[Database management]]></category>
		<category><![CDATA[PostgreSQL]]></category>
		<guid isPermaLink="false">https://www.dbi-services.com/blog/?p=43948</guid>

					<description><![CDATA[<p>Usually we do not see many foreign data wrappers being used by our customers. Most of them use the foreign data wrapper for Oracle to fetch data from Oracle systems. Some of them use the foreign data wrapper for files but that&#8217;s mostly it. Only one (I am aware of) actually uses the foreign data [&#8230;]</p>
<p>L’article <a href="https://www.dbi-services.com/blog/postgresql-19-importing-statistics-from-remote-servers/">PostgreSQL 19: Importing statistics from remote servers</a> est apparu en premier sur <a href="https://www.dbi-services.com/blog">dbi Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">Usually we do not see many foreign data wrappers being used by our customers. Most of them use the <a href="https://github.com/laurenz/oracle_fdw" target="_blank" rel="noreferrer noopener">foreign data wrapper for Oracle</a> to fetch data from Oracle systems. Some of them use the <a href="https://www.dbi-services.com/blog/external-tables-in-postgresql/">foreign data wrapper for files</a> but that&#8217;s mostly it. Only one (I am aware of) actually uses the <a href="https://www.postgresql.org/docs/18/postgres-fdw.html" target="_blank" rel="noreferrer noopener">foreign data wrapper for PostgreSQL</a> which obviously connects PostgreSQL to PostgreSQL. Some foreign data wrappers allow for collecting optimizer statistics on foreign tables and the foreign data wrappers for Oracle and PostgreSQL are examples for this. These local statistics are better than nothing but you need to take care that they are up to date and for that you need a fresh copy of the statistics over the remote data. PostgreSQL 19 will come with a solution for that when it comes to the foreign data wrapper for PostgreSQL. Actually, the solution is not in the foreign data wrapper for PostgreSQL but in the underlying framework and postgres_fdw uses can use that from version 19 on.</p>



<p class="wp-block-paragraph">For looking at this we need a simple setup, so we initialize two new PostgreSQL 19 clusters and connect them with postgres_fdw:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: bash; highlight: [1,3,4,5,6,7,8,9,11,13,15,17,19,21]; title: ; notranslate">
postgres@:/home/postgres/ &#x5B;pgdev] initdb --version
initdb (PostgreSQL) 19devel
postgres@:/home/postgres/ &#x5B;pgdev] initdb --pgdata=/var/tmp/pg1
postgres@:/home/postgres/ &#x5B;pgdev] initdb --pgdata=/var/tmp/pg2
postgres@:/home/postgres/ &#x5B;pgdev] echo &quot;port=8888&quot; &gt;&gt; /var/tmp/pg1/postgresql.auto.conf 
postgres@:/home/postgres/ &#x5B;pgdev] echo &quot;port=8889&quot; &gt;&gt; /var/tmp/pg2/postgresql.auto.conf 
postgres@:/home/postgres/ &#x5B;pgdev] pg_ctl --pgdata=/var/tmp/pg1/ start
postgres@:/home/postgres/ &#x5B;pgdev] pg_ctl --pgdata=/var/tmp/pg2/ start
postgres@:/home/postgres/ &#x5B;pgdev] psql -p 8888 -c &quot;create extension postgres_fdw&quot;
CREATE EXTENSION
postgres@:/home/postgres/ &#x5B;pgdev] psql -p 8889 -c &quot;create table t ( a int, b text, c timestamptz )&quot;
CREATE TABLE
postgres@:/home/postgres/ &#x5B;pgdev] psql -p 8889 -c &quot;insert into t select i, md5(i::text), now() from generate_series(1,1000000) i&quot;
INSERT 0 1000000
postgres@:/home/postgres/ &#x5B;pgdev] psql -p 8888 -c &quot;create server srv_pg2 foreign data wrapper postgres_fdw options(port &#039;8889&#039;, dbname &#039;postgres&#039;)&quot;
CREATE SERVER
postgres@:/home/postgres/ &#x5B;pgdev] psql -p 8888 -c &quot;create user mapping for postgres server srv_pg2 options (user &#039;postgres&#039;, password &#039;postgres&#039;)&quot;
CREATE USER MAPPING
postgres@:/home/postgres/ &#x5B;pgdev] psql -p 8888 -c &quot;create foreign table ft (a int, b text, c timestamptz) server srv_pg2 options (schema_name &#039;public&#039;, table_name &#039;t&#039;)&quot;
CREATE FOREIGN TABLE
postgres@:/home/postgres/ &#x5B;pgdev] psql -p 8888 -c &quot;select count(*) from ft&quot;
  count  
---------
 1000000
(1 row)
</pre></div>


<p class="wp-block-paragraph">What we have now is one table in the cluster on port 8889 and this table is attached as a foreign table in the cluster on port 8888.</p>



<p class="wp-block-paragraph">We already have statistics on the source table in the cluster on port 8889:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: sql; highlight: [1]; title: ; notranslate">
postgres@:/home/postgres/ &#x5B;pgdev] psql -p 8889 -c &quot;select reltuples::bigint from pg_class  where relname = &#039;t&#039;&quot;

 reltuples 
-----------
   1000000
(1 row)
</pre></div>


<p class="wp-block-paragraph">&#8230; but we do not have any statistics on the foreign table in the cluster on port 8888:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: sql; highlight: [1]; title: ; notranslate">
postgres@:/home/postgres/ &#x5B;pgdev] psql -p 8888 -c &quot;select reltuples::bigint from pg_class  where relname = &#039;ft&#039;&quot;

 reltuples 
-----------
        -1

(1 row)
</pre></div>


<p class="wp-block-paragraph">Only after manually analyzing the foreign table the statistics show up:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: sql; highlight: [1,3]; title: ; notranslate">
postgres@:/home/postgres/ &#x5B;DEV] psql -p 8888 -c &quot;analyze ft&quot;
ANALYZE
postgres@:/home/postgres/ &#x5B;DEV] psql -p 8888 -c &quot;select reltuples::bigint from pg_class  where relname = &#039;ft&#039;&quot;

 reltuples 
-----------
   1000000
(1 row)
</pre></div>


<p class="wp-block-paragraph">The issue that can arise with these local statistics is, that they probably become outdated when the source table is modified:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: sql; highlight: [1,3,10]; title: ; notranslate">
postgres@:/home/postgres/ &#x5B;pgdev] psql -p 8889 -c &quot;insert into t select i, md5(i::text), now() from generate_series(1000001,2000000) i&quot;
INSERT 0 1000000
postgres@:/home/postgres/ &#x5B;DEV] psql -p 8889 -c &quot;select reltuples::bigint from pg_class  where relname = &#039;t&#039;&quot;

 reltuples 
-----------
   2000000
(1 row)

postgres@:/home/postgres/ &#x5B;DEV] psql -p 8888 -c &quot;select reltuples::bigint from pg_class  where relname = &#039;ft&#039;&quot;

 reltuples 
-----------
   1000000
(1 row)
</pre></div>


<p class="wp-block-paragraph">As you can see, the row counts do not match anymore. Once the local statistics are gathered we again have the same picture on both sides:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: sql; highlight: [1,3]; title: ; notranslate">
postgres@:/home/postgres/ &#x5B;DEV] psql -p 8888 -c &quot;analyze ft&quot;
ANALYZE
postgres@:/home/postgres/ &#x5B;DEV] psql -p 8888 -c &quot;select reltuples::bigint from pg_class  where relname = &#039;ft&#039;&quot;

 reltuples 
-----------
   2000000
(1 row)
</pre></div>


<p class="wp-block-paragraph">One way to avoid this issue even before PostgreSQL 19 is to tell postgres_fdw to run analyze on the remote table and to use those statistics:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: sql; highlight: [1]; title: ; notranslate">
postgres@:/home/postgres/ &#x5B;pgdev] psql -p 8888 -c &quot;alter foreign table ft options ( use_remote_estimate &#039;true&#039; )&quot;
</pre></div>


<p class="wp-block-paragraph">In this case the local statistics will not be used but of course this comes with the overhead of the additional analyze on the remote side.</p>



<p class="wp-block-paragraph">From PostgreSQL 19 there is another option:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: sql; highlight: [1]; title: ; notranslate">
postgres@:/home/postgres/ &#x5B;pgdev] psql -p 8888 -c &quot;alter foreign table ft options ( restore_stats &#039;true&#039; )&quot;
ALTER FOREIGN TABLE
</pre></div>


<p class="wp-block-paragraph">This option tells postgres_fdw to import the statistics from the remote side and store them locally. If that fails it will run analyze as above, the <a href="https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=28972b6fc3dcd1296e844246b635eddfa29c38e1" target="_blank" rel="noreferrer noopener">commit message</a> nicely explains this:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: plain; title: ; notranslate">
Add support for importing statistics from remote servers.

Add a new FDW callback routine that allows importing remote statistics
for a foreign table directly to the local server, instead of collecting
statistics locally.  The new callback routine is called at the beginning
of the ANALYZE operation on the table, and if the FDW failed to import
the statistics, the existing callback routine is called on the table to
collect statistics locally.

Also implement this for postgres_fdw.  It is enabled by &quot;restore_stats&quot;
option both at the server and table level.  Currently, it is the user&#039;s
responsibility to ensure remote statistics to import are up-to-date, so
the default is false.
</pre></div>


<p class="wp-block-paragraph">As usual, thanks to all involved.</p>
<p>L’article <a href="https://www.dbi-services.com/blog/postgresql-19-importing-statistics-from-remote-servers/">PostgreSQL 19: Importing statistics from remote servers</a> est apparu en premier sur <a href="https://www.dbi-services.com/blog">dbi Blog</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.dbi-services.com/blog/postgresql-19-importing-statistics-from-remote-servers/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>

<!--
Performance optimized by W3 Total Cache. Learn more: https://www.boldgrid.com/w3-total-cache/?utm_source=w3tc&utm_medium=footer_comment&utm_campaign=free_plugin

Page Caching using Disk: Enhanced 
Lazy Loading (feed)

Served from: www.dbi-services.com @ 2026-06-27 09:33:29 by W3 Total Cache
-->