{"id":43437,"date":"2026-03-15T17:17:27","date_gmt":"2026-03-15T16:17:27","guid":{"rendered":"https:\/\/www.dbi-services.com\/blog\/?p=43437"},"modified":"2026-03-15T17:17:29","modified_gmt":"2026-03-15T16:17:29","slug":"mariadb-12-3-binlog-inside-innodb","status":"publish","type":"post","link":"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/","title":{"rendered":"MariaDB 12.3 &#8211; Binlog Inside InnoDB"},"content":{"rendered":"\n<p>MariaDB has been quite active lately, the version 11.8 was already quite a step forward and apart from the changes on LTS schedule and the EOL durations, the RC 12.3 LTS is bringing some interesting changes to binlogs, pushing even further performance improvements and reliability. Instead of using the traditional flat file binlogs, they can be now integrated into InnoDB WALs. This is quite the change and will have some implications for production. Although this is quite new and there will be some optimizations in the future I guess, I figured we could already benchmark the performance gains and the pros and cons.<\/p>\n\n\n\n<p>Every MariaDB (and MySQL) production DBA has felt the pain of the classic binlog performance tax or replication management. Every committed transaction must cross a <strong>two-phase commit (2PC) boundary<\/strong> between InnoDB and the binary log \u2014 two separate, sequential <code>fsync()<\/code> calls per transaction group, regardless of what <code>innodb_flush_log_at_trx_commit<\/code> is set to. The binlog has always been a flat file, written independently of InnoDB&#8217;s Write-Ahead Log (WAL), which means:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Two durability paths to synchronize on every commit<\/li>\n\n\n\n<li>Complex crash recovery logic to reconcile InnoDB state with binlog state<\/li>\n\n\n\n<li><code>sync_binlog=1<\/code> + <code>innodb_flush_log_at_trx_commit=1<\/code> = two fsyncs, always<\/li>\n\n\n\n<li>Performance degrades steeply under high concurrency, especially on cloud disks where <code>fsync<\/code> latency is measured in milliseconds<\/li>\n<\/ul>\n\n\n\n<p>MariaDB 12.3 ships a fundamental architectural answer to this problem, tracked under <strong>MDEV-34705<\/strong> and authored by Kristian Nielsen: the binary log is now stored <em>inside InnoDB tablespaces<\/em>, using InnoDB&#8217;s own Write-Ahead Log for durability. This is not an incremental tuning \u2014 it is a redesign of the commit pipeline.<\/p>\n\n\n\n<p>To understand what changed, you need to understand what the classic binlog commit path looks like:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>BEGIN transaction\n  \u2192 InnoDB: write undo log (redo log)\n  \u2192 InnoDB: prepare (XA prepare, writes to redo log)\n  \u2192 Binlog: write event to binlog file\n  \u2192 Binlog: fsync() &#091;sync_binlog=1]\n  \u2192 InnoDB: commit (writes commit record to redo log)\n  \u2192 InnoDB: fsync() &#091;innodb_flush_log_at_trx_commit=1]\nCOMMIT\n<\/code><\/pre>\n\n\n\n<p><strong>The problem: two separate <code>fsync()<\/code> calls,<\/strong> two separate file paths, and InnoDB cannot commit without first knowing that the binlog has been durably written. If the server crashes between the binlog write and the InnoDB commit, recovery must scan the binlog to find prepared-but-not-committed transactions. This is expensive both in steady-state (latency) and at recovery time (scan time).<\/p>\n\n\n\n<p><strong>Group commit mitigates this partially <\/strong>\u2014 multiple transactions can be flushed together \u2014 but the fundamental 2PC overhead remains.<\/p>\n\n\n\n<p><strong>With <code>binlog_storage_engine=innodb<\/code>, the binlog is no longer a separate flat file<\/strong>. It lives inside the InnoDB storage engine, in InnoDB tablespace files with the <code>.ibb<\/code> extension, pre-allocated at <code>max_binlog_size<\/code> (default 1 GB each). The commit path becomes:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>BEGIN transaction\n  \u2192 InnoDB: write undo log + binlog event (single redo log)\n  \u2192 InnoDB: commit (one fsync if innodb_flush_log_at_trx_commit=1)\nCOMMIT\n<\/code><\/pre>\n\n\n\n<p>The two-phase commit between binlog and engine is <strong>eliminated<\/strong>. The binlog and data changes are atomic \u2014 they land or fail together in the InnoDB WAL.<\/p>\n\n\n\n<p>This has two dramatic consequences:<\/p>\n\n\n\n<p><strong>1. <code>innodb_flush_log_at_trx_commit=0<\/code> or <code>=2<\/code> becomes safe.<\/strong> Because crash recovery is now handled entirely by InnoDB, you can run with <code>innodb_flush_log_at_trx_commit=0<\/code> (no fsync, OS buffer only) and still have a <em>consistent<\/em> binlog after a crash. Previously, this setting was dangerous for replication because the binlog and InnoDB could diverge after a crash. With the new model, they cannot diverge \u2014 they are the same file.<\/p>\n\n\n\n<p><strong>2. With <code>innodb_flush_log_at_trx_commit=1<\/code>, one fsync replaces two.<\/strong> The most common production setting now only costs a single WAL fsync. Group commit opportunities also improve because the binlog is no longer a separate bottleneck.<\/p>\n\n\n\n<p>(You might want to check the documentation for this parameter <a href=\"https:\/\/mariadb.com\/docs\/server\/server-usage\/storage-engines\/innodb\/innodb-system-variables#innodb_flush_log_at_trx_commit\" id=\"https:\/\/mariadb.com\/docs\/server\/server-usage\/storage-engines\/innodb\/innodb-system-variables#innodb_flush_log_at_trx_commit\">link<\/a>.)<\/p>\n\n\n\n<p>New binlog files use the <code>.ibb<\/code> extension and are pre-allocated. Here is what the binlog directory looks like side by side:<\/p>\n\n\n\n<p><strong>Classic FILE binlog:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ ls -lah \/var\/lib\/mysql\/binlog\/\ndrwx------ 2 mysql mysql 4.0K  #binlog_cache_files\n-rw-rw---- 1 mysql mysql 1.9K  mysqld-bin.000001\n-rw-rw---- 1 mysql mysql 4.0K  mysqld-bin.000001.idx\n-rw-rw---- 1 mysql mysql 1.4K  mysqld-bin.000002\n-rw-rw---- 1 mysql mysql    0  mysqld-bin.000002.idx\n-rw-rw---- 1 mysql mysql   80  mysqld-bin.index\n<\/code><\/pre>\n\n\n\n<p><strong>InnoDB binlog:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ ls -lah \/var\/lib\/mysql\/binlog\/\n-rw-rw---- 1 mysql mysql 1.0G  binlog-000000.ibb\n-rw-rw---- 1 mysql mysql 1.0G  binlog-000001.ibb\n<\/code><\/pre>\n\n\n\n<p>The difference is immediately striking: the <code>.ibb<\/code> files are <strong>pre-allocated at <code>max_binlog_size<\/code><\/strong> (default 1 GB). There is no <code>.index<\/code> file, no <code>.idx<\/code> GTID index files, no <code>.state<\/code> file. GTID state is periodically written as <em>state records<\/em> inside the binlog itself, controlled by <code>--innodb-binlog-state-interval<\/code> (bytes between state records). GTID position recovery scans from the last state record forward.<\/p>\n\n\n\n<p>Another subtle difference visible in the <code>mariadb-binlog<\/code> output: all InnoDB binlog events show <code>end_log_pos 0<\/code> (position tracking is handled by InnoDB internally), whereas FILE binlog events show actual byte offsets (<code>end_log_pos 257<\/code>, <code>end_log_pos 330<\/code>, etc.).<\/p>\n\n\n\n<p>Events can span multiple <code>.ibb<\/code> files, and parts of the same event can be interleaved across files. <code>mariadb-binlog<\/code> coalesces them transparently. For correct output across files, pass all files at once in order.<\/p>\n\n\n\n<p>This is a significant improvement over the old model: <code>mariadb-backup<\/code> now <strong>includes the binlog in the backup by default<\/strong>, transactionally consistent with the data. The backed-up server is not blocked during binlog copy. A restored backup can be turned into a replica using <code>CHANGE MASTER ... MASTER_DEMOTE_TO_SLAVE=1<\/code> directly \u2014 no separate binlog position reconciliation needed.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-operational-impact-gtid-replica-resync-and-split-brain-recovery\">Operational Impact: GTID, Replica Resync, and Split-Brain Recovery<\/h2>\n\n\n\n<p>This is arguably the most impactful day-to-day change for DBAs apart from the TPS throughput. You can&#8217;t use anything else than GTID mode for replication if you use innodb as the binlog storage engine. I still run into a lot of MariaDB 10.11 so some DBAs might not use this on a daily basis yet. To help them understand the next part is a quick reminder of the GTID capabilities that has been around for 12 years already&#8230;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-what-master-use-gtid-slave-pos-actually-means\">What <code>MASTER_USE_GTID = slave_pos<\/code> Actually Means<\/h3>\n\n\n\n<p>A common point of confusion when reading MariaDB documentation for the first time: in the <code>CHANGE MASTER TO<\/code> statement, <code>slave_pos<\/code> looks like a placeholder you are supposed to fill in. It is not. It is a <strong>literal enum keyword<\/strong> \u2014 one of exactly three accepted values:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>MASTER_USE_GTID = no           -- classic file\/offset mode, GTID disabled\nMASTER_USE_GTID = slave_pos    -- use @@gtid_slave_pos (what this replica last received)\nMASTER_USE_GTID = current_pos  -- use @@gtid_current_pos (slave_pos + any local writes)\n<\/code><\/pre>\n\n\n\n<p><code>slave_pos<\/code> is not a value you provide \u2014 it is an instruction telling MariaDB to read the starting replication position from the server&#8217;s own <code>@@gtid_slave_pos<\/code> system variable, which was populated automatically during the backup restore. You are providing an <strong>instruction<\/strong>, not data. The actual position is determined by the engine, not by you.<\/p>\n\n\n\n<p>For completeness: MariaDB accepts the keyword unquoted (unlike MySQL&#8217;s tendency to quote string enums), which is why it looks like a variable name in documentation examples.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-replica-resync-before-and-after-the-concrete-difference\">Replica Resync Before and After: The Concrete Difference<\/h3>\n\n\n\n<p>This is best understood by comparing the full procedure side by side.<\/p>\n\n\n\n<p><strong>With FILE binlog (classic):<\/strong><\/p>\n\n\n\n<p>Problem: two independent sources of truth to reconcile.<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Take fresh backup with mariadb-backup<\/li>\n\n\n\n<li>Read xtrabackup_binlog_info from the backup:<br>\u2192 binlog.000042 position 198472910<br>BUT: this position may not match what InnoDB actually committed<br>(MDEV-21611 \u2014 mariabackup does not always update the InnoDB<br>binlog position during prepare, causing mismatch)<\/li>\n\n\n\n<li>Restore the backup on the replica<\/li>\n\n\n\n<li>Manually declare:<br>CHANGE MASTER TO<br>MASTER_LOG_FILE = &#8216;binlog.000042&#8217;, \u2190 you looked this up<br>MASTER_LOG_POS = 198472910; \u2190 you looked this up<\/li>\n\n\n\n<li>START SLAVE;<\/li>\n\n\n\n<li>Watch for errors, validate row counts, hope the position was right<\/li>\n<\/ol>\n\n\n\n<p>The fragility: step 2\/4 requires you to provide specific <strong>data<\/strong> (a filename string and a byte offset integer). If that data is wrong by even one transaction \u2014 which MDEV-21611 demonstrates can happen \u2014 the replica either re-applies events it already has, or misses events entirely. This produces silent data drift, not an immediate error.<\/p>\n\n\n\n<p><strong>With InnoDB binlog + GTID:<\/strong><\/p>\n\n\n\n<p>Problem: eliminated. One source of truth.<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Take fresh backup with mariadb-backup<br>\u2192 .ibb binlog files included automatically, consistent with data<\/li>\n\n\n\n<li>Restore the backup on the replica<br>\u2192 @@gtid_slave_pos populated from InnoDB state automatically<\/li>\n\n\n\n<li>Declare the replica:<br>CHANGE MASTER TO<br>MASTER_HOST = &#8216;primary&#8217;,<br>MASTER_USER = &#8216;replicator&#8217;,<br>MASTER_PASSWORD = &#8216;replpass&#8217;,<br>MASTER_USE_GTID = slave_pos, \u2190 instruction, not a value<br>MASTER_DEMOTE_TO_SLAVE = 1; \u2190 folds local writes into GTID pos<\/li>\n\n\n\n<li>START SLAVE;<br>\u2192 replica announces its GTID set to the primary<br>\u2192 primary streams from that point forward<br>\u2192 done<\/li>\n<\/ol>\n\n\n\n<p>You never touch a file name or a byte offset. The position is embedded in the InnoDB state from the moment the backup was consistent.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-split-brain-recovery\">Split-Brain Recovery <\/h3>\n\n\n\n<p>Split-brain is where this matters most. Consider a scenario where the primary and replica became isolated:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>Primary:  committed GTIDs 0-1-1 through 0-1-10042\nReplica:  applied GTIDs   0-1-1 through 0-1-10039  (was lagging at split)\n          then received 3 rogue writes (briefly promoted to primary)\n          now has:  0-1-10040, 0-1-10041, 0-1-10042  (divergent, different transactions)\n<\/code><\/pre>\n\n\n\n<p><strong>With FILE binlog<\/strong>, reconciling this required manually scanning both binlog files to find the exact divergence point \u2014 line by line, event by event. Many teams simply wiped the replica and re-provisioned from scratch to be safe. Even experienced DBAs would spend hours on this.<\/p>\n\n\n\n<p><strong>With GTID<\/strong>, the divergence is immediately visible and unambiguous:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>-- On the drifted replica:\nSELECT @@gtid_slave_pos;\n-- Returns: 0-1-10042  (but these last 3 GTIDs are rogue, divergent from primary)\n\n-- On the primary:\nSELECT @@gtid_binlog_pos;\n-- Returns: 0-1-10042  (completely different transactions at 10040-10042)\n<\/code><\/pre>\n\n\n\n<p>To resync, declare the last known good position explicitly and restart:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>STOP SLAVE;\nRESET SLAVE;\nSET GLOBAL gtid_slave_pos = '0-1-10039';  -- last confirmed good GTID from primary\nSTART SLAVE;\n-- MariaDB streams 10040-10042 from the primary, no manual intervention\n<\/code><\/pre>\n\n\n\n<p>With <code>gtid_strict_mode = ON<\/code> (recommended), MariaDB will refuse to apply a GTID it has already seen with different content \u2014 you get an explicit error rather than silent data corruption. The divergence surface is precisely identified, not guessed at.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-master-demote-to-slave-1-the-safety-net-for-former-primaries\"><code>MASTER_DEMOTE_TO_SLAVE=1<\/code> \u2014 The Safety Net for Former Primaries<\/h3>\n\n\n\n<p>When resetting a server that was briefly acting as a primary (during split-brain, or when repurposing a former primary as a replica), its <code>@@gtid_binlog_pos<\/code> includes local writes that the current primary does not know about. Without reconciliation, <code>MASTER_USE_GTID = slave_pos<\/code> would start from <code>@@gtid_slave_pos<\/code>, which might be behind the local writes \u2014 creating a gap.<\/p>\n\n\n\n<p><code>MASTER_DEMOTE_TO_SLAVE = 1<\/code> handles this automatically: it computes the union of <code>@@gtid_slave_pos<\/code> and <code>@@gtid_binlog_pos<\/code>, sets that as the new <code>gtid_slave_pos<\/code>, and proceeds. In plain English: <em>&#8220;I may have had local writes \u2014 include them in my starting position declaration so I don&#8217;t replay things I already have.&#8221;<\/em><br><\/p>\n\n\n\n<p>Without MASTER_DEMOTE_TO_SLAVE=1:<br>&#8212; Risk: potential GTID gap or conflict if server had local writes<br><br>With MASTER_DEMOTE_TO_SLAVE=1:<br>&#8212; MariaDB automatically: SET gtid_slave_pos = gtid_slave_pos \u222a gtid_binlog_pos<br>&#8212; Then connects to primary from that unified position<br>&#8212; Safe regardless of the server&#8217;s previous role<\/p>\n\n\n\n<p>In summary, the GTID + InnoDB binlog combination reduces your split-brain\/resync runbook from a multi-step forensic procedure with manual position arithmetic to three commands:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>STOP SLAVE;\nSET GLOBAL gtid_slave_pos = '&lt;last_known_good_gtid&gt;';\nCHANGE MASTER TO MASTER_USE_GTID = slave_pos, MASTER_DEMOTE_TO_SLAVE = 1;\nSTART SLAVE;\n<\/code><\/pre>\n\n\n\n<p>For teams (like me) that have been burned by binlog position mismatches in production, this operational simplification was a game changer, and now with adding binlogs into InnoDB WALs, the procedure is position-agnostic. This means a faster and easier way to create a replica and automate a full re-sync because mariadb-backup captures data and binlog atomically and restoring sets @@gtid_slave_pos automatically. No parsing, no conditionals, no positions arithmetic, the same script will work whether you re-sync after lag, crash, or split-brain (which is even less likely) idempotently.  <\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-galera-cluster-and-innodb-cluster-what-changes-and-what-doesn-t\">Galera Cluster and InnoDB Cluster: What Changes (and What Doesn&#8217;t)<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-why-galera-is-blocked-and-why-it-s-non-trivial-to-fix\">Why Galera Is Blocked and Why It&#8217;s Non-Trivial to Fix<\/h3>\n\n\n\n<p>Understanding the block requires understanding what Galera actually does at commit time \u2014 and crucially, what it does <strong>not<\/strong> use the binlog for between cluster nodes.<\/p>\n\n\n\n<p>Galera nodes do not replicate by shipping binlog events to each other. The replication mechanism is the <strong>wsrep write set protocol<\/strong>: at commit time, the node extracts the changed rows as a write set, broadcasts it to all cluster nodes, and all nodes run a certification protocol (conflict detection) before any node commits. The commit is synchronous across all nodes.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>Galera write path:\n  Client writes to Node A\n  \u2192 InnoDB: prepare transaction\n  \u2192 wsrep: extract write set from transaction\n  \u2192 wsrep: broadcast write set to all nodes\n  \u2192 wsrep: certification round (all nodes vote \u2014 conflicts?)\n  \u2192 wsrep: apply write set on all nodes simultaneously\n  \u2192 InnoDB: commit on all nodes\n  \u2192 binlog written AFTER commit (for async slaves attached to the cluster)\n<\/code><\/pre>\n\n\n\n<p>The binlog in a Galera cluster has exactly one purpose: feeding <strong>async replica slaves<\/strong> that hang off one of the cluster nodes for analytics, backup, or reporting. The Galera nodes themselves never read each other&#8217;s binlog.<\/p>\n\n\n\n<p>The problem with InnoDB binlog is the certification hook. The wsrep plugin currently intercepts <strong>between InnoDB prepare and InnoDB commit<\/strong> \u2014 it sits at the 2PC boundary that InnoDB binlog eliminates. With the new architecture, InnoDB prepare and commit are atomic \u2014 there is no pause point where wsrep can insert its certification round. The wsrep plugin needs to be re-integrated at a different layer of the storage engine API, which is a significant engineering effort. The <code>GALERA26<\/code> label on MDEV-34705 acknowledges this as planned future work, but it is not present in 12.3.<\/p>\n\n\n\n<p><strong>What wsrep needs: <\/strong>       <br>InnoDB prepare \u2192 [wsrep certification] \u2192 InnoDB commit<br><br><strong>What InnoDB binlog does: <\/strong><br>InnoDB prepare + commit = atomic, no pause point\u2192 Incompatible. wsrep cannot insert itself.<br><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-impact-on-galera-async-slaves\">Impact on Galera Async Slaves<\/h3>\n\n\n\n<p>Even if you are not running Galera nodes with InnoDB binlog, you may be running <strong>async replicas attached to a Galera node<\/strong> \u2014 a common pattern for offloading analytics queries or running <code>mariadb-backup<\/code> from a dedicated replica. These are unaffected as long as the Galera node itself continues to use <code>binlog_storage_engine=FILE<\/code>. The async slave receives standard <code>.bin<\/code> events from the Galera node and nothing in its pipeline changes.<\/p>\n\n\n\n<p>The restriction is: the Galera node acting as the async replication source must stay on FILE binlog. You cannot mix binlog storage engines within the same replication chain in a meaningful way \u2014 the source determines the format.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-production-decision-matrix-by-topology\">Production Decision Matrix by Topology<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Topology<\/th><th>InnoDB binlog viable?<\/th><th>Notes<\/th><\/tr><\/thead><tbody><tr><td>Standalone primary + async replicas (GTID)<\/td><td>\u2705 Yes<\/td><td>Primary use case, fully supported<\/td><\/tr><tr><td>Primary + async replicas (file\/offset)<\/td><td>\u274c Not yet<\/td><td>Migrate to GTID first<\/td><\/tr><tr><td>Semi-sync AFTER_COMMIT + async replicas<\/td><td>\u2705 Yes<\/td><td>Supported in 12.3.1; AFTER_SYNC is architecturally incompatible<\/td><\/tr><tr><td>Semi-sync AFTER_SYNC<\/td><td>\u274c Never<\/td><td>Architecturally incompatible with 2PC removal<\/td><\/tr><tr><td>Galera cluster nodes<\/td><td>\u274c Not yet<\/td><td>wsrep hook incompatible, planned for future<\/td><\/tr><tr><td>Async slave off a Galera node<\/td><td>\u2705 Unaffected<\/td><td>Galera node uses FILE binlog as source<\/td><\/tr><tr><td>MaxScale read\/write split<\/td><td>\ud83d\udd36 Test required<\/td><td>Replication protocol unchanged, failover scripts may need GTID update<\/td><\/tr><tr><td>MySQL InnoDB Cluster<\/td><td>N\/A<\/td><td>MySQL-only, not applicable to MariaDB<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-known-limitations-and-open-issues-12-3-1-rc\">Known Limitations and Open Issues (12.3.1 RC)<\/h2>\n\n\n\n<p>This feature is marked as opt-in for good reason. The following limitations are documented and\/or reported in the JIRA tracker. <strong>Read these before recommending it for production.<\/strong><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-documented-limitations-by-design-12-3-1\">Documented Limitations (by design, 12.3.1)<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Limitation<\/th><th>Detail<\/th><\/tr><\/thead><tbody><tr><td><strong>Mutual exclusivity<\/strong><\/td><td>Enabling the new binlog ignores existing <code>.bin<\/code> files. No migration path from old to new binlog format in-place.<\/td><\/tr><tr><td><strong>GTID mandatory<\/strong><\/td><td>Replicas must use GTID-based replication (<code>CHANGE MASTER ... MASTER_USE_GTID=slave_pos<\/code>). Filename\/offset positions are unavailable (<code>MASTER_POS_WAIT()<\/code> does not work).<\/td><\/tr><tr><td><strong>Semi-sync: AFTER_SYNC not supported<\/strong><\/td><td><code>AFTER_SYNC<\/code> semi-sync cannot work because 2PC no longer exists. <code>AFTER_COMMIT<\/code> semi-sync is supported. MDEV-38190 tracks further semi-sync enhancements.<\/td><\/tr><tr><td><strong><code>sync_binlog<\/code> ignored<\/strong><\/td><td>The option is accepted but silently ignored. Durability is controlled by <code>innodb_flush_log_at_trx_commit<\/code> only.<\/td><\/tr><tr><td><strong>Old filename\/offset API gone<\/strong><\/td><td><code>BINLOG_GTID_POS()<\/code>, <code>MASTER_POS_WAIT()<\/code> unavailable. Use <code>MASTER_GTID_WAIT()<\/code>.<\/td><\/tr><tr><td><strong>Third-party binlog readers<\/strong><\/td><td>Tools that read <code>.bin<\/code> files directly (e.g., older versions of Debezium, Maxwell) will not understand <code>.ibb<\/code> format. Tools that connect to the server via the replication protocol may work unmodified.<\/td><\/tr><tr><td><strong>Galera not supported (12.3.1)<\/strong><\/td><td>The GALERA26 label on MDEV-34705 indicates future support, but wsrep-based clusters cannot use this feature in the current RC.<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-open-jira-issues-as-of-march-2026\">Open JIRA Issues (as of March 2026)<\/h3>\n\n\n\n<p><strong>MDEV-38462 \u2014 <code>InnoDB: Crash recovery is broken due to insufficient innodb_log_file_size<\/code> with new binlog<\/strong> Severity: likely Major. The new binlog writes into the InnoDB redo log, which means large transactions or high write rates can exhaust <code>innodb_log_file_size<\/code> more rapidly than before. If the redo log is undersized for the binlog load, crash recovery may fail. <strong>Lab implication:<\/strong> always set <code>innodb_log_file_size<\/code> generously (\u22652 GB) when enabling the new binlog.<\/p>\n\n\n\n<p><strong>MDEV-38304 \u2014 InnoDB Binlog to be Stored in Archived Redo Log<\/strong> A follow-on feature request for archiving binlog data within the InnoDB redo log archiving infrastructure. Not a blocker, but signals that the archival story is incomplete.<\/p>\n\n\n\n<p><strong>MDEV-34705 (parent, closed\/fixed in 12.3.1, resolved 2026-02-08)<\/strong> \u2014 The main implementation MDEV. Review the sub-tasks and linked issues for outstanding items, notably MDEV-38190 (semi-sync enhancements) and MDEV-38307 (wsrep\/Galera feasibility study).<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-operational-cautions-for-your-lab\">Operational Cautions for Your Lab<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Do <strong>not<\/strong> test <code>innodb_flush_log_at_trx_commit=0<\/code> on a shared Azure disk \u2014 results will be misleading due to OS-level write caching. Use a dedicated Premium SSD P30+ or Ultra Disk.<\/li>\n\n\n\n<li>The <code>sync_binlog<\/code> parameter being silently ignored is a foot-gun during migration \u2014 add it to your monitoring alerting if teams use it as a proxy for &#8220;durable binlog.&#8221;<\/li>\n\n\n\n<li>Pre-allocated 1 GB <code>.ibb<\/code> files mean disk space consumption &#8220;looks&#8221; immediately high even under low load.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-the-benchmark-design\">The Benchmark Design<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-what-we-are-measuring\">What We Are Measuring<\/h3>\n\n\n\n<p>Three performance dimensions across two configurations:<\/p>\n\n\n\n<p><strong>CONFIG A:<\/strong> binlog_storage_engine=FILE (classic)<br><strong>CONFIG B:<\/strong> binlog_storage_engine=innodb (new)<\/p>\n\n\n\n<p>Each dimension tested under three durability profiles:<\/p>\n\n\n\n<p><strong>D1<\/strong>: innodb_flush_log_at_trx_commit=1 + sync_binlog=1 (full durability, &#8220;gold&#8221;)<br><strong>D2<\/strong>: innodb_flush_log_at_trx_commit=2 + sync_binlog=0 (OS-buffered WAL)<br><strong>D3<\/strong>: innodb_flush_log_at_trx_commit=0 (no fsync, maximum throughput)<\/p>\n\n\n\n<p><strong>Note:<\/strong> D2\/D3 on CONFIG B is where the architectural gain is most visible. D3 on CONFIG A is effectively unsafe for replication; we benchmark it anyway as a theoretical ceiling.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-metrics\">Metrics<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Metric<\/th><th>Tool<\/th><th>Notes<\/th><\/tr><\/thead><tbody><tr><td>Write TPS<\/td><td>sysbench oltp_write_only<\/td><td>Primary headline number<\/td><\/tr><tr><td>Commit latency p50\/p95\/p99<\/td><td>sysbench<\/td><td>Shows tail latency behavior<\/td><\/tr><tr><td>Large transaction commit time<\/td><td>custom SQL<\/td><td>Single 100MB INSERT<\/td><\/tr><tr><td>Crash recovery time<\/td><td><code>kill -9<\/code> + timer<\/td><td>Wall clock to <code>[Ready for connections]<\/code><\/td><\/tr><tr><td>Replica lag under load<\/td><td><code>Seconds_Behind_Master<\/code><\/td><td>1 primary \u2192 1 replica<\/td><\/tr><tr><td>InnoDB redo log write bytes<\/td><td><code>SHOW ENGINE INNODB STATUS<\/code><\/td><td>Redo amplification<\/td><\/tr><tr><td>fsync count (fio\/strace proxy)<\/td><td>iostat + custom<\/td><td>Actual IO calls<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-concurrency-levels\">Concurrency Levels<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>1 thread (establish single-threaded overhead)<\/li>\n\n\n\n<li>8 threads (typical application concurrency)<\/li>\n\n\n\n<li>32 threads (high concurrency, group commit regime)<\/li>\n\n\n\n<li>64 threads (stress, disk saturation test)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-azure-lab-setup\">Azure Lab Setup<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Component<\/th><th>Primary<\/th><th>Replica<\/th><\/tr><\/thead><tbody><tr><td>VM SKU<\/td><td><code>Standard_E4ds_v5<\/code> (4 vCPU, 32 GB RAM)<\/td><td><code>Standard_E2ds_v5<\/code> (2 vCPU, 16 GB RAM)<\/td><\/tr><tr><td>OS Disk<\/td><td>Premium SSD (128 GB)<\/td><td>Premium SSD (128 GB)<\/td><\/tr><tr><td>Data Disk<\/td><td><strong>P30 Premium SSD 500 GB (5,000 IOPS)<\/strong><\/td><td><strong>P30 Premium SSD 500 GB (5,000 IOPS)<\/strong><\/td><\/tr><tr><td>OS<\/td><td>Ubuntu 24.04 LTS<\/td><td>Ubuntu 24.04 LTS<\/td><\/tr><tr><td>MariaDB<\/td><td>12.3.1 RC (binary tarball)<\/td><td>12.3.1 RC (binary tarball)<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>Host caching was set to <strong>None<\/strong> on both P30 data disks to ensure we measure actual I\/O latency, not Azure&#8217;s host-level read cache. The data directory, InnoDB redo log, and binlog all reside on the P30 data disk (<code>\/data\/mysql<\/code>).<\/p>\n\n\n\n<p>Each profile was tested with 2 runs per thread count (averaged). The datadir was fully reinitialized between profiles to prevent cross-contamination.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-my-lab-results\">My Lab Results<\/h2>\n\n\n\n<p>The following results were obtained on Azure <code>Standard_E4ds_v5<\/code> VMs (4 vCPU, 32 GB RAM) with P30 Premium SSD data disks (500 GB, 5,000 IOPS, host caching disabled). MariaDB 12.3.1 RC installed from binary tarball.<\/p>\n\n\n\n<p><strong>Lab parameters:<\/strong> sysbench <code>oltp_write_only<\/code>, 4 tables \u00d7 1M rows, 60s per run (15s warmup), 2 runs per data point (averaged), 20 GB InnoDB buffer pool, 4 GB redo log.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-write-tps-the-d1-headline-2-4-3-3-faster\">Write TPS \u2014 The D1 Headline: 2.4\u20133.3\u00d7 Faster<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Threads<\/th><th>FILE (D1) TPS<\/th><th>InnoDB (D1) TPS<\/th><th>Speedup<\/th><th>FILE p99<\/th><th>InnoDB p99<\/th><\/tr><\/thead><tbody><tr><td>1<\/td><td>129<\/td><td>307<\/td><td><strong>2.4\u00d7<\/strong><\/td><td>16.1 ms<\/td><td>8.1 ms<\/td><\/tr><tr><td>8<\/td><td>444<\/td><td>1,073<\/td><td><strong>2.4\u00d7<\/strong><\/td><td>37.3 ms<\/td><td>15.1 ms<\/td><\/tr><tr><td>32<\/td><td>1,392<\/td><td>4,625<\/td><td><strong>3.3\u00d7<\/strong><\/td><td>43.8 ms<\/td><td>34.1 ms<\/td><\/tr><tr><td>64<\/td><td>2,564<\/td><td>8,279<\/td><td><strong>3.2\u00d7<\/strong><\/td><td>72.1 ms<\/td><td>29.4 ms<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>Even at single-thread concurrency, InnoDB binlog is 2.4\u00d7 faster. At 64 threads, the gap widens to 3.2\u00d7 \u2014 the elimination of the binlog <code>fsync<\/code> becomes more valuable as group commit contention increases.<\/p>\n\n\n\n<p>The p99 latency tells the story most clearly: FILE binlog latency rises steeply with concurrency (16 ms \u2192 72 ms), while InnoDB binlog stays dramatically lower (8 ms \u2192 29 ms at 64 threads). The second <code>fsync<\/code> creates a latency cliff under contention that InnoDB binlog simply does not have.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-d2-d3-parity-when-fsyncs-are-already-gone\">D2\/D3 \u2014 Parity When fsyncs Are Already Gone<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Threads<\/th><th>FILE (D2) TPS<\/th><th>InnoDB (D2) TPS<\/th><th>FILE (D3) TPS<\/th><th>InnoDB (D3) TPS<\/th><\/tr><\/thead><tbody><tr><td>1<\/td><td>2,950<\/td><td>2,861<\/td><td>2,943<\/td><td>2,985<\/td><\/tr><tr><td>8<\/td><td>9,961<\/td><td>9,924<\/td><td>10,027<\/td><td>11,681<\/td><\/tr><tr><td>32<\/td><td>10,818<\/td><td>11,091<\/td><td>11,202<\/td><td>10,973<\/td><\/tr><tr><td>64<\/td><td>10,815<\/td><td>10,462<\/td><td>11,121<\/td><td>10,751<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>At D2 (<code>innodb_flush_log_at_trx_commit=2<\/code>, OS-buffered) and D3 (<code>=0<\/code>, no fsync), both binlog engines produce nearly identical throughput: <strong>~10,000\u201311,000 TPS at saturation<\/strong>. The bottleneck shifts from binlog sync overhead to InnoDB page writes and the P30 disk&#8217;s 5,000 IOPS ceiling.<\/p>\n\n\n\n<p>This is the control experiment that validates the D1 results: the InnoDB binlog advantage comes specifically from eliminating the <code>sync_binlog=1<\/code> fsync. When fsyncs are already absent, there is nothing to eliminate.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-crash-recovery-same-speed-different-architecture\">Crash Recovery \u2014 Same Speed, Different Architecture<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Profile<\/th><th>Recovery time<\/th><th>Pages to recover<\/th><th>Notes<\/th><\/tr><\/thead><tbody><tr><td>file_d1<\/td><td>34.1s<\/td><td>54,820<\/td><td>5+ XA prepared transactions, 2PC reconciliation<\/td><\/tr><tr><td>innodb_d1<\/td><td>39.2s<\/td><td>41,688<\/td><td>Single-engine recovery, no XA<\/td><\/tr><tr><td>file_d2<\/td><td>29.1s<\/td><td>45,430<\/td><td>XA prepared, binlog sync needed<\/td><\/tr><tr><td>innodb_d2<\/td><td>27.1s<\/td><td>43,219<\/td><td>Clean single-path recovery<\/td><\/tr><tr><td>file_d3<\/td><td>33.1s<\/td><td>44,309<\/td><td>Binlog-InnoDB sync at recovery<\/td><\/tr><tr><td>innodb_d3<\/td><td>39.2s<\/td><td>44,846<\/td><td>Clean single-path recovery<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>All six profiles: <strong>zero data loss<\/strong> (1,000,000 rows before crash = 1,000,000 rows after recovery).<\/p>\n\n\n\n<p>Recovery times are comparable (~27\u201339 seconds). The real difference is architectural:<\/p>\n\n\n\n<p><strong>FILE binlog<\/strong> recovery log shows the classic 2PC reconciliation:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>InnoDB: Starting crash recovery from checkpoint LSN=613360854\nInnoDB: To recover: 54820 pages\nInnoDB: Transaction 1418253 was in the XA prepared state.\nInnoDB: Transaction 1418254 was in the XA prepared state.\n...\n<\/code><\/pre>\n\n\n\n<p><strong>InnoDB binlog<\/strong> recovery log shows single-path recovery \u2014 no XA coordination:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>InnoDB: Starting crash recovery from checkpoint LSN=6438545394\nInnoDB: To recover: 41688 pages\nInnoDB: Continuing binlog number 2 from position 583656195.\nmariadbd: ready for connections.\n<\/code><\/pre>\n\n\n\n<p>The critical operational difference: with FILE binlog at D2\/D3 (<code>sync_binlog=0<\/code>), the binlog can lose events on crash while InnoDB has already committed them \u2014 creating <strong>silent primary-replica divergence<\/strong>. With InnoDB binlog, this class of failure is architecturally impossible because the binlog is inside the redo log.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-large-transaction-commit-redo-amplification-is-real\">Large Transaction Commit \u2014 Redo Amplification Is Real<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Metric<\/th><th>FILE (D1)<\/th><th>InnoDB (D1)<\/th><\/tr><\/thead><tbody><tr><td>Single 100K-row UPDATE commit time<\/td><td>1,793 ms<\/td><td>1,564 ms<\/td><\/tr><tr><td>Redo amplification<\/td><td>1.03\u00d7<\/td><td><strong>1.98\u00d7<\/strong><\/td><\/tr><tr><td>Raw data modified<\/td><td>104 MB<\/td><td>104 MB<\/td><\/tr><tr><td>Redo log written<\/td><td>107 MB<\/td><td>206 MB<\/td><\/tr><tr><td>Per-iteration (10K rows) redo<\/td><td>~5.6 MB<\/td><td>~10.8 MB<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>InnoDB binlog writes <strong>~2\u00d7 the redo log<\/strong> because every binlog event is also an InnoDB redo log record. Despite this, the commit time is actually slightly faster (1.56s vs 1.79s) because there is no separate binlog fsync to wait for.<\/p>\n\n\n\n<p>For OLTP workloads with small transactions, the redo amplification is negligible \u2014 the fsync elimination dominates. For bulk ETL operations producing large redo volumes, this overhead is measurable and should be factored into <code>innodb_log_file_size<\/code> sizing.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-replication-lag-same-lag-56-more-throughput\">Replication Lag \u2014 Same Lag, 56% More Throughput<\/h3>\n\n\n\n<p>We measured replication lag with a single async replica (E2ds_v5, 2 vCPU) under sustained 16-thread write load for 180 seconds, monitoring <code>Seconds_Behind_Master<\/code> every 10 seconds.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Metric<\/th><th>FILE (D1)<\/th><th>InnoDB (D1)<\/th><\/tr><\/thead><tbody><tr><td>Primary TPS during load<\/td><td>1,276<\/td><td><strong>1,994<\/strong><\/td><\/tr><tr><td>Lag at 60s<\/td><td>58s<\/td><td>89s<\/td><\/tr><tr><td>Lag at 120s<\/td><td>121s<\/td><td>148s<\/td><\/tr><tr><td>Lag at 180s (load stops)<\/td><td>185s<\/td><td>206s<\/td><\/tr><tr><td>Lag at 300s<\/td><td>313s<\/td><td>321s<\/td><\/tr><tr><td>Lag growth rate<\/td><td>~1.0 s\/s<\/td><td>~1.1 s\/s<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>The lag growth rate is nearly identical despite the primary pushing 56% more transactions with InnoDB binlog. The bottleneck is the replica&#8217;s single-threaded SQL apply \u2014 it falls behind at approximately the same rate regardless of how fast the primary commits.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed_s | file_d1 lag | innodb_d1 lag\n       10 |          6s |           39s\n       60 |         58s |           89s\n      120 |        121s |          148s\n      180 |        185s |          206s\n      240 |        249s |          265s\n      300 |        313s |          321s\n<\/code><\/pre>\n\n\n\n<p>The implication: InnoDB binlog does not make replication lag worse \u2014 it makes the <em>primary faster<\/em> while the replica remains the limiting factor. Enabling <code>slave_parallel_threads<\/code> on the replica would likely amplify the advantage further, but that is a test for another day.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-analysis-and-recommendations\">Analysis and Recommendations<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-what-the-results-mean-for-production\">What the Results Mean for Production<\/h3>\n\n\n\n<p>The 2.4\u20133.3\u00d7 TPS improvement at D1 durability on Azure P30 disks (5,000 IOPS) is not a synthetic artifact \u2014 it represents the removal of a real <code>fsync()<\/code> call from the commit path. On higher-IOPS storage (Azure Ultra Disk at 10K+ IOPS, NVMe at 100K+ IOPS), the absolute TPS numbers will be higher, but the relative improvement should hold because the 2PC <code>fsync<\/code> overhead is a fixed latency cost per commit group.<\/p>\n\n\n\n<p>The p99 latency curve tells the production story most clearly: FILE binlog p99 climbs from 16 ms to 72 ms as threads increase from 1 to 64, while InnoDB binlog stays at 8 ms to 29 ms. Tail latency stability under increasing concurrency is exactly what application teams need.<\/p>\n\n\n\n<p>The D2\/D3 parity results are equally important: they prove the improvement comes specifically from eliminating <code>sync_binlog=1<\/code>, not from some general overhead reduction. When fsyncs are already absent, InnoDB binlog adds no benefit \u2014 but crucially, it adds no penalty either.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-when-not-to-use-innodb-binlog\">When NOT to Use InnoDB Binlog<\/h3>\n\n\n\n<p>Despite the compelling performance numbers:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Large batch\/ETL workloads:<\/strong> The 2\u00d7 redo amplification on large transactions increases redo log pressure. For workloads dominated by bulk INSERTs or mass UPDATEs, size <code>innodb_log_file_size<\/code> accordingly.<\/li>\n\n\n\n<li><strong>Galera clusters:<\/strong> Not supported in 12.3.1 RC. The wsrep certification hook is incompatible with the new commit path.<\/li>\n\n\n\n<li><strong>Semi-sync AFTER_SYNC:<\/strong> Architecturally incompatible \u2014 the 2PC boundary that AFTER_SYNC hooks into no longer exists. AFTER_COMMIT is supported and is the only available wait point with InnoDB binlog.<\/li>\n\n\n\n<li><strong>Third-party binlog readers:<\/strong> Tools that read <code>.bin<\/code> files directly will not understand <code>.ibb<\/code> format. Tools using the replication protocol are unaffected.<\/li>\n\n\n\n<li><strong>Disk-constrained environments:<\/strong> Pre-allocated 1 GB <code>.ibb<\/code> files mean immediate disk usage even under low load.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-innodb-log-file-size-recommendation\"><code>innodb_log_file_size<\/code> Recommendation<\/h3>\n\n\n\n<p>With <code>binlog_storage_engine=innodb<\/code>, the redo log absorbs all binlog writes. Our lab used <code>innodb_log_file_size=4G<\/code> without encountering MDEV-38462. For production with high write throughput, start with 4 GB and monitor <code>SHOW ENGINE INNODB STATUS<\/code> for checkpoint age approaching the log file size. If crash recovery reports &#8220;insufficient innodb_log_file_size&#8221;, increase to 8 GB for example and monitor. The 2\u00d7 redo amplification we measured means your redo log needs to be roughly twice as large as it would be without InnoDB binlog.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-bottom-line\">Bottom Line<\/h3>\n\n\n\n<p>For GTID-based async replication with standard tooling, binlog_storage_engine=innodb delivers a 2.4\u20133.3\u00d7 TPS improvement at full durability, with 2\u20132.5\u00d7 lower p99 latency under<br>concurrency. The migration requires GTID (mandatory), gives up file\/offset positioning (use MASTER_GTID_WAIT() instead), and needs generous innodb_log_file_size. At relaxed durability<br>(D2\/D3), throughput is identical, but crash consistency is now guaranteed, which was never the case with FILE binlog.<\/p>\n\n\n\n<p>More importantly, this changes the architecture decision matrix. Binlog-based replication has always carried a performance tax, that pushed some teams toward Galera<br>clusters not because they needed synchronous multi-master or local HA, but simply to avoid that penalty while maintaining an offsite copy for DR. I often meet teams that conflate HA and<br>DR requirements and end up implementing a 3-node Galera cluster spread across 3 sites \u2014 paying the cross-site certification latency on every single commit for the sake of a disaster<br>recovery copy that doesn&#8217;t need to be synchronous. That trade-off no longer exists. With InnoDB binlog, a simple primary + async replica gives you DR capability at effectively the same<br>write throughput as a standalone server, while keeping your Galera cluster (if you need one) local where certification latency stays low.<\/p>\n\n\n\n<p>For OLTP workloads, this is the single largest performance improvement available in MariaDB 12.3 and it comes with better crash consistency, simpler automation, and a cleaner<br>separation between HA and DR as a bonus.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-references\">References<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>MDEV-34705: Improving performance of binary logging by removing the need of syncing it \u2014 <a href=\"https:\/\/jira.mariadb.org\/browse\/MDEV-34705\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/jira.mariadb.org\/browse\/MDEV-34705<\/a><\/li>\n\n\n\n<li>MDEV-38462: Crash recovery broken with insufficient innodb_log_file_size (new binlog) \u2014 <a href=\"https:\/\/jira.mariadb.org\/browse\/MDEV-38462\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/jira.mariadb.org\/browse\/MDEV-38462<\/a><\/li>\n\n\n\n<li>MariaDB 12.3 Changes &amp; Improvements \u2014 <a href=\"https:\/\/mariadb.com\/docs\/release-notes\/community-server\/12.3\/mariadb-12.3-changes-and-improvements\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/mariadb.com\/docs\/release-notes\/community-server\/12.3\/mariadb-12.3-changes-and-improvements<\/a><\/li>\n\n\n\n<li>Official binlog implementation documentation (knielsen branch) \u2014 <a href=\"https:\/\/github.com\/MariaDB\/server\/blob\/knielsen_binlog_in_engine\/Docs\/replication\/binlog.md\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/github.com\/MariaDB\/server\/blob\/knielsen_binlog_in_engine\/Docs\/replication\/binlog.md<\/a><\/li>\n\n\n\n<li>Mark Callaghan&#8217;s benchmark of the new binlog feature \u2014 referenced in the MariaDB.org release blog<\/li>\n\n\n\n<li>MariaDB Foundation: New binlog implementation blog post \u2014 <a href=\"https:\/\/mariadb.org\/tag\/mariadb-releases\/\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/mariadb.org\/tag\/mariadb-releases\/<\/a><\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>MariaDB has been quite active lately, the version 11.8 was already quite a step forward and apart from the changes on LTS schedule and the EOL durations, the RC 12.3 LTS is bringing some interesting changes to binlogs, pushing even further performance improvements and reliability. Instead of using the traditional flat file binlogs, they can [&hellip;]<\/p>\n","protected":false},"author":153,"featured_media":43498,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1726],"tags":[141],"type_dbi":[3328],"class_list":["post-43437","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-mariadb","tag-mariadb","type-mariadb"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.2 (Yoast SEO v27.2) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>MariaDB 12.3 - Binlog Inside InnoDB - dbi Blog<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"MariaDB 12.3 - Binlog Inside InnoDB\" \/>\n<meta property=\"og:description\" content=\"MariaDB has been quite active lately, the version 11.8 was already quite a step forward and apart from the changes on LTS schedule and the EOL durations, the RC 12.3 LTS is bringing some interesting changes to binlogs, pushing even further performance improvements and reliability. Instead of using the traditional flat file binlogs, they can [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/\" \/>\n<meta property=\"og:site_name\" content=\"dbi Blog\" \/>\n<meta property=\"article:published_time\" content=\"2026-03-15T16:17:27+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-03-15T16:17:29+00:00\" \/>\n<meta property=\"og:image\" content=\"http:\/\/www.dbi-services.com\/blog\/wp-content\/uploads\/sites\/2\/2026\/03\/download.png\" \/>\n\t<meta property=\"og:image:width\" content=\"249\" \/>\n\t<meta property=\"og:image:height\" content=\"203\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Adrien Obernesser\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Adrien Obernesser\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"18 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/\"},\"author\":{\"name\":\"Adrien Obernesser\",\"@id\":\"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/fd2ab917212ce0200c7618afaa7fdbcd\"},\"headline\":\"MariaDB 12.3 &#8211; Binlog Inside InnoDB\",\"datePublished\":\"2026-03-15T16:17:27+00:00\",\"dateModified\":\"2026-03-15T16:17:29+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/\"},\"wordCount\":3977,\"commentCount\":0,\"image\":{\"@id\":\"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.dbi-services.com\/blog\/wp-content\/uploads\/sites\/2\/2026\/03\/download.png\",\"keywords\":[\"MariaDB\"],\"articleSection\":[\"MariaDB\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/\",\"url\":\"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/\",\"name\":\"MariaDB 12.3 - Binlog Inside InnoDB - dbi Blog\",\"isPartOf\":{\"@id\":\"https:\/\/www.dbi-services.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.dbi-services.com\/blog\/wp-content\/uploads\/sites\/2\/2026\/03\/download.png\",\"datePublished\":\"2026-03-15T16:17:27+00:00\",\"dateModified\":\"2026-03-15T16:17:29+00:00\",\"author\":{\"@id\":\"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/fd2ab917212ce0200c7618afaa7fdbcd\"},\"breadcrumb\":{\"@id\":\"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/#primaryimage\",\"url\":\"https:\/\/www.dbi-services.com\/blog\/wp-content\/uploads\/sites\/2\/2026\/03\/download.png\",\"contentUrl\":\"https:\/\/www.dbi-services.com\/blog\/wp-content\/uploads\/sites\/2\/2026\/03\/download.png\",\"width\":249,\"height\":203},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Accueil\",\"item\":\"https:\/\/www.dbi-services.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"MariaDB 12.3 &#8211; Binlog Inside InnoDB\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.dbi-services.com\/blog\/#website\",\"url\":\"https:\/\/www.dbi-services.com\/blog\/\",\"name\":\"dbi Blog\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.dbi-services.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/fd2ab917212ce0200c7618afaa7fdbcd\",\"name\":\"Adrien Obernesser\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/secure.gravatar.com\/avatar\/dc9316c729e50107159e0a1e631b9c1742ce8898576887d0103c83b1ca3bc9e6?s=96&d=mm&r=g\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/dc9316c729e50107159e0a1e631b9c1742ce8898576887d0103c83b1ca3bc9e6?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/dc9316c729e50107159e0a1e631b9c1742ce8898576887d0103c83b1ca3bc9e6?s=96&d=mm&r=g\",\"caption\":\"Adrien Obernesser\"},\"url\":\"https:\/\/www.dbi-services.com\/blog\/author\/adrienobernesser\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"MariaDB 12.3 - Binlog Inside InnoDB - dbi Blog","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/","og_locale":"en_US","og_type":"article","og_title":"MariaDB 12.3 - Binlog Inside InnoDB","og_description":"MariaDB has been quite active lately, the version 11.8 was already quite a step forward and apart from the changes on LTS schedule and the EOL durations, the RC 12.3 LTS is bringing some interesting changes to binlogs, pushing even further performance improvements and reliability. Instead of using the traditional flat file binlogs, they can [&hellip;]","og_url":"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/","og_site_name":"dbi Blog","article_published_time":"2026-03-15T16:17:27+00:00","article_modified_time":"2026-03-15T16:17:29+00:00","og_image":[{"width":249,"height":203,"url":"http:\/\/www.dbi-services.com\/blog\/wp-content\/uploads\/sites\/2\/2026\/03\/download.png","type":"image\/png"}],"author":"Adrien Obernesser","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Adrien Obernesser","Est. reading time":"18 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/#article","isPartOf":{"@id":"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/"},"author":{"name":"Adrien Obernesser","@id":"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/fd2ab917212ce0200c7618afaa7fdbcd"},"headline":"MariaDB 12.3 &#8211; Binlog Inside InnoDB","datePublished":"2026-03-15T16:17:27+00:00","dateModified":"2026-03-15T16:17:29+00:00","mainEntityOfPage":{"@id":"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/"},"wordCount":3977,"commentCount":0,"image":{"@id":"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/#primaryimage"},"thumbnailUrl":"https:\/\/www.dbi-services.com\/blog\/wp-content\/uploads\/sites\/2\/2026\/03\/download.png","keywords":["MariaDB"],"articleSection":["MariaDB"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/","url":"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/","name":"MariaDB 12.3 - Binlog Inside InnoDB - dbi Blog","isPartOf":{"@id":"https:\/\/www.dbi-services.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/#primaryimage"},"image":{"@id":"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/#primaryimage"},"thumbnailUrl":"https:\/\/www.dbi-services.com\/blog\/wp-content\/uploads\/sites\/2\/2026\/03\/download.png","datePublished":"2026-03-15T16:17:27+00:00","dateModified":"2026-03-15T16:17:29+00:00","author":{"@id":"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/fd2ab917212ce0200c7618afaa7fdbcd"},"breadcrumb":{"@id":"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/#primaryimage","url":"https:\/\/www.dbi-services.com\/blog\/wp-content\/uploads\/sites\/2\/2026\/03\/download.png","contentUrl":"https:\/\/www.dbi-services.com\/blog\/wp-content\/uploads\/sites\/2\/2026\/03\/download.png","width":249,"height":203},{"@type":"BreadcrumbList","@id":"https:\/\/www.dbi-services.com\/blog\/mariadb-12-3-binlog-inside-innodb\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Accueil","item":"https:\/\/www.dbi-services.com\/blog\/"},{"@type":"ListItem","position":2,"name":"MariaDB 12.3 &#8211; Binlog Inside InnoDB"}]},{"@type":"WebSite","@id":"https:\/\/www.dbi-services.com\/blog\/#website","url":"https:\/\/www.dbi-services.com\/blog\/","name":"dbi Blog","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.dbi-services.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/fd2ab917212ce0200c7618afaa7fdbcd","name":"Adrien Obernesser","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/dc9316c729e50107159e0a1e631b9c1742ce8898576887d0103c83b1ca3bc9e6?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/dc9316c729e50107159e0a1e631b9c1742ce8898576887d0103c83b1ca3bc9e6?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/dc9316c729e50107159e0a1e631b9c1742ce8898576887d0103c83b1ca3bc9e6?s=96&d=mm&r=g","caption":"Adrien Obernesser"},"url":"https:\/\/www.dbi-services.com\/blog\/author\/adrienobernesser\/"}]}},"_links":{"self":[{"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/posts\/43437","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/users\/153"}],"replies":[{"embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/comments?post=43437"}],"version-history":[{"count":44,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/posts\/43437\/revisions"}],"predecessor-version":[{"id":43505,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/posts\/43437\/revisions\/43505"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/media\/43498"}],"wp:attachment":[{"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/media?parent=43437"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/categories?post=43437"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/tags?post=43437"},{"taxonomy":"type","embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/type_dbi?post=43437"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}