{"id":43036,"date":"2026-02-22T22:36:36","date_gmt":"2026-02-22T21:36:36","guid":{"rendered":"https:\/\/www.dbi-services.com\/blog\/?p=43036"},"modified":"2026-02-22T22:42:16","modified_gmt":"2026-02-22T21:42:16","slug":"rag-series-embedding-versioning-lab","status":"publish","type":"post","link":"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/","title":{"rendered":"RAG Series \u2013 Embedding Versioning LAB"},"content":{"rendered":"\n<h1 class=\"wp-block-heading\" id=\"h-introduction\">Introduction<\/h1>\n\n\n\n<p>This is Part 2 of the embedding versionin, in <a href=\"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-with-pgvector-why-event-driven-architecture-is-a-precondition-to-ai-data-workflows\/\" target=\"_blank\" rel=\"noreferrer noopener\">Part 1<\/a>, I covered the theory: why event-driven embedding refresh matters, the three levels of architecture (triggers, logical replication, Flink CDC), and how to detect and skip insignificant changes. If you haven&#8217;t read it, go there first, this post won&#8217;t through the entire intent of the designs but just demonstrate how it can work.<\/p>\n\n\n\n<p>Here, I&#8217;m going to <strong>run the whole thing<\/strong> on the Wikipedia dataset from the <a href=\"https:\/\/github.com\/boutaga\/pgvector_RAG_search_lab\">pgvector_RAG_search_lab<\/a> repository. 25,000 articles, triggers, OpenAI API calls, real numbers.<\/p>\n\n\n\n<p>The goal is to answer the questions you&#8217;d actually have when implementing this:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>How do you adapt the schema to an existing table that wasn&#8217;t designed for versioning?<\/li>\n\n\n\n<li>What do the SKIP vs EMBED decisions actually look like with real data?<\/li>\n\n\n\n<li>Does <code>SELECT FOR UPDATE SKIP LOCKED<\/code> really work with concurrent workers? <\/li>\n\n\n\n<li>What does the freshness monitoring report show in practice?<\/li>\n\n\n\n<li>How does the quality feedback loop close the circle?<\/li>\n<\/ul>\n\n\n\n<p>All the code is in the <code>lab\/05_embedding_versioning\/<\/code> directory of the repository.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-what-s-in-the-lab-directory\">What&#8217;s in the lab directory<\/h3>\n\n\n\n<p>Before diving in, here&#8217;s what each file does \u2014 so you know what you&#8217;re running:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nlab\/05_embedding_versioning\/\n\u251c\u2500\u2500 schema.sql                          # DDL: tables, triggers, indexes\n\u251c\u2500\u2500 worker.py                           # Embedding worker (claims queue items, calls OpenAI, writes vectors)\n\u251c\u2500\u2500 change_detector.py                  # Compares new vs old embeddings to decide SKIP or EMBED\n\u251c\u2500\u2500 freshness_monitor.py                # Generates a full health report on embedding staleness\n\u2514\u2500\u2500 examples\/\n    \u251c\u2500\u2500 simulate_document_changes.py    # Generates a realistic mix of article mutations\n    \u251c\u2500\u2500 targeted_mutations.py           # Applies specific change types to specific articles\n    \u251c\u2500\u2500 demo_skip_locked.py             # Demonstrates concurrent worker queue distribution\n    \u251c\u2500\u2500 demo_trigger_flow.py            # End-to-end: UPDATE \u2192 trigger \u2192 queue \u2192 embed\n    \u2514\u2500\u2500 demo_quality_drift.py           # Simulates declining search quality + automatic re-queuing\n\n<\/pre><\/div>\n\n\n<p>Every script connects to the local <code>wikipedia<\/code> database and uses the same embedding queue. They&#8217;re designed to run sequentially \u2014 each step builds on the state left by the previous one.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-the-starting-point\">The Starting Point<\/h2>\n\n\n\n<p>My lab environment runs PostgreSQL 17.6 with pgvector 0.8.0 and pgvectorscale (DiskANN). The <code>articles<\/code> table already has 25,000 Wikipedia articles with dense and sparse embeddings from the previous labs (the <code>sparsevec(30522)<\/code> column holds SPLADE sparse vectors \u2014 30,522 is the BERT WordPiece vocabulary size):<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: sql; title: ; notranslate\" title=\"\">\nwikipedia=# \\d articles\n                          Table &quot;public.articles&quot;\n         Column         |       Type       | Collation | Nullable | Default\n------------------------+------------------+-----------+----------+---------\n id                          | integer          |           | not null |\n url                    | text             |           |          |\n title                  | text             |           |          |\n content                | text             |           |          |\n title_vector           | vector(1536)     |           |          |\n content_vector         | vector(1536)     |           |          |\n vector_id              | integer          |           |          |\n content_tsv            | tsvector         |           |          |\n title_content_tsvector | tsvector         |           |          |\n content_sparse         | sparsevec(30522) |           |          |\n title_vector_3072      | vector(3072)     |           |          |\n content_vector_3072    | vector(3072)     |           |          |\nIndexes:\n    &quot;articles_pkey&quot; PRIMARY KEY, btree (id)\n    &quot;articles_content_3072_diskann&quot; diskann (content_vector_3072)\n    &quot;articles_sparse_hnsw&quot; hnsw (content_sparse sparsevec_cosine_ops)\n    &quot;articles_title_vector_3072_diskann&quot; diskann (title_vector_3072)\n    &quot;idx_articles_content_tsv&quot; gin (content_tsv)\n    &quot;idx_articles_title_content_tsvector&quot; gin (title_content_tsvector)\nTriggers:\n    tsvectorupdate BEFORE INSERT OR UPDATE ON articles FOR EACH ROW ...\n    tsvupdate BEFORE INSERT OR UPDATE ON articles FOR EACH ROW ...\n\n<\/pre><\/div>\n\n\n<p>No <code>content_hash<\/code>, no <code>updated_at<\/code>, no versioned embeddings. This is the reality of most existing deployments \u2014 you need to retrofit versioning without breaking what&#8217;s already working.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-1-apply-the-versioning-schema\">Step 1: Apply the Versioning Schema<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-what-schema-sql-does\">What <code>schema.sql<\/code> does<\/h3>\n\n\n\n<p>The schema file adapts the generic pattern from Part 1 to the existing <code>articles<\/code> table. It runs inside a single transaction and performs these operations in order:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Adds two columns<\/strong> to <code>articles<\/code>: <code>content_hash TEXT<\/code> and <code>updated_at TIMESTAMPTZ DEFAULT now()<\/code><\/li>\n\n\n\n<li><strong>Creates a BEFORE trigger<\/strong> (<code>trg_content_hash<\/code>) that automatically computes <code>md5(content)<\/code> before every INSERT or UPDATE of the <code>content<\/code> column \u2014 this is our change detection fingerprint<\/li>\n\n\n\n<li><strong>Backfills<\/strong> <code>content_hash<\/code> for all 25,000 existing articles with <code>UPDATE articles SET content_hash = md5(content)<\/code><\/li>\n\n\n\n<li><strong>Creates <code>article_embeddings_versioned<\/code><\/strong> \u2014 the versioned embeddings table with <code>model_name<\/code>, <code>model_version<\/code>, <code>source_hash<\/code>, <code>is_current<\/code>, and a partial DiskANN index on <code>WHERE is_current = true<\/code><\/li>\n\n\n\n<li><strong>Creates <code>embedding_queue<\/code><\/strong> \u2014 the work queue with <code>status<\/code>, <code>content_hash<\/code>, <code>change_type<\/code>, <code>claimed_at<\/code>, and retry tracking<\/li>\n\n\n\n<li><strong>Creates <code>embedding_change_log<\/code><\/strong> \u2014 records every SKIP\/EMBED decision with similarity scores for audit<\/li>\n\n\n\n<li><strong>Creates <code>retrieval_quality_log<\/code><\/strong> \u2014 for the quality feedback loop (Step 9b)<\/li>\n\n\n\n<li><strong>Creates an AFTER trigger<\/strong> (<code>trg_queue_embedding<\/code>) that fires on <code>INSERT OR UPDATE OF content<\/code> and inserts a queue entry automatically<\/li>\n<\/ol>\n\n\n\n<p>Key differences from the &#8220;clean&#8221; schema in Part 1:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>No generated column for <code>content_hash<\/code><\/strong>: <code>GENERATED ALWAYS AS (md5(content)) STORED<\/code> would rewrite the entire 25K-row table. The BEFORE trigger achieves the same result without a table rewrite \u2014 important for large production tables.<\/li>\n\n\n\n<li><strong>Column-targeted trigger<\/strong>: <code>AFTER UPDATE OF content<\/code> instead of <code>AFTER UPDATE<\/code>. The trigger only fires when the <code>content<\/code> column is touched \u2014 title-only or metadata-only updates are ignored at the PostgreSQL level, not inside application code.<\/li>\n\n\n\n<li><strong>Table naming<\/strong>: <code>article_embeddings_versioned<\/code> (not <code>document_embeddings<\/code>) to match the existing <code>articles<\/code> table naming convention.<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\npsql -d wikipedia -f lab\/05_embedding_versioning\/schema.sql\n\n<\/pre><\/div>\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: sql; title: ; notranslate\" title=\"\">\nBEGIN\nALTER TABLE\nCREATE FUNCTION\nDROP TRIGGER\nCREATE TRIGGER\nUPDATE 25000\nCREATE TABLE\nCREATE INDEX\nCREATE INDEX\nCREATE INDEX\nCREATE TABLE\nCREATE INDEX\nCREATE INDEX\nCREATE TABLE\nCREATE TABLE\nCREATE FUNCTION\nDROP TRIGGER\nCREATE TRIGGER\nCOMMIT\n\n<\/pre><\/div>\n\n\n<p>Let me walk through the important lines:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong><code>ALTER TABLE<\/code><\/strong> \u2014 adds <code>content_hash<\/code> and <code>updated_at<\/code> columns<\/li>\n\n\n\n<li><strong><code>CREATE FUNCTION<\/code> + <code>CREATE TRIGGER<\/code><\/strong> (first pair) \u2014 the BEFORE trigger that computes <code>md5(content)<\/code><\/li>\n\n\n\n<li><strong><code>UPDATE 25000<\/code><\/strong> \u2014 the backfill. This is the most expensive line: PostgreSQL computes MD5 for every article and writes the hash. On 25K rows it takes a few seconds; on millions of rows, plan a maintenance window<\/li>\n\n\n\n<li><strong><code>CREATE TABLE<\/code> + <code>CREATE INDEX<\/code> (\u00d73)<\/strong> \u2014 the versioned embeddings table with its partial DiskANN index, version lookup index, and staleness detection index<\/li>\n\n\n\n<li><strong><code>CREATE FUNCTION<\/code> + <code>CREATE TRIGGER<\/code><\/strong> (second pair) \u2014 the AFTER trigger that queues embedding work<\/li>\n<\/ul>\n\n\n\n<p>After applying, the table now has versioning infrastructure:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: sql; title: ; notranslate\" title=\"\">\nwikipedia=# \\d articles\n         Column         |           Type           | Nullable | Default\n------------------------+--------------------------+----------+---------\n ...existing columns...\n content_hash           | text                     |          |\n updated_at             | timestamp with time zone |          | now()\nReferenced by:\n    TABLE &quot;article_embeddings_versioned&quot; CONSTRAINT ... FOREIGN KEY (article_id) ...\n    TABLE &quot;embedding_queue&quot; CONSTRAINT ... FOREIGN KEY (article_id) ...\nTriggers:\n    trg_content_hash BEFORE INSERT OR UPDATE OF content ON articles ...\n    trg_queue_embedding AFTER INSERT OR UPDATE OF content ON articles ...\n    tsvectorupdate BEFORE INSERT OR UPDATE ON articles ...\n    tsvupdate BEFORE INSERT OR UPDATE ON articles ...\n\n<\/pre><\/div>\n\n\n<p>Two new triggers alongside the existing tsvector triggers. They coexist without conflict because <code>trg_content_hash<\/code> is BEFORE (updates the hash) and <code>trg_queue_embedding<\/code> is AFTER (queues the embedding work using the already-computed hash).<\/p>\n\n\n\n<p>Five new tables: <code>article_embeddings_versioned<\/code>, <code>embedding_queue<\/code>, <code>embedding_change_log<\/code>, <code>retrieval_quality_log<\/code>, and the queue&#8217;s indexes.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-2-test-the-trigger-manually\">Step 2: Test the Trigger Manually<\/h2>\n\n\n\n<p>Before running anything complex, verify the trigger actually works. This is just a sanity check \u2014 one UPDATE, then look at the queue:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: sql; title: ; notranslate\" title=\"\">\nwikipedia=# SELECT id, title, content_hash FROM articles WHERE id = 1;\n id | title |           content_hash\n----+-------+----------------------------------\n  1 | April | 47761052aee1158134fc07f3f7337952\n\nwikipedia=# UPDATE articles SET content = content || &#039; &#x5B;test trigger]&#039; WHERE id = 1;\nUPDATE 1\n\nwikipedia=# SELECT id, article_id, status, content_hash, change_type, queued_at\n  FROM embedding_queue ORDER BY queued_at DESC LIMIT 5;\n id | article_id | status  |           content_hash           |  change_type   |           queued_at\n----+------------+---------+----------------------------------+----------------+-------------------------------\n  1 |          1 | pending | 59e5ebe6fa9fce7ab87beccf6523dda6 | content_update | 2026-02-18 14:38:01.626792+00\n\n<\/pre><\/div>\n\n\n<p><strong>What happened here, step by step:<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>We checked article 1 (&#8220;April&#8221;) \u2014 its <code>content_hash<\/code> was <code>4776...<\/code><\/li>\n\n\n\n<li>We appended <code>' [test trigger]'<\/code> to its content<\/li>\n\n\n\n<li>The <strong>BEFORE trigger<\/strong> (<code>trg_content_hash<\/code>) fired first, recomputing <code>content_hash<\/code> to <code>59e5...<\/code> (the new MD5)<\/li>\n\n\n\n<li>The <strong>AFTER trigger<\/strong> (<code>trg_queue_embedding<\/code>) fired next, inserting a row into <code>embedding_queue<\/code> with the new hash and <code>change_type = 'content_update'<\/code><\/li>\n\n\n\n<li>The queue entry has <code>status = 'pending'<\/code> \u2014 nothing has processed it yet<\/li>\n<\/ol>\n\n\n\n<p>The <code>change_type<\/code> column is important: it&#8217;s how we&#8217;ll later distinguish content-triggered re-embeddings from quality-triggered ones (Step 9b).<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-3-simulate-50-document-mutations\">Step 3: Simulate 50 Document Mutations<\/h2>\n\n\n\n<p>Real knowledge bases don&#8217;t get 1 change at a time. The <code>simulate_document_changes.py<\/code> script generates a realistic mix of changes to random articles.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-what-the-script-does\">What the script does<\/h3>\n\n\n\n<p>The script picks 50 random articles from the database and applies one of five mutation types to each, chosen by a weighted random distribution that mimics real-world editing patterns:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong><code>typo_fix<\/code><\/strong> (most common): appends a period or fixes a word \u2014 the kind of minor edit that shouldn&#8217;t trigger re-embedding<\/li>\n\n\n\n<li><strong><code>paragraph_add<\/code><\/strong>: appends a substantial paragraph (3-5 sentences) \u2014 new information that changes the semantic content<\/li>\n\n\n\n<li><strong><code>section_rewrite<\/code><\/strong>: replaces a portion of the article with new text \u2014 significant semantic shift<\/li>\n\n\n\n<li><strong><code>major_rewrite<\/code><\/strong>: rewrites most of the article \u2014 entirely new embedding needed<\/li>\n\n\n\n<li><strong><code>metadata_only<\/code><\/strong>: changes only the title (not the content) \u2014 should NOT trigger the embedding pipeline at all<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\npython examples\/simulate_document_changes.py --count 50\n\n<\/pre><\/div>\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nMutation Summary:\n----------------------------------------\n  major_rewrite        3\n  metadata_only        6\n  paragraph_add       15\n  section_rewrite      4\n  typo_fix            22\n  TOTAL               50\n\n<\/pre><\/div>\n\n\n<p>This distribution is realistic: most changes are minor fixes, a smaller portion adds new content, and a few are major rewrites. The 6 <code>metadata_only<\/code> changes simulate edits to fields other than <code>content<\/code> \u2014 think correcting a title or updating a URL.<\/p>\n\n\n\n<p>Now check the queue:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: sql; title: ; notranslate\" title=\"\">\nwikipedia=# SELECT status, count(*) FROM embedding_queue GROUP BY status;\n status  | count\n---------+-------\n pending |    44\n\n<\/pre><\/div>\n\n\n<p><strong>50 mutations, but only 44 queue entries.<\/strong> Where did the other 6 go?<\/p>\n\n\n\n<p>The 6 <code>metadata_only<\/code> mutations changed only the title (not content), so the trigger \u2014 which fires on <code>UPDATE OF content<\/code> \u2014 <strong>didn&#8217;t fire for them<\/strong>. Those 6 changes never reached the embedding pipeline. This is the first cost optimization, and it happens at the PostgreSQL trigger level with zero application code, zero API calls, zero overhead.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><strong>Why this matters<\/strong>: In a real knowledge base, a meaningful fraction of updates are metadata-only \u2014 tags, categories, status flags, author fields (in some orgs, 30-50% of all UPDATEs). Filtering them at the trigger level means your embedding worker never even sees them.<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-4-change-detection-without-a-baseline\">Step 4: Change Detection Without a Baseline<\/h2>\n\n\n\n<p>Now let&#8217;s run the change detector to see which items should be embedded vs skipped.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-what-change-detector-py-does\">What <code>change_detector.py<\/code> does<\/h3>\n\n\n\n<p>The change detector is the &#8220;smart filter&#8221; in our pipeline. For each pending queue item, it:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Fetches the article&#8217;s current content<\/strong> from the <code>articles<\/code> table<\/li>\n\n\n\n<li><strong>Looks up the most recent embedding<\/strong> for that article in <code>article_embeddings_versioned<\/code><\/li>\n\n\n\n<li><strong>If no previous embedding exists<\/strong>: marks the item as EMBED (similarity = 0.0) \u2014 there&#8217;s nothing to compare against<\/li>\n\n\n\n<li><strong>If a previous embedding exists<\/strong>: generates a new embedding for the current content via OpenAI, computes the <strong>cosine similarity<\/strong> between old and new embeddings, and applies the threshold:\n<ul class=\"wp-block-list\">\n<li>Similarity \u2265 0.95 \u2192 <strong>SKIP<\/strong> (the semantic meaning barely changed, re-embedding would be wasteful)<\/li>\n\n\n\n<li>Similarity &lt; 0.95 \u2192 <strong>EMBED<\/strong> (the meaning shifted enough to warrant a new embedding)<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Logs every decision<\/strong> to <code>embedding_change_log<\/code> with the similarity score \u2014 this is your audit trail<\/li>\n<\/ol>\n\n\n\n<p><strong>Multi-chunk articles<\/strong>: When an article has multiple chunks (like &#8220;Dean Martin&#8221; with 3), the detector compares against <code>chunk_index = 0<\/code> only \u2014 the lead section, which concentrates the article&#8217;s core topic. This is a deliberate tradeoff: it&#8217;s fast (one comparison, not N), and for Wikipedia-style content where the introduction summarizes the whole article, it&#8217;s a reliable proxy. For corpora where meaning is spread more evenly across chunks, you&#8217;d want a centroid approach (average the L2-normalized chunk vectors) or max pairwise similarity across corresponding chunks. The threshold may need recalibration depending on which strategy you choose.<\/p>\n\n\n\n<p>The <code>--analyze-queue<\/code> flag tells it to analyze all pending items without actually embedding anything. Think of it as a dry run that records decisions.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\npython change_detector.py --analyze-queue\n\n<\/pre><\/div>\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n2026-02-18 14:43:22 &#x5B;DETECTOR] INFO Analyzing 44 pending queue items (threshold=0.95)\n2026-02-18 14:43:22 &#x5B;DETECTOR] INFO Article 6607: EMBED (similarity=0.0000)\n2026-02-18 14:43:22 &#x5B;DETECTOR] INFO Article 36870: EMBED (similarity=0.0000)\n...all 44 show similarity=0.0000...\n2026-02-18 14:43:22 &#x5B;DETECTOR] INFO Results: 44 EMBED, 0 SKIP\n\n<\/pre><\/div>\n\n\n<p>Every single article shows <code>similarity=0.0<\/code>. Why?<\/p>\n\n\n\n<p>Because <code>article_embeddings_versioned<\/code> is <strong>empty<\/strong>. There are no previous embeddings to compare against. The change detector hit step 3 for every article: &#8220;no previous embedding exists \u2192 must EMBED.&#8221;<\/p>\n\n\n\n<p><strong>This is an important operational insight<\/strong>: the change detector needs a baseline to work. On the very first run \u2014 or when you deploy to a new system \u2014 everything must be embedded. The SKIP optimization only kicks in on <strong>subsequent<\/strong> changes, after embeddings exist to compare against. If you&#8217;re migrating from a system that already has embeddings in a different format, you&#8217;d need to populate the <code>source_hash<\/code> column from those existing embeddings first to bootstrap the comparison.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-5-create-baseline-embeddings\">Step 5: Create Baseline Embeddings<\/h2>\n\n\n\n<p>Now we need to establish that baseline. Let&#8217;s run the worker for one small batch.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-what-worker-py-does\">What <code>worker.py<\/code> does<\/h3>\n\n\n\n<p>The worker is the component that actually calls the OpenAI API and writes embeddings to PostgreSQL. Here&#8217;s its internal flow:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Claim items from the queue<\/strong> using <code>SELECT ... FOR UPDATE SKIP LOCKED<\/code> \u2014 this is the concurrency primitive from Part 1. Multiple workers can run simultaneously, and each gets a non-overlapping set of items.<\/li>\n\n\n\n<li><strong>For each claimed item<\/strong>: fetch the article content, split it into chunks (2000-character windows with overlap), and call the OpenAI <code>text-embedding-3-small<\/code> API to generate a 1536-dimensional vector for each chunk.<\/li>\n\n\n\n<li><strong>Write the embeddings<\/strong> to <code>article_embeddings_versioned<\/code> with <code>is_current = true<\/code>, <code>model_name<\/code>, <code>model_version<\/code>, and <code>source_hash<\/code> (the content&#8217;s MD5 at the moment of embedding).<\/li>\n\n\n\n<li><strong>Mark old embeddings<\/strong> for the same article as <code>is_current = false<\/code> (soft delete \u2014 they&#8217;re kept for rollback).<\/li>\n\n\n\n<li><strong>Update the queue item<\/strong> to <code>status = 'completed'<\/code> with <code>processed_at = now()<\/code>.<\/li>\n<\/ol>\n\n\n\n<p>The <code>--once<\/code> flag means &#8220;process one batch and exit&#8221; (instead of running in an infinite polling loop). The <code>--batch-size 10<\/code> flag means &#8220;claim up to 10 items at a time.&#8221;<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\npython worker.py --once --batch-size 10\n\n<\/pre><\/div>\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n2026-02-18 14:45:58 &#x5B;8526] INFO Worker worker-once claimed 10 items\n2026-02-18 14:46:02 &#x5B;8526] INFO Article 6607: embedded 1 chunks\n2026-02-18 14:46:02 &#x5B;8526] INFO Article 36870: embedded 1 chunks\n2026-02-18 14:46:05 &#x5B;8526] INFO Article 19078: embedded 1 chunks\n2026-02-18 14:46:05 &#x5B;8526] INFO Article 7947: embedded 1 chunks\n2026-02-18 14:46:05 &#x5B;8526] INFO Article 75802: embedded 2 chunks\n2026-02-18 14:46:05 &#x5B;8526] INFO Article 5150: embedded 1 chunks\n2026-02-18 14:46:06 &#x5B;8526] INFO Article 55579: embedded 1 chunks\n2026-02-18 14:46:06 &#x5B;8526] INFO Article 92697: embedded 1 chunks\n2026-02-18 14:46:06 &#x5B;8526] INFO Article 49417: embedded 3 chunks\n2026-02-18 14:46:06 &#x5B;8526] INFO Article 70595: embedded 1 chunks\nProcessed 10 items\n\n<\/pre><\/div>\n\n\n<p><strong>Reading the output:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>claimed 10 items<\/code> \u2014 the worker took 10 items from the queue using SKIP LOCKED. If another worker ran simultaneously, it would get different items.<\/li>\n\n\n\n<li><code>Article 6607: embedded 1 chunks<\/code> \u2014 this article&#8217;s content fit within a single 2000-character chunk. One API call, one embedding vector stored.<\/li>\n\n\n\n<li><code>Article 75802: embedded 2 chunks<\/code> \u2014 &#8220;Brandenburg Gate&#8221; was longer and required two chunks. Two API calls, two embedding vectors, both linked to the same article with <code>chunk_index<\/code> 0 and 1.<\/li>\n\n\n\n<li><code>Article 49417: embedded 3 chunks<\/code> \u2014 &#8220;Dean Martin&#8221; was the longest article in this batch, requiring three chunks.<\/li>\n<\/ul>\n\n\n\n<p>Let&#8217;s verify the data in PostgreSQL:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: sql; title: ; notranslate\" title=\"\">\nwikipedia=# SELECT count(DISTINCT article_id) AS articles, count(*) AS chunks\n  FROM article_embeddings_versioned WHERE is_current = true;\n articles | chunks\n----------+--------\n       10 |     13\n\n<\/pre><\/div>\n\n\n<p>10 articles, 13 chunks. The numbers match the worker output.<\/p>\n\n\n\n<p>Total time: ~8 seconds for 10 articles. <strong>The bottleneck is the OpenAI API call<\/strong> (~300-600ms per embedding request), not PostgreSQL. In this lab, the trigger overhead, queue operations, and embedding writes were all negligible compared to API latency. If you need faster throughput, the answer is more workers (see Step 8) or a local embedding model \u2014 not database optimization.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-6-the-real-demo-skip-vs-embed\">Step 6: The Real Demo \u2014 SKIP vs EMBED<\/h2>\n\n\n\n<p>Now we have a baseline: 10 articles with embeddings and known <code>source_hash<\/code> values. This is the step where the change detector can finally do its job properly.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-what-targeted-mutations-py-does\">What <code>targeted_mutations.py<\/code> does<\/h3>\n\n\n\n<p>This script applies <strong>specific, known mutation types<\/strong> to the 10 articles we just embedded. Unlike <code>simulate_document_changes.py<\/code> (which picks random articles and random mutations), this script is deterministic \u2014 we control exactly what changes happen so we can verify the detector&#8217;s decisions:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>5 articles<\/strong>: append a single period character (<code>.<\/code>) to the content \u2014 the smallest possible content change. This is a typo-level edit that should not change the semantic meaning at all.<\/li>\n\n\n\n<li><strong>3 articles<\/strong>: append a substantial paragraph (~100 words of new information) \u2014 this adds genuine semantic content that should shift the embedding.<\/li>\n\n\n\n<li><strong>2 articles<\/strong>: rewrite the second half of the content \u2014 a major structural change that dramatically alters the meaning.<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\npython examples\/targeted_mutations.py\n\n<\/pre><\/div>\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nEmbedded articles: &#x5B;5150, 6607, 7947, 19078, 36870, 49417, 55579, 70595, 75802, 92697]\n  Article 5150: appended period (typo fix)\n  Article 6607: appended period (typo fix)\n  Article 7947: appended period (typo fix)\n  Article 19078: appended period (typo fix)\n  Article 36870: appended period (typo fix)\n  Article 49417: appended major paragraph\n  Article 55579: appended major paragraph\n  Article 70595: appended major paragraph\n  Article 75802: rewrote second half\n  Article 92697: rewrote second half\nDone - 10 targeted mutations applied\n\n<\/pre><\/div>\n\n\n<p>Each of these UPDATEs fires the trigger, which creates a new queue entry. But now \u2014 unlike Step 4 \u2014 we have <strong>existing embeddings<\/strong> to compare against.<\/p>\n\n\n\n<p>Now run the change detector again:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-what-happens-inside-the-detector-this-time\">What happens inside the detector this time<\/h3>\n\n\n\n<p>For each of the 10 mutated articles, the detector:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Takes the article&#8217;s current (modified) content<\/li>\n\n\n\n<li>Generates a new embedding via OpenAI<\/li>\n\n\n\n<li>Retrieves the existing embedding from <code>article_embeddings_versioned<\/code><\/li>\n\n\n\n<li>Computes cosine similarity between old and new<\/li>\n\n\n\n<li>Applies the 0.95 threshold<\/li>\n<\/ol>\n\n\n\n<p>For the 34 other pending items (from Step 3, still without baseline embeddings), it still returns similarity=0.0.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\npython change_detector.py --analyze-queue\n\n<\/pre><\/div>\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n...34 articles without baseline still show EMBED (similarity=0.0000)...\n\n2026-02-18 14:54:03 &#x5B;DETECTOR] INFO Article 5150: SKIP (similarity=0.9981)\n2026-02-18 14:54:03 &#x5B;DETECTOR] INFO Article 6607: SKIP (similarity=0.9994)\n2026-02-18 14:54:03 &#x5B;DETECTOR] INFO Article 7947: SKIP (similarity=0.9993)\n2026-02-18 14:54:03 &#x5B;DETECTOR] INFO Article 19078: SKIP (similarity=0.9993)\n2026-02-18 14:54:04 &#x5B;DETECTOR] INFO Article 36870: SKIP (similarity=0.9997)\n2026-02-18 14:54:04 &#x5B;DETECTOR] INFO Article 49417: EMBED (similarity=0.9263)\n2026-02-18 14:54:04 &#x5B;DETECTOR] INFO Article 55579: EMBED (similarity=0.9255)\n2026-02-18 14:54:04 &#x5B;DETECTOR] INFO Article 70595: EMBED (similarity=0.9369)\n2026-02-18 14:54:04 &#x5B;DETECTOR] INFO Article 75802: EMBED (similarity=0.6256)\n2026-02-18 14:54:04 &#x5B;DETECTOR] INFO Article 92697: EMBED (similarity=0.5090)\n2026-02-18 14:54:04 &#x5B;DETECTOR] INFO Results: 39 EMBED, 5 SKIP\n\n<\/pre><\/div>\n\n\n<h3 class=\"wp-block-heading\" id=\"h-reading-the-results\">Reading the results<\/h3>\n\n\n\n<p><strong>What the similarity numbers mean:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>0.998\u20130.999 (typo fixes)<\/strong>: The old and new embeddings are nearly identical. Adding a period barely shifts the vector in 1536-dimensional space. The detector correctly says: &#8220;this content hasn&#8217;t meaningfully changed \u2014 skip the re-embed.&#8221; That avoids 5 unnecessary write operations, index churn, and version flips.<\/li>\n\n\n\n<li><strong>0.925\u20130.937 (paragraph additions)<\/strong>: Adding 100 words of new information shifts the embedding enough to drop below 0.95. The detector correctly says: &#8220;the semantic content changed \u2014 re-embed.&#8221; The new paragraph about Dean Martin&#8217;s film career or Brandenburg Gate&#8217;s Cold War history needs to be reflected in the vector.<\/li>\n\n\n\n<li><strong>0.509\u20130.626 (section rewrites)<\/strong>: Rewriting half the article dramatically changes the meaning. These similarities are far below the threshold \u2014 clearly needing re-embedding.<\/li>\n\n\n\n<li><strong>0.0 (no baseline)<\/strong>: The 34 articles from Step 3 that still have no embeddings. Can&#8217;t compare what doesn&#8217;t exist yet.<\/li>\n<\/ul>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><strong>Cost honesty note<\/strong>: The detector uses embedding similarity, which means it calls OpenAI once per article to generate the comparison vector \u2014 even for articles it ultimately SKIPs. So SKIP doesn&#8217;t eliminate API spend; it eliminates <strong>unnecessary writes, index churn, and version flips<\/strong>. For single-chunk articles (the majority in this lab), the detection call is the same cost as the embedding call itself. The real savings show up with multi-chunk articles: the detector spends 1 API call to decide, versus N calls to re-embed all chunks. In production, you&#8217;d add <strong>cheaper pre-filters first<\/strong>, <code>content_hash<\/code> comparison (free, catches identical content), text diff ratio (cheap, catches typos),  and reserve embedding-similarity checks for borderline cases where the content changed but the semantic impact is unclear. That&#8217;s the graduation path Part 1 describes.<\/p>\n<\/blockquote>\n\n\n\n<p><strong>The key insight<\/strong>: there&#8217;s a <strong>clean gap<\/strong> between the typo group (lowest: 0.9981) and the paragraph group (highest: 0.9369). That gap from 0.937 to 0.998 is where our 0.95 threshold sits. It doesn&#8217;t fall in ambiguous territory. The change types cluster naturally, which is what makes threshold-based detection practical in the real world.<\/p>\n\n\n\n<p>The queue now reflects the decisions:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: sql; title: ; notranslate\" title=\"\">\nwikipedia=# SELECT status, count(*) FROM embedding_queue GROUP BY status;\n  status   | count\n-----------+-------\n skipped   |     5\n completed |    10\n pending   |    39\n\n<\/pre><\/div>\n\n\n<ul class=\"wp-block-list\">\n<li><strong>5 skipped<\/strong>: the typo-level changes \u2014 unnecessary writes avoided, no quality loss<\/li>\n\n\n\n<li><strong>10 completed<\/strong>: the baseline embeddings from Step 5<\/li>\n\n\n\n<li><strong>39 pending<\/strong>: 34 no-baseline articles + 5 newly-detected EMBED items, waiting for the worker<\/li>\n<\/ul>\n\n\n\n<p>The <code>skipped<\/code> status is an audit trail \u2014 you can always go back and see what was skipped, when, and at what similarity score (recorded in <code>embedding_change_log<\/code>).<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-7-freshness-monitoring-report\">Step 7: Freshness Monitoring Report<\/h2>\n\n\n\n<p>In production, you need a dashboard \u2014 not individual log lines. The <code>freshness_monitor.py<\/code> script consolidates all the monitoring queries from Part 1 into a single diagnostic report.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-what-freshness-monitor-py-does\">What <code>freshness_monitor.py<\/code> does<\/h3>\n\n\n\n<p>The script runs five monitoring queries against the database and formats them into a human-readable report:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Freshness summary<\/strong>: How many articles have embeddings? How many are stale (content changed since last embedding)?<\/li>\n\n\n\n<li><strong>Stale articles detail<\/strong>: Which specific articles have drifted \u2014 showing both the current content hash and the embedding&#8217;s source hash so you can see the mismatch<\/li>\n\n\n\n<li><strong>Queue health<\/strong>: Breakdown by status with timestamps \u2014 tells you if items are stuck or if the queue is draining properly<\/li>\n\n\n\n<li><strong>Version coverage<\/strong>: Which embedding models are in use and how many articles\/chunks each covers<\/li>\n\n\n\n<li><strong>Change detection decisions<\/strong>: Aggregated SKIP\/EMBED statistics with average similarity scores<\/li>\n<\/ol>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\npython freshness_monitor.py --report\n\n<\/pre><\/div>\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nEmbedding Freshness Report \u2014 2026-02-18 14:57:53\n\n============================================================\n  Freshness Summary\n============================================================\n  Total articles:        25000\n  With embeddings:       10  (0.0%)\n  Without embeddings:    24990\n  Stale embeddings:      10  (100.0%)\n\n<\/pre><\/div>\n\n\n<p><strong>Reading this<\/strong>: Only 10 of 25,000 articles have versioned embeddings (from Step 5). All 10 are &#8220;stale&#8221; because we just mutated all of them in Step 6. In a real deployment, you&#8217;d see something like &#8220;23,450 with embeddings (93.8%), 312 stale (1.3%)&#8221; \u2014 and you&#8217;d alert if stale exceeded, say, 5%.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n============================================================\n  Stale Articles (content changed since embedding)\n============================================================\n  ID    | Title                                  | Current Hash     | Embed Hash       | ...\n  ------+----------------------------------------+------------------+------------------+----\n  5150  | 1787                                   | 5b14bc4a2d...    | 11e81bc4de...    | ...\n  6607  | Needle                                 | 3ebb3c3cbb...    | 5c5290b5a7...    | ...\n  49417 | Dean Martin                            | 7061f1803f...    | f7fd9f30e6...    | ...\n  75802 | Brandenburg Gate                       | 7da53df7a0...    | 5a2dcc01f9...    | ...\n  ...6 more...\n\n<\/pre><\/div>\n\n\n<p>The <code>Current Hash<\/code> and <code>Embed Hash<\/code> columns are the two MD5 fingerprints. When they don&#8217;t match, it means the article&#8217;s content has changed since we last generated its embedding. Article 5150 (&#8220;1787&#8221;) shows different hashes even though we only appended a period \u2014 the MD5 captures <em>any<\/em> content change, even trivial ones. The <strong>change detector<\/strong> is what decides whether the difference matters semantically (and it said SKIP for this one).<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n============================================================\n  Queue Health\n============================================================\n  Status    | Count | Oldest                 | Newest\n  ----------+-------+------------------------+------------------------\n  pending   | 39    | 2026-02-18 14:39:59    | 2026-02-18 14:53:55\n  completed | 10    | 2026-02-18 14:39:58    | 2026-02-18 14:39:59\n  skipped   | 5     | 2026-02-18 14:53:55    | 2026-02-18 14:53:55\n\n<\/pre><\/div>\n\n\n<p>The queue is healthy but has a backlog. 39 items pending, oldest from ~15 minutes ago. In production, you&#8217;d watch the gap between &#8220;Oldest&#8221; and &#8220;Newest&#8221; \u2014 if the oldest item keeps getting older while new items are added, your workers can&#8217;t keep up. That&#8217;s when you scale up workers (see Step 8) or increase batch size.<\/p>\n\n\n\n<p>The 10 <code>completed<\/code> items are from Step 5, the 5 <code>skipped<\/code> from Step 6&#8217;s change detector.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n============================================================\n  Embedding Version Coverage\n============================================================\n  Model Version          | Articles | Chunks | Current\n  -----------------------+----------+--------+--------\n  text-embedding-3-small | 10       | 13     | 13\n\n<\/pre><\/div>\n\n\n<p> Only one model version in use, covering 10 articles with 13 chunks, all current. During a blue-green model upgrade (Part 1&#8217;s model versioning section), you&#8217;d see two rows here \u2014 v1 and v2 \u2014 and track coverage convergence.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n============================================================\n  Change Detection Decisions\n============================================================\n  Decision | Count | Avg Similarity\n  ---------+-------+---------------\n  EMBED    | 83    | 0.0473\n  SKIP     | 5     | 0.9992\n\n<\/pre><\/div>\n\n\n<p>The average similarity for EMBED decisions is 0.0473 because most of those 83 decisions had similarity=0.0 (no baseline). The 5 SKIPs have an average of 0.9992 \u2014 confirming these were truly trivial changes. In a mature deployment, the EMBED average similarity would be higher (0.7\u20130.9 range) and the SKIP\/EMBED ratio would tell you how efficient your threshold is.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-8-skip-locked-multi-worker-concurrency\">Step 8: SKIP LOCKED \u2014 Multi-Worker Concurrency<\/h2>\n\n\n\n<p>This is the demo that proves the theory from Part 1&#8217;s deep dive on <code>SELECT FOR UPDATE SKIP LOCKED<\/code>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-what-demo-skip-locked-py-does\">What <code>demo_skip_locked.py<\/code> does<\/h3>\n\n\n\n<p>The script launches multiple Python threads that each behave like independent embedding workers. Each thread:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Opens its own database connection<\/li>\n\n\n\n<li>Runs <code>UPDATE embedding_queue SET status='processing' WHERE queue_id IN (SELECT queue_id FROM embedding_queue WHERE status='pending' ORDER BY queued_at FOR UPDATE SKIP LOCKED LIMIT n)<\/code> \u2014 the exact same claim query the real worker uses (note the <code>ORDER BY queued_at<\/code> \u2014 without it, selection order is not deterministic and oldest-first is not guaranteed)<\/li>\n\n\n\n<li>Records which <code>queue_id<\/code> values it got<\/li>\n\n\n\n<li>Does NOT actually call OpenAI (this is a concurrency demo, not an embedding demo)<\/li>\n<\/ol>\n\n\n\n<p>After all threads finish, the script checks for <strong>overlap<\/strong>: did any two workers claim the same item? The answer should always be zero.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\npython examples\/demo_skip_locked.py --workers 4 --items 39\n\n<\/pre><\/div>\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n============================================================\n  Demo: SKIP LOCKED Multi-Worker Concurrency\n  Workers: 4  |  Target items: 39\n============================================================\n\nLaunching 4 workers (each requesting up to 14 items)...\n\n  demo-worker-0: claimed 14 items  (articles: &#x5B;96746, 37330, 67708, 32834, 46541]...)\n  demo-worker-1: claimed 14 items  (articles: &#x5B;57924, 20028, 65749, 92016, 24921]...)\n  demo-worker-2: claimed 11 items  (articles: &#x5B;66390, 27221, 30148, 97917, 30449]...)\n  demo-worker-3: claimed 0 items   (articles: &#x5B;])\n\n========================================\n  Total items claimed:  39\n  Unique articles:      39\n  Elapsed time:         0.05s\n\n  ZERO OVERLAP \u2014 SKIP LOCKED working correctly!\n============================================================\n\n<\/pre><\/div>\n\n\n<h3 class=\"wp-block-heading\" id=\"h-reading-the-output\">Reading the output<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>14 + 14 + 11 + 0 = 39<\/strong> \u2014 every pending item was claimed exactly once<\/li>\n\n\n\n<li><strong>Zero overlap<\/strong> \u2014 no item was processed by more than one worker<\/li>\n\n\n\n<li><strong>0.05 seconds<\/strong> \u2014 the entire distribution happened in 50 milliseconds<\/li>\n\n\n\n<li><strong>Worker 3 got 0 items<\/strong>: This is actually the ideal outcome. The first 3 workers were fast enough to drain the queue before Worker 3&#8217;s <code>SELECT ... SKIP LOCKED<\/code> could find any unlocked rows. In a real deployment where each item takes 300-500ms (OpenAI API call), all 4 workers would stay busy and you&#8217;d see approximately even distribution.<\/li>\n<\/ul>\n\n\n\n<p><strong>Why <code>SKIP LOCKED<\/code> and not regular <code>FOR UPDATE<\/code>?<\/strong> With regular <code>FOR UPDATE<\/code>, Worker 1 would lock rows and Worker 2 would <strong>wait<\/strong> (block) until Worker 1&#8217;s transaction commits. With <code>SKIP LOCKED<\/code>, Worker 2 <strong>skips<\/strong> the locked rows and grabs the next available ones immediately. No blocking, no deadlocks, no coordination.<\/p>\n\n\n\n<p>This is pure PostgreSQL. No Redis, no RabbitMQ, no SQS. One SQL query, one feature (<code>SKIP LOCKED<\/code>), and you have a production-grade concurrent work queue. If you need to process your embedding queue faster, just add workers \u2014 throughput scales linearly.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-9a-end-to-end-trigger-flow\">Step 9a: End-to-End Trigger Flow<\/h2>\n\n\n\n<p>Every previous step ran parts of the pipeline in isolation. This demo shows the <strong>complete lifecycle<\/strong> of a single article change \u2014 from <code>UPDATE<\/code> to searchable embedding.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-what-demo-trigger-flow-py-does\">What <code>demo_trigger_flow.py<\/code> does<\/h3>\n\n\n\n<p>The script picks one article and walks through the full pipeline synchronously:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Checks the queue<\/strong> for this article (should be empty)<\/li>\n\n\n\n<li><strong>Updates the article&#8217;s content<\/strong> (appending demo text)<\/li>\n\n\n\n<li><strong>Verifies the trigger fired<\/strong> by checking the queue again (should now have a pending entry)<\/li>\n\n\n\n<li><strong>Shows the article&#8217;s metadata<\/strong> (new content_hash, updated_at)<\/li>\n\n\n\n<li><strong>Runs the worker<\/strong> for exactly this one item (calls OpenAI, writes embeddings)<\/li>\n\n\n\n<li><strong>Verifies the embeddings<\/strong> are in <code>article_embeddings_versioned<\/code><\/li>\n<\/ol>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\npython examples\/demo_trigger_flow.py\n\n<\/pre><\/div>\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n============================================================\n  Demo: End-to-End Trigger Flow\n  Article: &#x5B;86698] Thin film transistor liquid crystal display\n============================================================\n\n1. Queue entries (pending) for article 86698 BEFORE update: 0\n\n<\/pre><\/div>\n\n\n<p>Nothing in the queue yet \u2014 clean starting state.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n2. Updated article content (appended demo text)\n\n<\/pre><\/div>\n\n\n<p>An <code>UPDATE articles SET content = content || '...' WHERE id = 86698<\/code> just ran. Two triggers fired: <code>trg_content_hash<\/code> (BEFORE, recomputed the MD5) and <code>trg_queue_embedding<\/code> (AFTER, inserted a queue entry).<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n3. Trigger fired! Queue entry created:\n   Queue ID:     57\n   Status:       pending\n   Content Hash: b5a7c0820832fd54...\n   Queued At:    2026-02-18 15:05:29.303062+00:00\n\n<\/pre><\/div>\n\n\n<p>The trigger did its job. A new <code>pending<\/code> item is in the queue with the article&#8217;s current content hash. Note the timestamp \u2014 in this lab, the trigger overhead was negligible.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n4. Article metadata updated:\n   Content Hash: b5a7c0820832fd54...\n   Updated At:   2026-02-18 15:05:29.303062+00:00\n\n<\/pre><\/div>\n\n\n<p>The article&#8217;s <code>content_hash<\/code> matches the queue entry&#8217;s hash \u2014 they were set by the same trigger. This hash will later be stored as <code>source_hash<\/code> on the embedding, creating the audit chain: <em>&#8220;this embedding was generated from this exact version of the content.&#8221;<\/em><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n5. Running worker for one batch...\n   Article 86698: embedded 3 chunks\n   Processed 1 items\n\n<\/pre><\/div>\n\n\n<p>The worker claimed this item, called OpenAI 3 times (3 chunks), and wrote the embeddings to <code>article_embeddings_versioned<\/code>.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n6. Embeddings for article 86698:\n   Current chunks: 3\n   Last created:   2026-02-18 15:05:29.388968+00:00\n\n============================================================\n  Demo complete!\n============================================================\n\n<\/pre><\/div>\n\n\n<p><strong>The complete flow \u2014 from content modification to searchable embeddings \u2014 took about 1 second.<\/strong> The latency breakdown: ~50ms for PostgreSQL (trigger + queue + insert), ~900ms for OpenAI (3 embedding API calls). In a production system with a continuously running worker, this latency would be the norm for every content change.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-9b-quality-feedback-loop\">Step 9b: Quality Feedback Loop<\/h2>\n\n\n\n<p>The final piece, and the one that closes the architecture. Everything so far reacts to <strong>content changes<\/strong>. But what if the embeddings are technically &#8220;fresh&#8221; (content hasn&#8217;t changed) yet <strong>search quality is degrading<\/strong>? Maybe the model isn&#8217;t capturing certain topics well, or the chunking strategy doesn&#8217;t work for some article types.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-what-demo-quality-drift-py-does\">What <code>demo_quality_drift.py<\/code> does<\/h3>\n\n\n\n<p>This script simulates the quality feedback loop described in Part 1&#8217;s monitoring section. It works in four phases:<\/p>\n\n\n\n<p><strong>Phase 1 \u2014 Simulate retrieval quality logs<\/strong>: The script generates 20 fake search queries with associated quality metrics (nDCG, precision@k, user satisfaction scores). It deliberately creates a pattern where quality metrics decline for certain topics \u2014 simulating what would happen if embeddings for some subject areas became less effective over time.<\/p>\n\n\n\n<p><strong>Phase 2 \u2014 Quality analysis<\/strong>: The script scans <code>retrieval_quality_log<\/code> looking for queries with poor results: low nDCG scores (below a configurable threshold) or negative user feedback. It identifies 8 queries where quality dropped.<\/p>\n\n\n\n<p><strong>Phase 3 \u2014 Article correlation<\/strong>: For each poor-performing query, the script finds related articles using <code>title ILIKE '%keyword%'<\/code> matching. This is a simplified version of what a production system would do (where you&#8217;d use the query&#8217;s actual retrieved results instead of keyword matching). It identifies 29 articles that might be causing poor search results.<\/p>\n\n\n\n<p><strong>Phase 4 \u2014 Automatic re-queuing<\/strong>: All 29 articles are inserted into <code>embedding_queue<\/code> with <code>change_type = 'quality_reembed'<\/code> instead of the usual <code>'content_update'<\/code>. This distinction is critical \u2014 it means the re-embedding is happening not because the content changed, but because the quality metrics flagged a problem.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\npython examples\/demo_quality_drift.py\n\n<\/pre><\/div>\n\n\n<p>The demo runs through all four phases and produces a final queue state:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: sql; title: ; notranslate\" title=\"\">\nwikipedia=# SELECT change_type, status, count(*) \n  FROM embedding_queue GROUP BY change_type, status ORDER BY change_type, status;\n  change_type    |  status   | count\n-----------------+-----------+-------\n content_update  | completed |    50\n content_update  | skipped   |     5\n quality_reembed | pending   |    29\n\n<\/pre><\/div>\n\n\n<h3 class=\"wp-block-heading\" id=\"h-reading-the-queue-state\">Reading the queue state<\/h3>\n\n\n\n<p>Three distinct categories tell the full pipeline story:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>50 <code>content_update<\/code> \/ <code>completed<\/code><\/strong>: the normal pipeline flow \u2014 content changed, trigger fired, worker embedded. This is Layers 1 and 2 doing their job.<\/li>\n\n\n\n<li><strong>5 <code>content_update<\/code> \/ <code>skipped<\/code><\/strong>: the typo-level changes from Step 6 \u2014 the change detector said &#8220;not worth re-embedding.&#8221; This is Layer 2&#8217;s cost optimization.<\/li>\n\n\n\n<li><strong>29 <code>quality_reembed<\/code> \/ <code>pending<\/code><\/strong>: the feedback loop&#8217;s contribution \u2014 these articles weren&#8217;t re-queued because their content changed (it may not have). They were re-queued because <strong>search quality dropped<\/strong> for queries related to them.<\/li>\n<\/ul>\n\n\n\n<p><strong>Why the <code>quality_reembed<\/code> change type matters<\/strong>: When the worker processes these items, it bypasses the change significance detector. If the detector were to analyze them, it might say &#8220;similarity=0.998 \u2192 SKIP&#8221; because the content barely changed. But that&#8217;s the whole point \u2014 the content didn&#8217;t change, yet the embeddings aren&#8217;t serving search well. The quality feedback overrides the filter.<\/p>\n\n\n\n<p>This is the three-layer architecture from Part 1 working in practice:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Triggers<\/strong> (Layer 1): react to content changes immediately \u2014 the broadest net<\/li>\n\n\n\n<li><strong>Change significance<\/strong> (Layer 2): filter out trivial changes, saving API cost \u2014 the optimization layer<\/li>\n\n\n\n<li><strong>Quality feedback<\/strong> (Layer 3): catch what the filter missed or what wasn&#8217;t about content changes at all \u2014 the safety net<\/li>\n<\/ol>\n\n\n\n<p>Each layer compensates for the blind spots of the previous one.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-key-takeaways\">Key Takeaways<\/h2>\n\n\n\n<p><strong>1. The trigger is smarter than you think.<\/strong> Using <code>UPDATE OF content<\/code> means metadata-only changes never touch the embedding pipeline. In our test, 12% of mutations (6 out of 50) were filtered out at the trigger level, before any Python code ran. In a real knowledge base with tag edits, status changes, and metadata updates, this fraction could be substantially higher.<\/p>\n\n\n\n<p><strong>2. The change detector needs a baseline.<\/strong> On the first run, every article shows <code>similarity=0.0<\/code> because there&#8217;s nothing to compare against. This is correct behavior, but you need to plan for the initial backfill being 100% EMBED. Budget the API cost and time accordingly.<\/p>\n\n\n\n<p><strong>3. The 0.95 threshold is validated.<\/strong> Typo-level changes (appending a period) scored 0.998+, paragraph additions scored ~0.93, and section rewrites scored 0.51\u20130.63. There&#8217;s a clear gap between &#8220;trivial&#8221; and &#8220;significant&#8221; that the threshold exploits. You don&#8217;t need machine learning or complex heuristics \u2014 cosine similarity with a simple threshold works.<\/p>\n\n\n\n<p><strong>4. SKIP LOCKED is production-ready.<\/strong> 4 workers, 39 items, zero overlap, 0.05 seconds. No external dependencies, no coordination service. This is the simplest correct way to build a concurrent work queue in PostgreSQL. Need more throughput? Add workers.<\/p>\n\n\n\n<p><strong>5. Quality metrics close the loop.<\/strong> The change significance filter reduces unnecessary writes and index churn, but it can&#8217;t know if a small change was semantically important \u2014 or if the embedding was poor to begin with. The quality feedback loop catches those cases by correlating low-quality retrievals with specific articles and forcing re-embedding. Three layers, each compensating for the blind spots of the previous one.<\/p>\n\n\n\n<p><strong>6. The bottleneck is the API, not PostgreSQL.<\/strong> 10 articles embedded in ~8 seconds, with each OpenAI call taking 300-600ms. In this lab, PostgreSQL&#8217;s trigger + queue overhead was negligible compared to API latency. If you need faster throughput, add workers (SKIP LOCKED scales linearly) or switch to a local embedding model like <code>nomic-embed-text<\/code> via Ollama.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-running-it-yourself\">Running It Yourself<\/h2>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\ngit clone https:\/\/github.com\/boutaga\/pgvector_RAG_search_lab.git\ncd pgvector_RAG_search_lab\n\n# Ensure Wikipedia database is loaded (see Lab 2 in README)\n# You&#039;ll need: PostgreSQL 17+, pgvector, pgvectorscale, an OpenAI API key\n\n# Step 1: Apply schema\npsql -d wikipedia -f lab\/05_embedding_versioning\/schema.sql\n\n# Step 3: Simulate changes (Step 2 is a manual SQL test)\npython lab\/05_embedding_versioning\/examples\/simulate_document_changes.py --count 50\n\n# Step 4: Run change detector (all EMBED on first run \u2014 no baseline yet)\npython lab\/05_embedding_versioning\/change_detector.py --analyze-queue\n\n# Step 5: Create baseline embeddings (requires OPENAI_API_KEY env var)\npython lab\/05_embedding_versioning\/worker.py --once --batch-size 10\n\n# Step 6: Apply targeted mutations, then re-run detector\npython lab\/05_embedding_versioning\/examples\/targeted_mutations.py\npython lab\/05_embedding_versioning\/change_detector.py --analyze-queue\n\n# Step 7: Full freshness report\npython lab\/05_embedding_versioning\/freshness_monitor.py --report\n\n# Step 8: SKIP LOCKED concurrency demo\npython lab\/05_embedding_versioning\/examples\/demo_skip_locked.py --workers 4\n\n# Step 9a: End-to-end trigger flow\npython lab\/05_embedding_versioning\/examples\/demo_trigger_flow.py\n\n# Step 9b: Quality feedback loop\npython lab\/05_embedding_versioning\/examples\/demo_quality_drift.py\n\n<\/pre><\/div>\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-what-s-next\">What&#8217;s Next<\/h2>\n\n\n\n<p>In the next post, I&#8217;ll explore <strong>benchmarking pgvectorscale&#8217;s StreamingDiskANN at scale<\/strong> \u2014 with real numbers on query latency, recall, index build time, and memory footprint at different dataset sizes. We&#8217;ll use the same Wikipedia dataset and the versioned embedding infrastructure from this lab.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introduction This is Part 2 of the embedding versionin, in Part 1, I covered the theory: why event-driven embedding refresh matters, the three levels of architecture (triggers, logical replication, Flink CDC), and how to detect and skip insignificant changes. If you haven&#8217;t read it, go there first, this post won&#8217;t through the entire intent of [&hellip;]<\/p>\n","protected":false},"author":153,"featured_media":40270,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[83],"tags":[3523,2602,3678],"type_dbi":[3869,2749,3868],"class_list":["post-43036","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-postgresql","tag-pgvector","tag-postgresql-2","tag-rag","type-pgvector","type-postgresql","type-rag"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.2 (Yoast SEO v27.2) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>RAG Series \u2013 Embedding Versioning LAB - dbi Blog<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"RAG Series \u2013 Embedding Versioning LAB\" \/>\n<meta property=\"og:description\" content=\"Introduction This is Part 2 of the embedding versionin, in Part 1, I covered the theory: why event-driven embedding refresh matters, the three levels of architecture (triggers, logical replication, Flink CDC), and how to detect and skip insignificant changes. If you haven&#8217;t read it, go there first, this post won&#8217;t through the entire intent of [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/\" \/>\n<meta property=\"og:site_name\" content=\"dbi Blog\" \/>\n<meta property=\"article:published_time\" content=\"2026-02-22T21:36:36+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-02-22T21:42:16+00:00\" \/>\n<meta property=\"og:image\" content=\"http:\/\/www.dbi-services.com\/blog\/wp-content\/uploads\/sites\/2\/2025\/09\/elephant.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1344\" \/>\n\t<meta property=\"og:image:height\" content=\"768\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Adrien Obernesser\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Adrien Obernesser\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"18 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/\"},\"author\":{\"name\":\"Adrien Obernesser\",\"@id\":\"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/fd2ab917212ce0200c7618afaa7fdbcd\"},\"headline\":\"RAG Series \u2013 Embedding Versioning LAB\",\"datePublished\":\"2026-02-22T21:36:36+00:00\",\"dateModified\":\"2026-02-22T21:42:16+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/\"},\"wordCount\":3945,\"commentCount\":0,\"image\":{\"@id\":\"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.dbi-services.com\/blog\/wp-content\/uploads\/sites\/2\/2025\/09\/elephant.png\",\"keywords\":[\"pgvector\",\"postgresql\",\"RAG\"],\"articleSection\":[\"PostgreSQL\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/\",\"url\":\"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/\",\"name\":\"RAG Series \u2013 Embedding Versioning LAB - dbi Blog\",\"isPartOf\":{\"@id\":\"https:\/\/www.dbi-services.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.dbi-services.com\/blog\/wp-content\/uploads\/sites\/2\/2025\/09\/elephant.png\",\"datePublished\":\"2026-02-22T21:36:36+00:00\",\"dateModified\":\"2026-02-22T21:42:16+00:00\",\"author\":{\"@id\":\"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/fd2ab917212ce0200c7618afaa7fdbcd\"},\"breadcrumb\":{\"@id\":\"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/#primaryimage\",\"url\":\"https:\/\/www.dbi-services.com\/blog\/wp-content\/uploads\/sites\/2\/2025\/09\/elephant.png\",\"contentUrl\":\"https:\/\/www.dbi-services.com\/blog\/wp-content\/uploads\/sites\/2\/2025\/09\/elephant.png\",\"width\":1344,\"height\":768},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Accueil\",\"item\":\"https:\/\/www.dbi-services.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"RAG Series \u2013 Embedding Versioning LAB\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.dbi-services.com\/blog\/#website\",\"url\":\"https:\/\/www.dbi-services.com\/blog\/\",\"name\":\"dbi Blog\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.dbi-services.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/fd2ab917212ce0200c7618afaa7fdbcd\",\"name\":\"Adrien Obernesser\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/secure.gravatar.com\/avatar\/dc9316c729e50107159e0a1e631b9c1742ce8898576887d0103c83b1ca3bc9e6?s=96&d=mm&r=g\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/dc9316c729e50107159e0a1e631b9c1742ce8898576887d0103c83b1ca3bc9e6?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/dc9316c729e50107159e0a1e631b9c1742ce8898576887d0103c83b1ca3bc9e6?s=96&d=mm&r=g\",\"caption\":\"Adrien Obernesser\"},\"url\":\"https:\/\/www.dbi-services.com\/blog\/author\/adrienobernesser\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"RAG Series \u2013 Embedding Versioning LAB - dbi Blog","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/","og_locale":"en_US","og_type":"article","og_title":"RAG Series \u2013 Embedding Versioning LAB","og_description":"Introduction This is Part 2 of the embedding versionin, in Part 1, I covered the theory: why event-driven embedding refresh matters, the three levels of architecture (triggers, logical replication, Flink CDC), and how to detect and skip insignificant changes. If you haven&#8217;t read it, go there first, this post won&#8217;t through the entire intent of [&hellip;]","og_url":"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/","og_site_name":"dbi Blog","article_published_time":"2026-02-22T21:36:36+00:00","article_modified_time":"2026-02-22T21:42:16+00:00","og_image":[{"width":1344,"height":768,"url":"http:\/\/www.dbi-services.com\/blog\/wp-content\/uploads\/sites\/2\/2025\/09\/elephant.png","type":"image\/png"}],"author":"Adrien Obernesser","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Adrien Obernesser","Est. reading time":"18 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/#article","isPartOf":{"@id":"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/"},"author":{"name":"Adrien Obernesser","@id":"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/fd2ab917212ce0200c7618afaa7fdbcd"},"headline":"RAG Series \u2013 Embedding Versioning LAB","datePublished":"2026-02-22T21:36:36+00:00","dateModified":"2026-02-22T21:42:16+00:00","mainEntityOfPage":{"@id":"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/"},"wordCount":3945,"commentCount":0,"image":{"@id":"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/#primaryimage"},"thumbnailUrl":"https:\/\/www.dbi-services.com\/blog\/wp-content\/uploads\/sites\/2\/2025\/09\/elephant.png","keywords":["pgvector","postgresql","RAG"],"articleSection":["PostgreSQL"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/","url":"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/","name":"RAG Series \u2013 Embedding Versioning LAB - dbi Blog","isPartOf":{"@id":"https:\/\/www.dbi-services.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/#primaryimage"},"image":{"@id":"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/#primaryimage"},"thumbnailUrl":"https:\/\/www.dbi-services.com\/blog\/wp-content\/uploads\/sites\/2\/2025\/09\/elephant.png","datePublished":"2026-02-22T21:36:36+00:00","dateModified":"2026-02-22T21:42:16+00:00","author":{"@id":"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/fd2ab917212ce0200c7618afaa7fdbcd"},"breadcrumb":{"@id":"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/#primaryimage","url":"https:\/\/www.dbi-services.com\/blog\/wp-content\/uploads\/sites\/2\/2025\/09\/elephant.png","contentUrl":"https:\/\/www.dbi-services.com\/blog\/wp-content\/uploads\/sites\/2\/2025\/09\/elephant.png","width":1344,"height":768},{"@type":"BreadcrumbList","@id":"https:\/\/www.dbi-services.com\/blog\/rag-series-embedding-versioning-lab\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Accueil","item":"https:\/\/www.dbi-services.com\/blog\/"},{"@type":"ListItem","position":2,"name":"RAG Series \u2013 Embedding Versioning LAB"}]},{"@type":"WebSite","@id":"https:\/\/www.dbi-services.com\/blog\/#website","url":"https:\/\/www.dbi-services.com\/blog\/","name":"dbi Blog","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.dbi-services.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/fd2ab917212ce0200c7618afaa7fdbcd","name":"Adrien Obernesser","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/dc9316c729e50107159e0a1e631b9c1742ce8898576887d0103c83b1ca3bc9e6?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/dc9316c729e50107159e0a1e631b9c1742ce8898576887d0103c83b1ca3bc9e6?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/dc9316c729e50107159e0a1e631b9c1742ce8898576887d0103c83b1ca3bc9e6?s=96&d=mm&r=g","caption":"Adrien Obernesser"},"url":"https:\/\/www.dbi-services.com\/blog\/author\/adrienobernesser\/"}]}},"_links":{"self":[{"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/posts\/43036","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/users\/153"}],"replies":[{"embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/comments?post=43036"}],"version-history":[{"count":9,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/posts\/43036\/revisions"}],"predecessor-version":[{"id":43100,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/posts\/43036\/revisions\/43100"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/media\/40270"}],"wp:attachment":[{"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/media?parent=43036"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/categories?post=43036"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/tags?post=43036"},{"taxonomy":"type","embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/type_dbi?post=43036"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}