{"id":4379,"date":"2015-02-15T19:21:00","date_gmt":"2015-02-15T18:21:00","guid":{"rendered":"https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/"},"modified":"2015-02-15T19:21:00","modified_gmt":"2015-02-15T18:21:00","slug":"oracletext-deletes-and-garbage","status":"publish","type":"post","link":"https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/","title":{"rendered":"OracleText: deletes and garbage"},"content":{"rendered":"<h2>By Franck Pachot<\/h2>\n<p>.<br \/>\nIn the <a href=\"https:\/\/www.dbi-services.com\/index.php\/blog\/entry\/oracletext-inserts-and-fragmentation\">previous post<\/a> we have seen how the OracleText index tables are maintained when new document arrives: At sync the new documents are read up to the available memory and words are inserted in the $I table with their mapping information. Now we will see how removed documents are processed. We will not cover updates as their are just delete + insert.<\/p>\n<h3>Previous state<\/h3>\n<p>Here is the state from the previous post where I had those 3 documents:<\/p>\n<ul>\n<li>&#8216;Hello World&#8217;<\/li>\n<\/ul>\n<p>which was synced alone, and then the two following ones were synced together:<\/p>\n<ul>\n<li>&#8216;Hello Moon, hello, hello&#8217;<\/li>\n<li>&#8216;Hello Mars&#8217;<\/li>\n<\/ul>\n<p>The $K is a IOT which maps the OracleText table ROWID to the DOCID (the fact that the primary key TEXTKEY is not at start is a bit misleading):<\/p>\n<pre><code>SQL&gt; select * from DR$DEMO_CTX_INDEX$K;\n\n     DOCID TEXTKEY\n---------- ------------------\n         1 AAAXUtAAKAAABWlAAA\n         2 AAAXUtAAKAAABWlAAB\n         3 AAAXUtAAKAAABWlAAC\n<\/code><\/pre>\n<p>The $R is a table mapping the opposite navigation (docid to rowid) storing a fixed-length array of ROWIDs indexed by the docid, and split into several lines:<\/p>\n<pre><code>SQL&gt; select * from DR$DEMO_CTX_INDEX$R;\n\n    ROW_NO DATA\n---------- ------------------------------------------------------------\n         0 00001752D000280000056940414100001752D00028000005694041420000\n<\/code><\/pre>\n<p>The $I table stores the tokens, the first 5 columns being indexed ($X) and the TOKEN_INFO blob stores detailed location of the token:<\/p>\n<pre><code>SQL&gt; select * from DR$DEMO_CTX_INDEX$I;\n\nTOKEN_TEXT TOKEN_TYPE TOKEN_FIRST TOKEN_LAST TOKEN_COUNT TOKEN_INFO\n---------- ---------- ----------- ---------- ----------- ----------\nHELLO               0           1          1           1 008801\nWORLD               0           1          1           1 008802\nHELLO               0           2          3           2 0098010201\nMARS                0           3          3           1 008802\nMOON                0           2          2           1 008802\n<\/code><\/pre>\n<p>We have seen that the $I table can be fragmented for 3 reasons:<\/p>\n<ul>\n<li>Each sync insert his tokens (instead of merging with other ones)<\/li>\n<li>TOKEN_INFO size is limited to fit in-row (we will see 12c new features later)<\/li>\n<li>Only tokens that fit in the allocated memory can be merged<\/li>\n<\/ul>\n<p>And the $N is empty for the moment:<\/p>\n<pre><code>SQL&gt; select * from DR$DEMO_CTX_INDEX$N;\n\nno rows selected\n<\/code><\/pre>\n<h3>Delete<\/h3>\n<p>Do you remember how inserts are going to the CTXSYS.DR$PENDING table? Deletes are going to CTXSYS.DR$DELETE table:<\/p>\n<pre><code>SQL&gt; delete from DEMO_CTX_FRAG where num=0002;\n\n1 row deleted.\n\nSQL&gt; select * from CTXSYS.DR$DELETE;\n\nDEL_IDX_ID DEL_IXP_ID  DEL_DOCID\n---------- ---------- ----------\n      1400          0          2\n<\/code><\/pre>\n<p>I&#8217;ve deleted docid=2 but the tokens are still there:<\/p>\n<pre><code>SQL&gt; select * from DR$DEMO_CTX_INDEX$I;\n\nTOKEN_TEXT TOKEN_TYPE TOKEN_FIRST TOKEN_LAST TOKEN_COUNT TOKEN_INFO\n---------- ---------- ----------- ---------- ----------- ----------\nHELLO               0           1          1           1 008801\nWORLD               0           1          1           1 008802\nHELLO               0           2          3           2 0098010201\nMARS                0           3          3           1 008802\nMOON                0           2          2           1 008802\n<\/code><\/pre>\n<p>as well as their mapping to the ROWID:<\/p>\n<pre><code>SQL&gt; -- $R is for rowid - docid mapping (IOT)\nSQL&gt; select * from DR$DEMO_CTX_INDEX$R;\n\n    ROW_NO DATA\n---------- ------------------------------------------------------------\n         0 00001752D000280000056940414100001752D00028000005694041420000\n<\/code><\/pre>\n<p>However, the $N has been maintained to know that docid=2 has been removed:<\/p>\n<pre><code>SQL&gt; select * from DR$DEMO_CTX_INDEX$N;\n\n NLT_DOCID N\n---------- -\n         2 U\n<\/code><\/pre>\n<p>This is the goal of $N (Negative) table which records the docid that should not be there and that must be deleted at next optimization (garbage collection).<\/p>\n<p>From there, a search by words (&#8216;normal lookup&#8217;) will give docid&#8217;s and rowid&#8217;s and the CTXSYS.DR$DELETE must be read in order to know that the document is not there anymore. It&#8217;s an IOT and the docid can be found with an index unique scan.<\/p>\n<p>However for the opposite way, having a ROWID and checking if it contains some words (&#8216;functional lookup&#8217;) we need to know that there is no document. In my case I deleted the row, but you may update the document, so the ROWID is still there. There is no pending table for that. It is maintained immediately in the $K table:<\/p>\n<pre><code>SQL&gt; select * from DR$DEMO_CTX_INDEX$K;\n\n     DOCID TEXTKEY\n---------- ------------------\n         1 AAAXUtAAKAAABWlAAA\n         3 AAAXUtAAKAAABWlAAC\n<\/code><\/pre>\n<p>the entry that addressed docid=2 has been deleted.<\/p>\n<h3>Commit<\/h3>\n<p>All those changes were done within the same transaction. Other sessions still see the old values. No need to read CTXSYS.DR$DELETE for them. What I described above is only for my session: the normal lookup reading the queuing table, and the functional lookup stopping at $K. We don&#8217;t have to wait a sync to process CTXSYS.DR$DELETE. It&#8217;s done at commit:<\/p>\n<pre><code>SQL&gt; commit;\n\nCommit complete.\n\nSQL&gt; select * from CTXSYS.DR$DELETE;\n\nno rows selected\n\nSQL&gt; select * from DR$DEMO_CTX_INDEX$R;\n\n    ROW_NO DATA\n---------- ------------------------------------------------------------\n         0 00001752D000280000056940414100000000000000000000000000000000\n<\/code><\/pre>\n<p>Of course we can&#8217;t read it but we see that part of it has been zeroed. That $R table is definitely special: it&#8217;s not stored in a relational way, and its maintenance is deferred at commit time.<\/p>\n<p>But nothing has changed in $I which contains garbage (and sync is not changing anything to that):<\/p>\n<pre><code>SQL&gt; select * from DR$DEMO_CTX_INDEX$I;\n\nTOKEN_TEXT TOKEN_TYPE TOKEN_FIRST TOKEN_LAST TOKEN_COUNT TOKEN_INFO\n---------- ---------- ----------- ---------- ----------- ----------\nHELLO               0           1          1           1 008801\nWORLD               0           1          1           1 008802\nHELLO               0           2          3           2 0098010201\nMARS                0           3          3           1 008802\nMOON                0           2          2           1 008802\n<\/code><\/pre>\n<p>And of course $N row is still there to record the deleted docid:<\/p>\n<pre><code>SQL&gt; select * from DR$DEMO_CTX_INDEX$N;\n\n NLT_DOCID N\n---------- -\n         2 U\n<\/code><\/pre>\n<h3>Sync<\/h3>\n<p>I&#8217;ve not reproduced it here, but sync is not changing anything. Sync is for new documents &#8211; not for deleted ones.<\/p>\n<h3>Conclusion<\/h3>\n<p>What you need to remember here is:<\/p>\n<ul>\n<li>New documents are made visible through OracleText index <strong>at sync<\/strong><\/li>\n<li>Removed document are immediately made invisible <strong>at commit<\/strong><\/li>\n<\/ul>\n<p>Of course, you can sync at commit, but the second thing to remember is that<\/p>\n<ul>\n<li>New documents brings fragmentation<\/li>\n<li>Removed document brings garbage<\/li>\n<\/ul>\n<p>and both of them increase the size of the $I table and its $X index, making range scans less efficient. We will see more about that but the next post will be about queries. I&#8217;ve talked about normal and functional lookups and we will see how they are done. Let&#8217;s detail that.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>By Franck Pachot . In the previous post we have seen how the OracleText index tables are maintained when new document arrives: At sync the new documents are read up to the available memory and words are inserted in the $I table with their mapping information. Now we will see how removed documents are processed. [&hellip;]<\/p>\n","protected":false},"author":27,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[198,59],"tags":[96],"type_dbi":[],"class_list":["post-4379","post","type-post","status-publish","format-standard","hentry","category-database-management","category-oracle","tag-oracle"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.2 (Yoast SEO v27.2) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>OracleText: deletes and garbage - dbi Blog<\/title>\n<meta name=\"description\" content=\"How OracleText processes deletes.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"OracleText: deletes and garbage\" \/>\n<meta property=\"og:description\" content=\"How OracleText processes deletes.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/\" \/>\n<meta property=\"og:site_name\" content=\"dbi Blog\" \/>\n<meta property=\"article:published_time\" content=\"2015-02-15T18:21:00+00:00\" \/>\n<meta name=\"author\" content=\"Oracle Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Oracle Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/\"},\"author\":{\"name\":\"Oracle Team\",\"@id\":\"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/66ab87129f2d357f09971bc7936a77ee\"},\"headline\":\"OracleText: deletes and garbage\",\"datePublished\":\"2015-02-15T18:21:00+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/\"},\"wordCount\":700,\"commentCount\":0,\"keywords\":[\"Oracle\"],\"articleSection\":[\"Database management\",\"Oracle\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/\",\"url\":\"https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/\",\"name\":\"OracleText: deletes and garbage - dbi Blog\",\"isPartOf\":{\"@id\":\"https:\/\/www.dbi-services.com\/blog\/#website\"},\"datePublished\":\"2015-02-15T18:21:00+00:00\",\"author\":{\"@id\":\"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/66ab87129f2d357f09971bc7936a77ee\"},\"description\":\"How OracleText processes deletes.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Accueil\",\"item\":\"https:\/\/www.dbi-services.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"OracleText: deletes and garbage\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.dbi-services.com\/blog\/#website\",\"url\":\"https:\/\/www.dbi-services.com\/blog\/\",\"name\":\"dbi Blog\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.dbi-services.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/66ab87129f2d357f09971bc7936a77ee\",\"name\":\"Oracle Team\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/secure.gravatar.com\/avatar\/f711f7cd2c9b09bf2627133755b569fb5be0694810cfd33033bdd095fedba86d?s=96&d=mm&r=g\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/f711f7cd2c9b09bf2627133755b569fb5be0694810cfd33033bdd095fedba86d?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/f711f7cd2c9b09bf2627133755b569fb5be0694810cfd33033bdd095fedba86d?s=96&d=mm&r=g\",\"caption\":\"Oracle Team\"},\"url\":\"https:\/\/www.dbi-services.com\/blog\/author\/oracle-team\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"OracleText: deletes and garbage - dbi Blog","description":"How OracleText processes deletes.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/","og_locale":"en_US","og_type":"article","og_title":"OracleText: deletes and garbage","og_description":"How OracleText processes deletes.","og_url":"https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/","og_site_name":"dbi Blog","article_published_time":"2015-02-15T18:21:00+00:00","author":"Oracle Team","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Oracle Team","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/#article","isPartOf":{"@id":"https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/"},"author":{"name":"Oracle Team","@id":"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/66ab87129f2d357f09971bc7936a77ee"},"headline":"OracleText: deletes and garbage","datePublished":"2015-02-15T18:21:00+00:00","mainEntityOfPage":{"@id":"https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/"},"wordCount":700,"commentCount":0,"keywords":["Oracle"],"articleSection":["Database management","Oracle"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/","url":"https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/","name":"OracleText: deletes and garbage - dbi Blog","isPartOf":{"@id":"https:\/\/www.dbi-services.com\/blog\/#website"},"datePublished":"2015-02-15T18:21:00+00:00","author":{"@id":"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/66ab87129f2d357f09971bc7936a77ee"},"description":"How OracleText processes deletes.","breadcrumb":{"@id":"https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.dbi-services.com\/blog\/oracletext-deletes-and-garbage\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Accueil","item":"https:\/\/www.dbi-services.com\/blog\/"},{"@type":"ListItem","position":2,"name":"OracleText: deletes and garbage"}]},{"@type":"WebSite","@id":"https:\/\/www.dbi-services.com\/blog\/#website","url":"https:\/\/www.dbi-services.com\/blog\/","name":"dbi Blog","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.dbi-services.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/66ab87129f2d357f09971bc7936a77ee","name":"Oracle Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/f711f7cd2c9b09bf2627133755b569fb5be0694810cfd33033bdd095fedba86d?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/f711f7cd2c9b09bf2627133755b569fb5be0694810cfd33033bdd095fedba86d?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/f711f7cd2c9b09bf2627133755b569fb5be0694810cfd33033bdd095fedba86d?s=96&d=mm&r=g","caption":"Oracle Team"},"url":"https:\/\/www.dbi-services.com\/blog\/author\/oracle-team\/"}]}},"_links":{"self":[{"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/posts\/4379","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/users\/27"}],"replies":[{"embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/comments?post=4379"}],"version-history":[{"count":0,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/posts\/4379\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/media?parent=4379"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/categories?post=4379"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/tags?post=4379"},{"taxonomy":"type","embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/type_dbi?post=4379"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}