{"id":32936,"date":"2024-04-30T19:30:00","date_gmt":"2024-04-30T17:30:00","guid":{"rendered":"https:\/\/www.dbi-services.com\/blog\/?p=32936"},"modified":"2024-04-30T15:56:46","modified_gmt":"2024-04-30T13:56:46","slug":"alfresco-mass-removal-cleanup-of-documents","status":"publish","type":"post","link":"https:\/\/www.dbi-services.com\/blog\/alfresco-mass-removal-cleanup-of-documents\/","title":{"rendered":"Alfresco &#8211; Mass removal\/cleanup of documents"},"content":{"rendered":"\n<p>At a customer, I recently had a case where a mass-import job was executed on an interface that, in the background, uses Alfresco for document and metadata storage. From the point of view of the interface team, there was no problem as documents were properly being created in Alfresco (although performance wasn&#8217;t exceptional). However, after some time, our monitoring started sending us alerts that Solr indexing nearly stopped \/ was very slow. I might talk about the Solr part in a future blog but what happened is that the interface was configured to import documents into Alfresco in a way that caused too many documents in a single folder.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-too-many-documents-in-the-same-folder-of-alfresco\">Too many documents in the same folder of Alfresco<\/h2>\n\n\n\n<p>The interface was trying to import documents in the folder &#8220;<em>YYYY\/MM\/DD\/HH<\/em>&#8221; (<em>YYYY<\/em> being the year, <em>MM<\/em> the month, <em>DD<\/em> the day and <em>HH<\/em> the hour). This might be fine for Business-As-Usual (BAU), when the load isn&#8217;t too high, but when mass-importing documents, that meant several thousand documents per folder (5&#8217;000, 10&#8217;000, 20&#8217;000, \u2026), the limit being what Alfresco can ingest in an hour or what the interface manages to send. As you probably know, Alfresco definitively doesn&#8217;t like folders with much more than a thousand nodes inside (in particular because of associations and indexing design)\u2026 When I saw that, I asked the interface team to stop the import job, but unfortunately, it wasn&#8217;t stopped right away and almost 190 000 documents were already imported into Alfresco.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-alfresco-apis-for-the-win\">Alfresco APIs for the win?<\/h2>\n\n\n\n<p>You cannot really let Alfresco in this state since Solr would heavily be impacted by this kind of situation and any change to a document in such folder could result in heavy load. Therefore, from my point of view, the best is to remove the documents and execute a new\/correct import with a better distribution of documents per folder.<\/p>\n\n\n\n<p>A first solution could be to restore the DB to a point in time before the activity started, but that means a downtime and anything else that happened in the meantime would be lost. A second option would be to find all the documents imported and remove them through API. As you might know, Share UI will not really be useful in this case since Share will either crash or just take way too long to open the folder, so don&#8217;t even try\u2026 And even if it is able to somehow open the folder containing <em>XX&#8217;XXX<\/em> nodes, you probably shouldn&#8217;t try to delete it because it will take forever, and you will not be able to know what&#8217;s the status of this process that runs in the background. Therefore, from my point of view, the only reasonable solution is through API.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-finding-documents-to-delete\">Finding documents to delete<\/h2>\n\n\n\n<p>As mentioned, Solr indexing was nearly dead, so I couldn&#8217;t rely on it to find what was imported recently. Using REST-API could be possible but there are some limitations when working with huge set of results. In this case, I decided to go with a simple DB query (if you are interested in <a href=\"https:\/\/www.dbi-services.com\/blog\/alfresco-some-useful-database-queries\/\" target=\"_blank\" rel=\"noreferrer noopener\">useful Alfresco DB queries<\/a>), listing all documents created since the start of the mass-import by the interface user:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: sql; title: ; notranslate\" title=\"\">\nSQL&gt; SELECT n.id AS &quot;Node ID&quot;,\n  n.store_id AS &quot;Store ID&quot;,\n  n.uuid AS &quot;Document ID (UUID)&quot;,\n  n.audit_creator AS &quot;Creator&quot;,\n  n.audit_created AS &quot;Creation Date&quot;,\n  n.audit_modifier AS &quot;Modifier&quot;,\n  n.audit_modified AS &quot;Modification Date&quot;,\n  n.type_qname_id\nFROM alfresco.alf_node n,\n  alfresco.alf_node_properties p\nWHERE n.id=p.node_id\n  AND p.qname_id=(SELECT id FROM alf_qname WHERE local_name=&#039;content&#039;)\n  AND n.audit_created&gt;=&#039;2023-11-23T19:00:00Z&#039;\n  AND n.audit_creator=&#039;itf_user&#039;\n  AND n.audit_created is not null;\n<\/pre><\/div>\n\n\n<p>In case the interface isn&#8217;t using a dedicated user for the mass-import process, it might be a bit more difficult to find the correct list of documents to be removed, as you would need to take care not to remove the BAU documents\u2026 Maybe using a recursive query based on the folder on which the documents were imported or some custom type\/metadata or similar. The result of the above query was put in a text file for the processing:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: sql; title: ; notranslate\" title=\"\">\nalfresco@acs01:~$ cat alfresco_documents.txt\n  Node ID Store ID Document ID (UUID)                   Creator   Creation Date             Modifier  Modification Date         TYPE_QNAME_ID\n--------- -------- ------------------------------------ --------- ------------------------- --------- ------------------------- -------------\n156491155        6 0f16ef7a-4cf1-4304-b578-71480570c070 itf_user  2023-11-23T19:01:02.511Z  itf_user  2023-11-23T19:01:03.128Z            265\n156491158        4 2f65420a-1105-4306-9733-210501ae7efb itf_user  2023-11-23T19:01:03.198Z  itf_user  2023-11-23T19:01:03.198Z            265\n156491164        6 a208d56f-df1a-4f2f-bc73-6ab39214b824 itf_user  2023-11-23T19:01:03.795Z  itf_user  2023-11-23T19:01:03.795Z            265\n156491166        4 908d385f-d6bb-4b94-ba5c-6d6942bb75c3 itf_user  2023-11-23T19:01:03.918Z  itf_user  2023-11-23T19:01:03.918Z            265\n...\n159472069        6 cabf7343-35c4-4e8b-8a36-0fa0805b367f itf_user  2023-11-24T07:50:20.355Z  itf_user  2023-11-24T07:50:20.355Z            265\n159472079        4 1bcc7301-97ab-4ddd-9561-0ecab8d09efb itf_user  2023-11-24T07:50:20.522Z  itf_user  2023-11-24T07:50:20.522Z            265\n159472098        6 19d1869c-83d9-449a-8417-b460ccec1d60 itf_user  2023-11-24T07:50:20.929Z  itf_user  2023-11-24T07:50:20.929Z            265\n159472107        4 bcd0f8a2-68b3-4cc9-b0bd-2af24dc4ff43 itf_user  2023-11-24T07:50:21.074Z  itf_user  2023-11-24T07:50:21.074Z            265\n159472121        6 74bbe0c3-2437-4d16-bfbc-97bfa5a8d4e0 itf_user  2023-11-24T07:50:21.365Z  itf_user  2023-11-24T07:50:21.365Z            265\n159472130        4 f984679f-378b-4540-853c-c36f13472fac itf_user  2023-11-24T07:50:21.511Z  itf_user  2023-11-24T07:50:21.511Z            265\n159472144        6 579a2609-f5be-47e4-89c8-daaa983a314e itf_user  2023-11-24T07:50:21.788Z  itf_user  2023-11-24T07:50:21.788Z            265\n159472153        4 7f408815-79e1-462a-aa07-182ee38340a3 itf_user  2023-11-24T07:50:21.941Z  itf_user  2023-11-24T07:50:21.941Z            265\n\n379100 rows selected.\nalfresco@acs01:~$\n<\/pre><\/div>\n\n\n<p>The above Store ID of &#8216;6&#8217; is for the &#8216;<em>workspace:\/\/SpacesStore<\/em>&#8216; (live document store) and &#8216;4&#8217; is for the &#8216;<em>workspace:\/\/version2Store<\/em>&#8216; (version store):<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: sql; title: ; notranslate\" title=\"\">\nSQL&gt; SELECT id, protocol, identifier FROM alf_store;\n ID PROTOCOL   IDENTIFIER\n--- ---------- ----------\n  1 user       alfrescoUserStore\n  2 system     system\n  3 workspace  lightWeightVersionStore\n  4 workspace  version2Store\n  5 archive    SpacesStore\n  6 workspace  SpacesStore\n<\/pre><\/div>\n\n\n<p>Looking at the number of rows for each Store ID gives the exact same number and confirms there are no deleted documents yet:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nalfresco@acs01:~$ grep &quot;  4 &quot; alfresco_documents.txt | wc -l\n189550\nalfresco@acs01:~$\nalfresco@acs01:~$ grep &quot;  5 &quot; alfresco_documents.txt | wc -l\n0\nalfresco@acs01:~$\nalfresco@acs01:~$ grep &quot;  6 &quot; alfresco_documents.txt | wc -l\n189550\nalfresco@acs01:~$\n<\/pre><\/div>\n\n\n<p>Therefore, there is around 190k docs to remove in total, which is roughly the same number seen in the filesystem. The Alfresco ContentStore has a little bit more obviously since it also contains the BAU documents.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-rest-api-environment-preparation\">REST-API environment preparation<\/h2>\n\n\n\n<p>Now that the list is complete, the next step is to extract the IDs of the documents, so that we can use these in REST-API calls. The IDs are simply the third column from the file (Document ID (UUID)):<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nalfresco@acs01:~$ grep &quot;  6 &quot; alfresco_documents.txt | awk &#039;{print $3}&#039; &gt; input_file_6_id.txt\nalfresco@acs01:~$\nalfresco@acs01:~$ wc -l alfresco_documents.txt input_file_6_id.txt\n   379104 alfresco_documents.txt\n   189550 input_file_6_id.txt\n   568654 total\nalfresco@acs01:~$\n<\/pre><\/div>\n\n\n<p>Now, to be able to execute REST-API calls, we will also need to define the username\/password as well as the URL to be used. I executed the REST-API calls from the Alfresco server itself, so I didn&#8217;t really need to think too much about security, and I just used a BASIC authorization method using localhost and HTTPS. If you are executing that remotely, you might want to use tickets instead (and obviously keep the HTTPS protocol). To prepare for the removal, I defined the needed environment variables as follow:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nalfresco@acs01:~$ alf_user=admin\nalfresco@acs01:~$ read -s -p &quot;Enter ${alf_user} password: &quot; alf_passwd\nEnter admin password:\nalfresco@acs01:~$\nalfresco@acs01:~$ auth=$(echo -n &quot;${alf_user}:${alf_passwd}&quot; | base64)\nalfresco@acs01:~$\nalfresco@acs01:~$ alf_base_url=&quot;https:\/\/localhost:8443\/alfresco&quot;\nalfresco@acs01:~$ alf_node_url=&quot;${alf_base_url}\/api\/-default-\/public\/alfresco\/versions\/1\/nodes&quot;\nalfresco@acs01:~$\nalfresco@acs01:~$ input_file=&quot;$HOME\/input_file_6_id.txt&quot;\nalfresco@acs01:~$ output_file=&quot;$HOME\/output_file_6.txt&quot;\nalfresco@acs01:~$\n<\/pre><\/div>\n\n\n<p>With the above, we have our authorization string (base64 encoding of &#8216;<em>username:password<\/em>&#8216;) as well as the Alfresco API URL. In case you wonder, you can find the definition of the REST-APIs in the <a href=\"https:\/\/api-explorer.alfresco.com\/api-explorer\/\">Alfresco API Explorer<\/a>. I also defined the input file, which contains all document IDs and an output file, which will contain the list of all documents processed, with the outcome of the command, to be able to check for any issues and follow the progress.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-deleting-documents-with-rest-api\">Deleting documents with REST-API<\/h2>\n\n\n\n<p>The last step is now to create a small command\/script that will execute the deletion of the documents in REST-API. Things to note here is that I&#8217;m using &#8216;<em>permanent=true<\/em>&#8216; so that the documents will not end-up in the trashcan but will be completely and permanently deleted. Therefore, you need to make sure the list of documents is correct! You can obviously set that parameter to false if you really want to, but please be aware that it will impact the performance quite a bit\u2026 Otherwise the command is fairly simple, it loops on the input file, execute the deletion query, get its output and log it:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nalfresco@acs01:~$ while read -u 3 line; do\n  out=$(curl -k -s -X DELETE &quot;${alf_node_url}\/${line}?permanent=true&quot; -H &quot;accept: application\/json&quot; -H &quot;Authorization: Basic ${auth}&quot; | sed &#039;s\/.*\\(statusCode&quot;:&#x5B;0-9]*\\),.*\/\\1\/&#039;)\n  echo &quot;${line} -- ${out}&quot; &gt;&gt; &quot;${output_file}&quot;\ndone 3&lt; &quot;${input_file}&quot;\n<\/pre><\/div>\n\n\n<p>The above is the simplest way\/form of removal, with a single thread executed on a single server. You can obviously do multi-threaded deletions by splitting the input file into several and triggering commands in parallel, either on the same host or even on other hosts (if you have an Alfresco Cluster). In this example, I was able to get a consistent throughput of ~3130 documents deleted every 5 minutes, which means ~10.4 documents deleted per second. Again, that was on a single server with a single thread:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nalfresco@acs01:~$ while true; do\n  echo &quot;$(date) -- $(wc -l output_file_6.txt)&quot;\n  sleep 300\ndone\nFri Nov 24 09:57:38 CET 2023 -- 810 output_file_6.txt\n...\nFri Nov 24 10:26:55 CET 2023 -- 18920 output_file_6.txt\nFri Nov 24 10:31:55 CET 2023 -- 22042 output_file_6.txt\nFri Nov 24 10:36:55 CET 2023 -- 25180 output_file_6.txt\nFri Nov 24 10:41:55 CET 2023 -- 28290 output_file_6.txt\n...\n<\/pre><\/div>\n\n\n<p>Since the cURL output (&#8216;<em>statusCode<\/em>&#8216;) is also recorded in the log file, I was able to confirm that 100% of the queries were successfully executed and all my documents were permanently deleted. With multi-threading and offloading to other members of the Cluster, it would have been possible to increase that by a lot (x5? x10? x20?) but that wasn&#8217;t needed in this case since the interface job needed to be updated before a new import could be triggered.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>At a customer, I recently had a case where a mass-import job was executed on an interface that, in the background, uses Alfresco for document and metadata storage. From the point of view of the interface team, there was no problem as documents were properly being created in Alfresco (although performance wasn&#8217;t exceptional). However, after [&hellip;]<\/p>\n","protected":false},"author":20,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[525],"tags":[3169,2473,338,3167,1269],"type_dbi":[3266,3349,3348],"class_list":["post-32936","post","type-post","status-publish","format-standard","hentry","category-enterprise-content-management","tag-alfresco","tag-import","tag-job","tag-rest-api","tag-solr","type-alfresco","type-rest-api","type-solr"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.2 (Yoast SEO v27.5) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>Alfresco - Mass removal\/cleanup of documents - dbi Blog<\/title>\n<meta name=\"description\" content=\"Need to remove a lot of documents from Alfresco but Share isn&#039;t responding? You can easily use DB and REST-API queries for that purpose!\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.dbi-services.com\/blog\/alfresco-mass-removal-cleanup-of-documents\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Alfresco - Mass removal\/cleanup of documents\" \/>\n<meta property=\"og:description\" content=\"Need to remove a lot of documents from Alfresco but Share isn&#039;t responding? You can easily use DB and REST-API queries for that purpose!\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.dbi-services.com\/blog\/alfresco-mass-removal-cleanup-of-documents\/\" \/>\n<meta property=\"og:site_name\" content=\"dbi Blog\" \/>\n<meta property=\"article:published_time\" content=\"2024-04-30T17:30:00+00:00\" \/>\n<meta name=\"author\" content=\"Morgan Patou\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@MorganPatou\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Morgan Patou\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/alfresco-mass-removal-cleanup-of-documents\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/alfresco-mass-removal-cleanup-of-documents\\\/\"},\"author\":{\"name\":\"Morgan Patou\",\"@id\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/#\\\/schema\\\/person\\\/c4d05b25843a9bc2ab20415dae6bd2d8\"},\"headline\":\"Alfresco &#8211; Mass removal\\\/cleanup of documents\",\"datePublished\":\"2024-04-30T17:30:00+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/alfresco-mass-removal-cleanup-of-documents\\\/\"},\"wordCount\":1187,\"commentCount\":2,\"keywords\":[\"Alfresco\",\"import\",\"Job\",\"REST-API\",\"Solr\"],\"articleSection\":[\"Enterprise content management\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/alfresco-mass-removal-cleanup-of-documents\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/alfresco-mass-removal-cleanup-of-documents\\\/\",\"url\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/alfresco-mass-removal-cleanup-of-documents\\\/\",\"name\":\"Alfresco - Mass removal\\\/cleanup of documents - dbi Blog\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/#website\"},\"datePublished\":\"2024-04-30T17:30:00+00:00\",\"author\":{\"@id\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/#\\\/schema\\\/person\\\/c4d05b25843a9bc2ab20415dae6bd2d8\"},\"description\":\"Need to remove a lot of documents from Alfresco but Share isn't responding? You can easily use DB and REST-API queries for that purpose!\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/alfresco-mass-removal-cleanup-of-documents\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/alfresco-mass-removal-cleanup-of-documents\\\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/alfresco-mass-removal-cleanup-of-documents\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Accueil\",\"item\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Alfresco &#8211; Mass removal\\\/cleanup of documents\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/\",\"name\":\"dbi Blog\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/#\\\/schema\\\/person\\\/c4d05b25843a9bc2ab20415dae6bd2d8\",\"name\":\"Morgan Patou\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/5d7f5bec8b597db68a09107a6f5309e3870d6296ef94fb10ead4b09454ca67e5?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/5d7f5bec8b597db68a09107a6f5309e3870d6296ef94fb10ead4b09454ca67e5?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/5d7f5bec8b597db68a09107a6f5309e3870d6296ef94fb10ead4b09454ca67e5?s=96&d=mm&r=g\",\"caption\":\"Morgan Patou\"},\"description\":\"Morgan Patou has over 12 years of experience in Enterprise Content Management (ECM) systems, with a strong focus in recent years on platforms such as Alfresco, Documentum, and M-Files. He specializes in the architecture, setup, customization, and maintenance of ECM infrastructures in complex &amp; critical environments. Morgan is well-versed in both engineering and operations aspects, including high availability design, system integration, and lifecycle management. He also has a solid foundation in open-source and proprietary technologies - ranging from Apache, OpenLDAP or Kerberos to enterprise-grade systems like WebLogic. Morgan Patou holds an Engineering Degree in Computer Science from ENSISA (\u00c9cole Nationale Sup\u00e9rieure d'Ing\u00e9nieurs Sud Alsace) in Mulhouse, France. He is Alfresco Content Services Certified Administrator (ACSCA), Alfresco Content Services Certified Engineer (ACSCE) as well as OpenText Documentum Certified Administrator. His industry experience spans the Public Sector, IT Services, Financial Services\\\/Banking, and the Pharmaceutical industry.\",\"sameAs\":[\"https:\\\/\\\/blog.dbi-services.com\\\/author\\\/morgan-patou\\\/\",\"https:\\\/\\\/x.com\\\/MorganPatou\"],\"url\":\"https:\\\/\\\/www.dbi-services.com\\\/blog\\\/author\\\/morgan-patou\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Alfresco - Mass removal\/cleanup of documents - dbi Blog","description":"Need to remove a lot of documents from Alfresco but Share isn't responding? You can easily use DB and REST-API queries for that purpose!","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.dbi-services.com\/blog\/alfresco-mass-removal-cleanup-of-documents\/","og_locale":"en_US","og_type":"article","og_title":"Alfresco - Mass removal\/cleanup of documents","og_description":"Need to remove a lot of documents from Alfresco but Share isn't responding? You can easily use DB and REST-API queries for that purpose!","og_url":"https:\/\/www.dbi-services.com\/blog\/alfresco-mass-removal-cleanup-of-documents\/","og_site_name":"dbi Blog","article_published_time":"2024-04-30T17:30:00+00:00","author":"Morgan Patou","twitter_card":"summary_large_image","twitter_creator":"@MorganPatou","twitter_misc":{"Written by":"Morgan Patou","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.dbi-services.com\/blog\/alfresco-mass-removal-cleanup-of-documents\/#article","isPartOf":{"@id":"https:\/\/www.dbi-services.com\/blog\/alfresco-mass-removal-cleanup-of-documents\/"},"author":{"name":"Morgan Patou","@id":"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/c4d05b25843a9bc2ab20415dae6bd2d8"},"headline":"Alfresco &#8211; Mass removal\/cleanup of documents","datePublished":"2024-04-30T17:30:00+00:00","mainEntityOfPage":{"@id":"https:\/\/www.dbi-services.com\/blog\/alfresco-mass-removal-cleanup-of-documents\/"},"wordCount":1187,"commentCount":2,"keywords":["Alfresco","import","Job","REST-API","Solr"],"articleSection":["Enterprise content management"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.dbi-services.com\/blog\/alfresco-mass-removal-cleanup-of-documents\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.dbi-services.com\/blog\/alfresco-mass-removal-cleanup-of-documents\/","url":"https:\/\/www.dbi-services.com\/blog\/alfresco-mass-removal-cleanup-of-documents\/","name":"Alfresco - Mass removal\/cleanup of documents - dbi Blog","isPartOf":{"@id":"https:\/\/www.dbi-services.com\/blog\/#website"},"datePublished":"2024-04-30T17:30:00+00:00","author":{"@id":"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/c4d05b25843a9bc2ab20415dae6bd2d8"},"description":"Need to remove a lot of documents from Alfresco but Share isn't responding? You can easily use DB and REST-API queries for that purpose!","breadcrumb":{"@id":"https:\/\/www.dbi-services.com\/blog\/alfresco-mass-removal-cleanup-of-documents\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.dbi-services.com\/blog\/alfresco-mass-removal-cleanup-of-documents\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.dbi-services.com\/blog\/alfresco-mass-removal-cleanup-of-documents\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Accueil","item":"https:\/\/www.dbi-services.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Alfresco &#8211; Mass removal\/cleanup of documents"}]},{"@type":"WebSite","@id":"https:\/\/www.dbi-services.com\/blog\/#website","url":"https:\/\/www.dbi-services.com\/blog\/","name":"dbi Blog","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.dbi-services.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.dbi-services.com\/blog\/#\/schema\/person\/c4d05b25843a9bc2ab20415dae6bd2d8","name":"Morgan Patou","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/5d7f5bec8b597db68a09107a6f5309e3870d6296ef94fb10ead4b09454ca67e5?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/5d7f5bec8b597db68a09107a6f5309e3870d6296ef94fb10ead4b09454ca67e5?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/5d7f5bec8b597db68a09107a6f5309e3870d6296ef94fb10ead4b09454ca67e5?s=96&d=mm&r=g","caption":"Morgan Patou"},"description":"Morgan Patou has over 12 years of experience in Enterprise Content Management (ECM) systems, with a strong focus in recent years on platforms such as Alfresco, Documentum, and M-Files. He specializes in the architecture, setup, customization, and maintenance of ECM infrastructures in complex &amp; critical environments. Morgan is well-versed in both engineering and operations aspects, including high availability design, system integration, and lifecycle management. He also has a solid foundation in open-source and proprietary technologies - ranging from Apache, OpenLDAP or Kerberos to enterprise-grade systems like WebLogic. Morgan Patou holds an Engineering Degree in Computer Science from ENSISA (\u00c9cole Nationale Sup\u00e9rieure d'Ing\u00e9nieurs Sud Alsace) in Mulhouse, France. He is Alfresco Content Services Certified Administrator (ACSCA), Alfresco Content Services Certified Engineer (ACSCE) as well as OpenText Documentum Certified Administrator. His industry experience spans the Public Sector, IT Services, Financial Services\/Banking, and the Pharmaceutical industry.","sameAs":["https:\/\/blog.dbi-services.com\/author\/morgan-patou\/","https:\/\/x.com\/MorganPatou"],"url":"https:\/\/www.dbi-services.com\/blog\/author\/morgan-patou\/"}]}},"_links":{"self":[{"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/posts\/32936","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/users\/20"}],"replies":[{"embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/comments?post=32936"}],"version-history":[{"count":8,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/posts\/32936\/revisions"}],"predecessor-version":[{"id":32952,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/posts\/32936\/revisions\/32952"}],"wp:attachment":[{"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/media?parent=32936"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/categories?post=32936"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/tags?post=32936"},{"taxonomy":"type","embeddable":true,"href":"https:\/\/www.dbi-services.com\/blog\/wp-json\/wp\/v2\/type_dbi?post=32936"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}