BigSearch
BigSearch is a module designed to perform bulk operations on very large sets of documents in the Alfresco system. Instead of burdening the system with one-time processing of thousands of files, BigSearch intelligently divides data into smaller parcels and processes them in a parallel and secure manner. In doing so, it uses the Lucene search engine, and performs operations based on user-defined JavaScript. BigSearch eliminates the risk of system overload and enables flexible document management, even in environments with very large data volumes.
Key capabilities of the module:
- Splitting large data sets into parcels– When a Lucene query returns a large number of documents, BigSearch splits the results into smaller groups. This allows processing to be done in stages without overloading the system, ensuring stability even when working with hundreds of thousands of files.
- Parallel processing of documents in separate threads– Each packet of documents is processed in a separate thread, allowing multiple operations to be performed simultaneously. This translates into big time savings when performing mass tasks.
- Isolated transactions for each parcel– Operations performed on each group of documents take place in a separate transaction. In case of an error, only a particular parcel is skipped or undone, without affecting the rest of the process.
- Flexible user JavaScript– For each parcel, a user-defined JavaScript script is run in the Alfresco environment. This enables automation of any action, such as modifying metadata, moving documents, changing statuses or deleting them.
- Compatibility with Lucene– BigSearch is based on the Lucene mechanism, which enables the creation of advanced queries that search the content structure and metadata of documents with high precision.