Lucene Search Engine

All Plugins / Index / Lucene

Identity Card

Plugin LabelLucene Search Engine
Short DescriptionZend_Search_Lucene implementation to index all files and search a whole workspace quickly.
Plugin Identifierindex.lucene
AuthorCharles du Jeu
Dependenciesaccess.fs, access.smb, access.imap, access.swift, access.s3, access.inbox, access.demo, access.dropbox, access.webdav, access.sftp_psl, access.smbicewind, access.sftp, access.ftp


This plugin uses the Zend_Search_Lucene library that implement the Apache Lucene module in PHP for indexing the files and providing an efficient search tool. You must make sure to add a meta source "index.lucene" to the repositories that you want to be indexed.

The plugin supports the indexation of metadata, the background indexation of huge folders (if the framework can be run in background via the command line), and also the indexation of files contents when they are textual files (TXT, HTML). It could be possible to add PDF indexation using some pdf-to-text conversion, but it's not implemented yet.

The search results display a "Hit Score" that is provided by the search engine.


If you can install the uniconv utilitary on your server, along with the openoffice or libreoffice headless suite, and the xpdf utilitary, the plugin will be able to extract and index textual contents from office documents (Word,Excel,Powerpoint and all their closed or open-source variants).

Examples to install the packages on CentOS : yum install unoconv xpdf
Or on Debian : apt-get install unoconv xpdf

Plugin parameters

Parse Content Until *
Skip content parsing and indexation for files bigger than this size (must be in Bytes)String500000
HTML files *
List of extensions to consider as HTML file and parse contentStringhtml,htm
Text files *
List of extensions to consider as Text file and parse contentStringtxt
Unoconv Path
Full path on the server to the 'unoconv' binaryString
PdftoText Path
Full path on the server to the 'pdftotext' binaryString
Query Analyzer
Analyzer used by Zend to parse the queries. Warning, the UTF8 analyzers require the php mbstring extension.Select (utf8num_insensitive, utf8num_sensitive, utf8_insensitive, utf8_sensitive, textnum_insensitive, textnum_sensitive, text_insensitive, text_sensitive)textnum_insensitive
Wildcard limitation
For the sake of performances, it is not recommanded to use wildcard as a very first character of a query string. Lucene recommends asking the user minimum 3 characters before wildcard. Still, you can set it to 0 if necessary for your usecases.Integer3
Automatically append a * after the user query to make the search broaderBooleanfalse
Hide 'My Shares'
Hide My Shares section in the Orbit theme GUI.Booleanfalse

Instance parameters

Index Content *
Parses the file when possible and index its content (see plugin global options)Booleanfalse
Index Meta Fields
Which additionnal fields to index and searchString
Repository keywords
If your workspace path is defined dynamically by specific keywords like AJXP_USER, or your own, mention them here.String