Tika Extractor

Extract files contents by querying a running Apache Tika server

Parameters

Label (internal name) Type Default Description
Tika Service Address (serverAddress) string localhost:9998 Use HOSTNAME or IP:PORT, http://host:port will be used to query Tika server
Extract and store textual content (extractContent) string pydio-binaries/tika-{{.Node.Uuid}}.gz Tika can extract content from many files types, that can be indexed by search server. Extracted content is stored to the specified file, and the url of this file is attached to the 'content ref' metadata.
Content Reference Metadata (contentRef) string pydio:ContentRef Where to attach textual file content
Compress extracted contents (compressContent) boolean true If switched on, extracted content is compressed before storing
Additional metadata fields (additionalMeta) string Content-Type Try to find additional known keys in metadata extracted by Tika. Use comma separated list of field names
Back to top