Strus documentation
Project organization
The whole Strus universe is implemented in several subprojects hosted on github. Here is a list of its subprojects:
- strusBase implements some helper functions as the common code base. It also defines the error buffer interface for buffering exception error messages. Exceptions cannot be thrown across borders of dynamically linked libraries and modules. All other Strus projects depend on strusBase.
- strus provides the interface to the storage and the query processing of the search engine. It also provides the default key value store database connector (to LevelDB).
- strusAnalyzer provides document and query analysis for transforming content into retrievable items.
- strusTrace implements all methods of the strus analyzer and core as a proxy that logs its calls made via a specified interface. The whole mechanism is implemented as own aspect without touching any code of the core or the analyzer. The produced trace can be visualized as call tree or the logs can be processed as a readable dump.
- strusPattern provides an implementation of the analyzer pattern matching interface with a lexer based on the Intel hyperscan library.
- strusVector provides an implementation of the strus vector storage interface with a search for the nearest vectors implemented with brute force LSH (Local Sensitive Hashing).
- strusModule provides the loading of search engine components from dynamically loadable modules. (depends on strus and strusAnalyzer)
- strusRpc provides a proxy interface for strus objects residing on a server via RPC. If you want to use strus in a web server context, where loading modules by another instance than the web server itself is not allowed or at least not recommended, you should access strus via RPC or a similar mechanism. (depends on strus, strusAnalyzer, strusModule)
- strusUtilities provides some command line programs to access the search engine. (Depends on strusModule and strusRpc
- strusBindings provides language bindings to use strus with other programming languages like PHP, Python, Lua, etc... (Depends on strusModule and strusRpc)
Interface documentation
Strus provides two classes of interfaces with diametric objectives. One is the C++ interface that is the base of the implementation and the other is the interface for language bindings with implementations that wrap to the C++ interface.
C++ interfaces
- strus base: Interface for common functions and interfaces
- strus core: Interface to storage and query evaluation
- strus analyzer: Interface to process documents and queries in strus.
- strus trace: Interface for logging and inspecting method call traces for debugging or another kind of execution analysis.
- strus module loader: Interface to define strus components and to load them from dynamically loadable modules.
- strus RPC client: Interfaces for strus RPC client and server
- strus program loader: Functions to instantiate strus components from source
Language bindings
Bindings are available for Lua, Python3 and PHP7. To enable the language bindings for Python and PHP you have to build strusBindings (or strusAll) with cmake -DWITH_PHP=YES, resp. with -DWITH_PYTHON=YES. The language bindings are referenced as "strus". Currently there are not many tutorials available. You can have a look at the tests ( for PHP, Lua, Python) for examples.
Functions documentation
When writing an application with Strus, you have various functions of a different kind at your hands. You can write your functions in C++ and load them dynamically into your application. But there exist also a lot of predefined functions in strus. You find a complete list of the built-in functions of the core and the analyzer here.
Command line tools documentation
You do not need the command line tools of strus.
All functionality is accessible with the API.
But there exist a lot of command line tools helpful for access and maintenance of
a strus storage. A list of the standard command line tools and their documentation can be found
here (utilities).
Developer documentation
Programming guidelines
The programming guidelines contributors should respect, can be found here. Suggestions for strenghtening these rules are welcome.
Writing Strus extension modules in C++
Strus core
The Strus core can be extended with dynamically loadable modules with weighting functions, summarizers and posting join operators written in C++. This codeproject article writes about the expandability of Strus and tries to explain the basic models and concepts used in the query evaluation.
Strus analyzer
You can also extend the Strus analyzer with own dynamically loadable modules with functions written in C++. A codeproject article about the expandability of the Strus analyzer is planned. Here is for now a short list of components you can write as dynamically loadable analyzer modules for Strus:
- Segmenter
You can define your own segmenters for the document formats you need to process. - Tokenizer
You can define your own tokenizers splitting the document segments into tokens. - Normalizer
You can define your own normalizer functions to produce the retrievable items from the document tokens for the storage and the query. - Aggregator
You can define your own aggregator functions to produce some statistical values from the document structure after analysis.