There are a number of ways of pairing ONgDB with ElasticSearch. One common way was through the use of the Rivers plugin, but that was deprecated in ElasticSearch 1.5 and will likely be fully removed shortly after ElasticSearch 2.0. Going forward any integration will require a more sophisticated integration to index the desired nodes and relationships from ONgDB to ElasticSearch.
For those that don’t know, ElasticSearch is an open source search server based on Lucene that provides a distributed full-text search engine that utilizes JSON documents with a RESTful API.
ElasticSearch provides language analyzers, aggregations and other features right out of the box, which are some of the reason it’s an ideal search solution to pair with ONgDB as opposed to trying to recreate all the text search capability within ONgDB. Some of the key advantages in the ONgDB ElasticSearch pairing include:
- Swift search against large data volumes
Large and complex graph traversal queries spanning tens to hundreds of thousands of nodes that would take many seconds will take milliseconds with ElasticSearch because the query result is stored in a single document that can be easily indexed. The design of ElasticSearch is leaner and lot simpler compared to a database consisting of columns, rows, tables, fields, and schemas, which enables many documents with concise results to be indexed in a caching mechanism when the attribute nature of the query variations doesn’t explode the combinations needing stored.
- Document indexing to repository
ElasticSearch can easily convert raw data (message files or log files) into internal documents. It then stores them within a basic data structure. Flowing documents to ElasticSearch is reliable to automate in a push fashion from ONgDB.
- Quick data access via de-normalized storage
ElasticSearch will usually house a document for every repository in which it lives in. Full text searches are swift since documents are housed nearby to corresponding metadata within the index. The aggregators and language analyzers can then be used effectively to build together search queries that go from text entry to a starting set of nodes for the ONgDB graph query to complete the process of returning a result.
- Scalable and distributable
ElasticSearch is capable of scaling thousands of servers while accommodating petabytes of data. Its capacity results directly from its highly distributed and intricate architecture. This scalability is a great front for the query result documents to lower the complex and potentially long running query load off ONgDB.
On the ONgDB side of the ElasticSearch integration, one way is to integrate the ONgDB TransactionEventHandler to push over any graph changes to ElasticSearch. Or another approach external to the core ONgDB transactional commit lifecycle is to push changes to the graph database nodes and relationships that would impact the desired query result documents to ElasticSearch on successful commit.
GraphGrid makes it possible to pair ONgDB and ElasticSearch in way that allows ONgDB to house all nodes and relationships, while allowing ElasticSearch text search through indexing certain nodes, relationships and query results. An example of this type of ONgDB ElasticSearch integration may be seen by following the Search tutorial in GraphGrid Connected Data Platform. Whenever you type a query within the search field, ElasticSearch will begin to work to surface search options and once the result is selected a request made to ONgDB where it can leverage the benefit of the relationship connectedness, specified in the graph to offer comprehensive results.
Surfacing relevant search results isn’t easy, but when leveraging the connectedness of your data through ONgDB along with the aggregation and language capabilities provided by ElasticSearch a powerful pairing emerges, which many refer to as “graph-aided search”. At GraphGrid, we see it as using each technology for what it’s best and have been enabling businesses to leverage this auto-indexing connection since ElasticSearch 0.90 and very early ONgDB versions. This deep expertise is built as a core integration within the GraphGrid Data Platform for all to use as the ElasticGraph service.