Using Solr Search with RDBMS

In many shops some of the most commondata was available, searches could be executed
queries used in large scale RDBMS systems suchon the data.
as Oracle are for pattern searches within rangesI also found the packaged Schema Browser was
of criteria, typically targeted searches for data byvery handy. Admittedly, the Schema Browser
users to answer and meet certain business needs.takes a while to process all the fields in the index
Writing standardized reports or simple relationalso if you have a lot of data this can take a while.
queries can answer the questions, but suchHowever the benefit is that it can provide
mechanisms can be inflexible and costly toanswers to some of the more common questions
maintain. One more efficient way to addressthat could be asked such as: the number of
these challenges is through the power of Solr.documents per value which can help for groups of
Getting the Dataitems such as types of orders; how many
After installing Solr Lucid Imagination onto adocuments actually have parent accounts; how
standalone server outside of the productionorders are provided by various sending
complex, the next step involved actuallysystems;how many orders are for a given state
configuring Solr so that we could get the data weor postal code; etc. The data can also yield
needed. A few decisions were made at this point.additional insights from more advanced searches
The first decision happened to be about the datasuch as faceted searches, such as what postal
itself. I decided to target many of the existingcodes are responding to which advertising or
information structures within the application whichproduct promotions; which areas have the most
had been simplified to meet other businessactivity for certain types of orders; or, how
reporting needs. Additionally by using thesemany domains are covered per type of account.
structures it would make configuration easier laterAnd the list goes on.
on. The second decision involved whether to storeOperationally speaking, the Solr instances were
the data values in the index itself. While ideally themanaged in one of two ways: periodic updates
data would have been accessed from thefrom the main production instances or continual
production database instance, I decided instead toupdates with application code not only adding data
store the data within the index for easier retrievalto the Oracle database but inserting them into the
and to reduce the queries against the productionSolr index as well. Hence the operations against
database instance itself. The final decision involvedthe existing production instances could be
how much of the data could be safely retrievedmanaged to minimize impacts and eliminate any
via the DataImportHandler and stored within Solr.unnecessary processing.
This actually turned out to be pretty simple. TheConclusion
Oracle constructs only held a week work's ofWith these new capabilities, answers to key
data, per an agreement with the business users. Iquestions can be found in seconds. Data can be
would start with that amount and from theremined quickly, efficiently and flexibly without a lot
determine how much further could be held withinof specialized training for business users.
the Solr instance.Additionally, the indexes could be managed in such
Searching with Solra way such that additional data could be added
The data once imported was not very large, onlyfor to increase the scope of analysis, or subsets
50GB worth of data overall. This again could beof data could be indexed and searched for
managed by adjusting the field types, whetherspecific business reasons such as service outages
data had to be stored or not, and the amount ofor legal reasons.
historical information to be imported. Now that the