Home » Apache Solr Architecture

Apache Solr Architecture

by Online Tutorials Library

Apache Solr Architecture

Apache Solr is a J2EE based application that uses the libraries of Apache Lucene internally for the generation of the indexes as well as to provide the user-friendly searches. The architecture of Apache Solr has been described with the help of block diagram below.

Apache Solr is a J2EE based application that uses the libraries of Apache Lucene internally for the generation of the indexes as well as to provide the user-friendly searches. The architecture of Apache Solr has been described with the help of block diagram below.

Apache Solr Architecture

The instances of Apache Solr can run as a single core or multicore application and is based on the client-server model. Previously, apache solr had a single core that limits the consumer on one application to run solr through a single configuration file and schema. Later in the development, it can support the creation of multiple cores. You can run one Solr instance for multiple schemas and configurations with unified administration. Apache solr runs in the distributed model for the high availability and scalability.

There are logically four layers in which the overall architecture of solr can be divided. The storage layer is responsible for the managing of indexes and configuration metadata. It is inside the J2EE container on which the instance will run, and the solr engine is the application package that runs on top of the container. Finally, interaction denotes how the client/ Apache Solr server can interact with the web browser. Let’s understand every component in detail in the upcoming sections.

Storage

The Apache Solr storage can be used mainly for storing metadata and the necessary index information. It is typically file storage that is locally configured in the configuration file of Apache Solr. The installation package comes with a Jetty servlet and HTTP server by default, the configuration related to the package can be found in the $solr. Home/conf folder inside the Solr installation. An index contains the sequence of the document, and external storage devices can be configured in Apache Solr. For Example, Databases or Big Data storage systems.

Components of Storage

  • A collection of fields is called Documents.
  • A field is named as a sequence of terms.
  • A string is related to the term.

Building Blocks of Apache Solr

Below are the essential building blocks and components of Apache Solr ?

Request Handler ? The requests are processed by these request handlers that we send to the Apache Solr server. Such as index update requests or query requests. Depending on our requirement, we have to choose the request handler. We will map the handler to a specific URI end-point, in general, to pass a request to Solr, and the specified request will be served by it.

Search Component ? It is a feature of search available in Apache Solr. It can be spell checking, faceting, querying, hit highlighting, etc. All the components are registered as the search handlers. You can register multiple components to a search handler.

Query Parser ? The query parser in Apache Solr parses the queries that we pass to the Solr server and verifies the queries for syntax errors. It translates them to a format that the Lucene application understands after parsing the queries.

Response Writer ? In Apache Solr, the Response Writer is the component that generates the formatted output for the queries of the user. Apache Solr supports formats of response such as XML, JSON, CSV, etc. For each type of response, we have different response writers.

Analyzer/tokenizer – Apache Solr recognizes data in the form of tokens. It analyzes the content that divides it into tokens and passes all the tokens to Lucene. An analyzer in Apache Solr can be used to examines the text of fields and creates a token stream. The token stream prepared by the analyzer can be breaked into tokens.

Update Request Processor – When we fetch an update request from Apache Solr, the update request will be run through a collection of plugins ( i.e., signature, logging, indexing), which is known as update request processor collectively. This update request processor is responsible for the modifications, such as adding a field, dropping a field, etc.


You may also like