Core Concepts of ElasticSearch
- Shards and replicas
- Mappings and types
- Inverted Indexes
- An index is container that stores and manages documents of single type in ElasticSearch.
- Typically documents with mostly common sets of fields are grouped under one type.
- Json Documents are first class citizens of elastic search.
- A document conists of multiple fields and is basic unit of information stored in elastic search
- In addition to the fields that are sent by the user in the document, Elastic search maintains internal metafields. The fields are
- _id: unique identifier of the document within a type
- _type: This field contains type of the document
- _index: this field contains the index name of the document
Lets Experiment with Indexes, types and Documents
- Refer Here for the api documentation of indexes
- Lets create a index called as qualitythought Refer Here
- Refer Here for the Get Index
- Lets create a document into the index Refer Here
- An Elastic search node is a single server of Elastic Search which may be part of large cluster of nodes.
- Every ElasticSearch node is assigned an unique ID and name when start.
- A cluster hosts one or more indexes and is responsible for providing operations such as indexing, searching and aggregations.
- A cluster is formed by one or more nodes.
- Elastic search node is always part of cluster
etc/elasticsearch/elasticsearch.yml => cluster.name
Shards and replicas
- An index contains documents of one type.
- Shards help in dividing the documents of single index over multiple nodes.
- The process of dividing the data among shards is called as sharding.
- This is built in process of elastic search
- By default, every index is configured to have five shards in ElasticSearch.
- At the time of creating the index, we can specify the number of shards.
- Once the index is created the shards cannot be modified.
- Consider the below example where we have three nodes and number of shards are five
- Now if one node goes down, as of now as mentioned below we loose some data (till the node is up)
- Distributed Systems such as Elastic Search are expected to run inspite of failure. This issue is addressed by replica shards or replaces.
- Each shard in an index can be configured to have zero or more replica shards.
- Lets understand this by one replica shard
- Even if one node goes down we still have the data from replica shards serving the clients of elastic search