Core Concepts of Elastic Search
Types: documents with mostly common set of fields are grouped under one type.
Document: consists of multiple fields and are basic unit of information stored in Elastic Search. When you send a document to elastic search it maintains internal metafield
- _id: Unique identifier for the document
- _type: This field consists of type of the document
- _index: Which index the document belongs to
Nodes: An Elastic Search Node is a single server of Elastic Search which may be part of large cluster of nodes. It partipates in indexing, searching and performs other operations supported by Elastic Search.
- Every node has a unique assigned ID
- node name can be changed by using node.name in elasticsearch.yml
- A cluster hosts one or more indexes and responsible for searching, indexing and aggregations.
- A cluster is formed by one or more elastic search nodes.
- Cluster name can be set by cluster.name in elasticsearch.yml
- Shards help in distributing index over the cluster
- The process of dividing data among shards is called sharding.
- For every shard created a replica shard is also created
- This process gives high availability to elastic search
- Now consider Node1 fails, the data is still available in rest of the cluster in the form of shards or replica shards, so node failure are not going to impact elastic search.
Mappings and datatypes:
- Elasticsearch is schemaless, But in real-world scenario data is never completely schemaless or unstructured. Before we do this we need to understand datatypes supported by elastic search
- Mapping is structure given to the type
- String datatypes:
- text: This datatype is useful for supporting full text-search for fields
- keyword: The keyword type enables analytics on string fields
- Numeric datatypes:
- byte, short, integer and long
- float and double
- Date Datatype
- Boolean datatype
- Binary datatypes:
- Range datatypes:
- integer_range, float_range, long_range, double_range and date_range
Complex data types
- Array datatype
- Object: Allows JSON documents
- Nested datatype:
- Geo-point datatype
- Geo-shape datatype
- IP datatype
RestAPIs and Elastic Search
- Elastic Search exposes its functionality over REST. It exposes different API’s for different purposes Refer Here
- Index Apis: Refer Here
- Lets Get the mapping of customer index which we have created from kibana console
- An inverted index is a core data structure of elastic search. This helpes is supporting full text-search.
- An Inverted index is similar to the index which you see at the end of any book
- Lets assume we have a documents as mentioned below
- An inverted index is built, so when search for text the queries will be faster bcoz we have frequency, Documents