DevOps Classroom Series – 21/Jul/2020 – Direct DevOps from Quality Thought

Analyzing Log Data

Each application/system generates logs whenever an event occurs. This consists of rich information about the state & behavior of your application.
With so much logs, collecting it , extracting a relavent information & analyzing it would be a challenge.
Logs in all applications/systems will not have the same format.
Lets look at two different formats of logs

# event log format
<event>
  <occured> 7:53 AM 7/21/2020 </occured>
  <message> User admin created </message>
  <type> INFO </type>
  
</event>

# text log format
7:53 AM 7/21/2020 org.qt.ecommerce.com INFO user admincreated

To analyze logs, we need a tool which can do Extracting the logs, Converting/Transforming into some common format and then storing it (ETL) & that is where logstash comes into picture.
Typical reasons for using logs
- Troubleshooting
- To understand system/application behavior
- Auditing
- Predictive Analysis
Challenges with Logs
- No common/consistent format
- Logs are decentralized.
- No consitent time formats
- Data is unstructured

Logstash

Logstash is a popular open source data collection engine with realtime pipelining capabilities. Logstash allows us to easily build a pipeline that can help in collecting data from various sources and parse, enrich, unify and store in wide variety of destinations
Installation: Refer Here

sudo apt-get install openjdk-8-jdk -y

Architecture

Logstash event processing pipeline has 3 stages
- Inputs
- Filters
- Outputs
Inputs & ouputs are required whereas filters are optional.
This functionality of Inputs, Outputs and Filters in logstash is provided by logstash plugins
Logstash uses in-memory bound queues between pipeline stages by default (Input to Filter and Filter to Output). To persist this to a file to prevent data loss in cases of failure, in Logstash.yml file
- Change property of queue.type to persisted
Logstash pipeline format is

input {
    <any input plugins>
}
filter {
    <any filter plugins>
}
output {
    <any output plugins>
}

Generally we create .conf file as store in Log stash configuration directory
Lets find the configuration files and logstash binaries (LOGSTASH_HOME) directories
Lets create a basic pipeline using command line

cd /usr/share/logstash/bin/
sudo ./logstash -e "input {stdin {} } output { stdout{} } "

Lets create a basic configuration file called as simple.conf

input 
{
    stdin{}
}

filter {
    mutate {
        uppercase => ["message" ]
    }
}

output {
    stdout {
        codec => rubydebug
    }
}

Now lets run log stash

sudo ./logstash -f simple.conf

Logstash plugins

Logstash has rich collection of input,filter,codec and output plugins. Plugins are available as self contained packages called as gems
If you want verify the list of plugins that are part of installation of logstash

sudo ./logstash-plugin list
sudo ./logstash-plugin list --group input
sudo ./logstash-plugin list --group filter
sudo ./logstash-plugin list --group output
sudo ./logstash-plugin list 'grok'

If needed plugins can be installed by using

sudo ./logstash-plugin install logstash-output-email

Plugins can be updated using

sudo ./logstash-plugin update

Documentation of plugins

Analyzing Log Data

Logstash

Architecture

Logstash plugins

Share this:

Leave a ReplyCancel reply

Discover more from Direct DevOps from Quality Thought