Grok Filter
- This is powerful and often used plugin for parsing the unstructured data into structured data, which makes the data easily queryable.
- The general syntax of grok pattern is
%{PATTERN:FIELDNAME}
- By default groked field are string and can be cast into float or int
%{PATTERN:FIELDNAME:type}
-
Logstash ships about 120 patterns by default Refer Here
-
To work with grok pattern Refer Here
-
Converting log message in to multiple fields
-
Sample 1
- Logmessage:
2021-04-29 02:59:23.110 DEBUG 353 --- [nio-8080-exec-7] s.w.s.m.m.a.RequestMappingHandlerMapping : Mapped to org.springframework.samples.petclinic.system.WelcomeController#welcome() - Grok pattern:
%{TIMESTAMP_ISO8601}%{SPACE}%{LOGLEVEL:level}%{SPACE}(?<thread>\d+\s+-*\s*\[[\b\w*\d*-]*\]\s)(?<class>[\b\w*\.]*)%{SPACE}(?::\s*)%{GREEDYDATA:logmessage}
- Logmessage:
-
Sample 2: Apache log Refer Here
- Log message:
83.149.9.216 - - [17/May/2015:10:05:03 +0000] "GET /presentations/logstash-monitorama-2013/images/kibana-search.png HTTP/1.1" 200 203023 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36" - Grok pattern:
%{COMBINEDAPACHELOG}
- Log message:
-
Sample 3: mysql error logs
- Log message:
060516 22:38:54 [ERROR] Fatal error: Can't open privilege tables: Table 'mysql.host' doesn't exist - Grok patterns:
(?<month>\d\d)(?<day>\d\d)(?<year>\d\d)%{SPACE}%{TIME}%{SPACE}(?<error>\[\b\w+\]\s+)%{GREEDYDATA:errormessage}
- Log message:
-
Refer Here for some other samples
-
Now lets create a conf which reads a log from stdin for sample 1
input
{
stdin {}
}
filter
{
grok
{
match => { "message" => "%{TIMESTAMP_ISO8601}%{SPACE}%{LOGLEVEL:level}%{SPACE}(?<thread>\d+\s+-*\s*\[[\b\w*\d*-]*\]\s)(?<class>[\b\w*\.]*)%{SPACE}(?::\s*)%{GREEDYDATA:logmessage}" }
}
}
output
{
stdout {}
}
- Now lets try to read apache logs from a file and output to stdout using grok
input
{
file
{
path => "/var/log/apache.log"
start_position => "beginning"
}
}
filter
{
grok
{
match => { "message" => "%{COMBINEDAPACHELOG}"}
}
}
output
{
stdout {}
}
- Also parse syslog messages
input
{
file
{
path => "/var/log/syslog*"
start_position => "beginning"
}
}
filter
{
grok
{
match => { "message" => "%{SYSLOGBASE}%{GREEDYDATA:logmessage}"}
}
}
output
{
stdout {}
}
