Skip to content

Instantly share code, notes, and snippets.

@visuve
Last active August 17, 2023 06:37
Show Gist options
  • Select an option

  • Save visuve/eb3121c4b120091d3620f7488ce4a5d3 to your computer and use it in GitHub Desktop.

Select an option

Save visuve/eb3121c4b120091d3620f7488ce4a5d3 to your computer and use it in GitHub Desktop.
Grok parser tutorial

0. Introduction

  • In this tutorial a grok parser is built for F-Secure (Windows) products which use "fs_ccf_log" logger component

1. Open Grok debugger & familiarize with the Grok syntax

2021-09-07 22:06:42.377 [2a88.23e0]  D: main: Debug-viesti
2021-09-07 22:06:42.377 [2a88.23e0]  I: main: Informatiivinen viesti
2021-09-07 22:06:42.377 [2a88.23e0] .W: main: Varoitusviesti
2021-09-07 22:06:42.377 [2a88.23e0] *E: main: Virheviesti

1.1. Open Grok-pattern cheat cheet

1.2. Start trying out the Grok debugger

  • NOTE: the F-Secure "fs_ccf_log" format is as follows
  • year-month-date hour:minute:second.millisecond [pid.tid] loglevel: function: message

1.3 Try to parse

  • Year
  • Month
  • Date
  • Hour
  • Minute

1.4 Try to parse

  • Seconds (without the milliseconds)
  • Milliseconds

2. Make the log format ECS-compliant

  • Open Elastic Stack Common Schema (ECS) documentation https://www.elastic.co/guide/en/ecs/current/index.html
  • Make the following fields ECS compliant
  • %{YEAR:year}-%{MONTHNUM:month}-%{MONTHDAY:day} %{HOUR:hour}:%{MINUTE:minute}:%{INT:second}.%{INT:millisecond} \[%{BASE16NUM:process_id}.%{BASE16NUM:thread_id}\]\s+%{DATA:loglevel}\:\s%{WORD:function}\:\s%{GREEDYDATA:message}
  • Hint: fields in objects are separated with [ ] symbols in Grok parser
    • E.g. [host][cpu][usage]

3. Putting it all together

3.1 Configure filebeats

  • Add a new path in the filebeat.inputs: section in filebeat config
    • /etc/filebeat/filebeat.yml
  • Mark the filestream with a custom field which will be used in logstash inputs
filebeat.inputs:
- type: filestream
  enabled: true
  paths:
    - /home/ite/f-secure.log
  fields:
    fsecure: true
  fields_under_root: true

3.2 Configure logstash input

  • Add a special input handler for the new tag
filter {
  if [fsecure] {
  }
}

3.3 Add Grok

  • Add grok inside the newly added if-block
    • Match message field
    • Use the regex above to extract fields

Example:

grok {
  match => { "message" => "%{YEAR:year}-%{MONTHNUM:month}-%{MONTHDAY:day} %{HOUR:hour}:%{MINUTE:minute}:%{INT:second}.%{INT:millisecond} \[%{BASE16NUM:process_id}.%{BASE16NUM:thread_id}\]\s+%{DATA:loglevel}\:\s%{WORD:function}\:\s%{GREEDYDATA:message}" }
}

3.4 Add Mutate

  • Add mutate after the grok object (still withing the if-block)
  • Create a new field to the datastream called fsecure_timestamp
    • To it, assign the values we parsed with grok
  • Remove excess fields

Example:

mutate {
  add_field => { "fsecure_timestamp" => "%{year}-%{month}-%{day}T%{hour}:%{minute}:%{second}+03:00" }
  remove_field => [ "year", "month", "day", "hour", "minute", "second", "millisecond" ]
}

3.5 Use the timestamp

  • Use the fsecure_timestamp variable to form a ISO-8601 Elastic timestamp field using mutate input
  • Add date after the mutate object (still withing the if-block)
  • Example:
date {
  match => [ "fsecure_timestamp", "ISO8601" ]
}

3.6 Full logstash input example:

filter {
  if [fsecure] {

    grok {
      match => { "message" => "%{YEAR:year}-%{MONTHNUM:month}-%{MONTHDAY:day} %{HOUR:hour}:%{MINUTE:minute}:%{INT:second}.%{INT:millisecond} \[%{BASE16NUM:process_id}.%{BASE16NUM:thread_id}\]\s+%{DATA:loglevel}\:\s%{WORD:function}\:\s%{GREEDYDATA:message}" }
      overwrite => [ "message" ]
    }

    mutate {
      add_field => { "fsecure_timestamp" => "%{year}-%{month}-%{day}T%{hour}:%{minute}:%{second}+03:00" }
      remove_field => [ "year", "month", "day", "hour", "minute", "second", "millisecond" ]
    }
    
    date {
      match => [ "fsecure_timestamp", "ISO8601" ]
    }

  }
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment