Loading...

Accessing Mustache Arrays Element

:heavy_exclamation_mark: This post is older than a year. Consider some information might not be accurate anymore. :heavy_exclamation_mark:

Used:   elasticsearch v5.6.4 

The QA (Quality Assurance) team use simulators like Astrex to check and test respective changes and features. I was asked if I could bring the simulator logs into our Elasticsearch, for a real time purpose. Tailing log files is still difficult, except if you can use bash.

A log message from the simulator has several segments and uses tab as delimiter.

0000000062	2017-12-28T12:52:08.643	OLI Interface	OLI Tst P60501	ID	4	[ChanCtrl] Requesting one channel from [TCP/IP (OLI Tst P60501)]

For the log ingestion, I was only interested in the timestamp and the log message at the end. I use a ingest pipeline in Elasticsearch. The splitting part is quite simple. Using the split processor will result in this data structure:

{
 "_source": {
  "data": [
    "0000000062",
    "2017-12-28T12:52:08.643",
    "OLI Interface",
    "OLI Tst P60501",
    "ID",
    "4",
    "[ChanCtrl] Requesting one channel from [TCP/IP (OLI Tst P60501)]"
  ]
 }
}

The challenge was to access only the index element 1 and 6. Since Elasticsearch uses Mustache is was quite difficult to find the respective documentation. But it is possible! Just use the index position like this {{data.1}}.

PUT _ingest/pipeline/le-mapper
{
  "description": "mapperstuff",
  "processors": [
    {
      "split": {
        "field": "message",
        "separator": "\t",
        "target_field": "data"
      }
    },
    {
      "set": {
        "field": "jesus",
        "value": "{{data.1}}"
      }
    },
    {
      "set": {
        "field": "logmessage",
        "value": "{{data.6}}"
      }
    },
    {
      "set": {
        "field": "application",
        "value": "simulator"
      }
    },
    {
      "date": {
        "field": "jesus",
        "target_field": "datetime",
        "formats": [
          "yyyy-MM-dd'T'HH:mm:ss.SSS"
        ],
        "timezone": "Europe/Zurich"
      }
    },
    {
      "remove": {
        "field": [
          "data",
          "jesus",
          "message"
        ]
      }
    }
  ],
  "on_failure": [
    {
      "set": {
        "field": "error",
        "value": "{{ _ingest.on_failure_message }} on operation: {{ _ingest.on_failure_processor_type }}"
      }
    }
  ]
}

Just test the pipeline

POST /_ingest/pipeline/le-mapper/_simulate
{
  "docs": [
    {
      "_source": {
        "message": """0000000062	2017-12-28T12:52:08.643	OLI Interface	OLI Tst P60501	ID	4	[ChanCtrl] Requesting one channel from [TCP/IP (OLI Tst P60501)]"""
      }
    }
  ]
}

The output in JSON

{
  "docs": [
    {
      "doc": {
        "_index": "_index",
        "_type": "_type",
        "_id": "_id",
        "_source": {
          "datetime": "2017-12-28T12:52:08.643+01:00",
          "application": "simulator",
          "logmessage": "[ChanCtrl] Requesting one channel from [TCP/IP (OLI Tst P60501)]"
        },
        "_ingest": {
          "timestamp": "2018-02-02T10:38:02.550Z"
        }
      }
    }
  ]
}
Please remember the terms for blog comments.