This post is older than a year. Consider some information might not be accurate anymore.
Used: elasticsearch v5.6.4
The QA (Quality Assurance) team use simulators like Astrex to check and test respective changes and features. I was asked if I could bring the simulator logs into our Elasticsearch, for a real time purpose. Tailing log files is still difficult, except if you can use bash.
A log message from the simulator has several segments and uses tab as delimiter.
0000000062 2017-12-28T12:52:08.643 OLI Interface OLI Tst P60501 ID 4 [ChanCtrl] Requesting one channel from [TCP/IP (OLI Tst P60501)]
For the log ingestion, I was only interested in the timestamp and the log message at the end. I use a ingest pipeline in Elasticsearch. The splitting part is quite simple. Using the split processor will result in this data structure:
{
"_source": {
"data": [
"0000000062",
"2017-12-28T12:52:08.643",
"OLI Interface",
"OLI Tst P60501",
"ID",
"4",
"[ChanCtrl] Requesting one channel from [TCP/IP (OLI Tst P60501)]"
]
}
}
The challenge was to access only the index element 1 and 6. Since Elasticsearch uses Mustache is was quite difficult to find the respective documentation. But it is possible! Just use the index position like this {{data.1}}
.
PUT _ingest/pipeline/le-mapper
{
"description": "mapperstuff",
"processors": [
{
"split": {
"field": "message",
"separator": "\t",
"target_field": "data"
}
},
{
"set": {
"field": "jesus",
"value": "{{data.1}}"
}
},
{
"set": {
"field": "logmessage",
"value": "{{data.6}}"
}
},
{
"set": {
"field": "application",
"value": "simulator"
}
},
{
"date": {
"field": "jesus",
"target_field": "datetime",
"formats": [
"yyyy-MM-dd'T'HH:mm:ss.SSS"
],
"timezone": "Europe/Zurich"
}
},
{
"remove": {
"field": [
"data",
"jesus",
"message"
]
}
}
],
"on_failure": [
{
"set": {
"field": "error",
"value": "{{ _ingest.on_failure_message }} on operation: {{ _ingest.on_failure_processor_type }}"
}
}
]
}
Just test the pipeline
POST /_ingest/pipeline/le-mapper/_simulate
{
"docs": [
{
"_source": {
"message": """0000000062 2017-12-28T12:52:08.643 OLI Interface OLI Tst P60501 ID 4 [ChanCtrl] Requesting one channel from [TCP/IP (OLI Tst P60501)]"""
}
}
]
}
The output in JSON
{
"docs": [
{
"doc": {
"_index": "_index",
"_type": "_type",
"_id": "_id",
"_source": {
"datetime": "2017-12-28T12:52:08.643+01:00",
"application": "simulator",
"logmessage": "[ChanCtrl] Requesting one channel from [TCP/IP (OLI Tst P60501)]"
},
"_ingest": {
"timestamp": "2018-02-02T10:38:02.550Z"
}
}
}
]
}