Delimiter base KV extraction – advanced

If you’ve read my previous post on delimiter based KV extraction, you might be wandering whether you could do more with it (Anonymous Coward did). Well, yes you can, I am going to cover the “advanced” cases here. Before covering the capabilities, as in other posts, I would first go over some observations and examples.

Observations

Header-body. Some applications, for different reasons, choose to format their log files using a header and a body section. The header usually describes the way the fields are organized in each logged event, while the body consists of logged events, usually one per line, with field values delimited as described in the header. W3C, CSV etc come to mind, see examples
Single-delimiter. Other applications choose to use a single delimiter to delimit keys from values and values from keys, while this is not very common it’s been observed in the field.

Data Examples

The following header-body sample, as you can probably guess, is from an exchange server. There is a header section which among other things has the list of field names, delimited from each other using the delimiter used to delimit values in the body section, in this case a tab character is used (even though our blogging platform chooses to mangle tabs to spaces – gotta love it !!!).

# Message Tracking Log File # Exchange System Attendant Version 6.5.7638.1 # Fields: time client-ip cs-method sc-status 14:13:11 10.1.1.9 HELO 250 14:13:13 10.1.1.9 MAIL 250 14:13:19 10.1.1.9 RCPT 250 14:13:29 10.1.1.9 DATA 250 14:13:31 10.1.1.9 QUIT 240

The following example shows how a single-delimiter can be used to list fields, it is pretty easy for us, as humans, to recognize the key value pairs:

"url http://splunk.com referer http://dev.splunk.com ip 10.10.10.10"

Enabling header-body kv/extract

The delimiter based KV extraction solves the header-body problem by adding the capability to assign field names to extracted values by doing single-level tokenization/splitting (ie single delimiter) instead of the normal two-layered one described earlier. Unfortunately, however, this is only available through transforms.conf* and it requires manual specification of the field names (no automatic field name detection). To this end, we introduce another transforms.conf configuration variable, defined as follows:

Enabling single-delimiter kv/extract

There’s yet another trick in the delimiter KV extraction – the single-delimiter extraction. Single delimiter extraction pairs extracted field values into key=value as follows: value1=value2, value3=value4 and so on… To enable this extraction via the command line set kvdelim and pairdelim to the same value, for the above example data the extract command should look as follows:

.... | extract kvdelim=" " pairdelim=" " auto=f | ....

To enable single-delimiter extraction via transforms.conf you can either specify one delimiter or two identical delimiters in the DELIMS config variable, thus the following two transforms.conf stanzas are equivalent to each other and to the above command:

....transforms.conf.... [single-delim-1] DELIMS = " " [single-delim-2] DELIMS = " ", " "

The results of these extractions for our sample data would be:

"url http://splunk.com referer http://dev.splunk.com ip 10.10.10.10" url=http://splunk.com referer=http://dev.splunk.com ip=10.10.10.10

NOTE: do not specify a FIELDS variable for the single-delimiter extraction because that will enable header-body extraction.

Thoughts, ?, ideas, comments are always welcomed….

----------------------------------------------------
Thanks!
Ledion Bitincka

Splunk

The world’s leading organizations trustSplunkto help keep their digital systems secure and reliable. Our software solutions and services help to prevent major issues, absorb shocks and accelerate transformation. Learnwhat Splunk doesandwhy customers choose Splunk.

Delimiter base KV extraction – advanced | Splunk (2024)

Observations

Data Examples

Enabling header-body kv/extract

Enabling single-delimiter kv/extract