Multiline Multivalued Fields Extraction in Splunk Search

Multiline Multivalued Fields Extraction in Splunk refers to a more complex data extraction scenario where a single event (log entry) contains multiple lines, each containing numerous field-value pairs. There are several ways to solve this. You can extract fields by changing the input ingestion of your data by using the “Extract Fields” feature from Splunk’s UI or “transforms.conf” and “props.conf” file alternation. But, if you’re in a hurry to test a feature in Splunk or to see statistics of your data, the less time-consuming and most effortless approach would be using Splunk Search SPL commands without altering your input data.

Affiliate: Experience limitless no-code automation, streamline your workflows, and effortlessly transfer data between apps with Make.com.

Let’s take an example: You may have an event that looks something like this:

14:00:00.892305|0.0003443|0.045435|0.0005435|2957
14:00:01.892305|0.0005456|0.044311|0.0006432|2962
14:00:02.892305|0.0004341|0.045876|0.0005442|2953

We’ll need a multiline multivalued fields extraction solution. Each log line is a separate event, with multiple values separated by the “|” character. Assuming these values correspond to data gathered from the external vibration sensor and internal engine speed sensor:
1. Sampled time of the event.
2. External vibration sensor X-axis acceleration measured in “g.”
3. External vibration sensor Y-axis acceleration measured in “g.”
4. External vibration sensor Z-axis acceleration measured in “g.”
5. Internal engine speed sensor measured in RPM.

Another problem of multiline multivalued fields extraction in such a case is that each log entry is inconsistent. It can have three lines, sometimes four or five. You can try a variety of means in Splunk, such as the use of “spath,” “mvexpand,” “rex,” and “makemv” commands in your Splunk search. Note that the complexity of this process may vary depending on the structure and consistency of your data.

We’ll show you the solution to multiline multivalued fields extraction and then break it down to understand separate commands:

index=your_index
| multikv noheader=t
| makemv delim="|" Column_1
| eval sampled_time=mvindex(Column_1,0), external_vib_x=mvindex(Column_1, 1), external_vib_y=mvindex(Column_1, 2), external_vib_z=mvindex(Column_1, 3), internal_speed=mvindex(Column_1, 4)

In this command:
index=your_index: This is the initial search command. It tells Splunk to search the data in the index named “your_index.”
| multikv noheader=t: The command extracts field-value pairs from events with a multiline, key-value format with a delimiter. The “noheader=t” argument means that the command treats the first row of the data as data, not headers (i.e., it doesn’t try to use the first line as field names). This command is the main part that separates the multi-lines in the multiline multivalued fields extraction solution.
| makemv delim=”|” Column_1: This command splits the “Column_1” field into multiple values based on the specified delimiter (“|” in this case). “makemv” creates multivalue fields (fields that can contain more than one value).
| eval sampled_time=mvindex(Column_1,0), external_vib_x=mvindex(Column_1, 1), external_vib_y=mvindex(Column_1, 2), external_vib_z=mvindex(Column_1, 3), internal_speed=mvindex(Column_1, 4): This command creates new fields using the “eval” function. The “mvindex” function pulls out specific values from the multivalue field produced in the last step (Column_1), with indexing starting from “0”. So, for example, “sampled_time=mvindex(Column_1,0)” creates a new field “sampled_time” and assigns it the first value from “Column_1”. The other parts of this command do the same for “external_vib_x,” “external_vib_y,” “external_vib_z,” and “internal_speed” using the second, third, fourth, and fifth values from “Column_1”, respectively.

In summary, this command sequence searches for data in a specific index, extracts multivalue fields, splits a particular field into multiple values, and creates new fields based on these values.

Data Result After Each Step

To remind you that this is our example:

14:00:00.892305|0.0003443|0.045435|0.0005435|2957
14:00:01.892305|0.0005456|0.044311|0.0006432|2962
14:00:02.892305|0.0004341|0.045876|0.0005442|2953

Let’s see what’s happening to it step by step. You can check it in the “_raw” special field.

Step 1: Using “multikv” Command.

Before the first step of multiline multivalued fields extraction solution, there is one field and one row that will look like this:

RowCounter  |  _raw
----------------------------------------------------------------
1           |  14:00:00.892305|0.0003443|0.045435|0.0005435|2957
            |  14:00:01.892305|0.0005456|0.044311|0.0006432|2962
            |  14:00:02.892305|0.0004341|0.045876|0.0005442|2953

After the execution of the command, this will be the search result:

RowCounter  |  Column_1
----------------------------------------------------------------
1           |  14:00:00.892305|0.0003443|0.045435|0.0005435|2957
----------------------------------------------------------------
2           |  14:00:01.892305|0.0005456|0.044311|0.0006432|2962
----------------------------------------------------------------
3           |  14:00:02.892305|0.0004341|0.045876|0.0005442|2953

In this case, “_raw” will look exactly like the “Column_1” field since this is what gets divided.

Step 2: “makemv” Command

The second step of the multiline multivalued fields extraction solution is the “makemv” command will divide the field into several rows inside the same single row:

RowCounter  |  Column_1
----------------------------------------------------------------
1           |  14:00:00.892305
            |  0.0003443
            |  0.045435
            |  0.0005435
            |  2957
----------------------------------------------------------------
2           |  14:00:01.892305
            |  0.0005456
            |  0.044311
            |  0.0006432
            |  2962
----------------------------------------------------------------
3           |  14:00:02.892305
            |  0.0004341
            |  0.045876
            |  0.0005442
            |  2953

Step 3 of Multiline Multivalued Fields Extraction Solution: “eval” and “mvindex”.

As we can see that each row now has a multiline cell inside the “Column_1” field. Each row inside that cell has an index that we will extract with the “mvindex” command and put inside a variable that we create with the “eval” command. As a result, this creates a separate field for each value. Off course “Column_1” field will be intact after using the full multiline multivalued fields extraction solution.

RowCounter  |  sampled_time     |  extrernal_vib_x  |  extrernal_vib_y  |  extrernal_vib_z  |  internal_speed
-------------------------------------------------------------------------------------------------------------
1           |  14:00:00.892305  |  0.0003443        |  0.045435         |  0.0005435        |  2957
-------------------------------------------------------------------------------------------------------------
2           |  14:00:01.892305  |  0.0005456        |  0.044311         |  0.0006432        |  2962
-------------------------------------------------------------------------------------------------------------
3           |  14:00:02.892305  |  0.0004341        |  0.045876         |  0.0005442        |  2953

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.