Yielding Optimized SPL Using TERM()

By Leor Roynsky

SPL (Splunk Processing Language) is the fundamental way through which users interact with data in their Splunk environment.  While many of us use SPL on a regular basis, we often fail to realize that search load is one of the biggest factors in determining long-term sizing.  While optimized ingestion is also important, creating efficient searches is a crucial way by which we can reduce server load, providing a better user experience for all users of your Splunk environment. 

Growing environments, in particular, tend to generate searches that will get more complicated, more convoluted, and more resource-intensive.  This will directly correlate to slower loading dashboards with poor performance and a more frustrating user experience.  By improving search performance, the overall load on the environment is reduced, productivity improves, and potentially unusable SPL is removed allowing users to tackle new problems.  Consequently, this yields improved ROI on various hardware investments. 

To accomplish our goals, we first need to understand LISPY and how Splunk uses your SPL to create multiple search parameters which are used to generate searched and matched results.  

What is LISPY? 

When a search is executed, Splunk uses your SPL to generate an optimized search, a literal search, and a LISPY.  At a very basic level, LISPY is the lexicon (or keywords) and logic gates (AND/OR/NOT) which Splunk uses to initially gather relevant events prior to further matching your events to your search.  Essentially, it determines which events Splunk will pull out of all relevant buckets before checking to see whether they match your search criteria — it’s the search before the search.

When events are ingested, Splunk creates a mapping of events to various keywords which correlates with each event.  It takes the raw event and breaks it down using major breakers, after which it applies minor breakers to those major breakers.  To find out more regarding breakers, head over to Splunk’s own documentation page.  For our purposes, it is sufficient to recognize that major breakers are words that appear in the raw event delimited by spaces, quotation marks, and a handful of other special characters.  What makes this super-interesting is that these are applied during indextime — yes, you read that right.  This makes using TERM a possibility during functions like tstats. 

By using LISPY, major/minor breakers & the TERM function — we can now speak about optimizing our search. Let’s take the following reference search query:

Reference Query:

index=solsys_es_windows sourcetype=”wineventlog” EventCode=4*

Test Query:

index=solsys_es_windows sourcetype=”wineventlog” TERM(EventCode=4*)

You may ask yourself, why is the runtime using TERM more than 50% faster then the reference query? 

The answer lies in the LISPY. 

LISPY of the reference query = [AND 4* index::solsys_es_windows sourcetype::wineventlog]

LISPY of the test query = [AND eventcode=4* index::solsys_es_windows sourcetype::wineventlog]

In the reference search, the LISPY that is pulled from the search is simply 4*. This means that Splunk will look for ANY event with a number 4* under the corresponding index and sourcetype definition, which will be the “scanned event count” (29,560). Following this, it will match those against your search request and discard any that don’t meet the requirements — which results in a total of 11,116 matched events. 

Our test query using the TERM function yields a different LISPY. In this case we’ve asked Splunk to isolate only events with the keyword eventcode=4* – this will result in a much smaller scanned event count. In fact, as you can see, the scanned count and the matched count is 1-to-1. Out of the 11,116 events scanned using the LISPY – we’ve matched 100%. As a result our search runtime has also decreased. 

To use TERM, the keyword you are attempting to isolate must be bounded by major breakers. The TERM function works best when the keyword you are searching for has one or more minor breakers throughout the keyword. For example an IP address would result in a LISPY of [AND 44 34 24 111]  whereas including it in a TERM function would result in [AND]. Keep in mind that when trying to optimize your query with the TERM function, it would be best to compare the original search results with the new query results within the same, fixed time frame.