Kubesense

Search Processing Language (SPL)

The Search Processing Language (SPL) is KubeSense's built-in, pipeline-style query language for searching, filtering, transforming, and aggregating log data in real time. Inspired by Splunk SPL and CloudWatch Logs Insights, SPL lets you compose queries as a series of commands separated by the pipe character |, where the output of each command becomes the input to the next.

SPL is available directly in the Log Explorer search bar. Toggle to advanced query mode and start writing SPL queries to unlock powerful analytical capabilities beyond simple filtering.


How SPL Works

Every SPL query is a pipeline. You start with a data selection or search, then pipe the results through one or more transformation stages:

<source or search> | <command> | <command> | ...

All queries run against your ingested log data and are automatically scoped to the selected time range and cluster(s) in the Log Explorer.

A Simple Example

fields @timestamp, body, level
| filter level = "ERROR"
| sort @timestamp desc
| limit 50

This query selects three fields, filters for ERROR logs, sorts by newest first, and returns the top 50 results.

A Complex Example

fields timestamp, workload as application_name, level as severity, body as text
| filter application_name != "" and severity in ["ERROR", "WARN", "INFO"]
| stats count(*) as total, sum(severity = "ERROR") as errors, sum(severity = "WARN") as warns by application_name
| eval error_rate = errors / total
| sort error_rate desc

This query computes the error rate for every application in your cluster, producing a table with total logs, error count, warning count, and the calculated error rate per workload.


Built-in Field Names

SPL provides built-in aliases that map directly to your log data columns. Use these in any SPL command:

AliasDescription
@timestampLog event time (UTC)
@messageRaw log message text
bodySame as @message
levelLog severity (ERROR, WARN, INFO, etc.)
clusterKubernetes cluster name
namespaceKubernetes namespace
workloadWorkload name (Deployment, StatefulSet, etc.)
pod_namePod name
container_nameContainer name
hostNode hostname
serviceService name
sourceLog source identifier
node_nameKubernetes node name
env_typeEnvironment type

Accessing Parsed Log Fields

For structured logs (JSON, logfmt, etc.), use the log_processed. prefix to access extracted fields:

filter log_processed.status = "200"
stats count(*) as total by log_processed.url

Parsed fields are resolved in priority order: string attributes → float attributes → JSON-parsed body.


Command Reference

Command Overview

CommandDescription
"keyword"Bare string search — implicit full-text substring match on body
fieldsSelect specific columns or compute derived expressions
filter / where / searchFilter rows by a condition
parseExtract a value from a field using a glob pattern
rexExtract named groups using a regular expression
evalCompute new columns using expressions
statsAggregate data with functions and optional GROUP BY
timechartTime-series aggregation with a configurable interval
dedupRemove duplicate rows by one or more fields
headKeep the first N rows
tailKeep the last N rows (by timestamp)
sortOrder results by one or more fields
limitRestrict the number of returned rows
renameRename columns
tableSelect only the listed columns for the final output
topReturn the N most frequent values of a field
rareReturn the N least frequent values of a field

Write one or more quoted strings without any command keyword to search the raw log body for substrings. Each string matches case-insensitively against the body field. This is the fastest way to find logs containing specific keywords.

Syntax:

"keyword"
"keyword1" and "keyword2"
"keyword1" or "keyword2"

Examples:

Search for all logs containing "KYC":

"KYC"

Search for logs containing multiple keywords:

"KYC" and "request time:" and "makePostRequestToNPCI"
| sort @timestamp desc
| limit 50

Search for either "timeout" or "connection refused" and aggregate by workload:

"timeout" or "connection refused"
| stats count(*) as occurrences by workload, namespace
| sort occurrences desc

Bare string search can also begin a pipeline — pipe additional commands after the search term:

"KYC" and "request time:"
| parse @message 'request time: * ms;' as latency
| filter ispresent(latency)
| timechart span=1m avg(latency) as avg_latency

info: Only quoted strings are supported as bare search terms. Bare strings must come before any pipe commands.


fields

Selects columns or computes derived expressions. Use this to control which fields appear in your output and to create calculated columns.

Syntax:

fields expr [as alias], expr [as alias], ...

Examples:

Select basic fields:

fields @timestamp, body, level

Select and rename parsed fields:

fields @timestamp, log_processed.url as url, log_processed.status as status

Create a boolean column from an expression:

fields (level = "ERROR") as is_error

Parenthesized boolean expressions are converted to 0/1 integers, which can be used in downstream aggregations.


Filters rows based on a condition. All three command names are interchangeable. When placed before stats, the condition becomes a WHERE clause. When placed after stats, it becomes a HAVING clause for post-aggregation filtering.

Syntax:

filter <condition>

Operators:

OperatorDescriptionExample
=Equallevel = "ERROR"
!=Not equallevel != "DEBUG"
<, >, <=, >=Numeric comparisonslog_processed.status >= "500"
and / or / notLogical connectiveslevel = "ERROR" and namespace = "prod"
in [v1, v2, ...]Membership testlevel in ["ERROR", "WARN"]
like "pattern%"SQL LIKE patternbody like "%timeout%"

Examples:

Basic filter:

filter level = "ERROR"

Combined conditions:

filter level in ["ERROR", "WARN"] and namespace = "production"

Negation:

filter not (level = "INFO")

Check if a parsed field exists:

filter ispresent(log_processed.url)

Post-aggregation filter (HAVING):

stats count(*) as total by level
| filter total > 100

parse

Extracts a value from a field using a glob-style pattern where * is the wildcard capture group.

Syntax:

parse <field> '<pattern>' as <alias>

Example:

Extract a username from a JSON-like log body:

parse body '"user":"*"' as username

Extract request ID and status code:

parse body '"requestId":"*"' as request_id
| parse body '"statusCode":*,' as status_code
| fields @timestamp, request_id, status_code, body
| filter status_code != "200"
| sort @timestamp desc

rex

Extracts one or more named groups from a field using a full regular expression. Named groups use (?P<name>...) syntax.

Syntax:

rex field=<field> "<regex>"

Each named group becomes a new column in the output.

Examples:

Extract HTTP method and path:

rex field=body "(?P<method>[A-Z]+) (?P<path>/[^ ]*) HTTP/[0-9.]+"
| stats count(*) as hits by method, path
| sort hits desc
| limit 25

Extract API version and resource from a URL:

rex field=log_processed.url "/api/(?P<version>v[0-9]+)/(?P<resource>[^/]+)"

eval

Computes new columns from expressions. When placed before stats, the columns are available for aggregation. When placed after stats, they operate on the aggregated result.

Syntax:

eval <alias> = <expr>, <alias> = <expr>, ...

Examples:

Create a boolean error flag:

eval is_error = (level = "ERROR")

Conditional scoring:

eval score = if(level = "ERROR", 10, if(level = "WARN", 5, 1))

String concatenation:

eval full_path = log_processed.service + "/" + log_processed.endpoint

Unit conversion:

eval duration_ms = toFloat64(log_processed.duration) * 1000

Eval Functions Reference

Math Functions:

FunctionDescription
abs(x)Absolute value
ceil(x)Ceiling
floor(x)Floor
round(x)Round to nearest integer
sqrt(x)Square root
exp(x)e^x
log(x)Natural logarithm
log10(x)Base-10 logarithm
sin(x) / cos(x)Trigonometric functions
pi()π constant
random()Random float in [0, 1)

String Functions:

FunctionDescription
len(s)String length
lower(s) / upper(s)Case conversion
ltrim(s) / rtrim(s) / trim(s)Whitespace trimming
replace(s, pattern, repl)Regex replacement
split(s, delim)Split string into array
substr(s, start, len)Substring (0-indexed start)
match(s, regex)Returns 1 if string matches regex
md5(s)MD5 hash
urldecode(s)URL-decode a string

Type Functions:

FunctionDescription
tonumber(s)Parse string as float64
typeof(x)Returns the type name of x
isstr(x) / isint(x) / isbool(x)Type checks

Conditional Functions:

FunctionDescription
if(cond, then, else)Conditional expression
case(cond1, val1, cond2, val2, ..., default)Multi-branch conditional
in(x, a, b, ...)Returns 1 if x equals any of a, b, ...
coalesce(a, b, ...)First non-empty value
nullif(a, b)NULL if a = b, else a
null()NULL literal
isnull(x)True if x is NULL
ispresent(x)True if x is non-empty string
isblank(x)True if x is empty string

Date/Time Functions:

FunctionDescription
strftime(fmt, ts)Format timestamp
strptime(s, fmt)Parse timestamp string

stats

Aggregates data, producing one row per unique combination of by fields. This is the primary command for building summaries, counts, averages, and other statistical views of your log data.

Syntax:

stats <aggr_fn>(<expr>) as <alias>, ... [by <field>, ...]

Aggregation Functions

FunctionDescription
count(*) / count(x)Row count / non-null count
sum(x)Sum (as float64)
avg(x)Average
min(x) / max(x)Minimum / maximum
dc(x)Distinct count
median(x)Median value
mode(x)Most frequent value
perc<N>(x)N-th percentile (e.g., perc95(latency))
range(x)max(x) − min(x)
stdev(x)Sample standard deviation
stdevp(x)Population standard deviation
var(x)Sample variance
sumsq(x)Sum of squares
latest(x)Value at the latest timestamp
earliest(x)Value at the earliest timestamp
values(x) / list(x)Array of all values

Examples:

Count logs by severity:

stats count(*) as total by level

API performance analysis:

stats count(*) as requests, avg(log_processed.duration) as avg_dur, perc95(log_processed.duration) as p95_dur by log_processed.url

Active pods and latest severity per workload:

stats dc(pod_name) as active_pods, latest(level) as last_level by workload

timechart

Creates a time-series aggregation, grouping results into time buckets. This is the primary command for building trend visualizations and time-based dashboards.

Syntax:

timechart [span=<N><unit>] <aggr_fn>(<expr>) as <alias> [by <field>]

Span units: s (seconds), m (minutes), h (hours), d (days)

When no span is specified, the default interval is 1 minute. The time bucket column is named _time.

Examples:

Request count per 5-minute bucket:

timechart span=5m count(*) as req_count

Error count by severity per hour:

timechart span=1h count(*) as errors by level

Average latency per service over 15-minute intervals:

timechart span=15m avg(log_processed.duration) as avg_latency by log_processed.service

dedup

Removes duplicate rows, keeping only the most recent (by timestamp) occurrence of each unique combination of the specified fields.

Syntax:

dedup [keepcount=N] <field>, <field>, ...

keepcount defaults to 1 if omitted.

Examples:

Keep only the latest log per pod:

dedup pod_name

Keep the 3 most recent logs per cluster/namespace combination:

dedup keepcount=3 cluster, namespace

head / tail

Keep only the first or last N rows. tail orders by timestamp descending to retrieve the most recent events.

Syntax:

head [N]
tail [N]

Default N is 10.

Examples:

head 100
tail 50

sort

Orders results by one or more fields.

Syntax:

sort <field> [asc|desc], <field> [asc|desc], ...

Default direction is ascending.

Examples:

Sort by timestamp, newest first:

sort @timestamp desc

Multi-field sort:

sort level asc, @timestamp desc

Sort aggregated results:

sort total desc

limit

Restricts the total number of returned rows. A default limit of 1000 is applied automatically when no limit command is present.

Syntax:

limit <N>

Example:

limit 500

rename

Renames one or more columns in the result.

Syntax:

rename <old_name> as <new_name>, ...

Example:

stats count(*) as c by level
| rename c as event_count, level as severity

table

Projects the result to only the listed columns. Use this as the final stage of a pipeline to control exactly which fields appear in the output.

Syntax:

table <field>, <field>, ...

Example:

fields @timestamp, body, level, namespace, pod_name
| filter level = "ERROR"
| table @timestamp, pod_name, body

top / rare

Return the most or least frequent values of a field, with their counts.

Syntax:

top [N] <field> [by <group_field>]
rare [N] <field> [by <group_field>]

Default N is 10.

Examples:

Top 5 namespaces by log volume:

top 5 namespace

Top 10 URLs by cluster:

top 10 log_processed.url by cluster

Least common pod names:

rare 3 pod_name

warn: top and rare issue standalone aggregations and cannot be combined with other pipeline stages.


Practical Examples

Find Recent ERROR Logs

filter level = "ERROR"
| sort @timestamp desc
| limit 50

Error Count by Service Over Time

filter level = "ERROR"
| timechart span=10m count(*) as errors by service

Top Slow API Endpoints

fields log_processed.url as url, log_processed.duration as dur
| filter ispresent(dur)
| stats count(*) as requests, avg(dur) as avg_dur, perc95(dur) as p95 by url
| filter requests > 10
| sort p95 desc
| limit 20

Extract Structured Data with parse

parse body '"requestId":"*"' as request_id
| parse body '"statusCode":*,' as status_code
| fields @timestamp, request_id, status_code, body
| filter status_code != "200"
| sort @timestamp desc

Extract HTTP Method and Path with rex

rex field=body "(?P<method>[A-Z]+) (?P<path>/[^ ]*) HTTP/[0-9.]+"
| stats count(*) as hits by method, path
| sort hits desc
| limit 25

Compute a Risk Score with eval

eval severity_score = if(level = "ERROR", 3, if(level = "WARN", 1, 0))
| stats sum(severity_score) as risk, count(*) as total by workload
| eval risk_per_event = risk / total
| sort risk_per_event desc
| limit 15

Deduplicate by Pod — Most Recent Event

filter level = "ERROR"
| dedup pod_name
| table @timestamp, pod_name, namespace, body
| sort @timestamp desc

Bare String Search + Latency Extraction

"KYC" and "request time:"
| parse @message 'request time: * ms;' as latency
| filter ispresent(latency)
| timechart span=1m avg(latency) as avg_latency

Full Pipeline — Server Error Analysis

fields @timestamp, log_processed.url as url, log_processed.status as status, log_processed.duration as dur
| filter status in ["500", "502", "503", "504"]
| eval is_timeout = (status = "504")
| stats count(*) as errors, sum(is_timeout) as timeouts, avg(dur) as avg_dur by url
| filter errors > 5
| eval timeout_rate = timeouts / errors
| sort timeout_rate desc
| limit 10

Count Failed Requests by Message

fields * | filter body like "%Request failed%" | stats count(*) as failures by @message

Per-Application Error Rate Dashboard

fields timestamp, workload as application_name, level as severity, body as text
| filter application_name != "" and severity in ["ERROR", "WARN", "INFO"]
| stats count(*) as total, sum(severity = "ERROR") as errors, sum(severity = "WARN") as warns by application_name
| eval error_rate = errors / total
| sort error_rate desc

Performance Tips

  • Narrow your time range — Always select the smallest possible time window in the Log Explorer to improve query performance.
  • Use explicit limits — Add | limit N to your queries. The default limit is 1000 rows, but specifying it explicitly ensures predictable results.
  • Filter early — Place filter commands as early as possible in your pipeline to reduce the amount of data processed by downstream commands.
  • Prefer string comparisons for parsed fieldslog_processed fields are resolved via attribute maps and JSON extraction. Use string literal comparisons over numeric ones for best performance.
  • Use head/tail for non-aggregated queries — Long-running queries without aggregation should use head or tail to cap the result set.
  • Query timeout — Queries have a 300-second timeout. For large datasets, use aggregations (stats, timechart) rather than scanning raw rows.

Error Reference

If your SPL query fails, the Log Explorer will display an error message. Common error conditions include:

ErrorDescription
Parse errorSyntax error in your query — check for missing quotes, unbalanced parentheses, or invalid operators
Translation errorThe query is syntactically valid but cannot be translated to an executable form
Execution errorThe query ran but encountered a runtime error (e.g., type mismatch, division by zero)
Query timeoutThe query exceeded the 300-second timeout — narrow your time range or add aggregation