Search Processing Language (SPL)

The Search Processing Language (SPL) is KubeSense's built-in, pipeline-style query language for searching, filtering, transforming, and aggregating log data in real time. Inspired by Splunk SPL and CloudWatch Logs Insights, SPL lets you compose queries as a series of commands separated by the pipe character |, where the output of each command becomes the input to the next.

SPL is available directly in the Log Explorer search bar. Toggle to advanced query mode and start writing SPL queries to unlock powerful analytical capabilities beyond simple filtering.

How SPL Works

Every SPL query is a pipeline. You start with a data selection or search, then pipe the results through one or more transformation stages:

<source or search> | <command> | <command> | ...

All queries run against your ingested log data and are automatically scoped to the selected time range and cluster(s) in the Log Explorer.

A Simple Example

fields @timestamp, body, level
| filter level = "ERROR"
| sort @timestamp desc
| limit 50

This query selects three fields, filters for ERROR logs, sorts by newest first, and returns the top 50 results.

A Complex Example

fields timestamp, workload as application_name, level as severity, body as text
| filter application_name != "" and severity in ["ERROR", "WARN", "INFO"]
| stats count(*) as total, sum(severity = "ERROR") as errors, sum(severity = "WARN") as warns by application_name
| eval error_rate = errors / total
| sort error_rate desc

This query computes the error rate for every application in your cluster, producing a table with total logs, error count, warning count, and the calculated error rate per workload.

Built-in Field Names

SPL provides built-in aliases that map directly to your log data columns. Use these in any SPL command:

Alias	Description
`@timestamp`	Log event time (UTC)
`@message`	Raw log message text
`body`	Same as `@message`
`level`	Log severity (ERROR, WARN, INFO, etc.)
`cluster`	Kubernetes cluster name
`namespace`	Kubernetes namespace
`workload`	Workload name (Deployment, StatefulSet, etc.)
`pod_name`	Pod name
`container_name`	Container name
`host`	Node hostname
`service`	Service name
`source`	Log source identifier
`node_name`	Kubernetes node name
`env_type`	Environment type

Accessing Parsed Log Fields

For structured logs (JSON, logfmt, etc.), use the log_processed. prefix to access extracted fields:

filter log_processed.status = "200"
stats count(*) as total by log_processed.url

Parsed fields are resolved in priority order: string attributes → float attributes → JSON-parsed body.

Command Reference

Command Overview

Command	Description
`"keyword"`	Bare string search — implicit full-text substring match on body
`fields`	Select specific columns or compute derived expressions
`filter` / `where` / `search`	Filter rows by a condition
`parse`	Extract a value from a field using a glob pattern
`rex`	Extract named groups using a regular expression
`eval`	Compute new columns using expressions
`stats`	Aggregate data with functions and optional GROUP BY
`timechart`	Time-series aggregation with a configurable interval
`dedup`	Remove duplicate rows by one or more fields
`head`	Keep the first N rows
`tail`	Keep the last N rows (by timestamp)
`sort`	Order results by one or more fields
`limit`	Restrict the number of returned rows
`rename`	Rename columns
`table`	Select only the listed columns for the final output
`top`	Return the N most frequent values of a field
`rare`	Return the N least frequent values of a field

Bare String Search

Write one or more quoted strings without any command keyword to search the raw log body for substrings. Each string matches case-insensitively against the body field. This is the fastest way to find logs containing specific keywords.

Syntax:

"keyword"
"keyword1" and "keyword2"
"keyword1" or "keyword2"

Examples:

Search for all logs containing "KYC":

"KYC"

Search for logs containing multiple keywords:

"KYC" and "request time:" and "makePostRequestToNPCI"
| sort @timestamp desc
| limit 50

Search for either "timeout" or "connection refused" and aggregate by workload:

"timeout" or "connection refused"
| stats count(*) as occurrences by workload, namespace
| sort occurrences desc

Bare string search can also begin a pipeline — pipe additional commands after the search term:

"KYC" and "request time:"
| parse @message 'request time: * ms;' as latency
| filter ispresent(latency)
| timechart span=1m avg(latency) as avg_latency

info: Only quoted strings are supported as bare search terms. Bare strings must come before any pipe commands.

fields

Selects columns or computes derived expressions. Use this to control which fields appear in your output and to create calculated columns.

Syntax:

fields expr [as alias], expr [as alias], ...

Examples:

Select basic fields:

fields @timestamp, body, level

Select and rename parsed fields:

fields @timestamp, log_processed.url as url, log_processed.status as status

Create a boolean column from an expression:

fields (level = "ERROR") as is_error

Parenthesized boolean expressions are converted to 0/1 integers, which can be used in downstream aggregations.

filter / where / search

Filters rows based on a condition. All three command names are interchangeable. When placed before stats, the condition becomes a WHERE clause. When placed after stats, it becomes a HAVING clause for post-aggregation filtering.

Syntax:

filter <condition>

Operators:

Operator	Description	Example
`=`	Equal	`level = "ERROR"`
`!=`	Not equal	`level != "DEBUG"`
`<`, `>`, `<=`, `>=`	Numeric comparisons	`log_processed.status >= "500"`
`and` / `or` / `not`	Logical connectives	`level = "ERROR" and namespace = "prod"`
`in [v1, v2, ...]`	Membership test	`level in ["ERROR", "WARN"]`
`like "pattern%"`	SQL LIKE pattern	`body like "%timeout%"`

Examples:

Basic filter:

filter level = "ERROR"

Combined conditions:

filter level in ["ERROR", "WARN"] and namespace = "production"

Negation:

filter not (level = "INFO")

Check if a parsed field exists:

filter ispresent(log_processed.url)

Post-aggregation filter (HAVING):

stats count(*) as total by level
| filter total > 100

parse

Extracts a value from a field using a glob-style pattern where * is the wildcard capture group.

Syntax:

parse <field> '<pattern>' as <alias>

Example:

Extract a username from a JSON-like log body:

parse body '"user":"*"' as username

Extract request ID and status code:

parse body '"requestId":"*"' as request_id
| parse body '"statusCode":*,' as status_code
| fields @timestamp, request_id, status_code, body
| filter status_code != "200"
| sort @timestamp desc

rex

Extracts one or more named groups from a field using a full regular expression. Named groups use (?P<name>...) syntax.

Syntax:

rex field=<field> "<regex>"

Each named group becomes a new column in the output.

Examples:

Extract HTTP method and path:

rex field=body "(?P<method>[A-Z]+) (?P<path>/[^ ]*) HTTP/[0-9.]+"
| stats count(*) as hits by method, path
| sort hits desc
| limit 25

Extract API version and resource from a URL:

rex field=log_processed.url "/api/(?P<version>v[0-9]+)/(?P<resource>[^/]+)"

eval

Computes new columns from expressions. When placed before stats, the columns are available for aggregation. When placed after stats, they operate on the aggregated result.

Syntax:

eval <alias> = <expr>, <alias> = <expr>, ...

Examples:

Create a boolean error flag:

eval is_error = (level = "ERROR")

Conditional scoring:

eval score = if(level = "ERROR", 10, if(level = "WARN", 5, 1))

String concatenation:

eval full_path = log_processed.service + "/" + log_processed.endpoint

Unit conversion:

eval duration_ms = toFloat64(log_processed.duration) * 1000

Eval Functions Reference

Math Functions:

Function	Description
`abs(x)`	Absolute value
`ceil(x)`	Ceiling
`floor(x)`	Floor
`round(x)`	Round to nearest integer
`sqrt(x)`	Square root
`exp(x)`	e^x
`log(x)`	Natural logarithm
`log10(x)`	Base-10 logarithm
`sin(x)` / `cos(x)`	Trigonometric functions
`pi()`	π constant
`random()`	Random float in [0, 1)

String Functions:

Function	Description
`len(s)`	String length
`lower(s)` / `upper(s)`	Case conversion
`ltrim(s)` / `rtrim(s)` / `trim(s)`	Whitespace trimming
`replace(s, pattern, repl)`	Regex replacement
`split(s, delim)`	Split string into array
`substr(s, start, len)`	Substring (0-indexed start)
`match(s, regex)`	Returns 1 if string matches regex
`md5(s)`	MD5 hash
`urldecode(s)`	URL-decode a string

Type Functions:

Function	Description
`tonumber(s)`	Parse string as float64
`typeof(x)`	Returns the type name of x
`isstr(x)` / `isint(x)` / `isbool(x)`	Type checks

Conditional Functions:

Function	Description
`if(cond, then, else)`	Conditional expression
`case(cond1, val1, cond2, val2, ..., default)`	Multi-branch conditional
`in(x, a, b, ...)`	Returns 1 if x equals any of a, b, ...
`coalesce(a, b, ...)`	First non-empty value
`nullif(a, b)`	NULL if a = b, else a
`null()`	NULL literal
`isnull(x)`	True if x is NULL
`ispresent(x)`	True if x is non-empty string
`isblank(x)`	True if x is empty string

Date/Time Functions:

Function	Description
`strftime(fmt, ts)`	Format timestamp
`strptime(s, fmt)`	Parse timestamp string

stats

Aggregates data, producing one row per unique combination of by fields. This is the primary command for building summaries, counts, averages, and other statistical views of your log data.

Syntax:

stats <aggr_fn>(<expr>) as <alias>, ... [by <field>, ...]

Aggregation Functions

Function	Description
`count(*)` / `count(x)`	Row count / non-null count
`sum(x)`	Sum (as float64)
`avg(x)`	Average
`min(x)` / `max(x)`	Minimum / maximum
`dc(x)`	Distinct count
`median(x)`	Median value
`mode(x)`	Most frequent value
`perc<N>(x)`	N-th percentile (e.g., `perc95(latency)`)
`range(x)`	max(x) − min(x)
`stdev(x)`	Sample standard deviation
`stdevp(x)`	Population standard deviation
`var(x)`	Sample variance
`sumsq(x)`	Sum of squares
`latest(x)`	Value at the latest timestamp
`earliest(x)`	Value at the earliest timestamp
`values(x)` / `list(x)`	Array of all values

Examples:

Count logs by severity:

stats count(*) as total by level

API performance analysis:

stats count(*) as requests, avg(log_processed.duration) as avg_dur, perc95(log_processed.duration) as p95_dur by log_processed.url

Active pods and latest severity per workload:

stats dc(pod_name) as active_pods, latest(level) as last_level by workload

timechart

Creates a time-series aggregation, grouping results into time buckets. This is the primary command for building trend visualizations and time-based dashboards.

Syntax:

timechart [span=<N><unit>] <aggr_fn>(<expr>) as <alias> [by <field>]

Span units: s (seconds), m (minutes), h (hours), d (days)

When no span is specified, the default interval is 1 minute. The time bucket column is named _time.

Examples:

Request count per 5-minute bucket:

timechart span=5m count(*) as req_count

Error count by severity per hour:

timechart span=1h count(*) as errors by level

Average latency per service over 15-minute intervals:

timechart span=15m avg(log_processed.duration) as avg_latency by log_processed.service

dedup

Removes duplicate rows, keeping only the most recent (by timestamp) occurrence of each unique combination of the specified fields.

Syntax:

dedup [keepcount=N] <field>, <field>, ...

keepcount defaults to 1 if omitted.

Examples:

Keep only the latest log per pod:

dedup pod_name

Keep the 3 most recent logs per cluster/namespace combination:

dedup keepcount=3 cluster, namespace

head / tail

Keep only the first or last N rows. tail orders by timestamp descending to retrieve the most recent events.

Syntax:

head [N]
tail [N]

Default N is 10.

Examples:

head 100
tail 50

sort

Orders results by one or more fields.

Syntax:

sort <field> [asc|desc], <field> [asc|desc], ...

Default direction is ascending.

Examples:

Sort by timestamp, newest first:

sort @timestamp desc

Multi-field sort:

sort level asc, @timestamp desc

Sort aggregated results:

sort total desc

limit

Restricts the total number of returned rows. A default limit of 1000 is applied automatically when no limit command is present.

Syntax:

limit <N>

Example:

limit 500

rename

Renames one or more columns in the result.

Syntax:

rename <old_name> as <new_name>, ...

Example:

stats count(*) as c by level
| rename c as event_count, level as severity

table

Projects the result to only the listed columns. Use this as the final stage of a pipeline to control exactly which fields appear in the output.

Syntax:

table <field>, <field>, ...

Example:

fields @timestamp, body, level, namespace, pod_name
| filter level = "ERROR"
| table @timestamp, pod_name, body

top / rare

Return the most or least frequent values of a field, with their counts.

Syntax:

top [N] <field> [by <group_field>]
rare [N] <field> [by <group_field>]

Default N is 10.

Examples:

Top 5 namespaces by log volume:

top 5 namespace

Top 10 URLs by cluster:

top 10 log_processed.url by cluster

Least common pod names:

rare 3 pod_name

warn: top and rare issue standalone aggregations and cannot be combined with other pipeline stages.

Practical Examples

Find Recent ERROR Logs

filter level = "ERROR"
| sort @timestamp desc
| limit 50

Error Count by Service Over Time

filter level = "ERROR"
| timechart span=10m count(*) as errors by service

Top Slow API Endpoints

fields log_processed.url as url, log_processed.duration as dur
| filter ispresent(dur)
| stats count(*) as requests, avg(dur) as avg_dur, perc95(dur) as p95 by url
| filter requests > 10
| sort p95 desc
| limit 20

Extract Structured Data with parse

parse body '"requestId":"*"' as request_id
| parse body '"statusCode":*,' as status_code
| fields @timestamp, request_id, status_code, body
| filter status_code != "200"
| sort @timestamp desc

Extract HTTP Method and Path with rex

rex field=body "(?P<method>[A-Z]+) (?P<path>/[^ ]*) HTTP/[0-9.]+"
| stats count(*) as hits by method, path
| sort hits desc
| limit 25

Compute a Risk Score with eval

eval severity_score = if(level = "ERROR", 3, if(level = "WARN", 1, 0))
| stats sum(severity_score) as risk, count(*) as total by workload
| eval risk_per_event = risk / total
| sort risk_per_event desc
| limit 15

Deduplicate by Pod — Most Recent Event

filter level = "ERROR"
| dedup pod_name
| table @timestamp, pod_name, namespace, body
| sort @timestamp desc

Bare String Search + Latency Extraction

"KYC" and "request time:"
| parse @message 'request time: * ms;' as latency
| filter ispresent(latency)
| timechart span=1m avg(latency) as avg_latency

Full Pipeline — Server Error Analysis

fields @timestamp, log_processed.url as url, log_processed.status as status, log_processed.duration as dur
| filter status in ["500", "502", "503", "504"]
| eval is_timeout = (status = "504")
| stats count(*) as errors, sum(is_timeout) as timeouts, avg(dur) as avg_dur by url
| filter errors > 5
| eval timeout_rate = timeouts / errors
| sort timeout_rate desc
| limit 10

Count Failed Requests by Message

fields * | filter body like "%Request failed%" | stats count(*) as failures by @message

Per-Application Error Rate Dashboard

fields timestamp, workload as application_name, level as severity, body as text
| filter application_name != "" and severity in ["ERROR", "WARN", "INFO"]
| stats count(*) as total, sum(severity = "ERROR") as errors, sum(severity = "WARN") as warns by application_name
| eval error_rate = errors / total
| sort error_rate desc

Performance Tips

Narrow your time range — Always select the smallest possible time window in the Log Explorer to improve query performance.
Use explicit limits — Add | limit N to your queries. The default limit is 1000 rows, but specifying it explicitly ensures predictable results.
Filter early — Place filter commands as early as possible in your pipeline to reduce the amount of data processed by downstream commands.
Prefer string comparisons for parsed fields — log_processed fields are resolved via attribute maps and JSON extraction. Use string literal comparisons over numeric ones for best performance.
Use head/tail for non-aggregated queries — Long-running queries without aggregation should use head or tail to cap the result set.
Query timeout — Queries have a 300-second timeout. For large datasets, use aggregations (stats, timechart) rather than scanning raw rows.

Error Reference

If your SPL query fails, the Log Explorer will display an error message. Common error conditions include:

Error	Description
Parse error	Syntax error in your query — check for missing quotes, unbalanced parentheses, or invalid operators
Translation error	The query is syntactically valid but cannot be translated to an executable form
Execution error	The query ran but encountered a runtime error (e.g., type mismatch, division by zero)
Query timeout	The query exceeded the 300-second timeout — narrow your time range or add aggregation

Search Processing Language (SPL)

ON THIS PAGE