The complete PVQL syntax is as follows:
<values> [BY <group>] FROM <source>
[WHERE <group-filter>] [HAVING <values-filter>]
[TOP <window-and-ordering>] [SINCE <timestamp>] [UNTIL <timestamp>]
[RAW]
[SETTINGS <key-values>]
A valid PVQL query includes at least one <value>
and one <source>
in the FROM
clause;
the BY
/WHERE
/HAVING
clauses are optional,
and all other clauses default to specific values (defined below).
<values>
and <group>
are expressions built from fields; the
exhaustive list of PVQL fields can be found in the
Definitions page. A field is either a value field
(a metric), or a key field (a group), and can be used in the
corresponding clause.
The fields are the simplest form of expressions, and you can request them as follows:
client.traffic FROM transport
To request multiple values at once, use a comma as separator:
client.traffic, server.traffic FROM transport
Some fields are built from simpler ones; for example, traffic
is defined as
client.traffic + server.traffic
. You can use either one when querying:
traffic FROM transport
client.traffic + server.traffic FROM transport
When requesting multiple complex expressions, it is useful to name them to
distinguish which is which. PVQL supports aliases with the AS
keyword:
client.traffic + server.traffic AS "Total traffic",
client.traffic / (client.traffic + server.traffic) AS "Client Traffic Ratio"
FROM transport
The group clause (with the BY
keyword) specifies how to partition the results: by
time, by zones, applications, IP subnets, etc. You can group by multiple expressions.
client.traffic BY application.name FROM transport
traffic BY client.zone.name, server.zone FROM transport
traffic BY time(600) FROM transport
traffic BY time_exclusion.business_hours, time_exclusion.maintenance_windows
FROM tcp
In the last example, we group the data for each period of 10 minutes
(600 s). A given point in PVQL corresponds to the data that preceded
it. For example, when using BY time(600)
, the point at 12:00:00 is
actually the data accumulated between 11:50:00 and 12:00:00.
The filter specified with the WHERE
keyword lets you limit the query to certain keys.
client.traffic BY layer, application.name FROM transport WHERE application.name != "http"
client.rtt BY client.ip FROM transport WHERE (application.name = "http"
OR application.name = "https") AND server.zone IN "/Local"
traffic FROM tcp WHERE NOT time_exclusion.any
Notice the IN
operator in this last example. It allows for filtering on the "/Local"
zone or any of its children.
The IN
operator may also be used to check if an element belongs to a set:
client.traffic FROM transport WHERE server.port IN {80, 443, 8000..8800}
and for wildcard or pattern matching (using re2 syntax):
client.traffic FROM tls WHERE server_name IN glob("*.google.*")
client.traffic FROM http WHERE host IN re("^www\d")
These functions exist in case-insensitive variants (iglob
, ire
).
There are also functions text
/ itext
for simple string comparisons
(default is case-insensitive).
You can further filter the final result with an HAVING
clause,
to keep only the rows matching the given expression.
client.traffic BY application.name FROM transport HAVING client.traffic >= 10000
Additionnaly, if you need to eliminate values at an early stage
of the query, you can filter fields directly in the WHERE
clause
using the raw.
prefix. Note that filtering this way requires a
certain knowledge of the way we store data.
client.traffic, client.dtt BY application.name FROM transport WHERE raw.client.dtt != 0
The window & ordering clause (which uses the TOP
keyword) allows to order and extract
a portion of the result.
To extract the first 5 results, use the following:
client.traffic BY application.name FROM transport TOP 5
To request 10 lines of data while skipping the first 5 (that is, at offset 5, since offsets start at 0), use the following:
client.traffic, server.traffic BY client.zone.name, server.zone FROM transport TOP 10@5
The ALL
keyword lets you request all lines of data starting at a given offset:
client.traffic, server.traffic BY client.zone.name, server.zone FROM transport TOP ALL@5
As for sorting, PVQL lets you sort on any or all of the values and keys in the
query. The i-th value is referred to by the name vi
, while the j-th key is
referred to by the name kj
. These names are used in the TOP
clause:
client.traffic, server.traffic BY client.zone.name, server.zone FROM transport
TOP 10@5 {v2 ASC, k1 ASC}
The sort direction accepts ASC
or DESC
. By default, the keys are sorted in
ascending order, and the values in descending order.
Some additional names are available in the ordering clause:
v*
refers to v1
, v2
, v3
, …k*
refers to k1
, k2
, k3
, …*
refers to v1
, v2
, v3
, …, k1
, k2
, k3
, …By default the result is sorted by values in descending order unless the query is grouped by time, in which case the default behavior is to return the result in chronological order.
Please note that you can only sort on expressions present in the query.
The FROM
clause lets you specify which layer to request.
global.dtt BY application.name FROM http
The SINCE
/ UNTIL
clauses allow for limitting the query to a
given time interval. They accept either the ISO8601 standard, or time expressions:
global.dtt BY application.name FROM http SINCE @(2018-02-20T10:00:00) UNTIL @now - 60*60
In this example, @now
refers to the current time in seconds,
thus @now - 60*60
refers to 3600 seconds ago (1 hour ago).
PVQL provides a special #
operator if we want to retrieve for a key, the number of distinct elements.
#client.ip FROM tcp
In this example, we request to PVQL the number of distinct client IPs. While without this operator, PVQL will return a set of distinct client IPs.
A PVQL query can provide a SETTINGS
clause to change some internal behaviors
of the query, and can be requested as follows:
client.traffic FROM transport SETTINGS name = value [, ...]
We provide several settings for PVQL:
limit_size_set
, which limits the size of the returned sets, for example when we request a key as a value,
the argument can take a positive number, or a special value ALL
.
client.ip FROM transport SETTINGS limit_size_set = 10
This query above will return the first 10 client IPs found during the execution of the query.
client.ip FROM transport SETTINGS limit_size_set = ALL
While this query will return all client IPs found during the execution of the query.
By default, PVQL returns the first 100 elements on a set.
query_id
is a parameter if we want to specify a specific PVQL query_id
in case we want to cancel the current query,
the query could be cancel using the endpoint query-cancel.
degradation_mode
, this setting is used to know how much PVQL should degrade the query.
The more a query has been degraded, the faster but the less accurate it will be, and the less
a query has been degraded, the slower but more precise it will be.
We have several parameters for this setting:
fast
mode is designed for queries where response time is really important (ex: a dashboard with
many widgets) and where we just need to have an approximation regarding the result of the query. Generally
speaking, only complex queries are degraded: for example queries involving keys or values that are expensive to process, or queries executed on long time ranges.
client.ip FROM transport SETTINGS degradation_mode = "fast"
precise
mode will give more accurate results, while keeping the execution time under control. It’s a middle-ground between accuracy
and execution time, it can be used for a single widget with many values or keys that are complex to process.
client.ip FROM transport SINCE @(2019-01-20T10:00:00) UNTIL @(2019-02-20T10:00:00) SETTINGS degradation_mode = "precise"
unlimited
mode is the most accurate mode and will give the result with the best accuracy as possible, but should be used
only if we don’t have any other choice, or for debugging purpose as the execution time will be unchecked: it might prove too long for real-world queries and may slow down the whole system.
client.ip FROM transport SETTINGS degradation_mode = "unlimited"
Finally, the RAW
clause lets you disable result aggregation and retrieve raw
values stored in the database:
traffic BY application.name FROM transport RAW
This query does not return the total traffic for each application, but the traffic and application of each database record.
A string literal in PVQL is an arbitrary sequence of non-null characters
(characters with nonzero values) bounded by double quotes ("
),
for example "This is a string"
.
To include a double-quote character within a string literal,
write two adjacent double quotes, e.g. "Alice says ""Hello"" to Bob"
.
PVQL also accepts “escape” string literals, bounded by backquotes (
`
). Within an escape string, a backslash character (\
) begins a
backslash escape sequence, in which the combination of backslash and
the following character(s) represents a special character. The
recognized escape sequences are:
Escape Sequence | Meaning |
---|---|
\\ |
Backslash |
\` |
Backquote |
\a |
ASCII Bell (BEL) |
\b |
ASCII Backspace (BS) |
\f |
ASCII Formfeed (FF) |
\n |
ASCII Linefeed (LF) |
\r |
ASCII Carriage Return (CR) |
\t |
ASCII Horizontal Tab (TAB) |
\v |
ASCII Vertical Tab (VT) |
\xhh |
Character with hex value hh |
\uxxxx |
Character with 16-bit hex value xxxx |
\Uxxxxxxxx |
Character with 32-bit hex value xxxxxxxx |
\N{name} |
Character named name in the Unicode database |
PVQL supports variables. This allows you to insert
variables with names you choose (e.g. $server
)
whereever PVQL is expecting a value. When querying
data with such a PVQL expression, you can assign
the variables with custom values. The query is
performed as if all variables were substituted
with the corresponding values.
For example, if you want a PVQL query regarding an application for a specific server, you would usually write something like this:
server.rt, server.dtt FROM tcp WHERE application.name = "NC" AND server.ip = 8.8.8.8
However, you might want to be able to easily choose
another IP address without rewriting the query each
time. For this purpose, you can introduce a variable,
such as $server
, in place of the actual value:
server.rt, server.dtt FROM tcp WHERE application.name = "NC" AND server.ip = $server
Then wherever you request data using this query, you
pass the corresponding value in the variables
parameters.
Example:
=> query expr="server.rt, server.dtt FROM tcp WHERE application.name = \"NC\" AND server.ip = $server" \
.. variables={"server": {"value": "8.8.4.4", "type": {"type": "ip"}}}
In the following examples, it is implied that the queries are applied
on the last hour (because if neither SINCE
nor UNTIL
are provided,
the default time interval is used).
Retrieving the client traffic over time for the “transport” layer (which is the combination of TCP, UDP, ICMP, Non IP and Other IP layers):
client.traffic FROM transport BY time()
Retrieving both the client traffic and the server traffic, over time for each “transport” layer:
client.traffic, server.traffic FROM transport BY time(), layer
Retrieving the client traffic, the server traffic, and the mean of both, over time by points of 5 minutes (300 seconds):
client.traffic, server.traffic, (client.traffic + server.traffic) / 2
BY time(300) FROM transport
Retrieving the total traffic for each /24
IPv4 subnet limited to
those in 192.168.0.0/16
(thus for each 256 192.168.x.0/24
networks):
traffic BY server.ip[24] FROM transport WHERE server.ip[16] = 192.168.0.0
Retrieving the 20 highest average server response times for a matrix of application × client zone:
server.rt BY application.name, client.zone.name FROM transport TOP 20
Retrieving the server traffic per client zone:
server.traffic BY client.zone.name FROM transport
Retrieving the server traffic for a matrix of client zone × application:
server.traffic BY client.zone.name, application.name FROM transport
Retrieving the total traffic by protocol stack for each IPv4 client
within 192.168.0.0/16
:
traffic BY protostack, client.ip FROM transport WHERE client.ip[16] = 192.168.0.0
Retrieving the total traffic from the HTTP layer for the given host:
traffic FROM http WHERE host = "www.google.com"
Retrieving the top 10 total traffic by host from the HTTP layer limited to those matching the given pattern:
traffic BY host FROM http WHERE host IN "*.google.com" TOP 10
The server.ip[n]
expression in these examples refer to subnets.
date(spec: set) -> set
Parse a date in ISO 8601 format as a PVQL time.
date
: a literal string that represents a date in ISO 8601 format.A PVQL time based on the date string.
date(spec: string) -> time
Parse a date in ISO 8601 format as a PVQL time.
date
: a literal string that represents a date in ISO 8601 format.A PVQL time based on the date string.
family(ip: set) -> set
Get the family type of the current IP.
ip
: An IP field.IPv4 or IPv6, depending of the current IP address type.
family(ip: ip) -> ipfamily
Get the family type of the current IP.
ip
: An IP field.IPv4 or IPv6, depending of the current IP address type.
flatten(zone: set, depth: set) -> set
Simplify a zone hierarchy up to a maximum depth level.
zone
: A zone id field.depth
: A literal positive non-negative value that represents the maximum depth of the returned zone path.A parent zone ID, depending on depth
.
flatten(client.zone.id, 1)
where zone hierarchy of client.zone.id = [1, 2]
returns 1
.
flatten(url: set, depth: set) -> set
Return an URL to a maximum depth level using its hierarchical path.
url
: An URL field.depth
: A literal positive non-negative value that represents the maximum depth of its hierarchical path.An URL to a maximum depth level using its hierarchical path.
traffic BY flatten(url, 2) FROM http
where url is http://example.com/a/b/c/d
returns http://example.com/a/b/
.
flatten(url: url, depth: number) -> url
Return an URL to a maximum depth level using its hierarchical path.
url
: An URL field.depth
: A literal positive non-negative value that represents the maximum depth of its hierarchical path.An URL to a maximum depth level using its hierarchical path.
traffic BY flatten(url, 2) FROM http
where url is http://example.com/a/b/c/d
returns http://example.com/a/b/
.
flatten(zone: set, depth: set) -> set
Simplify a zone hierarchy up to a maximum depth level.
zone
: A zone name field.depth
: A literal non-negative value that represents the maximum depth of the returned zone path.A parent zone, depending on depth
.
flatten(client.zone.name, 2)
where client.zone.name = "/All/Public/Documents"
returns /All/Public
.
flatten(zone: zone, depth: number) -> zone
Simplify a zone hierarchy up to a maximum depth level.
zone
: A zone name field.depth
: A literal non-negative value that represents the maximum depth of the returned zone path.A parent zone, depending on depth
.
flatten(client.zone.name, 2)
where client.zone.name = "/All/Public/Documents"
returns /All/Public
.
flatten(zone: zone_id, depth: number) -> zone_id
Simplify a zone hierarchy up to a maximum depth level.
zone
: A zone id field.depth
: A literal positive non-negative value that represents the maximum depth of the returned zone path.A parent zone ID, depending on depth
.
flatten(client.zone.id, 1)
where zone hierarchy of client.zone.id = [1, 2]
returns 1
.
glob(value: set) -> set
Build a wildcard pattern matcher.
Internally, the PVQL wildcard pattern is converted to a case sensitive regex.
value
: A PVQL wildcard pattern.A case sensitive regex.
traffic FROM http WHERE url IN glob("http://*")
returns the traffic from all URLs that use the HTTP scheme.
glob(value: string) -> regex
Build a wildcard pattern matcher.
Internally, the PVQL wildcard pattern is converted to a case sensitive regex.
value
: A PVQL wildcard pattern.A case sensitive regex.
traffic FROM http WHERE url IN glob("http://*")
returns the traffic from all URLs that use the HTTP scheme.
iglob(value: set) -> set
Build a wildcard pattern matcher.
Internally, the PVQL wildcard pattern is converted to a non-case sensitive regex.
value
: A PVQL wildcard pattern.A non-case sensitive regex.
traffic FROM http WHERE url IN glob("http://*")
returns the traffic from all URL using the HTTP protocol.
iglob(value: string) -> regex
Build a wildcard pattern matcher.
Internally, the PVQL wildcard pattern is converted to a non-case sensitive regex.
value
: A PVQL wildcard pattern.A non-case sensitive regex.
traffic FROM http WHERE url IN glob("http://*")
returns the traffic from all URL using the HTTP protocol.
ire(value: string) -> regex
Build with a re2 regex string, as a PVQL case-insensitive regex.
value
: A re2 regex.The same string, converted to a regex type.
traffic FROM http WHERE url IN re("^(http|https)://*.com$")
returns the traffic from HTTP URLs that use the top-level domain .com
.
ire(value: set) -> set
Build with a re2 regex string, as a PVQL case-insensitive regex.
value
: A re2 regex.The same string, converted to a regex type.
traffic FROM http WHERE url IN re("^(http|https)://*.com$")
returns the traffic from HTTP URLs that use the top-level domain .com
.
itext(value: set) -> set
Force PVQL to apply a case insentitive comparaison on filtering.
value
: A string.The same string, with an internal marker for doing case-insensitive operations.
query FROM databases WHERE itext(system) = itext("mysql")
returns all queries from the MySQL databases.
itext(value: Parameter<S>) -> normalized_case
Force PVQL to apply a case insentitive comparaison on filtering.
value
: A string.The same string, with an internal marker for doing case-insensitive operations.
query FROM databases WHERE itext(system) = itext("mysql")
returns all queries from the MySQL databases.
lower(value: Parameter<S>) -> Parameter<S>
Transform a string to a lowercase string.
value
: A string.The same string, on lowercase format.
lower(value: set) -> set
Transform a string to a lowercase string.
value
: A string.The same string, on lowercase format.
meta_category_capture(cat_type: set) -> set
meta_category_capture(cat_type: string) -> string
meta_category_capture_iface_client(cat_type: string) -> string
meta_category_capture_iface_client(cat_type: set) -> set
meta_category_capture_iface_server(cat_type: set) -> set
meta_category_capture_iface_server(cat_type: string) -> string
meta_category_zones_client(cat_type: string) -> string
meta_category_zones_client(cat_type: set) -> set
meta_category_zones_server(cat_type: set) -> set
meta_category_zones_server(cat_type: string) -> string
prefix(mac: set, mask: set) -> set
Keep the nth bytes on the current MAC address. The expression prefix_mac(client.mac, 24)
is equivalent to client.mac/24
.
mac
: A MAC address.mask
: A literal numeric value between 0 to 48 that represents the number of bytes to keep.A MAC address where the first nth bytes have been kept, and the remaining ones are zeros.
prefix(ip: ip, mask: number) -> ip
Apply a CIDR mask on the current IPv4. The expression prefix(client.ip, 24)
is equivalent to client.ip/24
.
ip
: An IPv4 address.mask
: A literal numeric value between 0 to 32 that represents the CIDR prefix length.An IPv4 with the applied CIDR mask.
prefix(ip: set, mask: set) -> set
Apply a CIDR mask on the current IPv4. The expression prefix(client.ip, 24)
is equivalent to client.ip/24
.
ip
: An IPv4 address.mask
: A literal numeric value between 0 to 32 that represents the CIDR prefix length.An IPv4 with the applied CIDR mask.
prefix(mac: mac, mask: number) -> mac
Keep the nth bytes on the current MAC address. The expression prefix_mac(client.mac, 24)
is equivalent to client.mac/24
.
mac
: A MAC address.mask
: A literal numeric value between 0 to 48 that represents the number of bytes to keep.A MAC address where the first nth bytes have been kept, and the remaining ones are zeros.
prefix6(ip: ip, mask: number) -> ip
Apply a CIDR mask on the current IPv6. The expression prefix_ipv6(client.ip, 48)
is equivalent to client.ip/48
.
ip
: An IPv6 address.mask
: A literal numeric value between 0 to 128 that represents the CIDR prefix length.An IPv6 with the applied CIDR mask.
prefix6(ip: set, mask: set) -> set
Apply a CIDR mask on the current IPv6. The expression prefix_ipv6(client.ip, 48)
is equivalent to client.ip/48
.
ip
: An IPv6 address.mask
: A literal numeric value between 0 to 128 that represents the CIDR prefix length.An IPv6 with the applied CIDR mask.
re(value: set) -> set
Build with a re2 regex string, as a PVQL case sensitive regex.
value
: A re2 regex.The same string, converted as a regex type.
traffic FROM http WHERE url IN re("^(http|https)://*.com$")
returns the traffic from HTTP URLs that use the top-level domain .com
.
re(value: string) -> regex
Build with a re2 regex string, as a PVQL case sensitive regex.
value
: A re2 regex.The same string, converted as a regex type.
traffic FROM http WHERE url IN re("^(http|https)://*.com$")
returns the traffic from HTTP URLs that use the top-level domain .com
.
safesum(left: set, right: set) -> set
Safe operator for add
if we have a nullable argument. If either operand is NULL, it will be replaced by 0.
left
: A numeric value that can be nullable.right
: A numeric value that can be nullable.The result of left + right.
safesum(left: number, right: number) -> number
Safe operator for add
if we have a nullable argument. If either operand is NULL, it will be replaced by 0.
left
: A numeric value that can be nullable.right
: A numeric value that can be nullable.The result of left + right.
strip_url(value: set) -> set
Removes the query string and fragment identifier from an URL.
url
: An URL.An URL, without the query string and fragmant identifier.
strip_url(value: string) -> url
Removes the query string and fragment identifier from an URL.
url
: An URL.An URL, without the query string and fragmant identifier.
strip_url(value: url) -> url
Removes the query string and fragment identifier from an URL.
url
: An URL.An URL, without the query string and fragmant identifier.
strip_url(value: set) -> set
Removes the query string and fragment identifier from an URL.
url
: An URL.An URL, without the query string and fragmant identifier.
substr(value: set, offset: set, length: set) -> set
Return a substring from an offset, up to a certain bytes length.
value
: A type value who should be a subset of string.offset
: A constant positive numeric value that represents the start position of the substring.length
: A constant positive numeric value that represents the length of the substring.A substring.
substr(value: Parameter<S>, offset: number, length: number) -> Parameter<S>
Return a substring from an offset, up to a certain bytes length.
value
: A type value who should be a subset of string.offset
: A constant positive numeric value that represents the start position of the substring.length
: A constant positive numeric value that represents the length of the substring.A substring.
substr(value: Parameter<S>, length: number) -> Parameter<S>
Return a substring from the beginning, up to a certain bytes length.
value
: A type value that should be a subset of string.length
: A constant positive numeric value that represents the length of the substring.A substring.
substr(value: set, length: set) -> set
Return a substring from the beginning, up to a certain bytes length.
value
: A type value that should be a subset of string.length
: A constant positive numeric value that represents the length of the substring.A substring.
text(value: Parameter<S>) -> normalized_case
Force PVQL to apply a case-insentitive comparaison on the filtering.
value
: A string.The same string, with an internal marker for not doing case-insensitive operations.
query FROM databases WHERE text(system) = text("MySQL")
returns all queries from the MySQL databases.
text(value: set) -> set
Force PVQL to apply a case-insentitive comparaison on the filtering.
value
: A string.The same string, with an internal marker for not doing case-insensitive operations.
query FROM databases WHERE text(system) = text("MySQL")
returns all queries from the MySQL databases.