read Method
Description
Reads (consumes) data from a TSDB table into pandas DataFrames.
A TSDB query can include aggregation functions ("aggregators") to apply to the sample metrics; for a list of the supported aggregation functions, see the description of the
The Frames TSDB backend currently supports "over-time aggregation", which aggregates the data for unique metric label sets over time, and returns a separate aggregation time series for each label set.
The aggregation is done at each aggregation step (a.k.a., aggregation interval) — the time interval for executing the aggregation functions over the query's time range; the step determines the aggregation data points, starting at the query's start time.
The default step is the query's time range (which can be configured via the
The aggregation is applied to all sample data within the query's aggregation window, which currently always equals the query's aggregation step. For example, for an aggregation step of 1 hour, the aggregation at step 10:00 is done for an aggregation window of 10:00–11:00.
When creating a TSDB table, you can optionally configure pre-aggregates that will be calculated for all metric samples as part of their ingestion into the TSDB table.
For each aggregation request in an over-time aggregation query, if the TSDB table has matching pre-aggregated data (same aggregation function and the query's aggregation window is a sufficient multiplier of the table's aggregation granularity), the pre-aggregated data is used instead of performing a new aggregation calculation, which speeds up the query processing.
For more information about pre-aggregation and how to configure it, see the description of the
Syntax
read(backend[, table='', columns=None, filter='', max_rows_in_msg=0,
iterator=False, **kw])
The following syntax statement replaces the
read(backend[, table='', columns=None, filter='', max_rows_in_msg=0,
iterator=False, start, end, aggregators, step, multi_index])
The method has additional parameters that aren't currently supported for the TSDB backend. Therefore, when calling the method, be sure to explicitly specify the names of all parameters after
Parameters
- backend
The backend type —
"tsdb"
for the TSDB backend. See Backend Types.- Type:
str
- Requirement: Required
- Type:
- table
The relative path to the backend data — a directory in the target data container (as configured for the client object) that represents a TSDB table. For example,
"mytable"
or"examples/tsdb/my_metrics"
.- Type:
str
- Requirement: Required
- Type:
- iterator
Determines whether to return a pandas DataFrames iterator or a single DataFrame:
True
— return a DataFrames iterator;False
(default) — return a single DataFrame.- Type:
bool
- Requirement: Optional
- Valid Values:
True
|False
- Default Value:
False
(return a single DataFrame)
- Type:
- columns
A list of metric names to which to apply the query. For example,
["cpu", "temperature", "disk"]
. By default, the query is applied to all metrics in the TSDB table.Note- Queries with multiple metric names is currently supported only as Tech Preview.
- You can restrict the metrics list for the query within the query filter, as explained for
filter parameter.
- Type:
[]str
- Requirement: Optional
- filter
A platform filter expression that restricts the information that will be returned. See Filter Expression for syntax details and examples.
The filter is typically applied to metric labels; for example,"os=='linux' AND arch=='amd64'"
.
You can also apply the filter to the_name attribute, which stores the metric name. This is less efficient than specifying the metric names in thecolumns parameter, but it might be useful in some cases. For example, if you have many "cpu<n>" metrics, you can use"starts(_name,'cpu')"
in your filter expression to apply the query to all metrics (or all metrics specified in thecolumns parameter, if set) whose names begin with the string "cpu".NoteCurrently, only labels of type string are supported; see the Software Specifications and Restrictions. Therefore, ensure that you embed label attribute values in your filter expression within quotation marks even when the values represent a number (for example," ), and don't apply arithmetic operators to such attributes (unless you want to perform a lexicographic string comparison).node == '1'
"- Type:
str
- Requirement: Optional
- Type:
- kw
This parameter is used for passing a variable-length list of additional keyword (named) arguments. See the following kw Arguments section for a list of additional arguments that are supported for the TSDB backend via the
kw parameter.- Type:
**
— variable-length keyword arguments list
- Requirement: Optional
- Type:
kw Arguments
The TSDB backend supports the following
- start
The query's start time — the earliest sample time to query: read only items whose data sample time is at or after (
>=
) the specified start time.- Type:
str
- Requirement: Optional
- Valid Values: A string containing an RFC 3339 time, a Unix timestamp in milliseconds, a relative time of the format
"now"
or"now-[0-9]+[mhd]"
(wherem
= minutes,h
= hours, and'd'
= days), or 0 for the earliest time. For example:"2016-01-02T15:34:26Z"
;"1451748866"
;"now-90m"
;"0"
.
- Default Value:
<end time> - 1h
- Type:
- end
The query's end time — the latest sample time to query: read only items whose data sample time is before or at (
<=
) the specified end time.- Type:
str
- Requirement: Optional
- Valid Values: A string containing an RFC 3339 time, a Unix timestamp in milliseconds, a relative time of the format
"now"
or"now-[0-9]+[mhd]"
(wherem
= minutes,h
= hours, and'd'
= days), or 0 for the earliest time. For example:"2018-09-26T14:10:20Z"
;"1537971006000"
;"now-3h"
;"now-7d"
.
- Default Value:
now
- Type:
- aggregators
A list of aggregation functions ("aggregators") to apply to the raw sample data of the configured query metrics (see the
columns parameter) in order to perform an aggregation query. You can configure the aggregation step, which serves also as the aggregation window, in thestep parameter.- Type:
str
- Requirement: Optional
Valid Values: A string containing a comma-separated list of supported aggregation functions ("aggregators"); for example,
"count,avg,min,max"
. The following aggregation functions are supported:avg — the average of the sample values.count — the number of ingested samples.last — the value of the last sample (i.e., the sample with the latest time).max — the maximal sample value.min — the minimal sample value.rate — the change rate of the sample values, which is calculated as<last sample value of the previous interval> - <last sample value of the current interval>) / <aggregation granularity>
.stddev — the standard deviance of the sample values.stdvar — the standard variance of the sample values.sum — the sum of the sample values.
- Type:
- step
The query step (interval), which determines the points over the query's time range at which to perform aggregations (for an aggregation query) or downsample the data (for a query without aggregators). The default step is the query's time range, which can be configured via the
start andend parameters. In the current release, the aggregation step is also the aggregation window to which the aggregators are applies. For more information, see Aggregation Queries.- Type:
str
- Requirement: Optional
- Valid Values: A string of the format
"[0-9]+[mhd]"
— where 'm
' = minutes, 'h
' = hours, and 'd
' = days. For example,"30m"
(30 minutes),"2h"
(2 hours), or"1d"
(1 day).
- Type:
- multi_index
Determines the indexing of the returned DataFrames:
True
— return a multi-index DataFrame in which all metric-label attributes are defined as index columns in addition to the metric sample-time attribute (the primary-key attribute);False
(default) — return a single-index DataFrame in which only the metric sample-time attribute is defined as an index column.- Type:
bool
- Requirement: Optional
- Default Value:
False
(return a single-index DataFrame)
- Type:
Return Value
- When the value of the
iterator parameter isTrue
— returns a pandas DataFrames iterator. - When the value of the
iterator parameter isFalse
(default) — returns a single pandas DataFrame.
Examples
Following are some usage examples for the True
to display metric-label attributes as index columns (in addition to the sample-time attribute, which is always displayed as an index column).
Except where otherwise specified, the examples return a single DataFrame (default False
).
-
Read all items (rows) of a
mytsdb table in the client's data container (table ) —start ="0"
and defaultend ("now
") andcolumns (all metrics):tsdb_table = "mytsdb" df = client.read(backend="tsdb", table=tsdb_table, start="0", multi_index=True) display(df.head()) display(df.tail())
-
Issue an aggregation query (
aggregators ) to amytsdb table in the client's data container (table ) for the "cpu" metric (columns ); use the default aggregation step (step not set), which is the query's time range — 09:00–17:00 on 1 Jan 2019 (seestart andend ):tsdb_table = "mytsdb" df = client.read("tsdb", table=tsdb_table, start="2019-01-01T09:00:00Z", end="2019-01-01T17:00:00Z", columns=["cpu"], aggregators="avg,min,max", multi_index=True) display(df)
-
Issue an aggregation query to a
tsdb/my_metrics table in the client's data container (table ) for the previous two days (start ="now-2d"
andend ="now-1d"
); apply thesum
andavg
aggregators (aggregators ) to the "disk" and "cpu" metrics (columns ) with a 12-hours aggregation step (step ), and only apply the query to samples with a "linux"os label (filter =os=='linux
).tsdb_table = "/tsdb/my_metrics" df = client.read("tsdb", table=tsdb_table, columns=["disk", "memory"], filter="os=='linux'", aggregators="sum,avg", step="12h", start="now-2d", end="now-1d", multi_index=True) display(df)
-
Issue a 1-hour raw-data downsampling query (
step ="1h"
andaggregators not set) to amytsdb table in the client's data container (table ); apply the query to all metric samples (defaultcolumns ) from 1 Jan 2019 (start ="2019-01-01T00:00:00Z"
andend ="2019-02-01T00:00:00Z"
):tsdb_table = "mytsdb" df = client.read("tsdb", table=tsdb_table, start="2019-01-01T00:00:00Z", end="2019-02-01T00:00:00Z", step="1h", multi_index=True) display(df)
See Also
- Querying a TSDB (The TSDB CLI)
- Frames TSDB-Backend Overview
- Frames Client Constructor