How to use historical data?
Required module
You will need the Machine Insights Cloud: PoC module or higher. Check your modules at Admin > Licenses. To obtain these modules, contact your IXON account manager or IXON distributor.
If you have configured historical data in the IXON Cloud, your IXrouter or IXagent will send PLC data to the IXON Cloud. The data will be sent in a frequency selected by the user when setting up historical data. You can use the APIv2 to request and use this data in your software application. This article will explain how to use the APIv2 to retrieve and structure historical data.
Historical data basics
With historical data, you basically are storing the value of a certain variable of your machine with a timestamp. This is done in a special database called a time series database. Information on how you log a variable and how often you log a variable is stored in a data tag. A data tag also keeps information on how long data of that variable should be stored. You can log variable data in 3 different manners:
- Interval: periodically on a set timeframe;
- Change: every time the value of a variable changes;
- Trigger: every time when a certain machine condition is met.
Aggregation in a nutshell
In general, historical data can be aggregated before and after it is stored. Data aggregation is downsampling of data. You can aggregate data by combining multiple values according to a certain rule (e.g. Mean, Median, Sum, etc.).
In the context of data analytics, there are two terms that are widely used:
- Pre-aggregation: data is stored aggregated in the database, which means that the data won't be stored raw. In our case, it will be aggregated and stored in the IXON database;
- Post-aggregation: in this case data is aggregated after storage.
Please note:
The IXON Cloud stores logging data with the pre-aggregator set to
raw
, a legacy value. Therefore, you do not need to change it. You will instead be able to work freely using post-aggregators.
For a more practical example of data aggregation in a Line Graph, check out this support article!
Requesting data with the DataList endpoint
You can request historical data from the API with the endpoint DataList. To do a successful request, you first need to obtain the following data with different requests:
Required data | Endpoint |
---|---|
The correct company ID | CompanyList |
The correct IXrouter or IXagent (agent ID) | AgentList |
The correct data source, usually a PLC (source ID) | AgentDataSourceList |
The correct data tag (tag ID) | DataList |
You can request data for multiple tags at once. For every tag you need to give either the tag ID (Int32) or slug (string). Do not use both at the same time. You have to set the pre-aggregator to raw
.
Because the data is stored in a time series database you always have to include a timequery to the request with both a start and an end time. The first two examples below contain the exact required format of the request including the POST body in case of use of either the $tag_id
or the $slug
. Additionally, an example where two different types of tags are used in the request is also included.
curl --location 'https://portal.ixon.cloud/api/data' \
--header 'Api-Version: 2' \
--header 'Api-Application: $application_id' \
--header 'Api-Company: $company_id' \
--header 'Content-Type: application/json; charset=utf-8' \
--header 'Authorization: Bearer $bearer_token' \
--data '{
"source": {
"publicId": "$source_id"
},
"tags": [
{
"id": "$tag_id",
"preAggr": "raw",
"queries": [
{
"ref": "a",
"limit": 2,
"offset": 0
}
]
}
],
"start": "2024-01-14T00:00:00+00:00",
"end": "2024-01-15T23:59:59+00:00",
"timeZone": "utc"
}
curl --location 'https://portal.ixon.cloud/api/data' \
--header 'Api-Version: 2' \
--header 'Api-Application: $application_id' \
--header 'Api-Company: $company_id' \
--header 'Content-Type: application/json; charset=utf-8' \
--header 'Authorization: Bearer $bearer_token' \
--data '{
"source": {
"publicId": "$source_id"
},
"tags": [
{
"slug": "$slug",
"preAggr": "raw",
"queries": [
{
"ref": "a",
"limit": 2,
"offset": 0
}
]
}
],
"start": "2024-01-14T00:00:00+00:00",
"end": "2024-01-15T23:59:59+00:00",
"timeZone": "utc"
}
Please note
The applicable timezone has to be added if you want to use a different timezone then the default (+00:00 UTC). Timezones have to be added in the format of the tz database.
Data return
Your data will be returned in a JSON format. You will receive all requested values for a certain timestamp. You can see this format clearly in the example below.
{
"type": "Data",
"data": {
"start": "2024-01-14T00:00:00Z",
"end": "2024-01-15T23:59:59Z",
"timeZone": "UTC",
"source": {
"publicId": "$source_id",
"reference": {
"name": "source"
}
},
"points": [
{
"time": "2024-01-15T23:59:50.132Z",
"values": {
"b": 900
}
},
{
"time": "2024-01-15T23:29:50.196Z",
"values": {
"b": 900
}
},
{
"time": "2024-01-15T10:59:03.824Z",
"values": {
"a": true
}
},
{
"time": "2024-01-15T10:57:55.370Z",
"values": {
"a": false
}
}
]
},
"status": "success"
}
Export to CSV-file
If you prefer your data to be returned in a CSV-file you can send the request to the DataExport endpoint.
How to downsample data using post-aggregators in query fields
To downsample data and to make it more suitable for data analysis you can use query fields and post-aggregators: query fields are parameters that determine how data should be returned, while a post-aggregator is one of the available query fields. The table below describes the possible parameters and their description:
Field | Structure | Description | Default |
---|---|---|---|
ref | string | The reference determines to what query a value belongs. | - |
postAggr | AggrEnum | None | Aggregator to apply during the query. To know which ones are supported, check out this paragraph. | raw is used if absent. |
postTransform | TransformEnum | None | Limit of amount of points that can be returned. Note: the value of TransformEnum can only be difference at this moment. | No transformation. |
limit | int | None | Limit of amount of points that can be returned. | config.data_max_limit (5000) |
step | int | string | None | Amount of time contained in each time bucket in seconds, or alternatively day or week as a string (TZ aware). | If the step parameter is not given, it is calculated (in milliseconds) based on the (implicit/default) limit. |
offset | int | None | Starting point for returned values. | 0 |
factor | Decimal | None | Value with which you want to recalculate data-point. Note: int and float values are also supported. string can be used as well when passing a large decimal value (e.g. many decimals after the '.'). | Expression is not applied if absent. |
operator | OperatorEnum | None | Operator for query expression (*, /, +, -). | Multiply |
decimals | int | None | Number of decimals you want returned. | 2 |
extendedBoundary | bool | None | Whether or not you want to query for extra data-points, one before the start date and one beyond the end date (if they exist). | False |
order | OrderEnum | None | Which order to return the data in, it can either be asc for ascending order or desc for descending order. | Descending |
Valid query combinations
Here you can find the possible and valid query combinations:
Combination | Use-case | What it does |
---|---|---|
limit , offset | Data tables | Obtain raw data w/ pagination. |
offset | Data tables | Obtain raw data with default point limit. |
postAggr , limit | Graphs | Obtain down sampled data, step size being based on the given limit . |
postAggr | Graphs | Obtain down sampled data, step size being based on the default point limit. |
postAggr , step | Graphs | Obtain down sampled data with known step size. |
Time bucketing
In any case where a post aggregator is used (except raw
), the data between start and end is bucketed into time buckets and the given aggregator is applied to aggregate the data in each bucket into a single value. In case the step
parameter is given, the data is bucketed into buckets of the given step
size. If the step parameter is not given, it is calculated (in milliseconds) based on the (implicit/default) limit
.
As for the returned timestamps, when a post aggregator is used (even if last
), the bucket start-timestamps will be returned, and not the timestamps at which those values were logged.
Please note:
By default, Influx aligns its time buckets to the Unix epoch (1970-01-01 00:00 UTC). This leads to unexpected behavior when, for instance, data is requested from 17-05-2023 until 17-05-2024, with a
step
size of 1 year, and instead of the expected single data point, 2 data points are returned. Even though you might only expect a single data point for the year between the start and end, instead you will get a datapoint for the bucket 17-05-2023 until 13-12-2023 and a datapoint for the bucket 01-01-2024 until 17-05-2024.
Post-aggregators
Post-aggregation queries are applied for each time bucket in the query. The time buckets are defined by the given step size or the limit as shown in this previous paragraph. These are the post-aggregators that the API currently supports:
Name | Description | Influx Function |
---|---|---|
raw | Does not aggregate the data but returns it as-is. | - |
first | Returns the value with the oldest timestamp. | FIRST() |
last | Returns the value with the most recent timestamp. | LAST() |
min | Returns the minimum value. | MIN() |
max | Returns the maximum value. | MAX() |
sum | Returns the sum of all values. | SUM() |
spread | max - min | SPREAD() |
count | Counts the amount of (non-null) values. | COUNT() |
mean | Returns the arithmetic mean average. | MEAN() |
median | Returns the middle value from a sorted list of all values. | MEDIAN() |
mode | Returns the most frequent value. | MODE() |
distinct | Returns unique values with no duplicates. | DISTINCT() |
integral | Returns the area under the curve for subsequent values and integrates them with respect to time. | INTEGRAL() |
stddev | Returns the standard deviation over a set of values, measuring how much the values deviate from their mean. | STDDEV() |
Useful tips
A single value can be helpful to determine all kinds of things about your machine. You can very easily get a single value by setting a very short timeframe or adding the query field limit
and setting it to 1. To get more value out of the single value a post-aggregator can be helpful. You can use the combination of the factor and operator queries if you want to do a calculation on the single value.
If you want to add data to a data table, it is best to go with the raw data, without post-aggregator and export it to a CSV-file using the DataExport endpoint instead of the DataList endpoint. You can use an offset
query field equal to the given limit to get a subsequent set of datapoints compared to your first request. (e.g. limit = 500
, offset = 0
for first 500 values and limit = 500
, offset = 500
for second 500). You can use the combination of the factor and operator queries if you want to do a calculation on the single value.
To make data ready for depiction in a graph, query fields can be helpful as well. Post aggregation using the limit
or step
query fields can make the amount of data for your graph both easier to handle and easier to understand. Thatβs why both of these queries are popular for this purpose. It is not possible to use both limit and step queries!
The moreAfter
attribute in this API response is always null
. With other endpoints this attribute has a function to make sure you received everything. When requesting all logged data, use offset
as explained in the above example instead of using moreAfter
.
Note on Parameter Interactions and Point Limit
Please note that the start, end, and step parameters all interact to determine the total data points in a query (limit). Together, they influence whether your request stays within the limit. Adjusting any of these parameters can help manage the total number of points and avoid limit-related errors.
Example
The Python script below is an example of how you can request historical data.
Updated about 9 hours ago