How to use historical data?

πŸ“˜

Required module

You will need the Machine Insights Cloud: PoC module or higher. Check your modules at Admin > Licenses. To obtain these modules, contact your IXON account manager or IXON distributor.

If you have configured historical data in the IXON Cloud, your IXrouter or IXagent will send PLC data to the IXON Cloud. The data will be sent in a frequency selected by the user when setting up historical data. You can use the APIv2 to request and use this data in your software application. This article will explain how to use the APIv2 to retrieve and structure historical data.

Historical data basics

With historical data, you basically are storing the value of a certain variable of your machine with a timestamp. This is done in a special database called a time series database. Information on how you log a variable and how often you log a variable is stored in a data tag. A data tag also keeps information on how long data of that variable should be stored. You can log variable data in 3 different manners:

  • Interval: periodically on a set timeframe;
  • Change: every time the value of a variable changes;
  • Trigger: every time when a certain machine condition is met.

πŸ‘

Aggregation in a nutshell

In general, historical data can be aggregated before and after it is stored. Data aggregation is downsampling of data. You can aggregate data by combining multiple values according to a certain rule (e.g. Mean, Median, Sum, etc.).
In the context of data analytics, there are two terms that are widely used:

  • Pre-aggregation: data is stored aggregated in the database, which means that the data won't be stored raw. In our case, it will be aggregated and stored in the IXON database;
  • Post-aggregation: in this case data is aggregated after storage.

❗️

Please note:

The IXON Cloud stores logging data with the pre-aggregator set to raw, a legacy value. Therefore, you do not need to change it. You will instead be able to work freely using post-aggregators.

For a more practical example of data aggregation in a Line Graph, check out this support article!

Requesting data with the DataList endpoint

You can request historical data from the API with the endpoint DataList. To do a successful request, you first need to obtain the following data with different requests:

Required dataEndpoint
The correct company IDCompanyList
The correct IXrouter or IXagent (agent ID)AgentList
The correct data source, usually a PLC (source ID)AgentDataSourceList
The correct data tag (tag ID)DataList

You can request data for multiple tags at once. For every tag you need to give either the tag ID (Int32) or slug (string). Do not use both at the same time. You have to set the pre-aggregator to raw.

Because the data is stored in a time series database you always have to include a timequery to the request with both a start and an end time. The first two examples below contain the exact required format of the request including the POST body in case of use of either the $tag_id or the $slug. Additionally, an example where two different types of tags are used in the request is also included.

curl --location 'https://portal.ixon.cloud/api/data' \
--header 'Api-Version: 2' \
--header 'Api-Application: $application_id' \
--header 'Api-Company: $company_id' \
--header 'Content-Type: application/json; charset=utf-8' \
--header 'Authorization: Bearer $bearer_token' \
--data '{
    "source": {
        "publicId": "$source_id"
    },
    "tags": [
        {
            "id": "$tag_id",
            "preAggr": "raw",
            "queries": [
                {
                    "ref": "a",
                    "limit": 2,
                    "offset": 0
                }
            ]
        }
    ],
    "start": "2024-01-14T00:00:00+00:00",
    "end": "2024-01-15T23:59:59+00:00",
    "timeZone": "utc"
}
curl --location 'https://portal.ixon.cloud/api/data' \
--header 'Api-Version: 2' \
--header 'Api-Application: $application_id' \
--header 'Api-Company: $company_id' \
--header 'Content-Type: application/json; charset=utf-8' \
--header 'Authorization: Bearer $bearer_token' \
--data '{
    "source": {
        "publicId": "$source_id"
    },
    "tags": [
        {
            "slug": "$slug",
            "preAggr": "raw",
            "queries": [
                {
                    "ref": "a",
                    "limit": 2,
                    "offset": 0
                }
            ]
        }
    ],
    "start": "2024-01-14T00:00:00+00:00",
    "end": "2024-01-15T23:59:59+00:00",
    "timeZone": "utc"
}

πŸ“˜

Please note

The applicable timezone has to be added if you want to use a different timezone then the default (+00:00 UTC). Timezones have to be added in the format of the tz database.

Data return

Your data will be returned in a JSON format. You will receive all requested values for a certain timestamp. You can see this format clearly in the example below.

{
    "type": "Data",
    "data": {
        "start": "2024-01-14T00:00:00Z",
        "end": "2024-01-15T23:59:59Z",
        "timeZone": "UTC",
        "source": {
            "publicId": "$source_id",
            "reference": {
                "name": "source"
            }
        },
        "points": [
            {
                "time": "2024-01-15T23:59:50.132Z",
                "values": {
                    "b": 900
                }
            },
            {
                "time": "2024-01-15T23:29:50.196Z",
                "values": {
                    "b": 900
                }
            },
            {
                "time": "2024-01-15T10:59:03.824Z",
                "values": {
                    "a": true
                }
            },
            {
                "time": "2024-01-15T10:57:55.370Z",
                "values": {
                    "a": false
                }
            }
        ]
    },
    "status": "success"
}

πŸ“˜

Export to CSV-file

If you prefer your data to be returned in a CSV-file you can send the request to the DataExport endpoint.

How to downsample data using post-aggregators in query fields

To downsample data and to make it more suitable for data analysis you can use query fields and post-aggregators: query fields are parameters that determine how data should be returned, while a post-aggregator is one of the available query fields. The table below describes the possible parameters and their description:

FieldStructureDescriptionDefault
refstringThe reference determines to what query a value belongs.-
postAggrAggrEnum | NoneAggregator to apply during the query. To know which ones are supported, check out this paragraph.raw is used if absent.
postTransformTransformEnum | NoneLimit of amount of points that can be returned.
Note: the value of TransformEnum can only be difference at this moment.
No transformation.
limitint | NoneLimit of amount of points that can be returned.config.data_max_limit (5000)
stepint | string | NoneAmount of time contained in each time bucket in seconds, or alternatively day or week as a string (TZ aware).If the step parameter is not given, it is calculated (in milliseconds) based on
the (implicit/default) limit.
offsetint | NoneStarting point for returned values.0
factorDecimal | NoneValue with which you want to recalculate data-point.
Note: int and float values are also supported. string can be used as well when passing a large decimal value (e.g. many decimals after the '.').
Expression is not applied if absent.
operatorOperatorEnum | NoneOperator for query expression (*, /, +, -).Multiply
decimalsint | NoneNumber of decimals you want returned.2
extendedBoundarybool | NoneWhether or not you want to query for extra data-points, one before the start date and one beyond the end date (if they exist).False
orderOrderEnum | NoneWhich order to return the data in, it can either be asc for ascending order or desc for descending order.Descending

Valid query combinations

Here you can find the possible and valid query combinations:

CombinationUse-caseWhat it does
limit, offsetData tablesObtain raw data w/ pagination.
offsetData tablesObtain raw data with default point limit.
postAggr, limitGraphsObtain down sampled data, step size being based on the given limit.
postAggrGraphsObtain down sampled data, step size being based on the default point limit.
postAggr, stepGraphsObtain down sampled data with known step size.

Time bucketing

In any case where a post aggregator is used (except raw), the data between start and end is bucketed into time buckets and the given aggregator is applied to aggregate the data in each bucket into a single value. In case the step parameter is given, the data is bucketed into buckets of the given step size. If the step parameter is not given, it is calculated (in milliseconds) based on the (implicit/default) limit.

As for the returned timestamps, when a post aggregator is used (even if last), the bucket start-timestamps will be returned, and not the timestamps at which those values were logged.

🚧

Please note:

By default, Influx aligns its time buckets to the Unix epoch (1970-01-01 00:00 UTC). This leads to unexpected behavior when, for instance, data is requested from 17-05-2023 until 17-05-2024, with a step size of 1 year, and instead of the expected single data point, 2 data points are returned. Even though you might only expect a single data point for the year between the start and end, instead you will get a datapoint for the bucket 17-05-2023 until 13-12-2023 and a datapoint for the bucket 01-01-2024 until 17-05-2024.

Post-aggregators

Post-aggregation queries are applied for each time bucket in the query. The time buckets are defined by the given step size or the limit as shown in this previous paragraph. These are the post-aggregators that the API currently supports:

NameDescriptionInflux Function
rawDoes not aggregate the data but returns it as-is.-
firstReturns the value with the oldest timestamp.FIRST()
lastReturns the value with the most recent timestamp.LAST()
minReturns the minimum value.MIN()
maxReturns the maximum value.MAX()
sumReturns the sum of all values.SUM()
spreadmax - minSPREAD()
countCounts the amount of (non-null) values.COUNT()
meanReturns the arithmetic mean average.MEAN()
medianReturns the middle value from a sorted list of all values.MEDIAN()
modeReturns the most frequent value.MODE()
distinctReturns unique values with no duplicates.DISTINCT()
integralReturns the area under the curve for subsequent values and integrates them with respect to time.INTEGRAL()
stddevReturns the standard deviation over a set of values, measuring how much the values deviate from their mean.STDDEV()

Useful tips

A single value can be helpful to determine all kinds of things about your machine. You can very easily get a single value by setting a very short timeframe or adding the query field limit and setting it to 1. To get more value out of the single value a post-aggregator can be helpful. You can use the combination of the factor and operator queries if you want to do a calculation on the single value.

If you want to add data to a data table, it is best to go with the raw data, without post-aggregator and export it to a CSV-file using the DataExport endpoint instead of the DataList endpoint. You can use an offset query field equal to the given limit to get a subsequent set of datapoints compared to your first request. (e.g. limit = 500, offset = 0 for first 500 values and limit = 500, offset = 500 for second 500). You can use the combination of the factor and operator queries if you want to do a calculation on the single value.

To make data ready for depiction in a graph, query fields can be helpful as well. Post aggregation using the limit or step query fields can make the amount of data for your graph both easier to handle and easier to understand. That’s why both of these queries are popular for this purpose. It is not possible to use both limit and step queries!

The moreAfter attribute in this API response is always null. With other endpoints this attribute has a function to make sure you received everything. When requesting all logged data, use offset as explained in the above example instead of using moreAfter.

🚧

Note on Parameter Interactions and Point Limit

Please note that the start, end, and step parameters all interact to determine the total data points in a query (limit). Together, they influence whether your request stays within the limit. Adjusting any of these parameters can help manage the total number of points and avoid limit-related errors.

Example

The Python script below is an example of how you can request historical data.