# Filters

Filters allow you to slice your data based on property parameters you set. On the Filters page, you will find all of your existing filters plus the ability to add new and modify existing filters.

Filters have five components:

# Component Description
1 Filter Name Name to describe your filter
2 Filter Defining the top level filter based on event properties is the first step to deciding what data to aggregate. You can further filter data using groupings and aggregation sub-filters, making it easy to group related metrics together based on a broad filter. Filters are defined using JPath.
3 Groupings Groupings allow you to group your aggregation results using additional properties in your event payload. When retrieving and visualizing data, you can exclude, filter or rollup individual groupings See more below.
4 Interval The aggregation interval is the speciifc time interval over which aggregations are computed. For guidance on choosing the proper interval, see How to choose your Aggregation Interval.
5 Aggregations Aggregations are the statistical calculations you want to run on your data. You can further filter your data per calculation using a sub-filter and choose up to 10 calculations per filter. To learn more about aggregations, see Aggregations

# Groupings

Groupings define how to slice your aggregations, based on properties in your event payload.

When creating a grouping, you can provide an alias, which will be the outputted value when retrieving results via the Metrics API or the Grafana Plugin.

# Requirements
  • Groupings need to be represented as valid JPath.
  • If using a grouping alias, it cannot exceed 100 characters.
  • Groupings must be unique within a filter.
  • Aliases must be unique within a filter.
  • You can have up to 10 groupings per filter.

# Aggregation Interval

Your aggregation interval refers to the baseline freuqency your data will be aggregated. This can range from 30s to 100 years. You can always re-aggregate your data to a less narrow interval when retrieving metric results with the Metrics API or the Grafana Plugin.

# Interval Truncation

Interval truncation refers to the process of rounding timestamps based on your aggregation interval. With Aggregations.io all timestamps are truncated to the start of the interval. The effect of this is easier to understand with standard intervals than with irregular, custom intervals. See the examples below to better understand this concept.

# Examples

If we set our filter to aggregate Every 5 Minutes and send data like the following:

Time Event Count
2023-01-01 12:00:00 10
2023-01-01 12:01:00 20
2023-01-01 12:11:00 100
2023-01-01 12:25:00 60
2023-01-01 12:27:00 30

Our results will be:

Time Event Count
2023-01-01 12:00:00 30
2023-01-01 12:10:00 100
2023-01-01 12:25:00 90

If we have an aggregation with a special need for 13 minute intervals, such as a full day in our video game being equal to 13 minutes of time in real life.

Truncation will happen in relation to the number of intervals since the UNIX Epoch.

  • At 2023-11-01 00:00:00 the timestamp in seconds since Epoch was 1698796800
  • We get the number of intervals since 0 as 1698796800 / (60 * 13) = 2177944.615
  • Rounding down, the interval start will be (60 * 13) * 2177944 = 1698796320
  • 1698796320 is 2023-10-31 23:52:00

The 13-minute interval will range from 2023-10-31 23:52:00 until 2023-11-01 00:04:59.999 and any event sent between then will be aggregated into that interval.

Aggregations.io weeks start on Mondays. For all events ingested in a given week, the timestamp will be truncated to the Monday of that week at 00:00:00 UTC.


# How to Choose?

Choosing the right aggregation interval is key to obtaining useful and accurate results. You may be tempted to choose the lowest granularity possible, but that isn't necessarily best. To choose an interval that best fits your use case, review the following considerations:

The shorter the interval, the more granular the aggregation results will be. Longer intervals create a broader perspective and better identification of overall trends. If you choose an interval too short for what you want to measure, you may wind up with noisy or lacking data. If you choose an interval that is too long, you may smooth out important variations.

For example, if your desired aggregation is a count of daily active users, but you collect data every hour, your aggregation interval would be daily. If you selected an interval of minutely, you would lack results in data for several intervals.

Consider the robustness of your data producers. In a perfect world, your producers may emit data once a minute, but in the real world, this may not be practical. For example, client-side events are imperfect, and electrical sensors may suffer from interference. In these situations, we may want to use a slightly larger interval, such as every two minutes, to mitigate that variability.

Higher-level aggregate forecasts are typically easier to create and more accurate than more granular forecasts. So for forecasts, use the longest interval that still satisfies your business case.

For example in retail supply chains, forecasts often focus on weekly sales because supply trucks deliver once a week. In this case, we don't care about sales on any particular day, so we choose weekly. Choosing daily and aggregating those intervals to weekly after would mean biases from the dailies would get compounded.

To detect irregularities, consider an interval that adheres to your data volatility and fluctuations. For example, if you want to analyze daily patterns in website traffic, your aggregation interval would be daily. Choosing an interval that is too long may not accurately capture those changes because it smooths out important variations.

Consider how you want the aggregation results to be presented in reports or visualizations, and who they are for.

Visualization: If you're interested in seeing monthly sales performance in the form of a line graph, daily metrics would be more readable than minutely.

Reporting: Align your interval with your reporting frequency. Teams may want daily reports for operational monitoring purposes and monthly or quarterly reports for strategic planning. In these scenarios, you would set your interval to daily for ops-related aggregations and monthly or quarterly for strategic aggregations.

In real-time monitoring situations, shorter aggregation levels are more typical as they allow quicker detection. For example, if you desire to monitor server performance for rapid response to issues, your desired metrics may require shorter intervals such as 30s or minutely. For quality control, where compliance standards are critical, shorter intervals allow quicker detection and correction of any deviations from quality standards. To analyze the effectiveness of a marketing campaign, especially digital marketing, shorter intervals allow marketers to make timely adjustments based on real-time performance data. For businesses that require a real-time understanding of customer behavior, such as e-commerce or online platforms, shorter intervals provide better insights into user interactions and preferences.


# Changing Aggregation Intervals

Your needs aren't static, so your intervals don't need to be. You can adjust an interval by editing filters to accomodate your evoling requirements.


# Cardinality Limits

Cardinality refers to the number of unique elements or distinct combinations of attributes in a set. In dimensional metrics, this set is the collection of distinct combinations of properties observed for a given metric within your aggregation interval.

Aggregations.io does not have a hard cardinality limit, but will introduce soft, system protecting limits in the future.


# Modify or Duplicate?

After a filter is modified, your past aggregations are retained, so modifying an active filter may lead to unexpected behavior.

For instance, modifying a filter does not backfill your aggregation's metrics based on the new filter definition. Instead, Aggregations.io will start producing the new metric from the time you save going forward. If you're making substantial changes to a filter or individual aggregations, it may make more sense to duplicate the filter and save a fresh version.


# Removing a Filter

When a filter is deleted, Aggregations.io will stop producing its aggregations results. This will not instantly delete the existing aggregation's results. Existing results will remain available for consumption, but new results will not be produced.

# Debug Mode

Sometimes JPath can be complicated. When working with real-time aggregations, you should test your changes before committing them. Debug Mode helps you understand how Aggregations.io will turn your JSON data into metrics.

Debug Mode will check all aspects of your filter definition against the supplied JSON payload(s) and ensure you understand what will be filtered, grouped and calculated.

Your payloads are evaluated client side. The JSON you debug is not sent to our servers.

As you make adjustments to your filter configuration, your JSON will continuously evaluate.

# Debugging Example

Using this example Filter:

Click the Enter Debug Mode button in the bottom right.

  • Start by inputting your JSON (object) in the left box. Immediately, your fields will be evaluated.

We're going to use the following payload to start:

{
    "event_name": "AppOpen",
    "ts": 1721491167001,
    "device":{
        "type": "iOS"
    },
    "duration":101
}

# We can see:

  • Our ingestion (which defines a Custom Timestamp Property of @.ts in Milliseconds) was successfully found and properly converted.
  • If your property is present, but not parsable to the expected format, you'll see a "Could not parse as number" error.
  • Our filter matched successfully
  • The Overall Duration aggregation, with no sub-filter, extracted 101 as the value for @.duration
  • The Cold Open Duration aggregation, did not match its sub-filter, because the @.cold_open property was not found.
  • The Grouping value for @.device.type is iOS

# Using an Array:

[
    {
        "event_name": "AppOpen",
        "ts": 1721491167001,
        "device":{
            "type": "iOS"
        },
        "duration":101
    },
    {
        "event_name": "appOpen",
        "ts": 1721491167001,
        "device":{
            "type": "android"
        },
        "duration":101,
        "cold_open": true
    }
]

We can put an array of objects into the debugger as well, which will net a more condensed evaluation view.

Here, we can see that the 2nd object matches the aggregations and groupings, but since the filter fails to match (appOpen vs AppOpen) - no metrics will be calculated for that payload.

# Debugging Limits

  • At this time, any filters using a regex operator is not debuggable.