Compound Aggregations
Beta
Learn how to implement compound incremental aggregations like averages and arrays.
This guide explains compound aggregations and how to implement them as incremental features in Profiles.
Overview
Compound aggregations require intermediate calculations or multiple steps, each implemented as a simple composable aggregation. They break complex aggregations into component parts that can be maintained incrementally.
Motivation: When and why to use compound aggregations
Use compound aggregations when:
- You need to compute averages (
AVG, WEIGHTED_AVG), which cannot be computed directly incrementally - You need to aggregate arrays or lists that require union and sorting operations
- Your aggregation requires multiple intermediate calculations
Compound aggregations use:
- Component parts (like sum and count for averages)
- Intermediate entity vars maintained incrementally
- Final calculations that combine components

Tip:
If compound aggregations don’t fit your use case, consider
Incremental SQL Models for stateful processing or complex transformations.
Real-world examples
Here are common use cases that map to compound aggregations:
| Use case | Pattern | Example |
|---|
| Average order value | AVG using sum/count | Average purchase amount across all orders |
| Average session duration | AVG using sum/count | Average time spent per session |
| Weighted average rating | WEIGHTED_AVG using weighted sum/weight sum | Average product rating weighted by number of reviews |
| List of countries visited | Array aggregation | Sorted list of unique countries a user has visited |
| Array of product IDs | Array aggregation | List of all product IDs a user has purchased |
Average (AVG) using component features
AVG cannot be computed directly incrementally. Express it using sum and count components:
- Sum component: Maintain sum incrementally using
SUM with SUM merge - Count component: Maintain count incrementally using
COUNT with SUM merge (counts from previous checkpoints are summed together) - Average calculation: Divide sum by count in a final entity var with division by zero protection
Example: Average order value
- entity_var:
name: sum_of_order_values
select: SUM({{orders.amount}})
merge: SUM(rowset.sum_of_order_values)
from: inputs/orders
is_feature: false
description: Sum of order values (used for average calculation)
- entity_var:
name: count_of_orders
select: COUNT({{orders.order_id}})
merge: SUM(rowset.count_of_orders)
from: inputs/orders
is_feature: false
description: Count of orders (used for average calculation)
- entity_var:
name: avg_order_value
select: "CASE WHEN {{user.count_of_orders}} > 0 THEN {{user.sum_of_order_values}} / {{user.count_of_orders}} ELSE 0 END"
description: Average order value across all orders
How it works:
- Sum component: Maintains total sum incrementally using
SUM with SUM merge - Count component: Maintains total count incrementally using
COUNT with SUM merge (counts from previous checkpoints are summed together) - Average calculation: Divides sum by count using
{{user.var_name}} syntax, with division by zero protection

Tip:
Mark sum and count components with is_feature: false since they’re intermediate calculations. The average is computed on-the-fly from the incrementally maintained components.
Workflow
Input Data → [Sum Component (Incremental)] → Final Calculation
→ [Count Component (Incremental)] ↗
- Profiles creates intermediate tables for sum and count components
- Both components are maintained incrementally using simple aggregation patterns
- The average is computed by dividing sum by count (with division by zero protection)
Weighted Average (WEIGHTED_AVG) using component features
WEIGHTED_AVG cannot be computed directly incrementally. You can express it using weighted sum and sum of weights components:
- Weighted sum component: Maintain sum of (value × weight) incrementally using
SUM with SUM merge - Sum of weights component: Maintain sum of weights incrementally using
SUM with SUM merge - Weighted average calculation: Divide weighted sum by sum of weights in a final entity var with division by zero protection
Example: Weighted average product rating
- entity_var:
name: weighted_sum_of_ratings
select: SUM({{reviews.rating}} * {{reviews.review_count}})
merge: SUM(rowset.weighted_sum_of_ratings)
from: inputs/reviews
is_feature: false
description: Sum of (rating × review_count) for weighted average calculation
- entity_var:
name: sum_of_review_counts
select: SUM({{reviews.review_count}})
merge: SUM(rowset.sum_of_review_counts)
from: inputs/reviews
is_feature: false
description: Sum of review counts (weights) for weighted average calculation
- entity_var:
name: weighted_avg_rating
select: "CASE WHEN {{user.sum_of_review_counts}} > 0 THEN {{user.weighted_sum_of_ratings}} / {{user.sum_of_review_counts}} ELSE 0 END"
description: Weighted average product rating across all reviews
How it works:
- Weighted sum component: Maintains sum of (value × weight) incrementally using
SUM with SUM merge
-Sum of weights component: Maintains total sum of weights incrementally using SUM with SUM merge - Weighted average calculation: Divides weighted sum by sum of weights using
{{user.var_name}} syntax, with division by zero protection

Tip:
Mark weighted sum and sum of weights components with is_feature: false since they’re intermediate calculations. The weighted average is computed on-the-fly from the incrementally maintained components.
Workflow
Input Data → [Weighted Sum Component (Incremental)] → Final Calculation
→ [Sum of Weights Component (Incremental)] ↗
- Profiles creates intermediate tables for weighted sum and sum of weights components
- Both components are maintained incrementally using simple aggregation patterns
- The weighted average is computed by dividing weighted sum by sum of weights (with division by zero protection)
Array aggregations
For array aggregations, use ARRAY_UNION_AGG or ARRAY_SORT in the merge property to combine arrays from previous checkpoints with new arrays.

Warehouse-specific syntax: Array aggregation syntax varies by warehouse. The examples in this section are specific to Snowflake. Adjust syntax for your warehouse.
Example: List of countries visited
- entity_var:
name: countries_list
select: ARRAY_AGG(DISTINCT {{tracks.country}}) WITHIN GROUP (ORDER BY {{tracks.country}})
merge: ARRAY_SORT(ARRAY_UNION_AGG(rowset.countries_list))
from: models/tracks
where: country IS NOT NULL
description: Sorted list of unique countries
How it works:
Select clause: Creates an array of distinct countries from new data, sorted
Merge property:
ARRAY_UNION_AGG(rowset.countries_list): Unions previous and new arrays, removing duplicatesARRAY_SORT(...): Sorts the final array
Array merge functions
| Select function | Merge function | Notes |
|---|
ARRAY_AGG(...) | ARRAY_UNION_AGG(rowset.feature_name) | Unions arrays, removing duplicates |
ARRAY_AGG(...) | ARRAY_SORT(ARRAY_UNION_AGG(rowset.feature_name)) | Unions and sorts arrays |
When compound aggregations don’t work
Compound aggregations may not be sufficient when:
- You need activity windows: Tracking recent activity over time windows (like MAU/DAU) requires Incremental SQL Models with Incremental Input Windows.
- You need per-object aggregations: Instead of one value per user, you need one value per (user_id, object_id) pair, where object_id could be product_id, location_id, or any other dimension. This requires Incremental SQL Models.
- You need complex business logic: Multi-step transformations with conditional logic may require Incremental SQL Models.

Input-vars cannot be built incrementally
An entity_var dependent on an input_var cannot be incremental. To enable incremental features, replace the input_var with an Incremental SQL Model and reference the model from the entity_var instead.
Best practices
- Break
AVG into sum and count components with separate merge properties - Break
WEIGHTED_AVG into weighted sum (value × weight) and sum of weights components with separate merge properties - Mark intermediate entity vars with
is_feature: false to avoid exposing them as features - Add division by zero protection when computing averages (use
CASE WHEN count > 0 THEN sum/count ELSE 0 END for AVG, or CASE WHEN sum_of_weights > 0 THEN weighted_sum / sum_of_weights ELSE 0 END for WEIGHTED_AVG) - Use
ARRAY_UNION_AGG for array aggregations to maintain deduplication - Be aware of warehouse-specific syntax differences for array operations
- Verify incremental results match full refresh results
Troubleshooting
| Issue | Solution |
|---|
| Average calculation returns incorrect values | Verify that sum and count components are correctly maintained incrementally and include division by zero protection |
| Weighted average calculation returns incorrect values | Verify that weighted sum (value × weight) and sum of weights components are correctly maintained incrementally and include division by zero protection |
| Division by zero errors | Add CASE WHEN count > 0 THEN sum/count ELSE 0 END for AVG, or CASE WHEN sum_of_weights > 0 THEN weighted_sum / sum_of_weights ELSE 0 END for WEIGHTED_AVG |
| Array aggregation syntax errors | Check warehouse-specific syntax requirements (Snowflake, BigQuery, etc.) |
| Duplicate values in arrays | Ensure ARRAY_UNION_AGG is used in merge property to remove duplicates |
| Intermediate features appearing in output | Mark intermediate entity vars with is_feature: false |
See more
Questions? Contact us by Email or on
Slack