Incremental SQL Models
Beta
Learn about incremental SQL models and how to use them for complex stateful processing in Profiles.
This guide explains incremental SQL models and how they enable complex stateful processing in Profiles.
Overview
Incremental SQL models are SQL models that dereference their own previous materialization to enable incremental processing. They enable complex patterns requiring stateful processing, running totals, or multi-step transformations that cannot be expressed using simple incremental entity vars.
See Incremental SQL Models Overview for conceptual information.
What are incremental SQL models?
An incremental SQL model is a SQL-template model that:
Dereferences a previous materialization of itself using this.DeRef() with pre_existing=true
Produces the same final result whether the previous state is available or not (idempotence)
Operates with two checkpoints:
- Current checkpoint: The checkpoint being produced now
- Baseline checkpoint: A previous checkpoint represented by the previous materialization
When the previous state is present, the model merges its prior state with new/changed inputs; otherwise it runs a full scan.
Motivation: When and why to use incremental SQL models
Use incremental SQL models when:
- Simple aggregations don’t work: Your use case requires stateful processing that cannot be expressed with
SUM, MIN, MAX, or COUNT in your merge property - Compound aggregations don’t work: Your use case requires complex transformations beyond averages or arrays
- You need activity windows: Tracking recent activity over time windows (like MAU/DAU)
- You need window functions or composite key aggregations: Per-dimension aggregations or window-based calculations
- You need multi-step transformations: Complex business logic that requires multiple steps or conditional processing
Incremental SQL models let you:
- Reference previous computation results using
this.DeRef() - Maintain stateful aggregations across runs
- Implement complex business logic with full SQL control
- Handle both incremental and full refresh modes

Tip:
For simple aggregations like SUM, MIN, MAX, COUNT, prefer using Simple Aggregations with merge property. For averages and arrays, use Compound Aggregations.
Incremental SQL models are best for complex scenarios.
Real-world examples
Here are common use cases that map to incremental SQL models:
| Use case | Pattern | Example |
|---|
| Monthly Active Users (MAU) | Incremental Input Window Models | Count of users active in the past 30 days |
| Daily Active Users (DAU) | Incremental Input Window Models | Count of users active today |
| Account balances by wallet | Composite Key Models | Account balance per (user_id, wallet_id) composite key maintained across transactions - each user has multiple wallets with individual balances |
| Recent activity tracking | Incremental Input Window Models | Tracking recent interactions within a time window |
Incremental SQL models subcategories
Incremental SQL models can be categorized into two main patterns:
When to use: Tracking recent activity or maintaining rolling windows of data.
Characteristics
- Maintain a rolling window of recent interactions (e.g., last 90 days of activity)
- Track activity over time windows (MAU, DAU, weekly active users)
- Filter and maintain only recent data incrementally
Example: Monthly Active Users (MAU) or Daily Active Users (DAU) where you need to track which users were active in a recent time window.
Composite key models
When to use: Per-dimension aggregations where you need one value per (entity_id, dimension_id) composite key pair.
Characteristics
- Maintain aggregations per (entity_id, dimension_id) composite key instead of just per entity_id
- Handle multi-dimensional state with composite key granularity
- Support complex logic where each entity has multiple related objects to track
Example: Account balances by wallet where each user (entity_id) has multiple wallets (dimension_id), and you need to maintain a separate balance for each wallet. This is a composite key pattern common in financial applications.
How incremental SQL models work
Incremental SQL models follow this workflow:
- Reference previous state using
this.DeRef() with pre_existing=true - Reference delta models that represent changes since the baseline checkpoint
- Collect new rows from upstream sources that arrived since the baseline checkpoint
- If previous state exists, merge it with new data; otherwise perform a full scan
- Materialize the result as a table, view, or ephemeral model
Key advantages
Profiles handles the complexity for you:
- Automatic creation: Delta models are automatically generated for append-only inputs
- Automatic cleanup: Old materials are cleaned up based on retention periods
- Automatic build order: Dependency resolution ensures correct execution order
- No manual backfill required: Full refresh mode handles first run automatically
- Per-model incremental base: If one entity var changes, others can still run incrementally
- Efficient time management: Large changes may take longer, but subsequent runs remain fast
Basic template structure
models:
- name: example_incremental_model
model_type: sql_template
model_spec:
materialization:
output_type: table
single_sql: |
{%- set lastThis = this.DeRef(
pre_existing=true,
dependency="optional",
checkpoint_name="baseline") -%}
{% set events_delta = this.DeRef("inputs/events/incr_delta_tbl",
dependency="coercive",
baseline_name="baseline",
prereqs=[lastThis]) %}
{% set events_full = this.DeRef("inputs/events",
prereqs=[events_delta.Except()]) %}
{# All DeRef before this line. Note that DeRef should not be put in conditionals. DeRef mark dependencies — use prereqs for conditional dependencies #}
{% if lastThis.IsEnabled() && events_delta.IsEnabled() %}
-- Incremental mode: merge with baseline
SELECT * FROM {{ lastThis }}
UNION ALL
SELECT * FROM {{ events_delta }}
{% else %}
-- Full refresh mode: process all data
SELECT * FROM {{ events_full }}
{% endif %}
contract:
is_event_stream: true
is_append_only: false

Tip:
Note the following:
- Incremental SQL models work on append-only inputs: The upstream inputs must have
contract: is_append_only: true to enable delta model generation. - Also works on append-only SQL models: Incremental SQL models can build on other append-only SQL models, even ephemeral ones.
- Set
is_append_only: false for incremental SQL models that merge or update data. - Set
is_append_only: true only if your model truly only appends new rows without modifying existing ones. - Profiles also supports keeping the SQL template in another file and referencing it using
{% exec %} {{ this.ReadFile("file/path") }} {% endexec %}.
Key concepts
This section explains the key concepts of incremental SQL models.
Previous state reference
Use this.DeRef() with pre_existing=true to reference the previous materialization of the current model:
{%- set lastThis = this.DeRef(
pre_existing=true,
dependency="optional",
checkpoint_name="baseline") -%}
This looks for a previous materialization, uses it as a baseline checkpoint, and returns nil if no baseline exists (enabling full refresh mode).
See DeRef Reference for complete syntax and usage.
Collecting new data
When running incrementally, collect only new data that arrived since the baseline checkpoint. Profiles automatically handles this for append-only inputs when you reference them with the appropriate baseline context using baseline_name that matches the checkpoint_name used for lastThis.
Incremental vs. full refresh modes
- Incremental mode (when baseline exists): Merges previous state with new data, processing only changes since the last checkpoint.
- Full refresh mode (when no baseline exists): Processes all historical data, used on first run or when baseline is unavailable.
See the Checkpoints and Baselines guide for more information on these modes.
Example: Monthly Active Users (MAU)
This example demonstrates an incremental input window model pattern for tracking monthly active users, where an incremental SQL model maintains a rolling window of active days, and an entity var references it to compute MAU.
Input configuration (inputs.yaml):
inputs:
- name: events
table: PROD.ANALYTICS.EVENTS
contract:
is_append_only: true
is_event_stream: true
occurred_at_col: timestamp
Incremental SQL model (sql_models.yaml):
models:
- name: recent_user_activity
model_type: sql_template
model_spec:
occurred_at_col: timestamp
materialization:
output_type: table
run_type: incremental
single_sql: |
{%- set lastThis = this.DeRef(
pre_existing=true,
dependency="optional",
checkpoint_name="baseline") -%}
{% set events_delta = this.DeRef("inputs/events/incr_delta_tbl",
dependency="coercive",
baseline_name="baseline",
prereqs=[lastThis]) %}
{% set events_full = this.DeRef("inputs/events",
prereqs=[events_delta.Except()]) %}
{# All DeRef before this line. Note that DeRef should not be put in conditionals. DeRef mark dependencies — use prereqs for conditional dependencies #}
WITH base AS (
{%- if lastThis.IsEnabled() %}
-- Incremental mode: merge previous active days with new events
SELECT DISTINCT user_id, CAST(timestamp AS DATE) AS active_day
FROM (
SELECT user_id, timestamp FROM {{ events_delta }}
UNION ALL
SELECT user_id, active_day AS timestamp FROM {{ lastThis }}
)
{%- else %}
-- Full refresh mode: process all events
SELECT DISTINCT user_id, CAST(timestamp AS DATE) AS active_day
FROM {{ events_full }}
{%- endif %}
)
SELECT user_id, active_day
FROM base
WHERE active_day >= DATEADD('day', -365, COALESCE({{end_time}}, CURRENT_DATE()))
ids:
- select: user_id
type: user_id
entity: user
contract:
is_event_stream: true
is_append_only: false
with_entity_ids:
- user
with_columns:
- name: user_id
- name: active_day
This incremental SQL model has is_append_only: false because it maintains a rolling window that removes old data (older than 365 days). Incremental SQL models that merge or update data should set is_append_only: false — only set is_append_only: true if your model truly only appends new rows without modifying existing ones.
Entity variable (profiles.yaml):
- entity_var:
name: monthly_active_users
select: COUNT(DISTINCT user_id)
from: models/recent_user_activity
where: active_day >= DATEADD('day', -30, {{end_time}})
description: Number of distinct users active in the past 30 days
How it works
Input configuration: The events input has contract: is_append_only: true, enabling incremental processing.
Incremental SQL model: The recent_user_activity model:
- References its previous materialization using
lastThis with pre_existing=true - Collects new events from
events_delta when running incrementally - Merges previous active days with new event dates using
UNION ALL and DISTINCT - Filters to maintain only the last 365 days of activity
- Handles full refresh mode when no baseline exists
Entity var reference: The monthly_active_users entity var references the incremental SQL model and filters to the past 30 days, counting distinct users from the pre-computed model.
This pattern is efficient because the SQL model maintains state incrementally (only processing new events), and the entity var performs a simple aggregation on the pre-computed model.
Troubleshooting
| Issue | Solution |
|---|
| Model fails on first run | Ensure your model handles cases when lastThis.IsEnabled() is both false and true. Use events_full with prereqs=[events_delta.Except()] to ensure proper fallback when baseline doesn’t exist |
| New data not being processed | Verify that upstream inputs have contract: is_append_only: true configured and that your baseline_name matches the checkpoint_name used for lastThis |
| Incorrect merge results | Check that your merge logic correctly handles both existing and new entities. Use appropriate SQL joins, UNION ALL patterns, or ordering/de-duplication functions based on your use case |
| Performance not improving | Ensure your filtering logic correctly identifies only new data and that your merge logic is efficient. Verify that delta models are being used instead of full inputs |
DeRef errors | See DeRef Reference for syntax and troubleshooting |
| SQL syntax errors | Note that SQL syntax varies by warehouse. DATEADD is Snowflake/Redshift syntax. For BigQuery, use DATE_SUB; for Databricks, use date_sub |
Avoiding cycles and infinite regress

Incremental features are powerful but need to be used carefully to avoid cycles and infinite regress.
Important: All DeRef before conditionals
All DeRef calls must be placed before any conditional logic (if statements). Profiles needs to discover all dependencies during the dependency discovery phase, which happens before SQL execution. If DeRef is inside a conditional, Profiles cannot determine dependencies correctly.
{# CORRECT: All DeRef at the top #}
{% set lastThis = this.DeRef(pre_existing=true, dependency="optional") %}
{% set events_delta = this.DeRef("inputs/events/incr_delta_tbl", prereqs=[lastThis]) %}
{% if lastThis.IsEnabled() %}
{# Use materials here #}
{% endif %}
{# WRONG: DeRef inside conditional - breaks dependency discovery #}
{% if some_condition %}
{% set material = this.DeRef("some/model") %} {# ERROR! #}
{% endif %}
Common mistakes and how to avoid them:
Infinite regress: Self pre-existing DeRef must be optional
{# WRONG: mandatory self-reference creates infinite regress #}
{% set lastThis = this.DeRef(pre_existing=true, dependency="normal") %}
{# CORRECT: self-reference must be optional #}
{% set lastThis = this.DeRef(pre_existing=true, dependency="optional", checkpoint_name="baseline") %}
Why: Without dependency="optional", Profiles tries to create infinite historical chain of materials.
Cycle: Missing pre_existing=true flag
{# WRONG: missing pre_existing creates cycle #}
{% set lastThis = this.DeRef(checkpoint_name="baseline") %}
{# CORRECT: must specify pre_existing=true #}
{% set lastThis = this.DeRef(pre_existing=true, dependency="optional", checkpoint_name="baseline") %}
Why: Without pre_existing=true, the model depends on itself in the same time context, creating a cycle.
Inefficient: Delta derefs must be conditional
{# INEFFICIENT: delta runs even in full refresh mode #}
{% set events_delta = this.DeRef("inputs/events/incr_delta_tbl", baseline_name="baseline") %}
{# CORRECT: delta only runs when baseline exists #}
{% set lastThis = this.DeRef(pre_existing=true, dependency="optional", checkpoint_name="baseline") %}
{% set events_delta = this.DeRef("inputs/events/incr_delta_tbl",
dependency="coercive",
baseline_name="baseline",
prereqs=[lastThis]) %}
Why: Without prereqs=[lastThis], delta model runs even during full refresh, wasting compute. Var tables may also run unnecessarily.
Profiles protections:
Profiles has built-in protections to catch these issues:
- Cycle detection: Graph validation catches same-time-context cycles during dependency resolution
- Infinite regress detection: If >10,000 temporary sequence numbers are created, Profiles throws an error with diagnostic information
- Helpful error messages: Errors include the call chain, recent material allocations, and suggestions for fixing the issue
Debug tips:
- Check that all self-referencing DeRefs have
dependency="optional" - Verify all pre-existing references include
pre_existing=true - Ensure delta model DeRefs include
prereqs=[lastThis] to make them conditional - Use
baseline_name matching the checkpoint_name for consistency
See more
Questions? Contact us by Email or on
Slack