Version:

Input Var

Learn how to create and use input_var in your Profiles project.

3 minute read

An input_var adds a calculated column to an input table. It calculates a single value per row of the input table which can be used as an input for entity_vars.

Performance consideration: Input vars scan all input data on every run, which can be expensive for large datasets.
Consider using the following approaches for better performance:
entity_var with Incremental Features for aggregations like SUM, COUNT, MIN, MAX
Incremental SQL Models for complex transformations
Use input_var only when you need row-level calculations on the input table before entity-level aggregation (for example, window functions like ROW_NUMBER() or LAG()).

- input_var:
    name: input_var_name
    select: SQL statement
    from: inputs/table
    window:
        partition_by:
            - partition_column
    description: description

Field	Description
`name`	Name of the `input_var`.
`select`	SQL state, equivalent to `SELECT` statement.
`from`	Path of the input table, equivalent to `FROM` statement.
`where`	Filtering condition, equivalent to `WHERE` statement.
`window`	Adds `partition_by` and/or `order_by` keys for window functions
`description`	Description of the `input_var`.

Example

- input_var:
    name: page_count
    description: Adds column to a SQL model to get a page count per date.
    select: count(distinct url)
    from: models/pages_orderby_table
    window:
        partition_by:
            - profile_id
            - date

Migrate input vars to incremental alternatives

Input vars scan all input data on every run and block incremental processing for any entity var that depends on them. When migrating a project to incremental mode, you must convert or replace every input var.

Decision tree

For each input var, follow this decision tree to choose the best replacement:

Input Var
+-- Uses LAST_VALUE / FIRST_VALUE window function?
|   +-- Yes: Convert to entity_var using MAX_BY / MIN_BY
|
+-- Uses a simple aggregation with partition_by?
|   +-- Yes: Convert to entity_var with merge clause
|
+-- Requires complex multi-step logic?
    +-- Yes: Create an incremental SQL model and
        reference it from entity vars

Option 1: Convert to entity var (most common)

You can replace window functions like LAST_VALUE and FIRST_VALUE with MAX_BY and MIN_BY:

# Before (input var with LAST_VALUE)
- input_var:
    name: last_known_city
    select: last_value(city)
    from: inputs/events
    window:
      partition_by:
        - user_id
      order_by:
        - timestamp
    where: city IS NOT NULL

# After (entity var with MAX_BY + ordering helper)
- entity_var:
    name: last_known_city
    select: MAX_BY({{events.city}}, {{events.timestamp}})
    merge: MAX_BY({{rowset.last_known_city}}, {{rowset.last_known_city_by_param}})
    from: inputs/events
    where: {{events.city}} IS NOT NULL

- entity_var:
    name: last_known_city_by_param
    select: MAX({{events.timestamp}})
    merge: MAX({{rowset.last_known_city_by_param}})
    from: inputs/events
    where: {{events.city}} IS NOT NULL
    is_feature: false

For detailed patterns, see the following sections:

Option 2: Create an incremental SQL model

When the input var logic is too complex for an entity var (multi-step transformations, complex joins, or logic that truly requires row-level windowing), create an Incremental SQL Model that handles the computation incrementally, then reference it from entity vars.