Complete reference for defining your SQL model resources using YAML configuration files.
Available Plans
free
starter
growth
enterprise
5 minute read
This guide serves as a detailed reference for the CLI project YAML files that contain definitions of your Reverse ETL SQL model resources.
Overview
In the context of the Rudder CLI (rudder-cli) tool, SQL model resources are defined using YAML files that specify the configuration for your Reverse ETL SQL model sources. These YAML files enable you to manage your SQL models through a Git-based workflow with version control and collaboration features.
The location and naming of these YAML files is flexible, as you can store the YAML files anywhere within the project’s root directory or subdirectories.
You can also group multiple SQL model resources in the same file, allowing structures that can best serve your project’s requirements.
The Rudder CLI tool processes all valid YAML files within the project structure to recognize the defined resources.
The following sections detail the specific YAML formats and parameter definitions for SQL model resources.
SQL model resources
You can define one or more SQL model resources in the YAML file by setting kind to retl-source-sql-model.
The spec parameter of the YAML file has the following structure:
Property
Type
Description
id Required
String
Unique identifier for the SQL model resource within the project. This parameter must be unique across all SQL model resources.
display_name Required
String
Display name for the SQL model resource as shown in the RudderStack dashboard.
primary_key Required
String
Column name to use as the primary key for the model. This column must exist in your SQL query results.
Whether the SQL model resource is active. Set to true to enable the resource.
account_id Required
String
The account ID for the data warehouse connection. You can obtain this using ./rudder-cli workspace accounts list.
description
String
Optional description for the SQL model resource.
sql
String
Inline SQL query definition.
Use the YAML literal block scalar `
file
String
Path to an external SQL file containing the query. Path is relative to the YAML file location.
You must use exactly one of sql or file — they are mutually exclusive. Use sql for inline queries or file to reference external SQL files.
Supported warehouse types
The source_definition field supports the following data warehouse types:
Warehouse
source_definition value
PostgreSQL
postgres
MySQL
mysql
Snowflake
snowflake
Amazon Redshift
redshift
Google BigQuery
bigquery
Databricks
databricks
Trino
trino
Example definitions
The following examples highlight the YAML configurations for SQL model resources defined using inline SQL and external SQL files. You can group multiple resources in the same YAML file to manage them together.
version:"rudder/v0.1"kind:"retl-source-sql-model"metadata:name:"product-purchases-model"spec:id:"product-purchases-model"display_name:"Product Purchases SQL Model"primary_key:"user_id"description:"SQL model for tracking product purchases over the last 180 days"source_definition:"postgres"enabled:trueaccount_id:"2rzPI1ARibivIvNH5DSpAKGTATy"sql:| SELECT
user_id,
event_name,
event_type,
timestamp,
properties,
product_id
FROM
user_events
WHERE
timestamp >= CURRENT_DATE() - INTERVAL 180 DAY
AND event_name = 'Product Purchased'
ORDER BY
timestamp DESC
version:"rudder/v0.1"kind:"retl-source-sql-model"metadata:name:"user-analytics-model"spec:id:"user-analytics-model"display_name:"User Analytics Model"primary_key:"user_id"description:"Comprehensive user analytics model"source_definition:"snowflake"enabled:trueaccount_id:"2rzPI1ARibivIvNH5DSpAKGTATy"file:"./queries/user-analytics.sql"
In this example, the SQL query would be stored in a separate file at ./queries/user-analytics.sql relative to the YAML file location.
version:"rudder/v0.1"kind:"retl-source-sql-model"metadata:name:"ecommerce-models"spec:- id:"product-views-model"display_name:"Product Views Model"primary_key:"user_id"description:"Model for tracking product page views"source_definition:"bigquery"enabled:trueaccount_id:"2rzPI1ARibivIvNH5DSpAKGTATy"sql:| SELECT
user_id,
product_id,
view_timestamp,
session_id
FROM
product_views
WHERE
view_timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)- id:"cart-abandonment-model"display_name:"Cart Abandonment Model"primary_key:"session_id"description:"Model for identifying cart abandonment events"source_definition:"bigquery"enabled:trueaccount_id:"2rzPI1ARibivIvNH5DSpAKGTATy"file:"./queries/cart-abandonment.sql"
Best practices
When defining SQL model resources in YAML files, follow these best practices:
File organization
Logical grouping: Group related SQL models in the same YAML file when they share similar purposes or data sources.
Clear naming: Use descriptive file names that reflect the models’ purpose, such as user-analytics-models.yaml or ecommerce-models.yaml.
Directory structure: Organize SQL files in subdirectories (like ./queries/ or ./sql/) to keep your project structure clean.
Resource configuration
Unique IDs: Choose clear, descriptive IDs that reflect the model’s purpose and avoid conflicts.
Meaningful names: Use display names that clearly describe what the model does for dashboard users.
Documentation: Always include descriptions to help team members understand the model’s purpose.
External files: For complex queries, use the file option to keep SQL in separate files for better syntax highlighting and version control. When importing resources, use the --sql-location parameter if you prefer external SQL files.
Environment management
Variable substitution: Consider using variable substitutions for account_id values when working across multiple environments (development, staging, or production).
Environment-specific configurations: Maintain separate configuration files or use templating for environment-specific settings.
Query optimization
Primary key selection: Choose primary keys that are unique, stable, and efficiently indexed in your warehouse.
Performance considerations: Write efficient queries that minimize warehouse compute costs and execution time.
Data freshness: Include appropriate date filters to control data volume and ensure relevance.
This site uses cookies to improve your experience while you navigate through the website. Out of
these
cookies, the cookies that are categorized as necessary are stored on your browser as they are as
essential
for the working of basic functionalities of the website. We also use third-party cookies that
help
us
analyze and understand how you use this website. These cookies will be stored in your browser
only
with
your
consent. You also have the option to opt-out of these cookies. But opting out of some of these
cookies
may
have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This
category only includes cookies that ensures basic functionalities and security
features of the website. These cookies do not store any personal information.
This site uses cookies to improve your experience. If you want to
learn more about cookies and why we use them, visit our cookie
policy. We'll assume you're ok with this, but you can opt-out if you wish Cookie Settings.