Sync Data Using Models

Import data using RudderStack’s Models feature.

This guide is applicable for Reverse ETL sources configured using the Model option. If you have configured your Reverse ETL source using the Table option, refer to the Importing Data using Tables guide.

RudderStack’s Models feature lets you define and run custom SQL queries on your warehouse data and send the results to your specified destinations. You can create a model in the RudderStack dashboard and use it while connecting a Reverse ETL source to a destination.

RudderStack provides the following options to map your warehouse columns to specific destination fields while importing the data:

  • Map with Visualizer (Refer to the Visual Data Mapper guide for the list of the supported destinations.)
  • Map with JSON

This guide lists the JSON mapping settings required to import and sync data from your model to the specified destination.

Connecting a model to a destination

  1. Create a new model in the RudderStack dashboard by going to Activate > Models.
  2. Set up a reverse ETL source by going to Collect > Sources. Select the warehouse used to create the model.
  3. Under Source type, specify Model and configure the rest of the settings.
Configure RudderStack models
  1. Connect the reverse ETL source to your destination. Configure the destination settings and click Continue.
  2. In the Data Mapping section, select the model created in step 1 in from the dropdown.
Connect RudderStack models
  1. Follow the steps listed in Data import settings to complete the configuration.

Data import settings

The settings to import and sync data from your model are listed below:

Data import settings in RudderStack
  • Model: Select the required model from the dropdown.
The dropdown will only display the list of models corresponding to the Reverse ETL source you have configured. For example, only the BigQuery models will be listed for a BigQuery warehouse source.
  • Sync mode: Select the sync mode that RudderStack should use to sync your data. For more information on these modes, refer to the Sync Modes guide.
  • Primary Key: Select a column from the data returned by the model (specified above) to uniquely identify your records in the warehouse.
RudderStack uses the primary key column for diffing in the case of incremental syncs. You can generate it while creating the model. The best combination for a primary key is the timestamp and user_id.
  • Event Type: Select from the identify or track event type depending on how you want to send the event data to the downstream destinations. If you select track, you also need to provide:

    • Event Name: Enter the event name which should be sent for all events to the downstream destinations:
    Schema tab options in RudderStack

    You can also send different event names by enabling the lookup event name by column setting and specifying the column name which should be used to set the event name in the Event Name field:

    Schema tab options in RudderStack
Refer to the Syncing Events guide for more information on sending the event data using the identify or track call.
  • Choose user identifier: Choose atleast one user identifier from user_id and anonymous_id from the dropdown.

You can also preview the data snippet which RudderStack will send to the destination. All the resulting columns from running the model’s query are selected by default. However, you can choose to retain specific columns by searching and selecting/deselecting them. Finally, preview the resulting JSON on the right:

The JSON payload carries the user_id and anonymous_id from the columns selected in the Choose user identifier section. Moreover, the traits are used from the columns selected in the Column section.

Add Constant

You can also use the Add Constant option to add a constant key-value pair which is always sent in the JSON payload:

The new constant will appear in the table and also in the JSON preview inside the traits:

You can also use the dot notation to define a constant:

Once you have finalized the configuration, click Save.

Updating an existing configuration

To update an existing configuration, follow these steps:

  1. Go to the Schema tab of your configured source.
  2. Click the Update button:
  1. Update your column selection.
When updating an existing configuration, you can only change the existing mappings. The Model, Sync mode, and the User identifier fields are not editable.
  1. Finally, click the Save button.
After updating the configuration, the next sync will be a full sync.

Questions? Contact us by email or on Slack