Customer.io Cloud Extract Source

Sync data from Customer.io to your warehouse destination via RudderStack.

4 minute read

RudderStack’s Cloud Extract (ETL) product will be sunset on December 1, 2025. See this release note for more details.

Customer.io is a popular marketing platform for sending targeted emails and push and SMS notifications to improve conversions and customer engagement.

This document guides you in setting up Customer.io as a source in RudderStack. Once configured, RudderStack automatically ingests your Customer.io data and routes it to your specified data warehouse destination.

This source works only for the Customer.io accounts in US data-centers.

All the Cloud Extract sources support sending data only to a data warehouse destination.

Getting started

To set up Customer.io as a source in RudderStack, follow these steps:

Log into your RudderStack dashboard.
Go to Sources > New source > Cloud Extract and select Customer.io from the list of sources.
Assign a name to your source and click Continue.

Connection settings

To set up Customer.io as a Cloud Extract source, you need to configure the following settings:

App API Key: Enter your Customer.io API key which can be obtained in the Customer.io dashboard by navigating to Settings > Account Settings > API Credentials.
Cutoff Days: Enter the number of days after which RudderStack fetches the updated data.

Destination settings

The following settings specify how RudderStack sends the data ingested from Customer.io to the connected warehouse destination:

Table prefix: RudderStack uses this prefix to create a table in your data warehouse and loads all your Customer.io data into it.

Note that RudderStack does not add special characters like - or _ to the prefix by default. Hence, you need to specify it while setting the prefix.

Schedule Settings: RudderStack gives you three options to ingest the data from Customer.io:
- Basic: Runs the syncs at the specified time interval.
- CRON: Runs the syncs based on the user-defined CRON expression.
- Manual: You are required to run the syncs manually.

For more information on the schedule types, refer to the Common Settings guide.

Selecting the data to import

You can choose the Customer.io data you want to ingest by selecting the required resources:

The below table lists the syncs supported by the Customer.io resources to your warehouse destination:

Resource	Full Refresh sync	Incremental sync
`newsletters`	Yes	Yes
`campaigns`	Yes	Yes
`campaign_actions`	Yes	Yes

For more information on the Full Refresh and Incremental sync modes, refer to the Common Settings guide.

RudderStack ingests the Customer.io data using the Customer.io Beta API which limits the API requests by 10 requests per second.

Customer.io is now configured as a source. RudderStack will start ingesting data from Customer.io as per your specified schedule and frequency.

You can further connect this source to your data warehouse by clicking on Add Destination:

Use the Use Existing Destination option if you have an already-configured data warehouse destination in RudderStack. To configure a data warehouse destination from scratch, select the Create New Destination button.

FAQ

Is it possible to have multiple Cloud Extract sources writing to the same schema?

Yes, it is.

RudderStack associates a table prefix for every Cloud Extract source writing to a warehouse schema. This way, multiple Cloud Extract sources can write to the same schema with different table prefixes.

How does RudderStack count the events for Cloud Extract sources?

RudderStack counts the number of records returned by the source APIs when queried during each sync. It considers each record as an event.

How does RudderStack set the table name for the data sent via Cloud Extract sources?

RudderStack sets the table name for the resource you are syncing to the warehouse by adding rudder_ to the Table prefix you set while configuring your Cloud Extract source in the dashboard.

For example, if you set test_ as the Table prefix in the dashboard, RudderStack sets the table name as test_rudder_<resource_name>, where <resource_name> is the name of the resource you are syncing (for example, contacts, messages, etc.).

Note that RudderStack does not add the character _ to the prefix by default. Hence, you need to specify it while setting the prefix.

Was this page helpful?

Glad to hear it! Please tell us how we can improve.

Sorry to hear that. Please tell us how we can improve.

Questions? Contact us by email or on Slack