From the left panel, go to Directory > Sources > Reverse ETL. Then, select Amazon S3.
Assign a name and click Continue.
Configure the following settings to authenticate RudderStack to access your S3 account:
Connection Mode: RudderStack provides the following options to connect to S3:
Cross-Account Role (recommended): This option lets you connect to S3 through an IAM access role. To do so, you need to first create an IAM role for RudderStack with the required permissions to access your S3 account. Refer to the Creating the RudderStack IAM Role for S3 section below for the detailed steps.
Access Key: This option lets you connect to S3 using your AWS access key ID and secret access key.
It is highly recommended to use the Cross-Account Role method for connecting to S3 as the Access Key method will be deprecated soon.
Account Name: Specify a name that will be used to identify the connection account.
AWS Access Key ID: If you select the Access Key connection mode for authenticating RudderStack, specify your AWS access key ID. For more information on obtaining your access key ID and secret access key, refer to the FAQ section below.
AWS Secret Access Key: Enter the corresponding secret access key.
The minimum S3 permissions that need to be attached to IAM role or the access keys (depending on your connection method) are listed below:
Specify the Schedule Settings to schedule the data syncs from your S3 source.
RudderStack lets you schedule data syncs for your Reverse ETL sources and specify how and when the syncs will run. For more information on the Basic, CRON, and Manual schedule types, refer to the Sync Schedule guide.
Connecting to a destination
Once you successfully set up your S3 source, you can connect it to your preferred destination by clicking the Add Destination button:
Specifying the data to import
While configuring the destination, specify the following bucket configuration settings needed for RudderStack to import the data and sync it to the connected destination:
S3 Bucket Name: Enter the name of the S3 bucket.
Prefix: Prefix refers to the path within your S3 bucket from where RudderStack will import the data. For example, if Prefix is set to RUDDER, then RudderStack will import the data stored in the location <your_s3_bucket>/RUDDER.
Your S3 bucket (with the prefix, if specified above) should only consist of Apache Parquet files as RudderStack can extract only the Parquet files. Also, the first row of the Parquet file should not have a null value (empty strings are allowed) for any column. It helps RudderStack to determine the correct schema of the file.
Choose user identifier: Choose a user identifier for user_id and/or anonymous_id from the dropdown.
Once you specify the above settings, you will be able to preview a snippet of your data, as shown below:
Here, you can select all or only specific columns of your choice, search the columns by a keyword, and also edit the JSON Trait Key. You can also preview the resulting JSON on the right.
From the upper right corner, click your account and go to Security Credentials. You can find your access key ID listed here. You can also create a new access key by clicking the Create access key button:
cookies, the cookies that are categorized as necessary are stored on your browser as they are as
for the working of basic functionalities of the website. We also use third-party cookies that
analyze and understand how you use this website. These cookies will be stored in your browser
consent. You also have the option to opt-out of these cookies. But opting out of some of these
have an effect on your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. This
category only includes cookies that ensures basic functionalities and security
features of the website. These cookies do not store any personal information.
learn more about cookies and why we use them, visit our cookie
policy. We'll assume you're ok with this, but you can opt-out if you wish Cookie Settings.