Amazon S3 destination

Send your event data from RudderStack to Amazon S3.

Amazon S3 (Simple Storage Service) is a cloud-based object storage service that lets customers and businesses store their data securely and at scale.

Find the open source transformer code for this destination in the GitHub repository.

Prerequisites

Before you set up S3 as a destination in RudderStack, make sure to set up your S3 bucket with the required permissions.

Setup

  1. From your RudderStack dashboard, add a source. Then, from the list of destinations, select Amazon S3.
  2. Assign a name to the destination and click Continue.

Connection settings

success
If you have already configured the AWS credentials in your RudderStack setup via the environment credentials or by following these steps, specifying only S3 Bucket Name and Prefix (optional but recommended) is sufficient to set up your S3 destination.
  • S3 Bucket Name: Enter your S3 bucket name.
  • Prefix: If specified, RudderStack creates a folder in the S3 bucket with this name and pushes all data within that folder. For example, s3://<bucket_name>/<prefix>/.
  • Role-based Authentication: This setting is toggled on by default and lets you use the RudderStack IAM role for authentication.
    • IAM Role ARN: Enter the ARN of the IAM role.

If Role-based Authentication is disabled, enter the AWS Access Key ID and AWS Secret Access Key to authorize RudderStack to write to your S3 bucket. For more information on obtaining these credentials, see the Permissions section.

info

Note the following:

  • Using Role-based Authentication is highly recommended as the access keys-based authentication method is deprecated and will be discontinued soon.
  • In both the role-based and access key-based authentication methods, you need to set a policy specifying the required permissions for RudderStack to write to your S3 bucket.
  • If you are using your S3 bucket as an intermediary object storage for sending events to a warehouse destination, then see the S3 permissions for warehouse destinations.
  • Enable Server Side Encryption: When you enable this setting, RudderStack adds a header x-amz-server-side-encryption with the value AES256 to the PutObject request when sending the data to the S3 bucket. See Encryption with S3 managed keys for more information.
  • Consent settings: Specify the OneTrust category ID and/or Ketch purpose ID.

S3 bucket setup

  1. Go to your S3 Management console.
  2. Create a new bucket. Alternatively, you can choose an existing bucket.
info
It is recommended to create a new bucket for storing events coming from RudderStack.

Permissions

To send events to S3 successfully, you need to give RudderStack the necessary permissions to write to your bucket. You can choose any of the following approaches based on your company’s security policies and setup preferences:

Option 1: Use RudderStack IAM role

success
It is highly recommended to use this option for setting up the required S3 bucket permissions.

Use this approach if you are going to set up the S3 destination in RudderStack using Role Based Authentication.

Role based authentication
  1. Create a RudderStack IAM role.
  2. Use the following S3 permissions policy for creating the role:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "s3:PutObject",
      "Resource": "arn:aws:s3:::<S3_BUCKET_NAME>/*"
    }
  ]
}

Make sure to replace <S3_BUCKET_NAME> with the actual bucket name.

  1. After creating the role, note and specify the IAM Role ARN to set up your S3 destination.

Option 2: Create IAM user and provide credentials

danger

Note that:

  • Using Role-based Authentication (Option 1) is highly recommended as this method is now deprecated and will be discontinued soon.
  • AWS does not recommend access key credentials-based authentication.

Use this approach to set up the S3 destination in RudderStack using Access Key Based Authentication.

Access key based authentication
info
If the AWS credentials are already configured on your instance (see Option 4) where the RudderStack server is set up, you do not need to specify these credentials.
  1. Log in to your Amazon AWS IAM Console.
  2. Create an IAM user. Choose a policy that has write access to your bucket. Alternatively, you can create a new policy with the following permissions and attach it to the IAM user:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "s3:PutObject",
      "Resource": "arn:aws:s3:::<S3_BUCKET_NAME>/*"
    }
  ]
}

Make sure to replace <S3_BUCKET_NAME> with the actual bucket name.

  1. Return to the IAM dashboard and go to Users under Access management. Then, click on the newly-created user.
  2. Go to the Security credentials tab and scroll down to Access keys.
  3. Click Create access key, select the use case as per your requirement, and click Next.
  4. If required, set the Description tag value, and click Create access key.
  5. Note and secure the Access key and Secret access key. Use these credentials to set up your S3 destination in RudderStack.

Option 3: Allow RudderStack to write into bucket

warning

Use this option only if:

  • You are using RudderStack Cloud to set up your connection.
  • You want to allow RudderStack to write into your S3 bucket directly.

For this option, you can leave the role based authentication (IAM Role ARN) or access key based authentication (AWS Access Key ID and AWS Secret Access Key) fields blank while setting up your S3 destination in RudderStack.

Add the following JSON in your bucket policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::422074288268:user/s3-copy"
      },
      "Action": ["s3:PutObject", "s3:PutObjectAcl"],
      "Resource": ["arn:aws:s3:::<S3_BUCKET_NAME>/*"]
    }
  ]
}

Make sure to replace <S3_BUCKET_NAME> with the actual bucket name.

By adding the above policy, the RudderStack user arn:aws:iam::422074288268:user/s3-copy will get the necessary permission to write into your bucket.

Option 4: Self-hosted RudderStack

warning
Use this approach only if you are hosting RudderStack in your own instance and don’t want to follow the above options.
  1. Create a new IAM user and attach the below policy:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "*",
      "Resource": "arn:aws:s3:::*"
    }
  ]
}
  1. Add the following policy to your bucket. Replace ACCOUNT_ID, USER_ARN, and <S3_BUCKET_NAME> with the AWS account ID, user ARN, and the S3 bucket name.
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::ACCOUNT_ID:user/USER_ARN"
      },
      "Action": ["s3:PutObject", "s3:PutObjectAcl"],
      "Resource": ["arn:aws:s3:::<S3_BUCKET_NAME>/*"]
    }
  ]
}
  1. Return to the IAM dashboard and go to Users under Access management. Then, click on the newly-created user.
  2. Go to the Security credentials tab and scroll down to Access keys.
  3. Click Create access key, select the use case as per your requirement, and click Next.
  4. If required, set the Description tag value, and click Create access key.
  5. Note and secure the Access key and Secret access key.
  6. Add the above credentials to your RudderStack setup environment:
RUDDER_AWS_S3_COPY_USER_ACCESS_KEY_ID=<access_key_id>
RUDDER_AWS_S3_COPY_USER_ACCESS_KEY=<secret_access_key>

S3 permissions for warehouse destinations

If you’re using your S3 bucket as an intermediary object storage for a warehouse destination, then attach the below permissions policy depending on your use case:

warning

Note that the below policy is applicable only for the below authentication options:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": [
      "s3:GetObject",
      "s3:PutObject",
      "s3:PutObjectAcl",
      "s3:ListBucket"
    ],
    "Resource": "arn:aws:s3:::<S3_BUCKET_NAME>/*"
  }]
}

To allow RudderStack to write into your bucket directly (Option 3), use the following policy:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": {
      "AWS": "arn:aws:iam::422074288268:user/s3-copy"
    },
    "Action": [
      "s3:GetObject",
      "s3:PutObject",
      "s3:PutObjectAcl",
      "s3:ListBucket"
    ],
    "Resource": ["arn:aws:s3:::<S3_BUCKET_NAME>/*"]
  }]
}

For self-hosted RudderStack (Option 4), use the following bucket policy in Step 2:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": {
      "AWS": "arn:aws:iam::ACCOUNT_ID:user/USER_ARN"
    },
    "Action": [
      "s3:GetObject",
      "s3:PutObject",
      "s3:PutObjectAcl",
      "s3:ListBucket"
    ],
    "Resource": ["arn:aws:s3:::<S3_BUCKET_NAME>/*"]
  }]
}

Make sure to replace <S3_BUCKET_NAME> with the actual bucket name.

Encryption

Amazon S3 provides encryption at rest. The objects get encrypted while saving them to the bucket and are decrypted before downloading from S3.

S3 lets you set the default encryption behavior for a bucket. It encrypts the objects using server-side encryption with either Amazon S3 managed keys (SSE-S3) or AWS KMS-managed keys (SSE-KMS).

Set default encryption

  1. Log in to your S3 Management console and select your bucket.
  2. Go to the Properties tab and scroll down to Default encryption. Then, click Edit.
  3. Under Encryption key type, choose from Amazon S3 managed keys (SSE-S3) or AWS KMS-managed keys (SSE-KMS):
S3 default encryption

The following settings are applicable if you choose AWS KMS-managed keys (SSE-KMS) as the encryption key type:

KMS encryption configuration
info
You can choose an existing AWS KMS key, enter the ARN of an AWS KMS key, or create a new KMS key.
  1. Under Bucket Key, choose Enable and click Save changes.

For more information on setting the default encryption behavior for a bucket, see the S3 documentation.

AWS KMS keys

When the default encryption is set to AWS KMS-managed keys (SSE-KMS), S3 encrypts the objects using the customer managed keys (CMK) when they are uploaded to the bucket.

Create a new customer managed key

  1. Log in to the AWS Key Management Service (KMS) console.
KMS console login
  1. From the left sidebar, go to Customer managed keys and click Create key.
Create new CMK
  1. Under Key type, choose Symmetric. Under Key usage, select Encrypt and decrypt.
warning
S3 supports only symmetric CMKs.
Key type and usage
  1. Set an Alias for the key. You can also add a description or tags for the key as required.
Key alias and description
  1. Choose the IAM user or role who can administer and use this key.
Key adminstration and usage
  1. Review the configuration and click Finish to create the customer managed key.
  2. Finally, set the default encryption for your S3 bucket as AWS KMS-managed keys (SSE-KMS) and select this customer managed key.

S3 managed keys

When you enable the Enable Server Side Encryption dashboard setting while configuring your S3 destination, RudderStack adds a x-amz-server-side-encryption header with the value AES256 to all the PutObject requests. S3 then encrypts the object with the AES256 encryption algorithm. For more information, see S3 encryption with S3 managed keys.

warning
If you set the default encryption key type to Amazon S3 managed keys (SSE-S3), then S3 encrypts the objects that are uploaded in the bucket with AES256 encryption - irrespective of whether the Enable Server Side Encryption is enabled in the RudderStack dashboard or the presence of the x-amz-server-side-encryption header in the PutObject requests.

Questions? Contact us by email or on Slack