Data Management

Manage your event data with our data retention options.

RudderStack provides a comprehensive data retention policy and options for opting in or out of data storage.

It provides the following retention options (available only in the above plans) for your event data:

The following sections define different types of RudderStack data and provide steps on opting in to the right setup for your needs.

Data definitions

RudderStack does not permanently store any customer data except the following:

  • Aggregate “Count” data on Event Name, Event Type, Source ID, Destination ID.
  • Error codes.
  • RudderStack customers’ records (for example, usernames, billing-related details).
The storage options vary with the nature of the data and your RudderStack plan. See Plan-based retention options for more details.

All other customer data can be classified as either transient or non-transient and it may either be stored in your location, for example, AWS, or by RudderStack for upto 7 days (or 30 days in case of Enterprise plan).

RudderStack’s data retention policy defines data as they pertain to the primary components of its service - the data plane and control plane.

Transient customer data

Transient customer data can be defined as all in-transit data, that is, stored for less than 3 hours, as an essential part of delivering the RudderStack product experience. This data includes:

  • Data plane: Events that hit the RudderStack gateway. See data plane architecture for more details.
  • Control plane: The in-transit data captured in the Live Events tab of the RudderStack dashboard.

Non-transient customer data

Non-transient customer data can be defined as data that can persist for more than 3 hours only if configured by the RudderStack user. This includes:

  • Data plane: This includes processing errors and gateway dumps.

    • Processing errors: Events rejected at various stages of the data pipeline, including errors from user transformation, destination transformation (internal to RudderStack), and events rejected by the destination after 3 hours of retry attempts.
    • Gateway dumps: Raw data for every successfully-ingested event.
  • Control plane: Data in the reporting service (sample events and responses).

Data retention options

RudderStack provides 3 options for your event data storage. To choose how you want to store the event data, follow these steps:

  1. Log in to your RudderStack dashboard.
  2. Go to Settings > Workspace > Data Management.
  3. Choose one of the 3 data storage options in the Data retention section:
Choose your data storage option.

The following sections explain the data retention options in detail.

1. Do not store event data

If you choose this option, RudderStack will not store any of your event data.

Do not store event data.

This is the recommended event storage option, and available in the Starter, Growth, and Enterprise plans. Selecting this option will bring up a modal allowing you to connect a storage bucket with your RudderStack data.

Store your data with your cloud provider.
RudderStack supports storage via AWS, GCS, Azure, and MinIO if you select this option.

When connecting your cloud storage provider to RudderStack, you will first need to create a storage bucket and configure the credentials for RudderStack to access the datastore. Follow the steps listed below depending on your cloud provider:

3. Store event data in RudderStack cloud storage

Choosing this option allows RudderStack to store and delete your event data on a rolling 7-day basis. This is the default setting.

Store in RudderStack cloud storage.

If you are on the Enterprise plan, RudderStack lets you store and delete your event data on a rolling 30-day basis.

Store in RudderStack cloud storage.

Sample event data

When the Sample event data setting is enabled, RudderStack stores and deletes sample events and responses on a rolling 30-day basis. This data may be helpful for debugging your events.

RudderStack does not consider the event name or event type to be Personally Identifiable Information (PII).
Opt in to sample event data storage.

Plan-based retention options

Based on your plan, RudderStack provides different options for event storage, giving you the ability to enable or disable retention for the following kinds of data:

  • Sample events and responses: As mentioned above, RudderStack will store and delete sample events and responses on a rolling 30-day basis.
  • Raw event data: Events sent to RudderStack. This includes processing errors and gateway dumps.

Refer to the below table for the storage items supported by different RudderStack pricing plans:

Data typeFreeStarter/GrowthEnterprise
Sample events/responses
Raw event data
  • No data storage
  • Connect your own cloud storage
  • RudderStack 7-day storage (default)
  • No data storage
  • Connect your own cloud storage
  • RudderStack 7-day storage (default)
  • RudderStack 30-day storage

Data governance

To enable the Event Audit API in the RudderStack dashboard:

  1. Go to Settings > Workspace and click the Data Management tab.
  2. Scroll down to the Data governance section and toggle on the Event audit API setting.
Only users with the Org Admin role have the access to this setting.
Event Audit API setting in RudderStack dashboard

When this setting is turned on, you can leverage the Event Audit API to create and manage your tracking plans. Use these plans to monitor and act on any non-compliant data coming into your RudderStack sources based on predefined rules.

See Event Audit API for more information.

RudderStack’s data privacy options let you safeguard your customers’ privacy by controlling who has access to the raw event data containing PII(Personally Identifiable Information). You can allow anyone on your team to access the PII or restrict access only to a select list of members.

Only members with PII permissions can view the customers’ PII in the Live Events and errors logs in your destination’s Events tab:

Error logs

To set the PII permissions, follow these steps:

  1. In your RudderStack dashboard, go to Settings > Workspace > Data Management and scroll down to the Data Privacy section.

  2. Under Who can view restricted data?, select the appropriate option:

    • Anyone on your team: All the members in your workspace can view the raw event data containing PII.
    • Only people you select: Only specific members with access can view the raw data.
  3. To allow specific members of your team to edit the object, click Only people you select, followed by Add member.

Add members
  1. Finally, select the workspace members from the drop-down and click Add Members:
Add members option
If the admins are removed from the access list, RudderStack will restrict them from viewing the PII.

Questions? Contact us by email or on Slack