Version:

Snowflake Sample Data

Sample Snowflake data you can use to run a Profiles project.

RudderStack provides a sample dataset for the Snowflake warehouse, available in the Snowflake marketplace. You can use this data to run the Profiles project and Propensity Scores through the UI or the CLI.

info

Note that:

  • The number of columns in this dataset is intentionally limited to make the dataset easy to understand.
  • All email addresses are generated randomly.
  • No PII is used in the dataset generation.

The following tables, properties, and user information is included in the dataset:

Tables

This dataset includes the following RudderStack event data tables and the approximate number of rows in each table:

TableDescriptionNumber of rows
PAGESPage view events from anonymous and known users.~172k
TRACKSSummarized tracked user actions (like login, signup, order_completed, etc.).~56k
IDENTIFIESAssociated with identify calls when the user provides a unique identifier.~22k
ORDER_COMPLETEDDetailed payloads from tracked order_completed events.~1.2k

The events that generate these tables follow the pattern of a standard ecommerce conversion funnel (pageview, signup, order).

info
This data starts from June 2023 and is valid until mid 2027, ensuring that future users can still use it to run their Profiles projects.

Properties

This dataset includes a subset of the standard properties found in the RudderStack Warehouse Schema spec for each table. The required columns for running Profiles and Predictions projects are also present.

User information

The user data includes a subset of our standard properties for identify calls.

This dataset contains a total of ~10k unique users by anonymousId. About 5.8k of these unique users are known users (with an associated identify call).



Questions? Contact us by Email or on Slack