You are viewing documentation for an older version.

Click here to view the latest documentation.

How Profiles Works

Know how Profiles collect, unify, and activate your data to enhance the overall customer experience.

RudderStack helps you build a complete CDP on top of your data warehouse in three stages - Collect, Unify, and Activate.

The following sections highlight RudderStack’s comprehensive solution at every stage to create a complete customer profile.

Profiles Overview


First, RudderStack collects and stores all the source data in your warehouse. This includes:

  • Event data, for example, user interaction data from web and mobile apps.
  • Data from cloud sources, for example, CRM platforms like Salesforce, support tools like Zendesk, etc.

Known data

RudderStack collects all the information from:

  1. First-party data: Data collected from the enterprise’s own mobile application, websites, POS systems, etc.
  2. Third-party apps like SalesForce CRM, Zendesk Support, ecommerce payments via Stripe, etc.

There is a known ID for all of these by which RudderStack collects the data like email, user ID, etc.

Unknown Data

This includes unknown user attributes like anonymousId captured from the RudderStack SDKs on the web/mobile apps. It is helpful in tracking user activities in cases where they are not logged in.

The difference between known and unknown data is that in the former, we have information about the user. First-party data can be known data if a user is logged in. For example, the data from cloud sources will always be known. However, data from the event stream sources can be known or unknown depending on whether a user had logged in.

As the data is collected, you can apply relevant transformations to it for compliance/security purposes like data governance, privacy, etc.

The below image highlights a snapshot of the identifies and tracks tables, that RudderStack leverages for unifying the data.

Identifies and Tracks tables for stitching


At this stage, Profiles takes over and does the following:

ID Stitching

Profiles stitches together all the known and unknown IDs into a single table. The IDs are linked using an autogenerated ID known as rudderId, which is akin to a golden record. Imagine a 1-to-many relationship, in which one rudderId has multiple values for other IDs like user ID, anonymous ID, email, etc.

With a rudderId, you can easily identify that a customer - who shopped on your website 6 months ago, anonymously browsed from mobile 4 months ago, raised a complaint with the support team 2 months ago - is actually the same customer. RudderStack represents them as different nodes/edges of the ID stitching graph.

Identity stitching

Feature Views

If the entity features/traits are spread across multiple entity vars and ML models, you can use Feature views to get them together into a single view. These models are usually defined in the pb_project.yaml file by creating entries under feature_views key with corresponding entity.

Features generation

Feature Table

The entity vars specified in the project are unified into a view. A Feature Table is a unified customer profile containing useful information for each customer. Once all the known and unknown identities are stitched together, you can trace back activities for all such identifiers and aggregate them under the common rudderId. This is helpful in calculating features across all such interactions.

Some common use cases are computing the customer’s total LTV (lifetime value), purchase history, number of days a customer was active, etc.


Activation is a two-step process. In the first step, the user creates the target audience using RudderStack’s Audiences feature. They would then use Reverse ETL to route this audience information (also persisted in the same cloud warehouse) to downstream marketing tools like Braze, Mailchimp, etc.

Audience Builder / Cohorts

Additionally - you can use the feature tables as inputs to ML models for use cases like churn prediction, lifetime value (LTV) prediction, etc. Again, you can route the output of such models to marketing platforms via Reverse ETL.

As you keep collecting data, Profiles continues to unify and send it for activation.

Profiles lifecycle

In this way, you can make informed decisions, run personalized marketing campaigns, and enhance the overall customer experience across multiple platforms.

Questions? Contact us by email or on Slack