Version:

Data Modeling

Model your unorganized and scattered warehouse data using RudderStack’s Profiles.

Profiles models your organizational data by analyzing all the data in your warehouse to create unified customer profiles and enrich them with features to help you scale your business efficiently and swiftly.

When you run the Profiles project, it creates an identity graph and feature views as outputs. You can augment the graph and create new user features by writing simple definitions in a configuration file or via SQL models.

Profiles data modeling

Highlights

  • Flexibility to use event stream, ETL, or any external tools as input sources.
  • Support to define various entities like user, product, organization, etc.
  • Intelligent merging of entities with different identifiers, like stitching Salesforce IDs.
  • Ease of creating features/traits for any entity and using them to deliver personalization.
  • Support to create core customer segments and activate them in downstream systems.
  • Deal with advanced use-case scenarios using entity_vars/ML models.

Use varied input sources

RudderStack Profiles gives you the flexibility of using a variety of input sources. These sources are essentially the tables or views which you can create using:

  • Event Stream (loaded from event data)
  • ETL extract (loaded from Cloud Extract (loaded from event data))
  • Existing tables in the warehouse (generated by external tools like DBT).

Define entities

Entities refer to an object for which you can create a profile. RudderStack allows you to use the desired object as an entity. For example, user, customer, product, or any other object that requires profiling.

You can define the entities in pb_project.yaml file and use them declaratively while describing the columns of your input sources.

Unify entities

Once you define the entities, you can resolve different identities for an entity using the process of identity stitching. It matches the different identifiers across multiple devices, digital touchpoints, and other data (like offline point-of-sale interactions) to build a comprehensive identity graph. The identity graph includes nodes (identifiers) and their relationships (edges), and it is generated as a transparent table in the warehouse.

For example, you can stitch Salesforce IDs or other ID types.

Enrich with features

Once you map all the available identifiers to an individual user or entity, it is easier to collect their traits and compute the user features you want in your customer 360 table.

Using the identity graph as a map, the Profiles entity var models let you define or perform calculations over the customer data in your warehouse. Each var materialises as a column in the entity_var table and represents a distinct feature. In addition, ML models can also use the identify graph as well as other entity vars, to create new features. Finally, feature view model lets you unify entity vars as well as ML faetures into a single view.

You don’t need any other tool or deep technical/SQL expertise to create these features. Trait definition is in a single unified framework and there is no need to move data across silos.

To implement advanced use cases, you can use custom SQL queries to define user features.

Define cohorts

Using cohorts, you can define core customer segments in the warehouse via a simple YAML config and the entire business can use them as a single source of truth. It is a subset of instances of an entity meeting a specified set of characteristics, behaviours, or attributes. For example, if you have user as an entity, you can define cohorts such as known users, new users, or North American users, etc.

By leveraging cohorts, you can target specific customer segments by enabling targeted campaigns and analysis. See Cohorts for more information.


Questions? Contact us by email or on Slack