May 09, 2022

Customer data is arguably the most valuable data that a company possesses. Understanding customer preferences and behaviors is the key to unlocking new value propositions, enabling better customer experiences, and discovering new revenue opportunities.

To understand their customers better, companies have sought to build a unified view of their customers using two  approaches: (1) building and managing multiple data pipelines and enterprise infrastructure (like Hadoop clusters) to collect, store, and analyze customer data, or (2) purchasing a Customer Data Platform (CDP). The first approach remains the purview of sophisticated tech companies with massive engineering teams (think: Google, Facebook, Uber, etc.), while the second is the primary option for others. In this article, we focus on the second approach.

When the term “CDP” was coined in 2013, one of the explicit goals of these tools according to the CDP Institute was to create “a persistent, unified customer database.” Since then however, traditional CDPs have fallen short of their stated goal. Data continues to grow explosively, privacy regulations are proliferating, and cloud data warehouses, led by Snowflake, have matured, reducing storage and compute costs. These trends have led companies to move their most valuable data to cloud data warehouses. To illustrate this massive shift, Snowflake, the leading cloud data warehouse, has almost 6k customers (including 48% of the Fortune 500). For traditional CDP implementations, this has resulted in the creation of a data silo - if a company already has their customer data on Snowflake, why store a duplicate of that data in a CDP? Herein lies the issue - in the modern data stack, the SaaS CDP creates the problem that it was meant to solve.

image-deb06cfa27848d3211ab7c9c23d842a07ec0ce0d-914x507-png

To be clear, the CDP is not the only data silo in the marketing data stack. Different teams require customer data delivered to the tools of their trade. For example, a product manager may need behavioral data to A/B test product features whereas a marketer may require the same data to personalize marketing emails. As these point-to-point integrations are implemented, each tool risks becoming a data silo. More importantly, it is very hard for these disparate tools to share data with each other, which is necessary for enabling use cases like triggering customer engagement (in a marketing automation tool) based on payment data (from a finance tool). Perhaps worst of all, organizations are forced to give up control of their customer data to multiple vendors.

In the connected apps model, software providers are turning the SaaS model for customer data on its head. Instead of forcing users to load their data into a third party tool, they are enabling functionality on top of the customer data where it already lives: the cloud data warehouse.

RudderStack, a warehouse-native, customer data platform, is a great example of a connected app that runs on top of the Snowflake Data Cloud. Instead of offering another bloated customer data SaaS platform, they offer a suite of tools that allow their users to collect all of their customer data in their own warehouse, enrich it, then take action on insights with best-of-breed marketing tooling.

This is a significant shift in the way we think about tools like customer data platforms: RudderStack doesn’t store any data, because they don’t have to, which is a win for everyone. Users retain full control and ownership of their data and RudderStack can leverage the power of those users’ modern cloud data platforms.

Managing customer data in the connected apps model is enabling companies to solve what were traditionally very challenging customer data projects like identity resolution, comprehensive enrichment of customer records and advanced ML apps for personalization and recommendations. This model enables ad hoc analytics without limitations, collaboration with internal data teams, reporting in the enterprise BI tool of choice and combining customer data with additional business datasets.

Solving customer identity resolution in the connected apps model

As the customer journey has grown more complex and the world becomes increasingly digitized, identifying customers across devices, brands and platforms is more difficult than ever.

Traditional customer data SaaS providers have struggled to keep up for several reasons:

  • First, they are generally built as UI-focused tools for marketers, not robust data platforms. So, while powerful, their products often can’t collect and process all types of customer data from across the stack.
  • Second, one-size-fits-all identity resolution solutions aren’t flexible enough to meet the needs of modern businesses that are increasingly demanding visibility into and control over the way their customer data is processed and modeled.

RudderStack as a connected app on the Snowflake ecosystem solves these problems. Anonymous and known users, as well as a comprehensive event-based view of their individual behaviors, are tracked by RudderStack’s cross-platform Event Stream SDKs, but all of the data lives in their users’ Snowflake environment. This provides not only control, but full visibility into the raw, cross-platform data.

RudderStack also leverages Snowflake’s power to compute a deterministic identity graph in their users’ Snowflake instance, making the logic fully transparent and configurable and providing a ready-made data foundation for data science teams who want build predictive identity resolution models that meet the requirements of their unique business and customer journey.

image-76ffb0194b7b32000a715d8d1d8f9128dc890dab-1180x580-jpg

Enriching and activating customer records

Many data enrichment solutions have been tailored to the specific needs of individual teams. For example, a sales team might add an app to their CRM system that automatically adds demographic information to contact records. While this solution helps the sales team, it remains difficult for other teams to leverage that enriched data.

In the connected apps model, enrichment happens centrally in the cloud data warehouse, and is then syndicated to tools across the stack through tools like RudderStack Reverse ETL. Here’s how it works:

Users enrich contact records within their own Snowflake instance using both their own first party data (behavioral events, transactional data, inventory data, etc.) as well as third party data (available directly via the Snowflake Data Marketplace). This enrichment goes far beyond just combining data. Enrichment also includes running algorithms on that combined data, from simple, rule based SQL (e.g. total revenue for a user or number of times they have logged in) to advanced ML models that predict churn. Ultimately, the enriched results are materialized in a table in the users’ Snowflake environment.

RudderStack’s Reverse ETL pipeline connects to that computed table and sends updated contact records to the companies entire stack, meaning that marketing, sales, customer success and other teams not only have the same complete set of customer data, but can leverage the power of enriched data.

With this architecture in place, one can unleash the full power of Snowflake and RudderStack to deliver value regardless of their data maturity level - from those looking to build a unified customer profile for BI to others looking to send personalized messages and recommendations based on predictive models. In a way, the connected apps model is the closest one can come to realizing the promise of the customer data platform.

image-36d8549083e836ee871ba2a4deb563754e0de204-400x400-png
About the author
Eric Dodds
Growth at RudderStack
Subscription
Subscribe

We'll send you updates from the blog and monthly release notes.