Why Your Data Warehouse Should Be the Foundation of Your CDP

Blog Banner

The Customer Data Platform has reached a tipping point fueled by increasing demand for useful customer data across the stack and the rise of major innovations in data tooling to meet that demand. The traditional CDP was primarily built for marketing activation use cases and is technologically incapable of integrating across the modern data stack. No longer can these systems deliver on promises like the single source of truth or real-time data activation. But we’re in a new era of customer data management ushered by the modern data stack. In this era, the warehouse is at the center, data teams build and manage the data layer on their own infrastructure, and valuable data from every touchpoint is made available across the stack, no matter the tools downstream stakeholders are using (including marketing CDPs).

A brief history: Defining the customer data platform

In 2013, David Raab, Founder of the CDP Institute, recognized confusion building around an emerging technology – tools that promised the coveted customer 360. These tools built customer profiles by stitching data points from various sources together and enabled predictive modeling on the resulting dataset. They fueled marketing with comprehensive customer information faster than ever before, but there was a lot of variance in the features of each tool, and no one really knew what to call them. Excitement around the new technology was high, but clarity was low.

So, David decided to put a stake in the ground. He published a blog post recognizing the new category and eventually launched the CDP Institute. The CDP institute created a definition for the CDP based upon a set of common consumer expectations driven, notably, by marketing use cases. In his original blog post, David hit the nail on the head, noting that “‘customer’ shows the scope extends to all customer-related functions, not just marketing.” But because marketing is the tip of the spear when it comes to customer data, and because the 2013-2020 period was defined by the explosion of the "mar-tech" landscape, CDPs have always been inextricably linked to marketing use cases.

Today, the CDP institute defines the CDP as “packaged software that creates a persistent, unified customer database that is accessible to other systems.”

The modern, warehouse-first data stack delivers all the value suggested by this definition but differs on a key point: the location and ownership of the persistent customer database.

The original intent, but a new approach

The ultimate goal of the traditional CDP was to provide a function-agnostic SaaS platform that created value by moving data throughout the organization's digital ecosystem. But these CDPs failed to live up to the promise because they really didn’t unify data, they created a data silo—a problem that became increasingly painful as the complexity of the data stack increased. They also fell short when it came to sharing customer profiles “with any system that need[ed] it”. In other words, most traditional CDPs are closed systems that confine data value instead of making it available throughout the stack.

That’s not necessarily the fault of the CDP providers—the market wanted to drive customer engagement for marketing use cases, so the CDPs built for that demand instead of making data integration a first-class citizen.

So, while many CDPs are great for engagement, the need for centralization and integration has become increasingly acute. This means data teams must explore new architectures to liberate data and provide integration flexibility for constantly changing toolsets.

Fortunately, the tool of choice to enable that flexibility already exists: the cloud data warehouse. It makes total sense. The warehouse has the most complete picture of data, and the tooling around it enables fully customizable data flows.

We spoke with David Raab as we did our research for this post, and he made a salient point: “modern warehouses, such as Snowflake, use more flexible data stores and can do more things, including much of what would typically be done in a CDP.”

Warehouse technology advancements mean the limitations of the traditional CDP can be overcome, but this requires a different approach: building the entire customer data stack around the warehouse, not a third-party marketing CDP. And that’s our mission at RudderStack: to enable data engineers to easily ingest and move data across the entire stack while maintaining and enriching a complete set of customer data in their cloud data warehouse.

Leveraging warehouse native architecture, you get the features required to build and activate real-time, unified customer profiles without needing to store any data with a third-party vendor or subject your stack to their technological limitations. You can serve every business user with the data they need in the tools they use, whether it’s a Salesforce CRM or the Business Intelligence tool your BI team uses for visualizations. This bring-your-own warehouse approach allows you to build a CDP on top of infrastructure you’re already familiar with and invested in. Here’s why companies are building CDPs with this new architecture:

  • Flexibility & accessibility - when all of your raw customer data lives in your own data warehouse, it’s easily accessible to everyone, and usage isn’t subject to vendor specific limitations
  • More complete customer data sets - because you own your warehouse, you can combine internal customer data (like transactional data) that you wouldn’t send to a 3rd party system
  • More advanced use cases - your warehouse is connected to your other customer data infrastructure that runs functions like data science, so you can enable more advanced machine learning use cases (as opposed to relying on vendor-provided data models, etc.) like churn prediction and personalized recommendations
  • Enhanced data privacy and governance - utilizing your existing data warehouse as your customer data store means one less tool where you have to deal with data security and privacy concerns, giving you more control in the era of GDPR and CCPA regulations
  • Cost savings - it’s cheaper to store your data in the warehouse you’re already paying for than it is to pay your CDP to store it for you (again)
Start building a CDP on your warehouse today
Sign up for RudderStack to test drive our end-to-end solution for data collection, unification, and activation.

Built for engineering vs. built for marketing

While observing that modern data warehouses are capable of doing many of the things typically done in a CDP, David made another excellent point in our conversation. He said “it’s not just a matter of dumping your data into Snowflake. There’s still plenty of data cleaning, transformation, and unification needed to make the data truly useful.” In other words, the warehouse isn’t a CDP on its own. Enter, the data engineer, the key to making all the customer data useful.

The marketing CDP was designed to make it easy for non-technical marketing teams to gather and use first-party data to understand the customer journey and enhance their initiatives. The CDP institute articulates this: “The packaged nature of the [CDP] makes it much easier to deploy and change as new needs arise. Corporate IT must cooperate to set up and maintain the CDP but most technical resources are usually provided by the vendor or an agency hired by marketing.”

That approach made sense for marketing use cases when the limited technology made moving data painful and expensive—CDPs provided value by making data accessible to non-technical teams by abstracting away the technical reality of data ingestion, movement, and unification. No need to worry about APIs, data pipelines, and identity resolution if the CDP does it for you. But this approach leads to a view where data teams are seen as necessary obstacles that non-technical teams must make cooperate in order to get what they want.

But, in reality, data and engineering teams are best-suited to own the implementation and management of the CDP. This actually allows them to supercharge the efforts of other teams because engineering:

  • Controls core technical infrastructure that’s directly related to the customer experience (and subsequently, customer data)
  • Understands the technical requirements for data security and is ultimately held responsible for keeping customer data secure
  • Can aggregate and centralize all customer data in a modern data warehouse or data lake
  • Can integrate every tool in the organization's data stack and enable real-time use cases

So, at RudderStack we’re building a CDP for data teams because our mission is to make data engineers, data scientists, and developers the heroes of their companies by providing every team with rich customer data.

The future is warehouse native and best of both worlds

With modern data warehouses and data lakes paving the way, we believe we’re at the beginning of a seismic shift in customer data management. The next five years will be full of innovation in the space and the modern data warehouse will sit at the center.

In this new era, companies won’t have to ditch their marketing CDP for a warehouse-first approach. The most advanced companies will leverage a Warehouse Native CDP to get more value out of their marketing CDP by driving it with more complete, enriched data from the warehouse. This means better marketing attribution, more streamlined marketing automation, and stronger marketing campaigns with richly personalized messaging.

And with the data team acting as a strategic partner, stakeholder teams organization-wide will solve harder problems and unlock new opportunities. Data engineers will no longer be seen as an obstacle to cooperate with, they’ll become the heroes of their organizations.

Try the Warehouse Native CDP today
Sign up for RudderStack to test drive our end-to-end solution for data collection, unification, and activation.
October 6, 2021
Soumyadeb Mitra

Soumyadeb Mitra

Founder and CEO of RudderStack

Brooks Patterson

Brooks Patterson

Product Marketing Manager