June 24, 2021
Reading Soumyadeb’s recent post reflecting on the past year, I was reminded about the very first conversations we had about the acquisition of Blendo. The major driver behind the idea to integrate Blendo’s ETL platform with RudderStack was a shared belief that unified data pipelines aren’t just a nice to have feature for data engineers, but the only way that a true customer data platform can be built.
It’s very exciting to me that this shared vision and Soumyadeb’s determination to execute it are being validated. Our amazing growth since the project launched is validation from the market, but investors also see the value, as evidenced by our recent Series A funding round led by Kleiner Perkins.
But, as Soumya said, this is just the beginning. We have a lot of work ahead of us, and in this post I'd like to share with you why we think the vision is so important. It translates into a product and gives you a glimpse of some exciting new features in the works.
The Vision: Why a Customer Data Platform for Developers?
The first question I want to address is why we would create a new ‘category’ of customer data platform (CDP). Why a CDP for developers, specifically? After all, RudderStack provides data pipelines, and last time I checked a Gartner report, CDPs are products for marketeers.
If you look at most CDP vendors, the marketing focus makes sense. They all offer the ability to slice and dice customer data into audiences, then run marketing campaigns on them (email, ads, etc.). Segment.com didn’t always call themselves a CDP, but their Personas product, which allows you to build audiences for marketing, is a big reason they moved in that direction, and their new Journeys product will trigger campaigns. In short, marketers buy traditional CDPs for marketing audiences and marketing engagement.
But here at RudderStack, we believe that a CDP is much more than a marketing platform. Let me explain what I mean.
First of all, customer data is not only fueling marketing. In fact, if marketing was the only use case, we’d severely underutilize the data we collect. At the highest level, customer data is a reconstruction of the behavior our customers exhibit when interacting with our company, and this behavior should drive marketing, of course, but also product, sales, data science, support, finance, and, to some degree, the overall business strategy.
Second, a platform is categorically completely different from an application. A platform exists to enable applications that are built on top of it. Restricting a platform to one application, marketing, defeats the true meaning of the term. In a recent webinar, Soumaydeb shared a great opinion on this: it’s funny that to most people CDP means software for marketers—in the mind of a developer, the term “customer data platform” inherently means a much larger, comprehensive, and more complex system than marketing audiences and campaigns.
Our vision is important because using customer data well across the entire stack and business is, and will increasingly be, a defining characteristic of the most successful companies.
At RudderStack, we believe that:
- Customer data is a cross-functional resource
- A platform should deliver an environment where applications can be built on top
- Data is the platform
Data is the Platform
Data is the platform? Yes, because on top of this data, we can build segments and activate the data for marketing, understand how our product delivers value to our customers, figure out if a customer is going to churn, etc., etc. And all of these use cases are applications that are enabled by the customer data a company generates.
But managing customer data is not an easy task. This is the reason data engineers are such a precious resource these days. This will be a hard problem for a long time, which is why at RudderStack we’re not trying to replace the data engineer—we want to deliver on a much bigger promise:
We will offer a platform to help data engineers build and manage a customer data platform on their existing infrastructure, enabling them to integrate with and even build applications across the business that rely on customer data. Not only that, but we want to make sure that data engineers love this platform.
For all of these reasons, we decided that we needed to reclaim the term “customer data platform,” turn it into an actual platform and build it for the people who are best fit to maximize the value that such a platform can deliver.
The first step in architecting a customer data platform is collecting all of the customer data. So, our first step was bringing all of the pipelines into one platform, and even though we’ve come a long way, we still have a lot of work to do on that front. As we continue that work, though, we’re also building some amazing new features that will take the platform to the next level.
I want to share a few highlights from our product roadmap, plus a glimpse of some of the exciting new things we’re building at RudderStack.
Let’s start with a few things that are coming in the immediate future that are iterations on the existing platform.
What You’ll See Soon
More destinations. RudderStack already supports more than 150 downstream applications, but we still have a lot of work to do here to make sure that our customers can use any product they want.
More SDKs, more optimization. Capturing first-party data is the first step in making the data available, so we will keep innovating here by adding more SDKs, but we’re also working hard to optimize the SDKs as much as possible for top performance.
More data sources. SDKs are just one part of the data collection tools a company needs. Today customer data is spread across every application we use, from CRMs to CSMs and marketing platforms to accounting software. Every customer interaction helps us understand our customers better, and we want to make every data point easy to collect. Keep an eye out for lots of new data sources in the coming months.
Closing the loop by making the warehouse a source (reverse-ETL). To maximize the value of customer data and a customer data platform, we have to close the ETL loop. Once collected, the data has to leave our stack and flow back to the applications where our people can use it to act. This is the promise of “reverse-ETL” that we provide through Reverse ETL. We have planned a lot of exciting new features, including visual data mapping together with programmatically access to the Cloud Actions functionality.
Deeper integration with the modern data stack. We want RudderStack to be an integral part of the modern data stack companies are running. We already integrate with important parts of the stack like data warehouses, data lakes, and brokers, but we have more work to do here. Better integration with data lakes is coming, including support for Delta Lake, Parquet files and Iceberg are all in progress right now.
Better access to metadata. Metadata Is equally, and in some cases more, important than the data itself. Integration with Schema registries and appropriate APIs to expose a rich set of metadata is something we are building and very excited about.
What You’ll See in the Next Several Quarters
As you can see, many exciting improvements are in active development. But I’m most excited about some of our longer-term plans on the roadmap. Here’s a glimpse of what’s coming.
Better Developer Experience
Here at RudderStack we put our bets on the value data engineers can deliver and we want to build the tools that will help them maximize it. To do that we have planned a number of really cool features including a rich API that will deliver an experience similar to what "infra as a code" offers to SREs and DevOps.
Managing data infrastructure is hard and it's even harder when you rely on tens of third-party applications and APIs to operate. To ensure that the data engineering team will always react on time and appropriately to issues, we want to offer the maximum possible operational transparency for the platform.
We are orchestrating the whole platform and we are building many different APIs for accessing data about the status of your pipelines. Delivering this information on the application UI will be just one of these APIs.
All the above will ensure that RudderStack is the best CDP for developers. But the beauty of a platform is that it allows you to keep building value on top of it and we have big plans for that. Here are two big product focus areas that this API-first architecture will enable.
Focus 1: Data Quality and Governance
After we solve the hard problem of collecting and managing the data, we are faced with another hard problem: can we trust our data? That's a fundamental question, the answer to which can turn a disaster into a success.
There's a lot of work to be done here and many vendors are working on providing solutions, but owning all the pipes and semantics of the data models can turn RudderStack into an enabler and multiplier for all these vendors.
This ecosystem is still early on its creation but we see RudderStack as an important force for its success in the market.
Focus 2: From Events to Behavior
Customer data is a lower-dimensional projection of customer behavior. We can think of it as breadcrumbs that vaguely define the relationship and journey a customer has with our company. Having the raw data is not enough, though. The big question is, how can we turn this data into a better and consistent representation of the customer’s behavior?
The combination of providing the complete set of customer data, a platform that can interoperate with every component of the data stack, and the involvement of the data engineer, offers a unique opportunity to succeed in this. Tailor-made identity graphs are a good example of the kinds of things we want to build here.
Thank You...and You Should Join the Team!
As you can see, I cannot hide my excitement for what we have achieved so far and what lies ahead of us. The vision is bold and we will keep executing with the same core values of being extremely customer-focused, working hard, being humble, accepting mistakes and failures, and celebrating victories and achievements no matter how small or big they are. Values that Soumyadeb infused to this team since its creation. And most importantly, thank you to the open-source users and customers who have been a part of our journey.
If you'd like to be part of such an ambitious team and you are driven by the impact you can have, we are hiring across all functions.
We'll send you updates from the blog and monthly release notes.