Developer-first tooling for customer data


In this webinar, Eric will interview our founder and CEO, Soumyadeb Mitra, who will dig into the details of what a CDP for developers is and discuss why RudderStack is building developer-first tooling for customer data. At the end, he will also explain our mission and how it will grow with our recent Series A funding led by Kleiner Perkins.

Here's what Soumyadeb will cover:

  • CDPs were built for marketers, but engineering needs to own the data stack

  • Open source and warehouse-first - building for transparency and ownership in customer data

  • Programmable pipelines - why dynamic customer data and ever-evolving data stacks require flexible, programmable pipelines

  • DevOps for CDP - why customer data tooling needs to be integrated into existing developer workflows

  • What’s next? The RudderStack platform as API first, the future of building on top of RudderStack pipelines


Eric Dodds
Growth at RudderStack

Eric leads growth at RudderStack and has a long history of helping companies architect customer data stacks to use their data to grow.

Soumyadeb Mitra
Founder and CEO of RudderStack

Founder and CEO of RudderStack. Passionate about finding engineering solutions to real-world problems.


Eric Dodds (00:03)

Thank you everyone for joining us. And we have a very special guest, our founder, and CEO, Soumyadeb on the webinar. Welcome, Soumyadeb.

Soumyadeb Mitra (00:15)

Thanks, Eric. Glad to be here doing this with you.

Eric Dodds (00:21)

Great. Well, today, we're going to talk about what a CDP for developers is. A little background on this. So last week we announced that we raised a Series A from Kleiner Perkins, which is a huge milestone for our company. S28 Capital and Uncorrelated Ventures also participated. And we announced that what we're building is what we're calling a CDP for developers. And so we're going to talk about lots of different things, including the history of CDPs, and then we're going to end with the future of what we're building at RudderStack and lots of things in between.

Here's a brief agenda of what we'll cover just so you can keep in mind what we're going to chat about. And I started thinking of some questions. So we'll talk about the history of CDPs. We'll also talk about really what's the goal of using customer data and how it's changed from a marketing use case to optimizing all parts of the business. And then a particular passion of ours is the subject of why engineering needs to own the data stack which is a controversial topic, but one that we love to talk about and one that Soumyadeb has extensive experience in.

Now, we'll dig into what a CDP for developers is. We'll talk about two really key components of RudderStack, which is why we're open source and we're warehouse-first. We'll talk about what warehouse-first means in the context of a CDP. We'll run through some key developer focus features, and then Soumyadeb's going to tell us what's coming up for RudderStack as we look at the next couple of years. So without further ado, Soumyadeb, do you want to give us just a brief background on yourself, and what led you to founding RudderStack a little over two years ago?

Soumyadeb Mitra (02:08)

Yeah. Sure, Eric. So we started RudderStack in 2019 as Eric said, but I have been working in this broad space for almost eight years. Prior to RudderStack, I spent a year in a company called 8x8. And what I was mostly doing 8x8 was to build a stack, very similar to RudderStack. We are trying to pull out the customer data, building machine learning models on top for various business use cases, all the way from lead scoring, to churn prediction, to upsell prediction, and so on.

And some of the experiences doing that led me to start RudderStack. There was no tool that could really satisfy the requirements at 8x8, and it made sense to start RudderStack. Back to 8x8, I spent five years doing a startup in the B2B marketing space. I'll not get into the details, but one thing I really learned there were marketing teams only have a limited view of customer data, which makes sense because they don't control all the properties through which enterprises interact with customers.

And that is the reason we strongly believe ... And that is a hypothesis I developed in my previous company that engineering needs to own this, as we'll talk later about details. So that's my high-level background.

Eric Dodds (03:34)

Great. And I'm Eric Dodds and I run growth at RudderStack, but this is about Soumyadeb. So I won't give my background. We'll get into that a little bit actually in this section. So let's talk about a brief history of the customer data platform. So I can speak to this a little bit. So my background is actually in marketing. So I started my career in marketing. And really, it's funny looking at this chart, thinking back to 2011. I remember being a digital marketer in 2011, and I remember just being really excited about the software.

And then as subsequent years passed, the software seemed to keep getting better and better and to do more things and more things. But the acceleration's pretty crazy to go from a couple of hundred tools to 7,000 tools in less than a decade is pretty insane. And so there was almost a whiplash in the industry where the options are almost overwhelming and you struggled even to figure out which tool to pick for the job. 

But there's a really specific reason why this dynamic happened in the MarTech landscape with marketing tools specifically. And I'll speak to this a little bit and then would love to hear your perspective on it, Soumyadeb. Really, marketing was the tip of the iceberg in terms of using customer data, right? So marketing teams in order to optimize campaigns, as they were able to collect more first-party data, use third-party data, wanted to optimize the top of the funnel to drive traffic, understand user behavior and the websites and apps, et cetera. And then also orchestrate the customer journey.

And in order to do that, really the first breed was all-in-one tools that did everything, even from your website to your email campaigns, everything lived in one place. But that created systems that were good at a number of things, but not excellent at a specific part of the puzzle. And that created point solutions. So companies that really were excellent at email marketing, but they also created data silos. They were really good at one thing, but they created a data silo, which made it hard to share the data with other teams and other tools.

And that led to an integrated MarTech stack where people were using best-of-breed tools that talk to each other. So this is the more modern generation of Salesforce and Pardot, and integrating HubSpot and Salesforce, et cetera, where your tools are talking to each other in a very integrated way. But even still, all of your data lived in different parts of the tech stack. Even though they could talk to each other, you didn't have a unified data layer.

So Soumyadeb, talk us through the integrated MarTech stack with best-of-breed actually led to needing data layer tooling and created a lot of confusion in the marketplace. Has that happened?

Soumyadeb Mitra (06:51)

This was really an interesting journey as you were saying that MarTech took from point solutions, everything in one bucket, to all these different tools. And the challenge with that approach though, as you pointed out rightly, is how do you get your data in all those tools? You have one tool for emailing, maybe another tool for your newsletters, one other tool for your push notification, one for your CRM, one for some other marketing, and so on.

You have seven tools that need to have customer data. Also, there is some feedback. So let's say you sent an email and you got a response back, and based on that response you want to take some other action. You want to send a push notification, and so on. So, that became a real mess. You've got point solutions that are very good at each task, but how do you make sure firstly, that your customer data is synced to all of them? Every time a record is created or somebody signs up, you have to sync that customer record to Salesforce and your Marketo and your push notification tool and so on.

So this was never an easy task. What people did was they built points or integrations, like one SDK for your Salesforce, some other form for Marketo, and so on. And then your website did not load properly, and that's why the record made it to Salesforce, but it never made it to Marketo. All these weird problems happen because of this silo of best-of-breed tools. And that is the problem that Segments and the Tealium’s of the world try to solve.

They said, "There is this confusion. Getting data across to all of them consistently is a problem. So just send it to us and then we'll make sure that everything goes to all the tools properly." So they've definitely added a lot of value. And that's why you see looking at the Segment's growth and trajectory, and eventual acquisition. And same for Tealium, I guess. A similar solution mostly focused on enterprises, but they're doing very well.

Eric Dodds (09:04)

And on top of that, and really at the same time, I think you saw a proliferation of CDPs that fall more on unified customer profiles and then customer journey orchestration, right? So the ActionIQs, BlueConics, et cetera. And it's funny because even though they fall under the same CDP category, they're really less about the data collection and more about actually actioning on the data. So making sure that you have really robust customer profiles and then being able to do all sorts of automation. Actually orchestrating the customer journey where people are getting emails and messages and push notifications and all that stuff.

But that led to a lot of confusion in the marketplace. And so it's pretty common for ... You see blog posts around CDPs and CDPs for all different purposes, et cetera. And of course, we are in that space as well, which we'll talk about more. But one thing I wanted to talk through was despite the confusion, the goal of what companies are trying to do with the data layer and with the activation tools is really the same. And every business is trying to optimize across business functions with customer data.

So we talked about marketing being the tip of the iceberg when it came to optimizing their function with customer data. And now we're seeing every single team across the organization, including finance have a huge hunger for customer data and first-party data because they need to optimize. Do you want to talk a little bit about that, Soumyadeb?

Soumyadeb Mitra (10:51)

Yeah, that's a great point. And firsthand saw that at 8x8. And for people who don't know 8x8, it's one of the largest telecom providers, a public company. And they're data silos. We had a mobile app which was generating a lot of ... Our customers are very using to make phone calls and text messages and so on. So there was a lot of customer data being generated in our mobile app. Then we had our billing system and we had our own homegrown CRM system and so on.

So we had multiple sources of data, and each function wanted to get access to that data. So of course, marketing, they wanted to customized journeys based on what people were doing in the apps. So that made sense. And a lot of the CDPs were trying to solve that problem. How do you define customer journeys and hear that single customer view? But the use case of customer data was well beyond marketing, as we learned at 8x8.

The next big was the use case of support. We had over 100,000 customers using our phone systems and support really wanted to know which customers are likely to churn? Which are the right customers? They used a tool called Gainsight. Gainsight is a great offer tool, but the biggest missing block was we did not have any integration from our sources of data, which was all over the place, as I mentioned, from the events springing from the app to our billing system, to our backend system, into the support Gainsight tool.

And this was important. Because let's say, what is support interested in? Who are the customers we're going to churn? Now, we found that the best predictor of churn is somebody's own usage going down over time, sending less messages, less specs, and so on. So that data comes from the average sale. But at the same time, another great feature for predicting who will churn or not is our ticketing data, which was on Sales Cloud. So bringing everything into Gainsight was very important so that the customer success team now has a view of which customers are going to churn?

So that integration was missing, and they really wanted that customer data. Now, if you go beyond support to the product. Of course, the product wanted to know who is using the product, who is not. But finance was another great example. Pricing the product was a constant degrading the company. It's always a problem in every company, of all states to public companies like 8x8. Are we rightly pricing the product? We had a free tier which was big being used. We tried up to some core limits. And then there was an unlimited tier and so on.

So how do you find out if you're pricing the product correctly? The best way to do that is to understand at each tier, what are we charging for and how are they using the product? But to do that, you need to h