Investigating the architecture of the modern data stack

image-f265c56dad35fa7788e5058f8a8c52d22b459030-1175x591-png

In this webinar, you’ll learn how engineering and data teams use RudderStack, Snowflake, and Braze to power their customer data and glean key insights about their market.

As teams collect user behaviors from the web, mobile, and more, they route their data through RudderStack. This behavioral data is then loaded into Snowflake to build a comprehensive customer profile and pushed to Braze via RudderStack’s reverse-ETL tool. Cross-channel engagement metrics are then pulled from Braze Currents back into RudderStack.

What the panel will cover:

  • The feedback loop between Braze, Snowflake, and RudderStack

  • Applications for customer data in campaigns

  • Results engineering, product, and data teams are gleaning from this data architecture

Speakers

Benjamin Rogojan
Seattle Data Guy, Data Science and Data Engineering Consultant

Matthew McRoberts
SVP Global Alliances at Braze

Eric Dodds
Growth at RudderStack

Eric leads growth at RudderStack and has a long history of helping companies architect customer data stacks to use their data to grow.

Transcript

Eric Dodds (00:00)

Thank you to everyone who's joining us. We're super excited to chat today. We're going to talk about data activation. If you heard what we were just talking about, we were talking really about the last decade and how we went from primitive technology to a world where we can accomplish pretty amazing things in terms of data activation. That's been a great journey. We have a special guest, so Ben has joined us. So say hi, Ben. We'll do an intro in a minute.

Benjamin Rogojan (00:27)

Hey.

Eric Dodds (00:28)

And we're going to start out actually giving this conversation some context along the lines of what we just discussed. So Ben does a lot of consulting, again we'll do an introduction for him. But he is going to give us some context about what he's seen in the industry and how stacks mature over time, which is just really helpful information. And then Matt and I will dig into where we've been and what we can do today with an architecture that delivers some pretty amazing engagement. So let's dig in. All right. Ben, you want to do a quick intro?

Benjamin Rogojan (01:04)

Yeah, sure. Hey everyone, my name is Ben, also known as the Seattle Data Guy. Basically I help companies do end to end data solutions, implementations, using a broad range of tools. And just helping them either modernize their data stacks or sometimes untangle the mess. There's plenty of times I just come in and data stack, maybe they've got all the right tools, but it's just been chaos or it's been developed in such a way that's hard to tell what's coming from where. And then also I'm a pretty big content creator across medium and YouTube in the whole data engineering space.

Eric Dodds (01:40)

Great. Matt?

Matt McRoberts (01:42)

Awesome. Thank you, Eric. Thanks, Ben. Money name is Matt McRoberts. I'm the senior vice president of global alliances at Braze. I've been in the business at Braze about seven years now. I look after what we call the three pillars of partnerships at Braze, which include technology integration, so RudderStack and Snowflake are two of our top tier partnership integrations that we're really excited to continue to build and innovate around. The second pillar would be folks like Ben in terms of working with consultancies, GSIs, growth agencies, folks that are building managed services around Braze and helping our shared customers drive best in class engagement. And then thirdly, Braze has a mature channel development program, mainly focused around regional reselling in the Asian Pacific, as well as Latin American markets.

Eric Dodds (02:36)

Very cool. And I'm Eric Dodds. I'm with RudderStack and I work on the growth team and actually manage our implementation of RudderStack. So I get to be in the guts of the product every day. And again, Ben, thanks for joining us. As you can tell, Matt and I love having a third party in here because we like to talk about RudderStack and Braze a lot, but we also want you to hear from people who are out doing the work every day. So Ben, give us some context here, we're going to spend about 15 minutes just building a little bit of context for what we're going to talk about with Braze and RudderStack.

Benjamin Rogojan (03:10)

Yeah. No, I'd love to do that. So basically I'm going to focus on talking about developing a mature data stack, talking about how I've seen different companies go about it. What I've seen in terms of like both working with specific clients, as well as just doing research and seeing other large companies and seeing what they're doing. So you can probably go to the next slide.

Benjamin Rogojan (03:31)

Not to reiterate on the same thing, but again, just my background is in data engineering, a lot of end-to-end consulting, all over the place in terms of finance insurance. Also, worked at Meta and I like to say accidental writer and YouTuber. So those are just some fun things that I have found myself doing that I really like doing. So if you're interested in learning more about data engineering, you can check out that.

Benjamin Rogojan (03:56)

All right. So just going with this talk in terms of what the focus is, basically I wanted to start out with why data infrastructure and analytical maturity is more important than ever and understanding how in the process that companies get to some sort of final end point of having a data infrastructure that they can rely on. And I like starting this by thinking the fact that for some people Excel is their mature data stack. If they're just starting out, if they're a small company, if they're just not aware of all the tools, you might see a combination of just exported excels directly from even things like Workday and Salesforce being used and conglomerated together to do a lot of analysis.

Benjamin Rogojan (04:37)

Or if they're lucky, maybe they've got one developer who's duplicated a database and they're relying off that one duplicated database to do a lot of their analytics off of. Or finally they've got some sort of half automated system that's got some things connected via various tools or scripts or something to pull in all that data into something like Postgres or Snowflake. It's not really fully connected, but it's getting to the right place. They've got some dashboards.

Benjamin Rogojan (05:03)

But the problem that I think a lot of companies face is they're going through this process is that we're getting to a point where, as the slide points out, there's more data than I think there's ever been. There's more demand for that data. There's more data sources, I think that's one of the bigger things. It's one thing to deal with larger data sets, it's another thing to deal with a larger variety which equals more connectors, which equals more upkeep and code and all that. And then of course, more data tools makes it just hard to understand what to use where, and I just obviously poke at more hype articles, which I'm sure I put out myself.

Eric Dodds (05:37)

So many hype articles.

Benjamin Rogojan (05:39)

Everything is the next everything, every platform is the next solution. And so all of this I think makes it very difficult to understand where do I take my data stack? What tools do I pick? But going back to the more data portion of this and the more variety... Sorry. No, you can go back.

Eric Dodds (05:56)

Okay. Yep. No worries.

Benjamin Rogojan (05:59)

What was I going to say? Basically I like to think that we're getting to this point where we can no longer process data in the way that we have been, which used to be very manual connections, because we're getting to this point where we need to almost industrialize our data stacks. That is to say that before, when we only needed to create 10 cars, it's very easy to do a lot of that stuff more manually, put together a lot of those pieces. But as soon as you start having greater demand and greater variety. We're pushing for that need to develop these mature data stacks and analytical processes because that's the only way we're going to be able to manage all of the data. So now you can go next slide. Sorry.

Eric Dodds (06:40)

Yeah. One comment that I think the reason I'm so excited to dig into this a little bit with you, Ben, is that for the modern person working in data or the modern person trying to drive engagement with customers. With all of the amazing tooling, it's a really important question to say why is this still hard to do? It doesn't seem like it should be that hard, but it is hard because of all these things you just mentioned, mainly the hype articles.

Benjamin Rogojan (07:12)

Speaking of articles, I've recently been reading a ton, obviously, because as a content creator you read 10 times as much as you produce. So there's two articles that I think I came across recently that I think stuck out, especially in terms of maturity and watching companies, just right out their maturity process. Because I think a lot of companies externally assume that these big tech companies, they just automatically have these data stacks that were already from day one the best of the best. But I think data stacks are a process. So looking at, for example Netflix, which there was an article put out by an author who basically discussed who's the founding engineer of the real time data infrastructure streaming platform at Netflix and like how it happened there.

Benjamin Rogojan (07:59)

He has this whole article where he discusses it. Originally, he references that they were on a traditional OLAP system using Hadoop, using Hive, which is what a lot of companies do rely upon. But what they eventually started to realize in 2015 is that in about six months, the capacity that they currently had was no longer going to be able to manage their current needs. Both on the operational side, as well as more long term analytics retention, answering questions like retention and things of that nature. And so through this article, which I think is very well written if anyone wants to look it up, they end up just going through the different steps they went through in order to go from concept of we've got six months, what can we put together in six months that will work?

Benjamin Rogojan (08:44)

And so they talk about how they've put together this streaming analytical system that is able to handle their 500 billion events in a day. It's called Keystone. Again, that can also be looked up as well, if you're interested in the underlying architecture. And that took upwards of six months. And then in 2016, they've got that MVP ready and now they're testing it out. They're getting other teams to adopt it, they're seeing if it works. They've got at this point, dozens of people using it. And that's a great start. It's never immediate buy-in. Even when I go into a customer, I usually try to get things out in some sort of MVP within that three month period where it's like we've got something that's working, we've got a process in place.

Benjamin Rogojan (09:26)

And now, in 2017 to 2019 for them, they were like let's really start ramping it up, let's set up more processes. Let's mature around that. And now they've got hundreds of people using it. And in 2020 at that point, I think they've got thousands of people using that infrastructure. And they're now looking to what's next.

Benjamin Rogojan (09:42)

And I think that