Beyond the modern data stack: The customer context engine for the AI era

Evidence is stacking up against the AI naysayers. Early to mid-2025 was full of damning reports claiming that despite massive investments, AI generated little to no return for most companies. But the tide has changed. Last November, McKinsey reported that companies dedicated to significantly changing their business with AI are seeing significant value from their efforts.
Our own research backs that up. RudderStack customers are moving projects from pilot to production. In fact, fifty percent of respondents to a recent survey are offering a customer-facing AI experience. More importantly, seventy percent of these RudderStack customers are driving measurable impact from their AI initiatives.
To drive success from their AI efforts, they’re evolving data stack architecture to meet the demands of agentic systems that continuously consume data and LLMs that generate content in front of the customer. The AI era is firmly upon us.
Customer context – the key to positive customer facing AI experiences – is front and center, and RudderStack’s real-time customer data infrastructure is quickly becoming the fresh, trustworthy customer context engine for the AI era. We’re adding customers faster than ever, and usage is exploding.
In 2025, RudderStack processed 1.3 trillion events for over 4,000 organizations, including a rapidly growing portfolio of AI-native companies (👋 AssemblyAI, Otter.ai, N8N, Replicate, and Warp). They’re choosing RudderStack because we’ve been building for this moment since day one, and we’re extending our platform to meet the evolving requirements of today’s data teams.
Beyond the modern data stack
The modern data stack’s crowning achievement was to establish the warehouse as an organization-wide system of record. This single source of truth is a critical building block for the AI era. But it’s not enough.
Agentic and generative AI use cases stress test every component of the data architectures they operate on top of because they operate continuously, autonomously, and non-deterministically in front of the customer. Teams consistently cite two familiar data foes – siloes and quality issues – as primary blockers to productionizing AI and improving performance.
While the modern data stack did much to combat these issues, it was designed, foundationally, for business intelligence around a batch analytics paradigm: ETL data to the cloud data warehouse → clean and model data in the warehouse → fuel BI. Reverse ETL pioneers recognized the opportunity to operationalize rich warehouse data downstream and pushed this further.
But to win in the AI era, companies must evolve their stacks to deliver on demand access to fresh, trustworthy customer context. The warehouse is the foundation, but the modern data stack’s batch pipelines, fragmented tooling, and govern-after-landing approach doesn’t cut it.
Defining customer context and the importance of clickstream data
Let’s take a quick step back. Customer context comes from a mix of different data sources and types. At a high level, we can define these as:
- First-party behavioral (clickstream) data: Observed user actions across products, websites, servers, and cloud tools. This is high signal, continuously generated data that captures intent through patterns of behavior.
- First-party operational data: Stateful customer records and transactions from systems like CRMs, billing, support, and ERPs. This data reflects durable business state and evolves through discrete updates.
- Second and third-party enrichment data: Supplemental customer data from external systems or vendors , such as ad platforms, fulfillment services, and third-party data providers, that adds context not directly observable through first-party interactions.
Modeled into a customer 360 profile, these different datasets comprise everything a company knows about a customer. Traits can be computed on top (including LTV and Churn scores from traditional ML models), and subsets of the whole can be prepared and served to AI systems at the moment of inference to give them all of (and only) the relevant context for their decision.
The AI era calls for evolution and innovation around each type of customer context data, but clickstream data requires special attention. It’s a secret weapon because it contains critical signals of engagement and intent, but it’s less forgiving than the operational and enrichment data that comes in lower volumes and evolves more slowly over time.
Core principles of an AI-ready data stack
The companies we see winning with AI are prioritizing clickstream data and pushing boundaries in three core areas:
- Warehouse as the system of record: Centralizing and unifying data in the warehouse to create complete customer profiles that become the foundation for customer context. Leading teams are pushing latency down across the customer data lifecycle with technologies like Snowflake Streaming and Snowflake Dynamic Tables, or contemporary warehouses like Clickhouse.
- Proactive governance: Automating enforcement of governance throughout the pipeline by design to ensure only clean and compliant data makes it into systems for analytics, activation, and AI.
- Data infrastructure as code: Programmatically managing data pipelines, data governance, and identity semantics to enable scalable workflows and introduce software style guarantees. This is a nearly non-negotiable trust prereq for teams productionizing and scaling autonomous AI systems.
Before the modern AI era, data completeness, quality, and latency issues were tolerable because most use cases were deterministic and involved close human supervision. AI systems remove those guardrails and demand a new approach.
RudderStack: The customer context engine for the AI era
We won’t claim that we saw the meteoric rise of generative AI coming or anticipated how it would transform the data world. However, when we started building RudderStack five years ago, we did it with two deep, forward looking convictions, both of which are bearing out in the AI era.
First, we bet that data teams would become increasingly essential to support cross company customer data use cases. Second, we bet on the data warehouse moving to the center of the customer data stack.
We built the most reliable real-time pipelines for clickstream data, made the warehouse a first-class citizen, embedded code level control, and prioritized robust governance toolkits for data quality and compliance. Since day one, our driving principles and core products have been aligned with the needs of the AI era.
That’s why the fastest growing AI native companies, digital native titans like Glassdoor, Cars.com, and Prizepicks, and forward-thinking clicks-and-mortar companies like Footlocker and Crate and Barrel choose RudderStack today. They consider our guiding principles essential to their success:
- Warehouse-native DNA: RudderStack is built with a modern architecture at its core. Instead of storing customer data in a black box, it supports deep, seamless warehouse integration to move data in and out. Plus, it can operate on data in the warehouse to stitch identities and model business logic, enabling teams to establish their warehouse as the central, open hub to fuel analytics, activation, and AI.
- Governance included: RudderStack brings event data governance controls together in a central command center with deep observability and fine grained control over data schemas, validation logic, consent management, and PII. This gives teams maximum control over data quality and compliance and supports seamless workflows.
- Code based power: RudderStack is built for data teams with programmatic interfaces and APIs to enable code-based management of the customer data lifecycle and provide a foundation for agentic interaction.
These principles, and how they’re expressed in the product, make RudderStack the fresh, trustworthy customer context engine for the AI era.
In 2025, we built on our core strengths to meet the moment, creating an industry-leading event data governance system and enabling deeper programmatic management with Infrastructure as Code. What’s most exciting, though, is what lies ahead.
Strengthen the core. Build the future.
We built the industry's most powerful customer data infrastructure foundation. We’re still investing in our core to give data teams even more control while moving faster. We’re building a future-forward user experience with complimentary AI, CLI, and UI workflows. We’re attacking the gap between data and business teams, and we’re building a suite of capabilities to help AI native companies harness the data from their LLM applications.
Stay tuned next week for more details on the latest features, a real-world architecture for assembling and serving fresh customer context, and a deeper dive on what’s to come.
Published:
January 29, 2026








