Blog

The analytics stack Is the first SaaSpocalypse of AI. Or is it?

The analytics stack Is the first SaaSpocalypse of AI. Or is it?

Soumyadeb Mitra

Soumyadeb Mitra

Founder and CEO of RudderStack

10 min read

|

May 21, 2026

The analytics stack Is the first SaaSpocalypse of AI. Or is it?

Every wave of computing has rewritten the analytics stack, and AI is doing it again. It will kill the analytics UI and many vendors along with it. But the boring infrastructure underneath is more important than ever and vendors who adopt for that world will thrive.

A short history of analytics (aka asking questions of data)

To understand what AI is doing to analytics, it helps to look at what every previous wave did. Each one started with the same underlying problem: Somebody has a question, somewhere there's data, and there's a gap between them. Each wave closed the gap in a different way, and each wave left some scaffolding behind that the next wave had to work around.

Stage 1: Humans as the query engine.

Analytics predates computers. If you had data (e.g., tax records, ledger entries, inventory counts), you wanted to ask questions of it. The data lived on paper, and a person did the math by hand:

Question → Human doing arithmetic → Paper records

Slow, but conceptually simple: one question, one person, one ledger, one answer.

Stage 2: Code replaces the human.

Computers didn't change the shape of the problem. They changed who did the arithmetic. Someone still had to translate a question into instructions, but now those instructions were written in COBOL or FORTRAN by a programmer, and a machine executed them against digital records:

Question → Code → Data

Faster, but with a new bottleneck: the programmer. Every new question meant a new program. Business users couldn't self-serve.

Stage 3: SQL democratizes querying.

Relational databases introduced SQL, a declarative language that let you describe what you wanted instead of how to fetch it. A query planner translated SQL into actual code under the hood. This was a real democratization: Analysts and power users could now write their own queries without learning to program:

Question → SQL → Query plan → Data

Still not enough. Because the people who actually had the most questions (e.g., marketers, PMs, executives) weren't going to write SQL.

Stage 4: Purpose-built UIs for non-technical users.

So the industry built UIs on top of SQL, one per persona. Google Analytics for marketers. Mixpanel and Amplitude for product teams. Tableau and Looker for business intelligence. Each tool exposed a domain-specific interface (think: funnels, cohorts, dashboards) that hid the SQL underneath:

Question (marketing funnels) → UI → SQL/Code →

Question (product funnels) → UI → SQL/Code → Data

Question (executive dashboards) → UI → SQL/Code →

This worked. Each persona got a tool that spoke their language. But it created a new structural problem: Every tool needed its own data. These tools couldn't just point at the production database. Operational stores weren't built for the volume or query patterns of analytics, and the cloud data warehouses we have today didn't exist yet at scale. So each tool ingested its own copy of events, stored them in its own backend, and optimized them for its own use case:

Question → UI → Data store (marketing) ←

Question → UI → Data store (product) ← Source events

Question → UI → Data store (BI) ←

Three copies of the truth meant three versions of the truth. Marketing's revenue number didn't match finance's. Product's active-user count didn't match the dashboard the CEO was looking at. Reconciling them became a full-time job.

Stage 5: CDPs paper over the inconsistency.

This is where companies like Segment and RudderStack came in. The fix wasn't to eliminate the copies (that wasn't even feasible at the time), but to make them consistent. A CDP sits upstream of every analytics tool and fans out the same clean, standardized event stream to all of them, so at least the inputs match.

It's worth being honest about this stage: The CDP layer was a bandaid for a hardware and economics constraint. You couldn't run everything on one store, so you had to keep many stores in sync. The CDP made the world better, but it didn't fix the underlying architecture. It just patched it.

Cloud data warehouses removed the constraint

Snowflake, BigQuery, and Databricks changed the math. Suddenly, one store could handle the volume and the query patterns of every team. The “warehouse-native” movement, which we pushed early at RudderStack, said: stop copying data, run analytics in place.

Question → UI → SQL/Code → Warehouse ← Events

This should have collapsed the left side of the stack, too. It mostly didn't. The UI-plus-storage incumbents, such as Amplitude, Mixpanel, Looker, and Tableau, had distribution, muscle memory, and switching costs on their side. Teams were used to them. So the warehouse became the source of truth underneath, but the BI/product/marketing UIs sat on top, often still maintaining their own materialized copies.

And then AI showed up

For the first time, the interface to data can be English again. You can ask a complex question, and an LLM that's seen your schema can write reasonable SQL, run it, and explain the result. You can also ask things the UI was never going to let you ask: “Compare cohort retention for users who hit feature X within 24 hours of signup, broken down by acquisition channel, but exclude self-serve trials shorter than seven days.”

The interface is English again, but that doesn't mean the questions are simple. 'How is our trial-to-paid conversion trending?' has at least eight reasonable interpretations depending on cohort window, attribution model, and plan tier. The interface simplicity is back; the underlying complexity never left.

This is why every analytics vendor (Amplitude, Mixpanel, PostHog, Looker) has rushed out MCP servers in the last 12 months. The reported growth in MCP usage on these tools has been steep, with vendors citing 10–20x year-over-year.

Today's stack looks like this:

Claude → Existing Product/UI → SQL/Code → Warehouse

The thesis people are running with: There's no reason for Claude to talk to the UI layer. Claude can write SQL. The UI was always just a translation layer for humans who couldn't write SQL.

Remove it.

Claude → SQL/Code → Warehouse

And it's directionally right, but that's where the argument needs more precision.

Where the thesis breaks (and why it still matters)

The UI layer is threatened. The rest of the stack isn't. It's load-bearing.

The diagram makes it look like everything between the question and the data disappears. It doesn't. What disappears is the human-facing chrome: the dropdowns, the chart configurators, the saved-report libraries. What stays, and arguably becomes more important, is everything that makes an LLM-generated SQL query actually correct:

  • Semantic models and metrics layers. “Revenue” means six different things across finance, sales ops, and product. An LLM left to its own devices will pick one, confidently, and be wrong. dbt's semantic layer, Cube, LookML, and similar systems become the substrate Claude reasons over. They go from “nice governance hygiene” to “the thing that makes AI not lie to you.”
  • Identity resolution. The hardest question in product analytics is who is this user. Web visitor 7a3f, app install d91c, Salesforce contact 0x…4421, and support+jane@acme.com. Is it the same person or not? No amount of LLM cleverness fixes a join that was never set up. This is where the data plumbing (the CDP layer, profiles, identity stitching) matters more in an AI world, not less. Claude is only as smart as the keys it can join on.
  • Data quality and freshness. When a human ran a report and the number looked weird, they had context to know “wait, marketing changed UTM tagging last week.” Claude doesn't. Bad data with a confident natural-language explanation is more dangerous than bad data in a chart, because the explanation papers over the smell.
  • Governance and access. “Show me revenue by customer” should not return PII to a junior analyst. The UI used to enforce this with permission checks. Now the warehouse and semantic layer have to.

So the actual stack collapse isn't:

Claude → SQL → Warehouse

It's:

Claude → Semantic layer → SQL → Warehouse

Identity, governance, freshness, lineage

Event pipeline (CDP)

Sources

The boxes labeled “UI” disappear. Almost nothing else does. If anything, the layers underneath get more scrutiny because there's no human in the loop visually sanity-checking outputs.

Shipping an MCP server could mean a vendor is being replaced, or that they're becoming more essential.

The metrics definitions, the cohort logic, the saved attribution models living inside these tools have value well beyond the UI. They're a semantic layer in everything but name. If a vendor's MCP becomes the canonical interface Claude uses to reason about a domain's data, that vendor doesn't get hollowed out; it gets promoted from “app people log into” to “the layer the AI calls.”

The frame that matters: Whoever owns the semantic layer wins. If a vendor's moat was the UI, they're a casualty. If their moat was the metrics, definitions, and business logic underneath the UI, they're more entrenched than ever. The next 18 months will sort vendors into those two buckets.

So what actually gets killed?

The piece of the stack that is genuinely a casualty: the UI-plus-storage vendor model that copied data out of the warehouse to make its own UI fast. That bundle made sense when warehouses couldn't keep up. It doesn't anymore, and a conversational interface needs that bundle even less. There's no chart to render fast, just a query to run.

What replaces it isn't “Claude talks to Snowflake.”

It's:

  1. A reasoning layer (Claude or whatever the next model is)
  2. A semantic and metrics layer that the model reasons against
  3. Identity, data governance, and lineage as first-class infrastructure
  4. The warehouse as the substrate
  5. A consistent event pipeline feeding all of it

The CDP doesn't get displaced by AI; it gets promoted. The whole AI-on-your-data pitch falls apart if “your data” is three inconsistent copies across Amplitude, Salesforce, and the warehouse. Consistency at the pipeline layer is what makes the conversational interface actually trustworthy.

The casualty isn't analytics. The casualty is the UI vendor that built a moat out of dropdowns.

For a deeper look at what this means for your data infrastructure, see this related blog post from April 2026: Do you still need to centralize your data if your interface is Claude?

Published:

May 21, 2026

CTA Section BackgroundCTA Section Background

Start delivering business value faster

Implement RudderStack and start driving measurable business results in less than 90 days.

CTA Section BackgroundCTA Section Background