ERR vs. ARR: The Metric That Reveals Whether Your AI Revenue Is Real
I was listening to Lenny’s podcast with Jeanne DeWitt Grosser (Vercel’s COO) and she mentioned a metric I hadn’t heard before: ERR, or Experimental Runrate Revenue.
It stopped me mid-workout.
Not because it was complicated. The opposite. It was so simple and so obviously correct that my first thought was: why didn’t I think of this? After eight years working in AI, I’ve watched enterprise customers cycle through pilots, POCs, and “strategic experiments” that never quite converted to real commitments. I knew the pattern. I just didn’t have a name for the revenue it generated.
ERR is that name. And once you see it, you can’t unsee it.
The concept comes from Jamin Ball at Altimeter Capital, and it addresses something that’s been bothering investors and AI product leaders alike: a lot of what’s being reported as ARR in AI companies isn’t actually recurring in any meaningful sense. It’s experimental. The customers are trying, not buying.
This distinction matters more than you might think. And it led me down a rabbit hole that surfaced some uncomfortable questions about where AI products are headed, including one that keeps nagging at me: what happens when companies try, try, try... and then just build internally?
So I dug in.
What You’ll Learn (8-minute read)
What ERR actually means and why it emerged now
The warning signs that revenue might be experimental rather than recurring
How ERR connects to the Build vs. Buy vs. Try calculus in AI
Why this matters for AI PMs building products
How to apply the ERR lens to your own customer relationships
What converts experimental revenue into real, durable ARR
The bigger questions ERR raises that we don’t have answers to yet
The Problem ERR Is Trying to Solve
Here’s the core issue: traditional ARR assumes predictability. You sign a customer to an annual contract, and barring something unusual, they renew. The whole SaaS model depends on this, which is why investor multiples for SaaS companies are calculated off ARR.
But AI has introduced a new category of customer behavior that looks like commitment but isn’t.
I recently spoke with a senior technology leader at a Fortune 100 company who laid it out plainly: his team is running parallel experiments with three different AI coding tools (Cursor, Claude Code, GitHub Copilot) with no intention of picking just one. “We know this landscape changes fast,” he told me. That’s smart risk management from his side. From the vendor side, it’s three simultaneous ERR relationships.
Companies are signing short pilots with easy opt-outs. They’re testing AI tools with discretionary “innovation budgets” that could disappear next quarter. They’re buying because of FOMO (their competitors bought something similar), not because they’ve validated clear ROI. And crucially, they often can’t articulate what success looks like.
Ball’s point isn’t that this revenue is fake. It’s real money, hitting real bank accounts. But it lacks the characteristics that make recurring revenue valuable: predictability, retention, and expansion potential.
As Ball put it: if annual churn is 40%, it’s hard to justify calling it “recurring” revenue.
What Makes Revenue “Experimental”?
Based on Ball’s framework and follow-up analysis from other investors, here are the behavioral signals that suggest revenue might be ERR rather than ARR:
Contract structure tells you a lot. Short pilots with 30 or 90-day opt-out clauses are a classic ERR signal. So are contracts where the customer can cancel without penalty if certain vague criteria aren’t met. Compare this to traditional enterprise software, where customers sign multi-year commitments with real switching costs.
Budget source matters. Is the spending coming from a line item like “AI experimentation” or “innovation fund”? That’s a discretionary budget that might not exist next year. Compare this to the budget allocated to core operations (like CRM or ERP), which is sticky by nature.
The enterprise leader I spoke with drew a sharp line here. Developer tools with trackable ROI? Worth it. “You can literally track the numbers,” he said. But big AI platforms promising vague innovation benefits? That’s just “scratching the R&D itch,” and his team treats those relationships very differently.
Champion concentration is risky. If your entire customer relationship depends on one AI-enthusiastic VP, you’re exposed. When that person leaves, gets promoted, or shifts priorities, the contract may not survive. True ARR typically involves broader organizational adoption, often procurement involvement, and integration into workflows that multiple teams depend on.
Undefined success criteria are a red flag. Ask the customer: how will you know if this is working? If they can’t answer clearly, you’re in experimental territory. They bought the promise of AI, not a solution to a specific problem with measurable outcomes.
First-mover grabbing behavior. Did the customer do a thorough evaluation, or did they grab the first AI solution that came along because they felt pressure to “do something with AI”? The former suggests considering adoption. The latter suggests experimentation.
A Real Example: McDonald’s and AI Drive-Thrus
Ball highlighted McDonald’s partnership with IBM on AI-powered drive-thru ordering as a cautionary tale. This was real revenue for IBM, based on a major enterprise customer deploying AI in actual restaurants.
Then McDonald’s ended the partnership in 2024.
The revenue that looked like a major enterprise win evaporated. Was it ever truly “recurring”? In hindsight, it was clearly experimental, and the experiment didn’t work out.
This isn’t an isolated case. Ball specifically flagged vector databases as a high-ERR category, noting that companies often build early functionality with dedicated vector databases during experimentation, then migrate to something like MongoDB when it’s time to scale for real production deployment.
Build vs. Buy vs. Try: The New Calculus
In that same Lenny’s podcast episode, Jeanne DeWitt Grosser discussed how the calculus of build vs. buy is shifting in the AI era. This connects directly to ERR in a way that I think is underappreciated.
The traditional framing was binary: do we build this internally or buy a solution? But AI has introduced a third option that’s become the default for many enterprises: Try.
Try means: run a pilot, allocate some innovation budget, test a few vendors doing similar things, and see what we learn. It’s not a commitment to buy. It’s not a decision to build. It’s information gathering.
And here’s the thing: trying is actually smart behavior for buyers. When the technology is moving this fast and the vendor landscape is this fragmented, running parallel experiments with multiple solutions before committing makes total sense. I’ve been in the AI world long enough to see how informative these experiments can be. They surface integration challenges, reveal actual (vs. promised) capabilities, and help organizations understand what they really need.
The Fortune 100 leader I mentioned earlier put it bluntly: “Hyperscalers have been gold because we can consume APIs and pay as we go. We need the ability to flex.” When vendors come to him with multi-year platform commitments, his team says no, even if the technology looks promising. The lock-in risk is too high when the landscape is shifting this fast.
But for vendors, “Try” revenue is ERR. It looks like traction. It feels like validation. But it’s not a commitment, and treating it like ARR is a mistake.
The uncomfortable question this raises: what happens when the experiment concludes... and the customer decides to build internally? They’ve learned what they needed from the trial. They understand the problem space better now. And they have engineering resources who are confident they can replicate the core functionality.
This isn’t hypothetical. The same enterprise leader confirmed his company is building their own internal AI assistant for employees rather than buying a platform solution. The vendors who pitched them over the past year? They were learning experiences. Valuable, sure. But not customers who converted.
This is a real risk, and it’s one that the ERR framework helps you see clearly.
Why This Matters for AI Product Managers
You might be thinking: I’m a PM, not a VC. Why do I care about how investors categorize revenue?
Here’s why: the ERR framework changes how you should interpret customer signals.
If most of your customers are in ERR territory, you don’t actually have product-market fit yet. You have proof-of-concept validation at best. The work isn’t done, and it’s dangerous to staff up or expand features as if you have a stable foundation.
Experimental customers also behave differently than committed ones. They’re less likely to give you detailed feedback because they haven’t invested enough to care deeply. They’re more likely to churn at renewal. And they’re more likely to make feature requests that don’t represent your broader market. If you’re prioritizing roadmap based on input from ERR customers, you might be building for a market that won’t stick around.
Your expansion strategy depends on the ERR/ARR ratio too. Expanding experimental customers (upselling them to bigger contracts) is risky because you’re building on an unstable foundation. Expanding truly committed customers has much higher expected value.
I’ve written before about how 78% of firms used AI last year, but far fewer are seeing large-scale business transformations. Many projects remain stuck at the proof-of-concept stage. The ERR framework helps explain part of why: a lot of what looks like adoption is actually experimentation that never converted.
Converting ERR to ARR: What Actually Works
So how do you move customers from experimental to committed? Based on the research and patterns I’ve seen, a few things seem to matter:
Integration depth is probably the most important factor. When your AI capability is embedded into core workflows (not just available as an optional tool), switching costs increase dramatically. The customer can’t leave without re-engineering their processes. This is Layer 3 of the AI Moat Pyramid (workflow integration), and it applies to individual customer relationships as much as to competitive positioning.
Defined success metrics that the customer owns. Not metrics you chose, but metrics they articulated as important before the pilot started. When a customer can say “we reduced X by 30%” rather than “we’re using the AI tool,” you’ve moved from experiment to validated solution.
Multi-stakeholder adoption within the customer org. When usage spreads beyond the original champion to multiple teams or departments, the relationship becomes more durable. One person can change priorities or leave. An entire organization changing behavior is much stickier.
Budget migration from discretionary to operational. This is a big one. When the customer moves your cost from “innovation budget” to “cost of doing business,” you’ve crossed a threshold. That’s a signal they see you as infrastructure, not experiment.
The ERR Audit: Questions to Ask About Your Own Customers
If you want to apply this framework to your current customer base, here are questions worth asking:
For each major customer:
What’s the contract length, and how easy is it to exit?
Where does the budget come from? Is it discretionary or operational?
Can they articulate specific ROI they’ve achieved?
How many people in the organization actually use the product regularly?
Is the product integrated into workflows, or is it a standalone tool they could drop?
If the original champion left tomorrow, would the contract survive?
For your overall portfolio:
What percentage of revenue comes from customers with short pilots vs. multi-year commitments?
What’s your churn rate for first-year customers specifically? (High first-year churn is a classic ERR signal)
How often do pilots convert to full contracts vs. quietly not renewing?
I suspect many AI products would find that their “ARR” is significantly more experimental than they’d like to admit. That’s not necessarily bad news. It’s useful information that should inform strategy.
The Bigger Questions ERR Raises
The ERR concept is useful on its own, but it also opens up some bigger strategic questions that I don’t think we have good answers to yet.
If enterprises are in permanent “Try” mode, what does that mean for AI vendors? Are we heading toward a world where customers cycle through experiments indefinitely, always learning but never committing? And if so, how do you build a sustainable business on that foundation?
What happens when Try leads to Build? The vector database example Ball mentioned is instructive. Companies experiment with dedicated solutions, then migrate to integrated alternatives (or build their own). If your product is a learning experience rather than a long-term commitment, that’s a very different business than SaaS.
Are we heading toward a wave of M&A and acqui-hires? If a significant portion of AI startup revenue is actually ERR, some of these companies won’t survive the transition to needing real, durable ARR. Does that mean 2026 becomes an acqui-hire bonanza, where larger companies absorb AI teams whose products couldn’t convert experimental traction into sustainable revenue?
I don’t have definitive answers to these questions. But I think they’re worth sitting with, especially if you’re building or leading AI products right now.
What’s Next
The next time you’re reviewing a customer renewal or expansion opportunity, try applying the ERR lens. Ask yourself: is this customer committed or experimenting?
The answer might change how you prioritize that relationship, and how honest you are with yourself about the stability of your revenue base.
What signals do you use to distinguish experimental customers from committed ones? Drop a comment below.
Sources
Lenny’s Podcast, “What world-class GTM looks like in 2026 | Jeanne DeWitt Grosser (Vercel, Stripe, Google)” (November 2025): https://www.lennysnewsletter.com/p/what-the-best-gtm-teams-do-differently
Jamin Ball, “Clouded Judgement: ERR vs ARR and the Conundrum of AI Revenue Streams Today” (March 2024): https://cloudedjudgement.substack.com/p/clouded-judgement-32224-err-vs-arr
Jamin Ball, “Clouded Judgement: How to Spot ERR” (June 2025): https://cloudedjudgement.substack.com/p/clouded-judgement-6625-how-to-spot
The GTM Newsletter, “ARR vs. ERR: Why every dollar isn’t equal”: https://thegtmnewsletter.substack.com/p/arr-vs-err-why-every-dollar-isnt



