← Back to Blog

The AGI Paradox: Why the Biggest Technology Shift in History Hasn't Shown Up in Your Productivity Numbers Yet

Illustration of AI data trends and economic supercycle

Ali Ghodsi, CEO of Databricks, made a claim in a Stanford lecture earlier this year that should stop every business leader in their tracks: "We already have AGI."

Not hype. His argument is precise. Every peer-reviewed definition of artificial general intelligence written before 2010 has now been met. By any measure that existed when researchers coined the term, AGI is here. The reason it doesn't feel that way is that the definition kept moving as the capability arrived.

That reframing matters because it explains the central puzzle of this moment: the most capable technology ever built by humans is producing surprisingly modest results in most organizations. Not because the technology is oversold, but because most organizations are using it wrong.

What follows draws on Stanford's MS&E435 Economics of the AI Supercycle course from Spring 2026, featuring Apoorv Agrawal of Altimeter Capital alongside guests Brad Gerstner (Altimeter Capital), Sunny Madra (formerly Groq and Nvidia), Chase Lochmiller (Crusoe Energy), and Ali Ghodsi (Databricks). Four sessions. One consistent argument.

The Inverted Triangle

To understand where AI value sits today, picture a triangle. At the bottom: the semiconductor layer. At the middle: cloud hyperscalers. At the top: the application layer where most businesses operate.

Right now, that triangle is inverted. The semiconductor layer earns gross margins around 75%. The application layer earns 0 to 30%. Value sits closest to silicon while silicon remains scarce. Every dollar of application revenue depends on compute infrastructure that is genuinely constrained.

This will change. As supply catches up to demand, the triangle will flip. The infrastructure advantage that currently belongs to chip manufacturers and hyperscalers will compress. Application businesses that have built real differentiation, real data advantages, real switching costs, will capture more of what the technology creates.

But that transition has not happened yet. For now, the people building the shovels are doing better than the people mining.

The Physics of the Build-Out

The scale of AI infrastructure investment is difficult to comprehend without concrete numbers.

A modern AI data center costs approximately $60 million per megawatt to build: $20 million for infrastructure, $40 million for IT compute. Once operational, it generates $15 to $30 million in annual revenue per megawatt, implying a two-to-four year payback on construction costs.

Chase Lochmiller described the Abilene, Texas campus that Crusoe is developing: 2.1 gigawatts of capacity, 9,000 construction workers on site, and the largest privately owned electrical substation in the United States. That one campus.

The implication that matters for business planning: compute committed today will not be operational for two to four years. The capacity that will power AI applications in 2028 is being contracted now. Companies that assume they can acquire compute on demand when they need it are planning against a market that does not exist.

Tokens as Digital Labor

The most useful economic frame for understanding AI comes from standard macroeconomics. The Cobb-Douglas production function describes output as a combination of labor and capital. In the AI economy, tokens are labor. GPUs are capital.

That reframe has a concrete implication: as token costs fall, the cost of digital labor falls. And token costs have fallen dramatically. A task that cost one dollar in 2023 costs roughly one cent today. Inference costs dropped approximately 90% year over year, and approximately 99% over two and a half years.

This is deflationary on the cost side. But it has also dramatically expanded what is economically feasible to automate. Tasks that were not worth running at a dollar are worth running at a penny. The addressable market for AI-powered work expands every time costs fall.

The Deflationary Paradox

Illustration of business strategy decisions in the AI economy

Here is the paradox that confuses most analysis of AI economics: inference costs are falling dramatically, but compute spending is accelerating.

The reason is volume. Early language models generated roughly 100 tokens per response. Reasoning models today generate 10,000 to 100,000 tokens working through a problem. Agents multiply that further by running multiple calls in sequence. The per-token cost fell, but the number of tokens per task increased by orders of magnitude.

The result: total spending on compute goes up even as unit costs go down. Anthropic's annualized revenue reportedly grew from $3.5 billion to $8 billion to $10.5 billion in roughly 60 days, adding $10 billion in annual recurring revenue in two months. That is equal to the combined revenues of Databricks and Palantir.

Cheaper tokens did not shrink the market. They expanded it by making previously impossible applications economically viable.

The Enterprise Context Problem

This is where the productivity paradox lives.

AI models know essentially everything that has ever been written publicly. They do not know your business. They do not know your customers, your processes, your institutional knowledge, your internal terminology, or your specific competitive context. Every enterprise AI deployment is fundamentally a problem of closing that gap.

Ghodsi used a specific example. Databricks used AI to help build a new data connector. The estimate went from nine months down to seven and a half months. Meaningful, but marginal. Then someone asked a different question: what if we planned this project assuming AI capabilities from the start, rather than applying AI to the existing plan?

The answer was one quarter.

The difference between seven and a half months and three months is not a different AI tool. It is a different mode of deployment. Augmentation means AI helps people work faster within the existing process. Rewiring means starting from first principles given what AI can do, and building a different process.

Productivity gains from augmentation are accumulations. Productivity gains from rewiring are discontinuities. Most organizations are pursuing augmentation and measuring the results against benchmarks that assume rewiring. The gap is not AI underperforming. It is the wrong deployment model.

Moats That Survive

The competitive landscape for AI-powered businesses is shifting faster than most strategy frameworks can track.

The open-source gap, which once gave frontier proprietary models a meaningful advantage, has compressed to weeks or months. Kimi 2.6, released in June 2026, matched or exceeded proprietary model benchmarks at open-source cost. Any competitive advantage built purely on model capability is fragile.

What survives? The frameworks that hold up under examination are the ones that predate AI. Hamilton Helmer's Seven Powers remain the most useful lens. The durable moats are: proprietary data advantages (what the model knows that competitors cannot access), switching costs (the cost to a customer of changing providers), network effects (value that increases as more users participate), and process power (deeply embedded operational know-how that takes years to replicate).

Scale economics at the infrastructure layer are real but concentrated in a small number of players. For the vast majority of businesses, the durable advantage in an AI world looks the same as the durable advantage in any other era: be genuinely close to the customer, accumulate data that reflects your specific context, and make it costly to leave.

Where Value Will Land

Two verticals emerged consistently across the Stanford sessions as the most likely sites of the first trillion-dollar AI applications: healthcare and education.

Healthcare represents approximately 17% of US GDP. Education is the other large sector where access to expertise has historically been rationed by time and cost rather than capability. Both share a structural characteristic: the limiting factor is not the knowledge itself, but the delivery of that knowledge at scale to individuals with specific contexts.

A physician cannot be in a thousand places at once. A great teacher cannot personally serve every student. AI does not replace either. It makes the output of expertise available at a cost that approaches zero per interaction. That is not a marginal improvement. It is a structural change in how knowledge-intensive services scale.

Both sectors are also, notably, context problems. The model needs to know the patient. The model needs to know the student. This is exactly where proprietary data and institutional context create durable value. The organizations that build those context advantages now will be difficult to displace later.

The Dynamo Moment

Illustration of continuous learning and organizational transformation

The most important framework from the entire course comes from economic history.

In 1890, an economist published a paper noting that the electric dynamo, clearly superior to steam power by the early 1890s, had produced almost no measurable improvement in US industrial productivity statistics. The dynamo was invented in the 1880s. The productivity gains did not appear in the data until the 1920s. Forty years.

The reason: factories were built around a central drive shaft. When electrification arrived, factory owners did what was rational in the short term. They bought one large electric motor and plugged it into the existing central shaft. The factory ran on electricity. But it was organized exactly as it had been under steam.

The productivity gains came when a new generation of designers, starting from first principles with electricity as the baseline assumption, redesigned factories entirely. One motor per machine. Flexible floor layouts. Processes organized around what electricity made possible rather than what steam had required. That redesign took decades because it required the old generation of factories, and the managers who built them, to be replaced.

Brad Gerstner's framing in the course: we are somewhere in the 1895-to-1910 range. The technology is real and demonstrably superior. The productivity statistics look unimpressive. The gap is not the technology. The gap is organizational design.

He noted that Airbnb was founded in 2009, fifteen years after the commercial internet arrived. By conventional startup thinking, that was not early. By any sensible analysis of how platform businesses actually scale, it was precisely on time. The infrastructure was mature. The user behavior was established. The regulations had not yet closed the window.

The question for every organization right now is not whether AI is real. That is settled. The question is whether you will spend the next decade plugging electric motors into your central drive shaft, or whether you will redesign around what the technology actually makes possible.

Rewiring is harder than augmentation. It requires rethinking processes that work well enough today, which creates organizational resistance. It requires investment before the productivity gain is visible, which creates budget pressure. And it requires leaders who understand what the technology can do well enough to imagine processes that do not yet exist.

The organizations that do that work now will look, in 2035, the way Airbnb looks today. Not early. Precisely on time.

Ready to rewire instead of just augment?

Book a free 30-minute session. We will walk through where your organization stands today and what a first-principles AI roadmap looks like for your specific context.

Schedule Complimentary AI Training