When Sam Altman took the stage at a recent tech conference, the message was unmistakable: OpenAI isn't just a research lab anymore. It's a platform. It's an infrastructure play. It wants to be the Amazon Web Services of artificial intelligence—the layer upon which the entire economy runs. The vision is grand, the funding is astronomical, and the ambition is borderline imperial. But for all the talk of super intelligence and trillion-dollar valuations, OpenAI has one glaring problem that nobody in the boardroom seems willing to address directly: the fundamental economics of inference don't work the way they think they do.
The Amazon playbook doesn't fit
Let's give credit where it's due. Amazon Web Services succeeded because it commoditized a previously expensive, specialized resource—compute capacity. Jeff Bezos bet that if you could spin up a server for pennies an hour, developers would flood in. And they did. The unit economics were simple: Amazon bought servers at scale, ran them efficiently, and sold the spare cycles at a margin. That model works because a server's marginal cost approaches zero once it's built and running. A single AWS instance can host a thousand customers' WordPress sites. The cost per transaction is effectively nothing.
OpenAI, on the other hand, has to run a new, expensive computation for every single query. Every time you ask ChatGPT to summarize a 50-page PDF or generate an image of a cat in a spacesuit, it burns through GPU cycles. Those cycles cost real money—electricity, cooling, hardware depreciation, and the constant need to upgrade to the next generation of chips. Worse, the demand is unpredictable. A viral tweet can send inference costs through the roof. There is no "idle capacity" to sell at a discount. Every user is a direct, variable cost.
The scaling paradox
OpenAI's solution to this problem has been to push for ever-larger models. The theory is that a bigger, smarter model will be more efficient per task—it will need fewer tokens to answer a question, need fewer retries, and hallucinate less. That's true, to a point. GPT-4 is more efficient than GPT-3 in terms of quality per token. But the underlying physics of transformer models means that the relationship between model size and cost is brutally nonlinear. The jump from GPT-3 to GPT-4 required an estimated 100x more compute for training, and the inference cost is still significantly higher per query.
And here's the kicker: the market is already pricing in these savings. Users and businesses expect better answers at the same price. If OpenAI charges $20 a month for ChatGPT Plus today, they can't suddenly raise it to $200 because the model got smarter. The competition—from Google's Gemini, Meta's Llama, and a dozen open-source alternatives—keeps the pricing pressure on. So the more OpenAI invests in bigger models, the more they have to sell them at razor-thin margins, hoping volume will make up for it.
The enterprise trap
OpenAI's real target is enterprise contracts. They want to be the backbone of customer service chatbots, internal knowledge bases, and automated coding assistants. This is the AWS dream—recurring revenue from Fortune 500 companies. But enterprise buyers are notoriously stingy. They want custom SLAs, data privacy guarantees, and integration support. They also demand predictable pricing. A company building a customer service bot cannot afford a 10x cost spike because a new model upgrade hit the market. They want fixed costs per query, and they will negotiate hard.
This creates a fundamental tension. OpenAI needs to invest billions in R&D and compute infrastructure to stay ahead of the competition. But they also need to sell their product at a price point that enterprise buyers will stomach. The only way to square that circle is to either massively scale up volume—which requires more compute, more cost—or to find a breakthrough in efficiency that nobody else has discovered. The latter is a moonshot. The former is a treadmill.
Where the money actually goes
Let's look at the numbers. OpenAI is reportedly burning through $5–7 billion a year on compute and salaries. Their revenue is growing fast—maybe $2–3 billion this year—but it's still a long way from covering costs. Investors are betting that the revenue curve will eventually outpace the cost curve. That's the classic Silicon Valley growth story. But in this case, the cost curve isn't just fixed hardware; it's variable and scaling with every new user. Amazon's AWS had a similar cost structure, but the marginal cost of serving a new customer on a shared server was near zero. For OpenAI, every new customer brings a new, real marginal cost.
The most telling sign of this problem is OpenAI's own behavior. They have started limiting free-tier usage, throttling high-volume users, and pushing for "tiered" access that charges more for premium features. These are the moves of a company that knows its unit economics are broken, not a company that is confidently scaling into dominance. It's the same pattern we saw with Uber—pricing below cost to capture market share, then scrambling to find a profitable equilibrium. Uber eventually found it by raising prices and cutting driver pay, but that's a much harder game to play when your "drivers" are million-dollar GPU clusters.
The open-source elephant
And then there's the open-source movement. Meta's Llama 2 and 3, Mistral, and a dozen other models are freely available. Any startup or large enterprise can run them on their own hardware or lease GPU time from a cloud provider. The cost of running an open-source model is often a fraction of what OpenAI charges, and the quality gap is narrowing fast. OpenAI's moat was always the quality of their model. But that moat is eroding. Once a model is good enough, businesses will choose the cheaper, self-hosted option. OpenAI is betting that their next-generation models will stay far enough ahead to justify the premium. That's a dangerous bet when the entire open-source ecosystem is collectively spending billions to catch up.
None of this means OpenAI is doomed. They have a strong brand, a talented team, and a massive head start. But the Amazon-level ambition requires solving a problem that Amazon never had: the cost of the core product itself increases with every customer you add. Until OpenAI finds a way to decouple revenue growth from compute cost growth—either through a radical efficiency breakthrough, a subscription model that heavily subsidizes heavy users, or a pivot to something entirely different—the financial math simply doesn't add up. And in the end, the market always does the math.
Ahmed Abed – News journalist