The Cathedral and the Budget
In December 2024, Netflix published a blog post called “Cloud Efficiency at Netflix.” It described an internal team whose job is helping engineers understand what resources they use, how efficiently they use them, and what they cost. The company that the entire industry treats as the gold standard for cloud infrastructure needed dedicated tooling and a dedicated team just to maintain visibility into its own spend.
The Growth Subsidy
Netflix spent fifteen years building genuinely impressive infrastructure. Chaos Monkey. Microservices at unprecedented scale. Open Connect, a private CDN with thousands of servers inside ISP networks across 150+ countries. An open-source portfolio that entire companies bootstrapped on.
All of it was real. All of it was funded by a growth curve that made cost a secondary concern.
When your subscriber count doubles every few years, over-provisioned infrastructure disappears into the top line. You can spend a billion a year on AWS and nobody asks questions because revenue is growing faster than the bill. A rising tide does not just lift boats. It hides every hull below the waterline.
Can You Shrink a Cathedral?
Netflix is not in trouble. They won.
But growth is decelerating. The era where subscriber numbers doubled every few years is over. The ad tier buys time, but it does not shrink the infrastructure. It adds to it. So imagine what happens if Netflix actually has to contract. A recession hits. Ad revenue dries up. Subscriber growth goes negative. Headcount gets cut.
Think about what they would be trying to shrink. Over 40,000 microservices, each with its own deployment pipeline, its own monitoring, its own failure domain. Thousands of Open Connect appliances in ISP data centers across 150+ countries, all needing firmware, capacity planning, and hardware lifecycle management. A logging system that ingests five petabytes of data per day and required Netflix to build custom infrastructure on top of ClickHouse and Apache Iceberg just to make their own logs searchable. Infrastructure to observe the infrastructure.
All of it still exists after the cuts. The pipelines still need upgrades. The hardware still ages. The chaos engineering tooling still injects failures, but the team that understood why the experiments were configured a certain way got reorged. Nobody remembers which monkeys to cage or why they were loose in the first place. The logging infrastructure that watches the infrastructure still needs its own care and feeding. The deployment tooling was written for an organization twice this size and nobody has time to simplify it because everyone left is busy keeping the lights on.
Scaling up is an engineering problem. Scaling down is a political one. You have to decide which services to kill, which teams to consolidate, which capabilities to abandon. Imagine the OKR wars when someone proposes killing a service that powers a beloved A/B testing framework, or sunsetting a chaos tool whose creator is now a director. Every one of those decisions runs into someone who built the thing, someone who depends on the thing, and someone whose headcount is justified by the thing. Growth lets you avoid all of that. You never have to choose because there is always room for one more service, one more team, one more layer. That is the heating bill of a cathedral. And it is a lot easier to build a cathedral than to figure out which rooms you can permanently lock.
But the industry should be paying attention. Because if the company that built the cathedral is struggling just to read its own utility bill, the companies that copied the cathedral on a fraction of the budget have no chance at all.
The Rest of Us
For fifteen years, startups with ten engineers adopted microservices because Netflix used microservices. Companies with comfortable uptime targets implemented chaos engineering because Netflix ran Chaos Monkey. Teams that could have shipped a monolith on one server built distributed systems requiring platform teams to operate.
Netflix does it. Netflix succeeded. Therefore the practice causes success.
Nobody asked whether their growth curve would absorb the overhead. Nobody asked whether patterns built for hundreds of millions of users made sense at a few thousand. Netflix was not prescribing. The industry chose to build a religion around it.
Netflix might figure out how to right-size a cathedral. They have the budget to learn.
Everyone who copied the architecture without the balance sheet gets to learn the same lesson without the safety net.
The tide goes out as fast as it comes in. Check what you are wearing.