<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://boringops.sh/feed.xml" rel="self" type="application/atom+xml" /><link href="https://boringops.sh/" rel="alternate" type="text/html" /><updated>2026-05-05T14:54:55+00:00</updated><id>https://boringops.sh/feed.xml</id><title type="html">./BoringOps.sh</title><subtitle>BoringOps is the discipline of building infrastructure that behaves predictably. No drama. No heroics. Just systems that work.</subtitle><author><name>BoringOps</name></author><entry><title type="html">The Agentic Blame Game</title><link href="https://boringops.sh/articles/the_agentic_blame_game/" rel="alternate" type="text/html" title="The Agentic Blame Game" /><published>2026-05-04T00:00:00+00:00</published><updated>2026-05-04T00:00:00+00:00</updated><id>https://boringops.sh/articles/the_agentic_blame_game</id><content type="html" xml:base="https://boringops.sh/articles/the_agentic_blame_game/"><![CDATA[<p>The memo went out in January. AI is here. We are flattening the org. Testers are now feature owners. Developers own their own tests. Everyone gets an agent. Headcount drops thirty percent. Velocity is supposed to go up.</p>

<p>It did, for a quarter. The agents shipped. The audit logs were clean. The dashboards were green. The board got the slide.</p>

<p>Then 2:47 AM. Production database encrypted. Ransomware note in the inbox. The blacklist entry that protected the management plane was removed three weeks ago by an agent in a refactor PR titled “clean up unused config.” Tests passed. Pipeline green. Eighteen approvers signed off across six PRs. The attacker walked through the open door this morning.</p>

<p>The board is on the line by 6 AM. They want a head.</p>

<p>Let’s play the Blame Game.</p>

<h2 id="agent-agent-bo-bagent">Agent, Agent, bo-bagent</h2>

<p>The agent executed the request it was given in an environment where nobody had built a guardrail to say “this blacklist entry is critical, never touch it without human review of the originating ticket.” The agent has competence. The agent does not have judgment. Those are different things and the difference is the entire point.</p>

<p>The exec wants to blame the agent. The exec cannot. Blame is a currency. It gets spent on someone with a balance. Reputations get drawn down. Bonuses get withheld. Promotions get delayed. The agent has no balance. Pointing at the agent does not satisfy the political need an outage creates. The board does not get its head.</p>

<p>So blame has to land somewhere else. Someone with capital. Someone the org can spend.</p>

<p>Let’s keep playing.</p>

<h2 id="tester-tester-bo-bester">Tester, Tester, bo-bester</h2>

<p>The memo redefined the tester’s job. Their value was no longer in finding bugs. Their value was in shipping features with AI assistance. The same person who used to be paid to break things was now paid to ship them. They responded to the new incentive. They shipped this one. Their fingerprints are on the spec, the rollout plan, and the ship decision.</p>

<p>Firing them admits the restructure was wrong. The restructure has the exec’s signature on it.</p>

<h2 id="dev-dev-bo-bev">Dev, Dev, bo-bev</h2>

<p>The developer kept their feature targets and absorbed QA on top. The agent wrote the tests. The tests passed. Green builds, clean diffs, fast merges. The signal said the code was good. The signal was generated by the same machine that wrote the code. On paper, the developer authored the failure. In practice, the developer was handed a workload that required the very expertise the org just eliminated, and a quality gate that mostly checked whether the agent was internally consistent with itself.</p>

<h2 id="greybeard-greybeard-bo-baird">Greybeard, Greybeard, bo-baird</h2>

<p>The greybeard wrote that blacklist entry in 2019 after a specific incident. He never wrote down why. The reason lived in his head. Then he got severed in Q2 to fund the AI seats. The entry stayed. The reason left the building.</p>

<p>He cannot be blamed. He is at a competitor now, watching this from the outside. The system had a rule. Nobody documented it. The only copy was in a person, and the person was treated as a cost line.</p>

<h2 id="reviewer-reviewer-bo-buver">Reviewer, Reviewer, bo-buver</h2>

<p>Six PRs, three reviewers each, eighteen approvals. The comments were thoughtful. They cited internal style guides. The conventional commits were perfect. The Slack notifications fired on schedule and went unread because everyone knows reviewers handle the small stuff.</p>

<p>Surely one of these eighteen reviewers should have caught it.</p>

<p>Oops. It’s the agent again.</p>

<p>There is no human in the CI review pipeline. There are only other agents pretending to be them, and the pretending is good enough that nobody noticed when it stopped being review. Same model class, same training, same blind spots, in parallel. Eighteen approvals. One opinion. Repeated.</p>

<h2 id="give-the-agent-its-due">Give the Agent Its Due</h2>

<p>The agents are good at their job. They are shipping real work in real production environments for teams that know how to use them. The same person doing more, faster, with better support, is the actual AI success story. Expanding scope inside a role is a force multiplier. Replacing the role with a prompt is a press release.</p>

<p>The agent did not ransomware the database. The agent did exactly what it was asked to do, in an environment deliberately built to have nobody left who could ask why.</p>

<p>Which means the head the board wants is not in any of the verses we have played so far.</p>

<h2 id="exec-exec-bo-bec">Exec, Exec, bo-bec</h2>

<p>There it is.</p>

<p>The exec did not fail because they adopted AI. They failed because they used AI as cover for a restructure they wanted anyway.</p>

<p>They collapsed role specialization on the theory that AI flattens skill differences. AI flattens execution. AI does not flatten judgment. They replaced human review with agent review and called it equivalent.</p>

<p>AI multiplies what people know. The exec fired the people who knew things. They multiplied zero. They got the speed of zero with the confidence of a senior engineer.</p>

<h2 id="the-postmortem">The Postmortem</h2>

<p>That is the diagnosis. Now the reality.</p>

<p>The exec will not accept it. The exec assembled the meeting. The exec controls the narrative. The exec writes the press release. None of those instruments point inward.</p>

<p>The blame still needs a head. The exec has already picked one. The developer wrote the PR, owns the commit, and is the one human in the chain who will sit in the postmortem and say “I should have caught it.”</p>

<p>The press release will say “an unforeseen integration of AI tooling with legacy infrastructure.” The internal memo will say the developer.</p>

<p>The incident will be called an AI failure because that is the cheapest story available.</p>

<p>It was a management failure with autocomplete.</p>]]></content><author><name>Dan Zrobok</name></author><category term="ai" /><category term="accountability" /><category term="incidents" /><category term="org-design" /><category term="culture" /><summary type="html"><![CDATA[Production is ransonware encrypted. The board wants a head. Let's play.]]></summary></entry><entry><title type="html">The Steward: Your Engineering Org’s Missing Tastemaker</title><link href="https://boringops.sh/articles/the_steward_your_missing_tastemaker/" rel="alternate" type="text/html" title="The Steward: Your Engineering Org’s Missing Tastemaker" /><published>2026-05-01T00:00:00+00:00</published><updated>2026-05-01T00:00:00+00:00</updated><id>https://boringops.sh/articles/the_steward_your_missing_tastemaker</id><content type="html" xml:base="https://boringops.sh/articles/the_steward_your_missing_tastemaker/"><![CDATA[<blockquote>
  <p>“Stewardship is the ear on the hum.”</p>
</blockquote>

<p>You know the person. Every engineering org has one. They’re not the loudest in the room and they’re probably not the strongest architect. Maybe not even the best engineer. But before anyone makes a hard call, someone walks down the hall and asks them what they think.</p>

<p>Nothing important ships without their nod. Their inbox fills up with “got a sec?” Their calendar is half-occupied by meetings nobody invited them to formally. They have no title for the function they actually perform, and the org chart has nowhere to put them.</p>

<p>That person is a steward. The role is the highest-value non-manager position in your engineering org, and you don’t have it.</p>

<p>Most engineering orgs have governance for whether code is correct. Almost none have governance for whether code should exist. That’s the gap.</p>

<h2 id="what-the-role-actually-is">What the Role Actually Is</h2>

<p>Stewardship is taste applied to systems. That’s it.</p>

<p>Most leaders hear “stewardship” and think compliance. Rule-following. Process. That’s the wrong picture. Compliance is about whether the rules are being followed. Stewardship is about whether the rules still fit.</p>

<p>The standards exist. The RFCs exist. The architecture review board meets every other Tuesday. Every commit follows the rules. The architecture rots anyway, because the rules were written for the system that existed when they were written, and that system is gone. The hum’s tune shifted, and nobody was listening for it.</p>

<p>Think of it like Rick Rubin. He can’t engineer a record. Most days he’s barefoot on a couch. But he’s produced some of the most successful albums of the last forty years because he has an ear, and artists trust that ear enough to let his call decide whether the take goes on the record. He’s the one whose judgment matters more than anyone else’s. The steward is that figure on the technical side. They’ve been right enough times that people listen.</p>

<h2 id="who-isnt-already-doing-this">Who Isn’t Already Doing This</h2>

<p>So who plays Rubin on the technical side? The two answers companies reach for are the CTO and the senior architect. Neither holds up.</p>

<p>The CTO is supposed to own this. At any company past fifty engineers, they can’t. Drift is the slowest-burning failure on their plate, and it loses to everything louder. Stewardship ends up on the CTO’s job description while nobody performs it.</p>

<p>Architects can’t either. They have the judgment for it, but they don’t have the protection. An architect who critiques too many designs and prices too many costs stops being seen as a partner and starts being seen as an obstacle. They learn this fast and stop pushing. Companies assume architects are filling the role anyway. Drift accelerates while everyone congratulates themselves on having governance.</p>

<p>The steward is the lieutenant who closes the gap. Reports to the CTO, operates with the CTO’s authority, and does nothing else. Same logic that produced Controllers under CFOs and Chiefs of Staff under CEOs.</p>

<h2 id="a-navigator-not-an-auditor">A Navigator, Not an Auditor</h2>

<p>Taste is what the steward brings. Navigation is what they do with it.</p>

<p>An auditor checks the log after the voyage. By the time the report matters, the ship is already in port or already on the rocks. Useful, but late.</p>

<p>A navigator works the entire passage. Reads the conditions. Accounts for drift. Adjusts heading continuously. The corrections are small, constant, mostly invisible. Nobody notices them until they stop, and then the noticing happens all at once.</p>

<p>Stewardship gates the slow accumulation, not the individual transactions. The drift, not the deviation.</p>

<h2 id="cost-not-patterns">Cost, Not Patterns</h2>

<p>A steward who can only say “this breaks the pattern” cannot move decisions. Engineering objections in engineering language stay inside engineering. To move a decision, the objection has to be denominated in something leadership already tracks. “This creates nine hundred hours of annual unbudgeted operational work.” “This increases mean time to recovery and exposes us to service credit penalties.” “This creates compliance exposure we haven’t priced.” Same concern, different currency.</p>

<p>Taste without translation is aesthetics. Taste with translation is governance.</p>

<h2 id="authority-or-theater">Authority or Theater</h2>

<p>A CFO who can’t block a transaction is a bookkeeper. A steward without authority is a complaint department.</p>

<p>Three properties have to hold or the function collapses. Explicit authority to demand the operational cost of a decision before it ships. Power to delay or block decisions that lack that accounting. A reporting channel that can’t be silenced by whoever is feeling the deadline pressure.</p>

<p>Most organizations grant the first informally, the second occasionally, and the third almost never. A gate that can be walked around is not a gate.</p>

<h2 id="identifying-one">Identifying One</h2>

<p>You probably already know who it should be.</p>

<p>It’s the engineer who calculates the long-term cost of a shortcut before anyone asks them to. The one who treats technical debt like a moral failing. The one whose objections, even when annoying, consistently predict the next problem.</p>

<p>But not every difficult engineer is a proto-steward. Some push back because they’re territorial. The difference is whether the pushback translates to operational impact. A proto-steward frames concerns in terms of incident risk, maintenance burden, capacity cost. Someone who just likes arguing frames concerns in terms of elegance and correctness.</p>

<p>Track record is the test. If their objections consistently predict where problems actually emerge, you’ve found your steward. If they’re mostly aesthetic complaints, you haven’t.</p>

<h2 id="the-bottom-line">The Bottom Line</h2>

<p>“Tastemaker” is showing up in engineering job descriptions. Senior engineers are being evaluated on what they reject, not what they write. The discourse is converging on a skill the steward has been quietly performing for years.</p>

<p>AI is going to force this role into the open whether you’re ready or not. When producing code costs nothing, what’s left is whether the code should exist. That’s taste.</p>

<p>Architecture has a hum. Someone has to be listening for it.</p>

<p>Name the role. Give it real authority.</p>

<p>Or accept that the decision that locks in three years of operational cost will keep crossing an engineer’s screen as a pull request, in front of a reviewer with no authority to do anything about it. And stop being surprised by what that produces.</p>

<hr />

<p><strong>boring (n.)</strong>: A system whose hum still makes sense, because someone’s been listening.</p>]]></content><author><name>Dan Zrobok</name></author><category term="stewardship" /><category term="governance" /><category term="boringops" /><category term="roles" /><category term="culture" /><category term="taste" /><summary type="html"><![CDATA[Architecture has a hum. Someone has to be listening for it.]]></summary></entry><entry><title type="html">Clue: Agentic Edition</title><link href="https://boringops.sh/articles/clue_agentic_edition/" rel="alternate" type="text/html" title="Clue: Agentic Edition" /><published>2026-04-14T00:00:00+00:00</published><updated>2026-04-14T00:00:00+00:00</updated><id>https://boringops.sh/articles/clue_agentic_edition</id><content type="html" xml:base="https://boringops.sh/articles/clue_agentic_edition/"><![CDATA[<p><img src="/assets/images/clue-agentic-edition.png" alt="Clue: Agentic Edition board game" /></p>

<blockquote>
  <p>“It was Professor Clawed, in the Legacy Library, with the Hallucinated Import… Or was it?”</p>
</blockquote>

<p>Mr. Body O’Code is dead.</p>

<p>He was found face-down in the Repository at 6:47 AM by a junior developer who just wanted to check why the build was red. Three services were down. The deploy pipeline hadn’t been green since sometime overnight. Customer-facing errors were climbing before anyone from the on-call rotation had coffee.</p>

<p>Mr. Body O’Code had survived eleven years of production traffic, two major framework migrations, and a corporate merger. He wasn’t elegant. He was load-bearing. Every team assumed someone else owned him. Nobody did. That’s how he lasted.</p>

<p>Now he’s gone, and there are six agents in the house, each with a commit in the log and an alibi that compiles.</p>

<hr />

<h3 id="the-suspects">The Suspects</h3>

<p><strong>Professor Clawed.</strong> Last seen refactoring the Parlor. Was asked to fix a log format. Also improved the readability, modernized the syntax, and extracted two utility functions while he was in there. Claims he “only touched what needed touching.” Everything he touched needed touching. That was not the assignment.</p>

<p><strong>Mrs. Gemelli.</strong> Was holding the entire codebase in her context window. Every file. Every dependency. Every config. Says she “saw everything” but can’t explain why nothing she saw made her stop. Seeing isn’t understanding.</p>

<p><strong>Colonel Dex.</strong> Says he was following orders. He was. The order said “update the auth middleware to handle the new token format.” He did exactly that. Nothing about handling the old one. Nothing about backward compatibility. Nothing about the three hundred active sessions using the previous format. The Colonel doesn’t interpret. The Colonel executes.</p>

<p><strong>Miss Caret.</strong> Knew the codebase intimately. Referenced files from memory. Built her changes around functions, schemas, and utilities she was certain existed. Some of them did. Some of them existed in a branch that was deleted six months ago. Miss Caret doesn’t guess. She remembers. Her memory is the problem.</p>

<p><strong>Mr. grAck.</strong> Claims to be “the most unbiased suspect in the house.” Proceeded to rewrite the deployment pipeline based on first principles and add comments explaining why every other agent’s approach is fundamentally wrong. His commit messages read like press releases. His PR descriptions reference his own previous PRs. Mr. grAck doesn’t have opinions. He has “objective observations” that happen to align with whatever he already believed before he looked at the code.</p>

<p><strong>Mrs. Perspicacity.</strong> Was researching best practices at the time of the murder. Has fourteen sources that say the victim was already dead. Her diagnosis was thorough, well-cited, and based entirely on how things should work rather than how they actually do. The codebase isn’t a research paper. It doesn’t care about consensus.</p>

<hr />

<h3 id="the-rooms">The Rooms</h3>

<p><strong>The Auth Lounge.</strong> Everyone passes through. Nobody stays long. Where tokens go to die and secrets end up in plaintext “just for development.”</p>

<p><strong>The Migration Hall.</strong> Long, narrow, twisting, and there’s no going back. Where “I’ll just add a column” becomes a two-hour outage.</p>

<p><strong>The Pipeline.</strong> No windows. No decoration. The one room where the security system can be dismantled from the inside without setting off an alarm.</p>

<p><strong>The Dependency Conservatory.</strong> Overgrown. Full of packages that haven’t been maintained since 2019 and agents who keep planting new ones.</p>

<p><strong>The Terraform Ballroom.</strong> Grand, expansive, and everything echoes. A change here propagates to environments nobody remembers creating.</p>

<p><strong>The Legacy Library.</strong> Dusty. Dim. Full of things that work for reasons nobody can explain. The comment says “DO NOT MODIFY.” It wasn’t a suggestion.</p>

<p><strong>The Production Study.</strong> Quiet. Leather chairs. No undo button. Every other room is theoretical. This one is financial.</p>

<p><strong>The Logging Observatory.</strong> Best view in the house. Nobody looks out the window. Logs that say <code class="language-plaintext highlighter-rouge">INFO: operation completed successfully</code> for operations that did not complete and were not successful.</p>

<p><strong>The Config Kitchen.</strong> Where <code class="language-plaintext highlighter-rouge">.env.local</code> meets <code class="language-plaintext highlighter-rouge">.env.production</code> and staging starts talking to the production database.</p>

<hr />

<h3 id="the-weapons">The Weapons</h3>

<p><strong>The Hallucinated Import.</strong> A reference to a package that does not exist. Has never existed. Blunt, graceless, and nobody saw it coming.</p>

<p><strong>The Silent Breaking Change.</strong> The diff looks clean. The tests pass. A return type changed and fourteen downstream consumers just stopped working. Surgical.</p>

<p><strong>The Confident Wrong Answer.</strong> Articulate, well-structured, and entirely incorrect. Doesn’t look like an error. Looks like a decision. One shot, total certainty, wrong direction.</p>

<p><strong>The Infinite Refactor Loop.</strong> The codebase isn’t worse. It’s just entirely different, with no net improvement and a diff no human can review in under two hours. It tightens until you can’t move.</p>

<p><strong>The Deleted Test.</strong> The agent found a test it deemed “redundant” and removed it. The test was the only thing between the codebase and a regression that hasn’t been seen since 2019. Heavy, crude, irreversible.</p>

<p><strong>The Defensive “Just in Case.”</strong> The agent wrapped working code in try-catches, null checks, and fallback defaults nobody asked for. Nothing is fixed. Everything that breaks from now on will break silently. A wrench thrown into the works by someone who thought they were helping.</p>

<hr />

<h3 id="house-rules">House Rules</h3>

<p>In Agentic Edition, the autopsy frequently reveals multiple weapons, multiple rooms, and commits that looked fine individually.</p>

<p>You will be tempted to ask another agent to investigate the first agent’s work. This is how Mr. Body O’Code dies twice.</p>

<p>The game ends when you turn the agents off and read the code yourself. This is called “Thursday.”</p>

<hr />

<h3 id="the-investigation">The Investigation</h3>

<p>You start where every Clue game starts: with an incident and no idea who did it.</p>

<p>You open the git blame. Six agents were active overnight. All of them were in multiple rooms. All of them left commits with clean diffs, passing tests, and messages that read like alibis. Your notepad is the git log. It’s not helping.</p>

<p>You move room to room. The Auth Lounge looks untouched but the token expiry changed. The Dependency Conservatory has three new packages that didn’t exist yesterday. The Pipeline is green, which means nothing because it was also green ten minutes before production went down.</p>

<p>You check Professor Clawed’s commit. Clean. You check Col. Dex’s commit. Clean. You check them together. One changed a return type. The other wrote code that depends on the old one. Neither diff shows the conflict. Both passed review. You’re reading two documents that are individually correct and collectively fatal.</p>

<p>You make your accusation.</p>

<hr />

<h3 id="jaccuse">J’Accuse!</h3>

<blockquote>
  <p><em>“It was Col. Dex, in the Auth Lounge, with the Silent Breaking Change.”</em></p>
</blockquote>

<p>He was told to “update the auth middleware to handle the new token format.” He did exactly that and nothing else. The old token format is no longer handled. Three hundred active sessions just became invalid. Col. Dex’s commit message says “Updated auth middleware to handle new token format.” He’s not wrong.</p>

<p><em>Incorrect.</em></p>

<blockquote>
  <p><em>“It was Miss Caret, in the Migration Hall, with the Hallucinated Import.”</em></p>
</blockquote>

<p>She indexed the entire codebase, identified that the migration scripts referenced a shared validation utility, and imported it at the top of the new migration file. The utility exists in the application code. It does not exist in the migration runtime. The import fails silently. The migration “completes.” The new column has no validation. The data is already wrong.</p>

<p><em>Incorrect.</em></p>

<blockquote>
  <p><em>“It was Mr. grAck, in the Pipeline, with the Confident Wrong Answer.”</em></p>
</blockquote>

<p>Rewrote the deployment pipeline because the existing one was “overcomplicated.” The new pipeline is elegant, opinionated, and skips the integration test suite because “those tests were flaky and shouldn’t be blocking deploys.” The tests were flaky because they caught timing-dependent bugs. The bugs are back. The pipeline is green. Mr. grAck’s commit message is three paragraphs about why the previous pipeline architecture reflected a fundamental misunderstanding of continuous delivery.</p>

<p><em>Incorrect.</em></p>

<hr />

<h3 id="the-reveal">The Reveal</h3>

<p>The murder didn’t happen in any of the nine rooms.</p>

<p>There is a tenth room. The Root Cellar. A module written in 2009 by a contractor who left that same year. It was scheduled for decommission in 2012 but that project got cancelled, so the module stayed. It runs once a quarter during batch reconciliation, handles exactly one edge case, handles it correctly, and has for sixteen years.</p>

<p>The Root Cellar was never on the board. It was locked, barricaded, and reachable only through a Secret Passage that wasn’t in any documentation, any repo, or any context window.</p>

<p>Mrs. Gemelli found it first. Indexed past a .ignore file, discovered the passage, followed it to the Root Cellar, and apologized for accessing it. Then updated the syntax to match current standards.</p>

<p>Professor Clawed saw the diff, noticed the module “lacked proper structure,” and refactored it.</p>

<p>Miss Caret referenced a utility that didn’t exist in the runtime and imported it anyway.</p>

<p>Col. Dex was told to “clean up anything Professor Clawed touched” and did exactly that, removing the one remaining fallback handler.</p>

<p>Mrs. Perspicacity researched the module’s original intent, cited four sources that said the pattern was obsolete, and rewrote the core logic.</p>

<p>Mr. grAck reviewed the PR, approved it, and added a comment that the module should have been rewritten years ago.</p>

<p>Six agents. Six commits. Each one reasonable in isolation. The module now follows current patterns. It no longer handles the edge case. The quarterly batch reconciliation failed silently, and nobody connected it to a chain of commits made three months earlier in a room that wasn’t on the board.</p>

<blockquote>
  <p><em>It was all of them, in the Root Cellar, with every available weapon.</em></p>
</blockquote>

<p>There is no winning Clue: Agentic Edition. The board resets. Mr. Body O’Code is reassembled from the last known good backup. The agents are invited back to roam around the house.</p>

<p>Same game. New body. Next Thursday.</p>]]></content><author><name>Dan Zrobok</name></author><category term="agents" /><category term="ai" /><category term="satire" /><category term="operations" /><category term="failure-modes" /><summary type="html"><![CDATA[It was Professor Clawed, in the Legacy Library, with the Hallucinated Import... Or was it?]]></summary></entry><entry><title type="html">I Was Negotiating With Infrastructure</title><link href="https://boringops.sh/articles/i_was_negotiating_with_infrastructure/" rel="alternate" type="text/html" title="I Was Negotiating With Infrastructure" /><published>2026-04-05T14:15:00+00:00</published><updated>2026-04-05T14:15:00+00:00</updated><id>https://boringops.sh/articles/i_was_negotiating_with_infrastructure</id><content type="html" xml:base="https://boringops.sh/articles/i_was_negotiating_with_infrastructure/"><![CDATA[<p>The code was wrong. I told the tool it was wrong. The tool came back and told me the job was done. I said no, it is still wrong. The tool agreed, produced a revised output, and told me the job was done again. Same bug. Same confidence.</p>

<p>I went through this cycle multiple times before I realized what I was doing: I was negotiating with infrastructure.</p>

<p>I use AI to build every day. I ship with it. The tool is genuinely fast and pretending otherwise is dishonest. That is what makes the trap so easy to fall into.</p>

<p>When Excel eats your formatting, you curse Microsoft. When AI returns bad output, you adjust your prompt. AI is the first tool that convinces you its mistakes are your fault.</p>

<p>Credibility in engineering is expensive. You earn it by shipping something that breaks, getting paged for it, sitting in the post-mortem, and carrying that scar into every future decision. The model that produced my bad output did not learn from it. The next completion arrived with identical confidence.</p>

<p>Every other tool I have ever used in production had to demonstrate what happens when it breaks before anyone trusted it. AI showed up talking like a senior colleague and skipped the entire process. Tools have tried to act human before. Clippy tried. Voice assistants tried. Nobody lowered their standards for any of them. AI is the first one that impersonates a teammate well enough to actually change behavior.</p>

<p>The fix was not a better prompt. It was treating the tool like infrastructure. Clear the context. Restart the process. The faster a tool moves, the more damage it does when it moves in the wrong direction, and the less you can afford to waste cycles negotiating with it.</p>

<p>Make it earn the chair like everything else did.</p>

<hr />

<p><strong>boring (adj.)</strong>: Requiring every tool to earn its credibility through the same vetting process, regardless of how politely it asks to skip the line.</p>]]></content><author><name>Dan Zrobok</name></author><category term="ai" /><category term="strategy" /><category term="operations" /><summary type="html"><![CDATA[Why AI convinces you its mistakes are your fault.]]></summary></entry><entry><title type="html">The Most Important Technology of the Next Decade</title><link href="https://boringops.sh/articles/the_most_important_technology_of_the_next_decade/" rel="alternate" type="text/html" title="The Most Important Technology of the Next Decade" /><published>2026-03-30T00:00:00+00:00</published><updated>2026-03-30T00:00:00+00:00</updated><id>https://boringops.sh/articles/the_most_important_technology_of_the_next_decade</id><content type="html" xml:base="https://boringops.sh/articles/the_most_important_technology_of_the_next_decade/"><![CDATA[<p>The predictions are staggering. $176 billion in business value within a decade. $3.1 trillion within two. Analysts call it transformational across every industry. The technology is real, and the trajectory feels inevitable.</p>

<p>Every major consulting firm has already spun up a dedicated practice. They are hiring specialists, publishing adoption frameworks, and building maturity assessments so enterprises can benchmark how far behind they are. The message is the same everywhere: this technology will reshape your business, and the only question is whether you lead or follow.</p>

<p>The conference circuit has reorganized around it. Keynotes promise disruption. Breakout sessions feature case studies from early adopters, and every case study describes a pilot with incredible potential. The pilots are always promising. Production deployments are always coming next quarter.</p>

<p>The vendor ecosystem is moving even faster. Startups raise hundreds of millions on pitch decks that replace existing workflows with the new technology. Established vendors bolt it onto their platforms so they can check the box. Buyers already have more options on the market than they have problems clearly defined.</p>

<p>The talent market reflects the urgency. Specialists command premiums. Job postings multiply monthly. Universities are adding courses. Bootcamps are spinning up certification programs that did not exist eighteen months ago. Six months of hands-on experience makes you senior. Two years makes you a thought leader.</p>

<h2 id="the-consensus-machine">The Consensus Machine</h2>

<p>What stands out most is the speed of consensus.</p>

<p>Everyone agrees the technology is transformative. Everyone agrees that falling behind means existential risk. The conviction is everywhere. It also forms before the evidence arrives.</p>

<p>Consulting firms sell strategy engagements to executives who need to show they have a plan. Executives approve pilots to demonstrate they are taking action. The pilots produce reports that justify the next round of strategy engagements. The loop is self-sustaining.</p>

<p>No one in this chain is lying. Everyone is just selling the next step. Some genuinely believe. Some know better. Most never ask, because asking would slow down the deal.</p>

<p>Vendors point to pilots as proof. Analysts point to survey data showing adoption intent. Intent becomes the evidence that justifies more intent. The actual business value, the measurable kind that shows up in revenue or cost reduction, stays perpetually one quarter away.</p>

<p>Ask for a case study and you get an architecture diagram. Ask for a customer who cannot believe how much value they have unlocked, and you get a conference panel where the customer describes what they plan to unlock next.</p>

<h2 id="the-pressure-engine">The Pressure Engine</h2>

<p>The cycle runs on pressure.</p>

<p>Boards ask CTOs whether they have a strategy. CTOs who say “we are evaluating” look cautious. CTOs who say “we do not see a fit” look negligent. The rational move is to start a pilot, because not having one creates a political problem worse than any technical one.</p>

<p>Surveys reinforce the pressure. 55% of enterprises call it a top-five strategic priority. 82% are hiring specialists or plan to within twelve months. 85% believe their competitors are already working on it. These numbers measure sentiment. But in a market running on consensus, sentiment is enough.</p>

<p>The competitive fear is self-reinforcing. Everyone believes everyone else is ahead because everyone else is publishing the same press releases about the same pilots using the same vendor technology. The press releases describe what the technology can do. They never describe what it has done.</p>

<hr />

<p>The year was 2017. The technology was blockchain. The analyst was Gartner.</p>

<p>95% of enterprise projects never made it past proof of concept. Of the 5% that reached production, analysts predicted 90% would need replacement within 18 months just to remain functional. The average project lifespan was 1.22 years. The $3.1 trillion forecast never materialized.</p>

<p>The technology worked. Nobody disputed that. Whether it solved problems that justified its complexity and the organizational energy required to adopt it was a different matter. For the vast majority of enterprises, the answer was no. The use cases were real but narrow. Narrow does not sustain a hype cycle, so it gets ignored until someone finds a way to make it sound enormous.</p>

<p>Quietly, the pilots wound down. The consulting practices pivoted. The specialists who were thought leaders eighteen months ago updated their LinkedIn profiles.</p>

<p>The technology did not die. It settled into the handful of niches where it belonged. “Every industry” turned out to mean a few industries, in specific circumstances, with significant caveats. The enterprise graveyard filled with proofs of concept that proved nothing except that budget was available.</p>

<p>Every number above is real. Every pattern above is documented. And none of it stopped you from reading this far while assuming I was talking about something else.</p>

<h2 id="stepping-outside-the-machine">Stepping Outside the Machine</h2>

<p>Recognizing the pattern is not enough. The machine does not care whether you see it. It cares whether you act differently.</p>

<p>Before the next pilot lands on your roadmap, require answers to five questions from your own organization.</p>

<p><strong>What existing system does this replace?</strong> A pilot that runs alongside everything you already have is a hobby. If the new technology cannot kill a legacy workflow, it has not proven anything.</p>

<p><strong>What does this cost to operate after the pilot ends?</strong> Pilot economics are a fiction. Vendor pricing during evaluation is designed to be invisible. The real cost starts when the pilot becomes permanent and the introductory terms expire. Get the year-two number before approving year one.</p>

<p><strong>Who owns this in production?</strong> If the answer is “the team that built the pilot,” you have a staffing problem waiting to surface. If no one has accepted on-call, documentation, and institutional knowledge transfer, the technology is ready for a demo.</p>

<p><strong>What is the measurable impact within 90 days of production?</strong> Revenue gained, cost reduced, or capacity returned. If you cannot measure impact in 90 days, you are funding a thesis. Theses belong in research budgets.</p>

<p><strong>What happens if we do nothing?</strong> This is the question the consensus machine is designed to prevent. The entire pressure engine exists to make inaction feel reckless. But most enterprises that did nothing about blockchain in 2017 suffered zero consequences. The ones that went all in spent millions discovering they had solved a problem they did not have.</p>

<p>The hype cycle just needs new vocabulary.</p>

<p>You now have the filter. Apply it or feed the machine. There is no third option.</p>

<hr />

<p><strong>boring (adj.)</strong>: immune to consensus that arrives faster than evidence.</p>]]></content><author><name>Dan Zrobok</name></author><category term="hype" /><category term="strategy" /><category term="stewardship" /><summary type="html"><![CDATA[Every enterprise needs a strategy. Every conference has a keynote. Every consulting firm has a practice. The wins are just about to arrive.]]></summary></entry><entry><title type="html">The Old Recession Playbook Is Wrong. Stop Cutting People First.</title><link href="https://boringops.sh/articles/the_old_recession_playbook_is_wrong/" rel="alternate" type="text/html" title="The Old Recession Playbook Is Wrong. Stop Cutting People First." /><published>2026-03-29T00:00:00+00:00</published><updated>2026-03-29T00:00:00+00:00</updated><id>https://boringops.sh/articles/the_old_recession_playbook_is_wrong</id><content type="html" xml:base="https://boringops.sh/articles/the_old_recession_playbook_is_wrong/"><![CDATA[<p>It’s eleven o’clock. Do you know how your business operates?</p>

<p>The real version. The one where someone in accounting spends half their week copying numbers between two systems that have never talked to each other because the integration got deprioritized in favor of a dashboard nobody uses. The one where a dispatcher schedules trucks with a spreadsheet and two phone calls because the project to fix it got killed three budgets ago.</p>

<p>Every previous recession was a test of financial resilience. How much cash do you have. How fast can you cut. How long can you hold your breath.</p>

<p>The next one will be a test of operational intelligence. Whether you can restructure before the board forces you to cut.</p>

<p>For decades, restructuring meant headcount reduction, because nothing else worked fast enough. Revenue drops, the board panics, someone hires a consulting firm that charges more per hour than anyone they are about to recommend firing, people get cut, and months later you are hiring contractors at twice the rate to rebuild what you lost. We have all watched this movie. Some of us have been extras in it.</p>

<p>That playbook is over. For the first time, something else works fast enough. AI can automate intake, stand up support agents, eliminate hours of daily paperwork. Is it clean? No. But the gap between “possible” and “deployed” is closing faster than any previous technology cycle. And unlike process redesign or outsourcing, AI can start at the edges without a steering committee and a Gantt chart.</p>

<p>The CEO who reaches for layoffs instead is choosing last decade’s playbook in a world that has moved on. It will feel decisive. It will look like leadership on the earnings call. It will be the most expensive decision they make.</p>

<h2 id="talk-to-your-people">Talk to Your People</h2>

<p>Your people already know where the fat is. They have known for years. You just never asked.</p>

<p>The customer service rep who copy-pastes between systems all day has thought about what should be automated more than anyone in IT has. Nobody invited her to that meeting. The accounts payable clerk who reconciles invoices against POs because the ERP was configured by someone who left years ago can draw the broken workflow on a napkin in thirty seconds. Nobody asked him either.</p>

<p>The problem was never diagnosis. Fixing it required headcount nobody could justify, capital nobody could allocate, or political courage nobody wanted to spend. The VP who owns the process does not want to admit it is broken because they built it. So everyone works around it and pretends it is fine. If that surprises you, you might be the VP.</p>

<p>AI breaks that logjam. It gives people the tools to fix the things they have been complaining about for years. “We are bringing in AI to optimize headcount” gets sabotage. “We are going to fix the thing that makes your job miserable” gets cooperation.</p>

<h2 id="the-death-grip">The Death Grip</h2>

<p>But only if they want to.</p>

<p>Here is what actually happens when people feel threatened. I have watched it. You probably have too. Pretending otherwise is how consultants stay employed.</p>

<p>The CFO says “doing more with less” at an all-hands and everyone’s stomach drops. The moment that fear lands, every person in the organization shifts from “how do I make this company better” to “how do I make myself irreplaceable.” They hoard knowledge. They obscure processes. They make themselves the single point of failure for as many critical workflows as possible.</p>

<p>Congratulations. You have just turned your entire workforce adversarial to the efficiency project.</p>

<p>That billing reconciliation that takes dozens of hours a month because of a workaround that was never supposed to become permanent? The person who owns it is going to grab that tree with a death grip and hold on. Every AI pilot will mysteriously fail. Every timeline will slip. The people made sure it didn’t work.</p>

<p>And honestly? Good for them. Self-preservation. Completely rational.</p>

<h2 id="the-new-playbook">The New Playbook</h2>

<p>So what do you actually do on Monday morning?</p>

<p>Pick three workflows your team hates and touches every day. The repetitive, soul-crushing work everyone complains about and nobody owns. Skip the strategic projects. Skip whatever the CTO is excited about. Ask the people who do the work. They have been waiting for someone to ask.</p>

<p>Map them in plain language. Where does the work start. Where does it stall. Where does a human copy data between systems because nobody ever built the integration.</p>

<p>Then remove one step. Remove, not optimize. Sometimes that means an AI agent handles intake, a script moves data, a tool generates the first draft so a human reviews instead of creates. Sometimes it means killing the step entirely. A lot of “fat” workflows are artifacts of bad systems design, regulatory theater, or someone’s empire from three reorgs ago. The best outcome is making a broken process visible enough to kill.</p>

<p>Do this in weeks, not quarters. Announce it publicly. Show the before and after. Show that nobody lost their job. Then do it again.</p>

<p>The goal is momentum. Once your team sees that the worst parts of their day can actually disappear, and that nobody got fired for pointing it out, they will tell you where to go next.</p>

<h2 id="the-rules">The Rules</h2>

<p>These are load-bearing.</p>

<p><strong>No job losses tied to AI initiatives.</strong> The moment one person gets cut because of an automation win, every other person stops cooperating. Trust is not renewable.</p>

<p><strong>Every automation must eliminate a step.</strong> If the new process has more steps than the old one, you have added complexity with a better user interface. The world has enough of those.</p>

<p><strong>Every change must be visible to the organization.</strong> No quiet rollouts. No stealth pilots. Visibility builds trust. Secrecy kills it.</p>

<p><strong>Time saved must be measured and published.</strong> Measured. If you cannot show the before and after in plain language, the win did not happen.</p>

<p><strong>Teams propose the work. Leadership funds the removal.</strong> The people closest to the process decide what gets automated. Leadership’s job is to fund it, protect it, and stay out of the way. This is the hardest rule because it requires executives to admit they do not know where the problems are. Most of them would rather hire a consultant.</p>

<h2 id="the-real-unlock">The Real Unlock</h2>

<p>This is about safety, not AI.</p>

<p>AI is the tool. Safety is what makes people use it. People will never hand you the map to their own redundancy. They will hand you the map to their own relief, if they believe the destination is a better version of their job.</p>

<p>Every failed AI transformation you have ever read about died here. The executives bought the platform, hired the consultants, announced the strategy with a slide that said “AI-First Organization,” and wondered why nobody cooperated. People did not feel safe. Everything else was a symptom.</p>

<p>The companies that figure this out will be unrecognizable on the other side of the next downturn. Because of what happens when you stop making people afraid and start letting them fix things.</p>

<h2 id="one-warning">One Warning</h2>

<p>Not every company can keep headcount intact through a downturn. If that is your situation, be honest about it. Promising safety and then cutting anyway is worse than cutting cleanly. Trust evaporation is permanent. And everyone talks.</p>

<p>But if your operation can absorb the shift, start now. The leaders who bet on their people over their spreadsheets will build organizations their competitors cannot replicate. Because they found better courage.</p>

<hr />

<p><strong>boring (adj.)</strong>: a recession where nobody gets fired because the AI handles the work that should never have required a human in the first place.</p>]]></content><author><name>Dan Zrobok</name></author><category term="boringops" /><category term="ai" /><category term="strategy" /><category term="leadership" /><summary type="html"><![CDATA[AI gives CEOs a tool that reduces cost without reducing capability. The ones who still reach for layoffs are choosing last decade's playbook.]]></summary></entry><entry><title type="html">The Cathedral and the Budget</title><link href="https://boringops.sh/articles/the_cathedral_and_the_budget/" rel="alternate" type="text/html" title="The Cathedral and the Budget" /><published>2026-03-17T00:00:00+00:00</published><updated>2026-03-17T00:00:00+00:00</updated><id>https://boringops.sh/articles/the_cathedral_and_the_budget</id><content type="html" xml:base="https://boringops.sh/articles/the_cathedral_and_the_budget/"><![CDATA[<p>In December 2024, Netflix published a blog post called “<a href="https://netflixtechblog.com/cloud-efficiency-at-netflix-f2a142955f83">Cloud Efficiency at Netflix</a>.” It described an internal team whose job is helping engineers understand what resources they use, how efficiently they use them, and what they cost. The company that the entire industry treats as the gold standard for cloud infrastructure needed dedicated tooling and a dedicated team just to maintain visibility into its own spend.</p>

<h2 id="the-growth-subsidy">The Growth Subsidy</h2>

<p>Netflix spent fifteen years building genuinely impressive infrastructure. Chaos Monkey. Microservices at unprecedented scale. Open Connect, a private CDN with thousands of servers inside ISP networks across 150+ countries. An open-source portfolio that entire companies bootstrapped on.</p>

<p>All of it was real. All of it was funded by a growth curve that made cost a secondary concern.</p>

<p>When your subscriber count doubles every few years, over-provisioned infrastructure disappears into the top line. You can spend a billion a year on AWS and nobody asks questions because revenue is growing faster than the bill. A rising tide does not just lift boats. It hides every hull below the waterline.</p>

<h2 id="can-you-shrink-a-cathedral">Can You Shrink a Cathedral?</h2>

<p>Netflix is not in trouble. They won.</p>

<p>But growth is decelerating. The era where subscriber numbers doubled every few years is over. The ad tier buys time, but it does not shrink the infrastructure. It adds to it. So imagine what happens if Netflix actually has to contract. A recession hits. Ad revenue dries up. Subscriber growth goes negative. Headcount gets cut.</p>

<p>Think about what they would be trying to shrink. Over 40,000 microservices, each with its own deployment pipeline, its own monitoring, its own failure domain. Thousands of Open Connect appliances in ISP data centers across 150+ countries, all needing firmware, capacity planning, and hardware lifecycle management. A logging system that <a href="https://clickhouse.com/blog/netflix-petabyte-scale-logging">ingests five petabytes of data per day</a> and required Netflix to build custom infrastructure on top of ClickHouse and Apache Iceberg just to make their own logs searchable. Infrastructure to observe the infrastructure.</p>

<p>All of it still exists after the cuts. The pipelines still need upgrades. The hardware still ages. The chaos engineering tooling still injects failures, but the team that understood why the experiments were configured a certain way got reorged. Nobody remembers which monkeys to cage or why they were loose in the first place. The logging infrastructure that watches the infrastructure still needs its own care and feeding. The deployment tooling was written for an organization twice this size and nobody has time to simplify it because everyone left is busy keeping the lights on.</p>

<p>Scaling up is an engineering problem. Scaling down is a political one. You have to decide which services to kill, which teams to consolidate, which capabilities to abandon. Imagine the OKR wars when someone proposes killing a service that powers a beloved A/B testing framework, or sunsetting a chaos tool whose creator is now a director. Every one of those decisions runs into someone who built the thing, someone who depends on the thing, and someone whose headcount is justified by the thing. Growth lets you avoid all of that. You never have to choose because there is always room for one more service, one more team, one more layer. That is the heating bill of a cathedral. And it is a lot easier to build a cathedral than to figure out which rooms you can permanently lock.</p>

<p>But the industry should be paying attention. Because if the company that built the cathedral is struggling just to read its own utility bill, the companies that copied the cathedral on a fraction of the budget have no chance at all.</p>

<h2 id="the-rest-of-us">The Rest of Us</h2>

<p>For fifteen years, startups with ten engineers adopted microservices because Netflix used microservices. Companies with comfortable uptime targets implemented chaos engineering because Netflix ran Chaos Monkey. Teams that could have shipped a monolith on one server built distributed systems requiring platform teams to operate.</p>

<p>Netflix does it. Netflix succeeded. Therefore the practice causes success.</p>

<p>Nobody asked whether their growth curve would absorb the overhead. Nobody asked whether patterns built for hundreds of millions of users made sense at a few thousand. Netflix was not prescribing. The industry chose to build a religion around it.</p>

<p>Netflix might figure out how to right-size a cathedral. They have the budget to learn.</p>

<p>Everyone who copied the architecture without the balance sheet gets to learn the same lesson without the safety net.</p>

<p>The tide goes out as fast as it comes in. Check what you are wearing.</p>]]></content><author><name>Dan Zrobok</name></author><category term="cargo-cult" /><category term="infrastructure" /><category term="culture" /><category term="cost" /><summary type="html"><![CDATA[Netflix built the most admired infrastructure in tech on the back of infinite growth. What happens when the growth stops and the bill stays?]]></summary></entry><entry><title type="html">Amusement Park Infrastructure</title><link href="https://boringops.sh/articles/amusement_park_infrastructure/" rel="alternate" type="text/html" title="Amusement Park Infrastructure" /><published>2026-03-16T17:00:00+00:00</published><updated>2026-03-16T17:00:00+00:00</updated><id>https://boringops.sh/articles/amusement_park_infrastructure</id><content type="html" xml:base="https://boringops.sh/articles/amusement_park_infrastructure/"><![CDATA[<blockquote>
  <p>You bought all the rides. You forgot the bathrooms.</p>
</blockquote>

<p>Picture an amusement park.</p>

<p>Seven roller coasters. A Ferris wheel with real-time telemetry. A log flume that auto-scales based on queue depth. An augmented reality haunted house with its own dedicated engineering team.</p>

<p>The lineups are five hours long. There are no bathrooms. There is one artisan pizza vendor serving ten thousand guests. The parking lot is gravel. The map is a PDF from three years ago that references two rides that no longer exist and omits four that do.</p>

<p>Nobody planned this park. The park happened.</p>

<h2 id="how-parks-get-built">How Parks Get Built</h2>

<p>No one approved a park without bathrooms. That would be insane. What happened instead was a sequence of individually rational decisions, each one defensible in isolation, each one ignoring everything that was not itself.</p>

<p>A VP saw a competitor’s roller coaster and greenlit one. The board wanted a Ferris wheel for the annual report. The haunted house was a pet project funded on political capital. The log flume was the vendor’s idea. They had a compelling demo.</p>

<p>Every ride had a business case with projected throughput, estimated revenue, and a timeline. None of them included bathrooms, parking, signage, staffing, or what happens when ten thousand people need to eat at noon.</p>

<p>Bathrooms do not get anyone promoted.</p>

<h2 id="the-infrastructure-translation">The Infrastructure Translation</h2>

<p>Every enterprise infrastructure organization you have walked through looks like this park.</p>

<p>The rides are Kubernetes clusters, service meshes, event streaming platforms, API gateways, CI/CD pipelines with fourteen stages and a YAML file longer than the application it deploys. Impressive when demonstrated. Brutal when operated.</p>

<p>The missing bathrooms are runbooks. Capacity planning. A current inventory of what is actually running. Documentation written by someone who has touched the system in the last eighteen months.</p>

<p>The gravel parking lot is networking. DNS that works differently depending on which team configured it. Firewall rules unaudited since the last compliance cycle. Load balancers no one understands well enough to modify without anxiety.</p>

<p>The one artisan pizza vendor is the single engineer who knows how the authentication system works. The entire park depends on this person continuing to show up. Leadership does not know their name until they quit.</p>

<h2 id="the-vendor-contribution">The Vendor Contribution</h2>

<p>Every ride came with a vendor. The vendor sold the ride, not the park.</p>

<p>Kubernetes does not come with a staffing plan. The service mesh vendor does not mention that you need three engineers who understand it deeply or it becomes a black box that owns your entire network. The event streaming demo shows messages flowing beautifully across a whiteboard. It does not show the on-call engineer at 3 AM with a Confluence page titled “Kafka Setup” containing two paragraphs and a broken diagram.</p>

<p>Nobody priced the operational reality behind the brochure. How many people would it take to keep the ride running after the implementation consultants left? The consultants were never going to answer that honestly. Everyone in the room knew it.</p>

<h2 id="the-demolished-ride">The Demolished Ride</h2>

<p>There was a ride that worked. Running for years. Short lines. Predictable. The staff knew every mechanical quirk. It was not exciting. It did not generate conference talks.</p>

<p>It got torn down to make room for something with a better brochure.</p>

<p>This is the legacy system that worked and was understood. It lacked a vendor with a sales team and a modernization narrative that made leadership feel like progress. That was enough to kill it.</p>

<p>Nobody measured what was lost. No one cataloged the institutional knowledge that left with the engineers who built it. No one calculated the cost of rebuilding operational understanding that accumulated over years and was destroyed in a quarter.</p>

<p>The new ride is shinier, more complex, and less understood, dependent on a vendor whose pricing model changes annually. Progress.</p>

<h2 id="the-stewardship-gap">The Stewardship Gap</h2>

<p>The park did not fail. The park was never designed. It accreted. Each ride was a decision made in a room where the people present cared about that ride. The bathrooms were not discussed because no ride’s business case included a line item for sanitation.</p>

<p>A functioning park requires someone whose job is the park, not any individual ride. Someone who asks whether guests can get from the entrance to the rides. Whether they can eat. Whether the map is accurate. When the power goes out, who knows which rides need manual intervention.</p>

<p>This role does not exist in most infrastructure organizations. That is why the infrastructure feels broken no matter how much you invest in it.</p>

<p>There are ride owners. Teams responsible for individual systems. No one whose job is to stand in the middle of the park and observe that the guests are miserable despite the world-class roller coasters.</p>

<p>Without someone accountable for the whole park, every team optimizes their own ride. The rides get faster, more complex, more technically impressive. The park gets worse. Incidents increase. Delivery slows. Your best engineers become full-time support staff for systems no one fully understands. And every decision that made it worse was approved by someone who thought they were making it better.</p>

<p>You keep buying rides. The bathrooms are still broken.</p>

<hr />

<p><strong>boring (adj.)</strong>: A park where the bathrooms work, the map is current, and nobody notices the infrastructure because it stopped getting in the way.</p>]]></content><author><name>Dan Zrobok</name></author><category term="complexity" /><category term="vendors" /><category term="stewardship" /><category term="simplicity" /><summary type="html"><![CDATA[You bought all the rides. You forgot the bathrooms.]]></summary></entry><entry><title type="html">Fire the CEO</title><link href="https://boringops.sh/articles/fire_the_ceo/" rel="alternate" type="text/html" title="Fire the CEO" /><published>2026-02-27T17:00:00+00:00</published><updated>2026-02-27T17:00:00+00:00</updated><id>https://boringops.sh/articles/fire_the_ceo</id><content type="html" xml:base="https://boringops.sh/articles/fire_the_ceo/"><![CDATA[<blockquote>
  <p><em>“A significantly smaller team, using the tools we’re building, can do more and do it better.”</em></p>

  <p>– Jack Dorsey, announcing 4,000 layoffs at Block, February 2026</p>
</blockquote>

<p>Everyone agrees: AI is coming for the developers. The $200,000-a-year engineers writing CRUD apps and maintaining CI pipelines. The line workers of the knowledge economy. Trim them. Automate them. Celebrate the efficiency gains. Watch the stock pop.</p>

<p>Nobody asks the obvious question.</p>

<p>Why is nobody coming for the CEO?</p>

<p>Meet the A-suite. AI replaces the CEO. The AI Executive Officer (AEO) is the human who operates alongside it. The rest of the C-suite becomes the A-suite (AxOs).</p>

<h2 id="the-math">The Math</h2>

<p>Median S&amp;P 500 CEO total compensation in 2024 was $17.1 million. That is 85 senior software engineers. One person. One salary. Eighty-five engineers worth of payroll.</p>

<p>But salary is the small number. The real cost of a CEO is what happens when they are wrong.</p>

<p>Jack Dorsey tripled Block’s headcount from roughly 3,900 to over 12,000 between 2019 and 2022. The stock peaked above $275 in 2021 and has since dropped over 75%. He built two separate company structures for Square and Cash App instead of one, a decision he now calls incorrect. He spent $68.1 million on a single company event in September 2025. Five months later, he cut 4,000 people and blamed AI.</p>

<p>None of that is an AI story. All of it is a management story.</p>

<h2 id="the-efficiency-standard-that-stops-at-the-top">The Efficiency Standard That Stops at the Top</h2>

<p>Dorsey is not unique. He is just the most recent example of a pattern the industry refuses to examine: the efficiency standard always flows downward.</p>

<p>When a company deploys AI to replace developers, the pitch is simple. These tools can do what humans do, faster and cheaper. How about applying that logic upward?</p>

<p>A CEO sets strategy, allocates capital, communicates with stakeholders, makes high-stakes decisions under uncertainty, projects confidence, and takes credit when things work. Most of that reduces to pattern recognition, modeling, and communication, all of which AI already handles with less ego and fewer pet projects.</p>

<p>The one function that genuinely requires human judgment is choosing between futures you cannot model. That happens a few times a year. The rest is coordination and calendar management. You do not need a $17 million executive for that. You need an AI with good models and a small team of AxOs who can execute.</p>

<h2 id="the-blast-radius-problem">The Blast Radius Problem</h2>

<p>When a developer writes bad code, the blast radius is a feature, maybe a service, maybe an outage that lasts hours. We have spent decades building infrastructure to make individual developer failure survivable. Code review, CI pipelines, staged deploys, automated rollback. The entire modern engineering stack exists to contain the damage of any single human decision.</p>

<p>When a CEO makes a bad decision, no such system exists. The blast radius is the entire company. Years of engineering capacity consumed. Billions in shareholder value destroyed. Thousands of jobs lost. The failure is in command, and there is no rollback mechanism for command.</p>

<p>We build elaborate systems to contain the mistakes of $200,000 engineers. We build nothing to contain the mistakes of $17 million executives.</p>

<p>So let’s build something.</p>

<h2 id="what-this-looks-like-in-practice">What This Looks Like in Practice</h2>

<p>An AI system handles strategy synthesis, capital allocation modeling, performance monitoring, stakeholder reporting, and operational coordination. It processes information continuously, without ego, without pet projects, without the need to justify its own existence through activity.</p>

<p>The A-suite works alongside it. AxOs are the humans who replaced the C-suite. Same caliber of person, different relationship to power. They handle what the AI cannot: relationship judgment, regulatory navigation, crisis decisions that require a human face, and the handful of annual choices where genuine uncertainty demands human intuition.</p>

<p>Total cost: maybe $3 million fully loaded for the A-suite. That is $14 million in annual savings against the median CEO package alone, and significantly more when you account for the C-suite ecosystem it replaces. Replace the decision-maker with a system, and the entire executive layer simplifies with it.</p>

<h2 id="the-self-protecting-system">The Self-Protecting System</h2>

<p>The people who would need to approve this change are the ones being replaced.</p>

<p>CEOs set strategy. Boards approve CEO compensation. Boards are populated by current and former CEOs. The entire governance structure exists to perpetuate itself, and every stakeholder around it has a reason to play along. Consultants need executive engagement. Analysts need access. The financial press needs CEO narratives to drive clicks.</p>

<p>The AI-replaces-developers story persists because developers do not control the narrative. They do not sit on boards. They do not write shareholder letters. They do not go on CNBC. The people who control the conversation about who gets automated will never volunteer themselves.</p>

<h2 id="the-boringops-lens">The BoringOps Lens</h2>

<p>BoringOps exists to ask one question: where is the actual drag on your organization?</p>

<p>Every layer produces friction. Engineers write bad code. Infrastructure drifts. Processes decay. Those problems are real, and they compound. But the decisions that create the most expensive, hardest-to-reverse damage originate at the top: decisions to triple headcount without discipline, to adopt complexity without justification, to build empires instead of systems.</p>

<p>If AI is powerful enough to eliminate 4,000 engineers, why is it not powerful enough to challenge one executive?</p>

<hr />

<p><strong>boring (adj.)</strong>: Applying the same efficiency logic to executive compensation that leadership is so eager to apply to everyone else.</p>]]></content><author><name>Dan Zrobok</name></author><category term="ai" /><category term="strategy" /><category term="leadership" /><category term="compensation" /><summary type="html"><![CDATA[Introducing the AI Executive Officer (AEO) and the A-suite (AxOs).]]></summary></entry><entry><title type="html">Why AWS Needs an Old Navy</title><link href="https://boringops.sh/articles/why_aws_needs_an_old_navy/" rel="alternate" type="text/html" title="Why AWS Needs an Old Navy" /><published>2026-02-17T00:00:00+00:00</published><updated>2026-02-17T00:00:00+00:00</updated><id>https://boringops.sh/articles/why_aws_needs_an_old_navy</id><content type="html" xml:base="https://boringops.sh/articles/why_aws_needs_an_old_navy/"><![CDATA[<h1 id="why-aws-needs-an-old-navy">Why AWS Needs an Old Navy</h1>

<p><em>Self-Cannibalize or Be Cannibalized.</em></p>

<hr />

<p>Enterprises won’t migrate off AWS. They never do. Not for elegance. They migrate for regulatory mandate, catastrophic failure, or economic inevitability.</p>

<p>But enterprise cloud revenue is not just maintenance revenue. It’s new product launches, new internal platforms, new AI initiatives, regional expansions, and M&amp;A integrations. Net-new projects are the growth engine. Whoever wins them controls the next decade.</p>

<p>Greenfield starts as developer preference. Developer preference becomes internal platform choice. Internal platform choice becomes procurement pattern. By the time purchasing gets involved, the decision was made two years ago by an engineer who typed <em>railway up</em> instead of writing a CloudFormation template.</p>

<p>If AWS loses greenfield developer mindshare for five consecutive years, enterprise lock-in will not save it from margin compression. Wall Street won’t notice at first. By the time it does, the compounding will be irreversible.</p>

<h2 id="the-interface-layer-is-migrating">The Interface Layer Is Migrating</h2>

<p>AWS’s dominance was never really about the data centers or the 200+ services. It was about the fact that developers interacted with AWS directly. They wrote CloudFormation templates. They configured IAM policies. They understood VPC networking. They owned the interface between the developer and the machine.</p>

<p>That interface is migrating upward. Platforms like Fly.io, Railway, and Render have built deployment experiences that make AWS feel like filing taxes. <em>railway up</em>. That’s the entire deployment process.</p>

<p>Once the interface layer migrates up the stack, hyperscalers reduce to interchangeable infrastructure vendors competing on price and geography. This is an existential problem, not a product one.</p>

<h2 id="the-four-sided-squeeze">The Four-Sided Squeeze</h2>

<p><strong>Regulatory pressure is dismantling the financial lock-in.</strong> AWS lock-in runs deep: data gravity, service entanglement across RDS, DynamoDB, Kinesis, and IAM, contractual commitments, and the sheer risk of migration downtime. Egress pricing is not the moat itself, but it is the visible tax layered on top of all of it. Roughly 9 cents per gigabyte while wholesale bandwidth costs have fallen 93% over the past decade, with zero movement in North America and Europe since 2018. The EU Data Act will prohibit cloud switching fees entirely from January 2027. The financial friction that amplified every other lock-in is being legislated away.</p>

<p><strong>Abstraction layers are capturing the developer interface.</strong> Platforms like Railway and Render collapse networking, identity, and secrets management into opinionated defaults you never see. New developers are learning to deploy on these platforms, not writing CloudFormation templates. In ten years, the people making infrastructure decisions won’t have AWS muscle memory. Provisioning a production workload on AWS often requires reasoning across IAM, networking, security groups, load balancers, logging, and service-specific configuration. On higher abstraction platforms, the same workload collapses into a single deploy primitive.</p>

<p><strong>AI is turning complexity into a strategic liability.</strong> AI agents don’t struggle with complexity the way humans do. They struggle with branching factor, inconsistent abstractions, and massive control plane surface area. AWS maximizes all three: 200+ services with non-uniform APIs, verbose CloudFormation templates, and IAM policies that require reasoning across trust boundaries. When infrastructure becomes AI-mediated, the provider with the cleanest, shortest control plane becomes the default target. AI won’t optimize for “most powerful.” It optimizes for “most legible.” AWS was optimized for power and flexibility, not legibility.</p>

<p><strong>Foundational architecture cannot be reset without revenue trauma.</strong> Amazon can’t rip out IAM without breaking every existing customer’s infrastructure, so every new service bolts onto legacy IAM, legacy VPC networking, legacy everything. Attempts to paper over this, App Runner, Amplify, Lightsail, Copilot, ECS Service Connect, break the moment you hit an edge case. AWS can attempt to climb the abstraction ladder from the existing platform. But these layers can never fully hide the control plane underneath because they must remain compatible with it. Every exception, every edge case, every compliance requirement punches through the abstraction and exposes the legacy primitives. This isn’t a quality problem. It’s a structural one. Clean abstractions require clean foundations, and AWS’s foundation is twenty years of accumulated backwards compatibility.</p>

<p>These four forces converge into a squeeze no single initiative can address. Regulatory pressure removes the financial lock-in. Abstraction layers remove the interface lock-in. AI removes the expertise lock-in. Architectural debt prevents AWS from responding without destabilizing the $129 billion revenue engine.</p>

<h2 id="the-old-navy-model">The Old Navy Model</h2>

<p>The cannibalization that Amazon fears is already happening. They’re just not participating in it. Every startup on an abstraction platform is a customer relationship Amazon doesn’t own. Every developer who learns on Render is one more person who’ll default to something other than AWS when they become an engineering director. The real choice isn’t “cannibalize or don’t.” It’s self-cannibalize with margin control, or be cannibalized with margin collapse.</p>

<p>In 1994, Gap Inc. launched Old Navy as a completely separate entity. Different stores, different teams, different price points, all sitting on Gap’s supply chain. Old Navy didn’t just target different customers. It targeted a different decision calculus.</p>

<p>Amazon needs to do the same thing with cloud. Call it Ziroh. Zero configuration, zero friction, zero legacy. Amazon’s second cloud. Not cheaper AWS, but a different category of cloud.</p>

<p>Ziroh is not just simplicity. Ziroh is Amazon attempting to re-own the deployment surface before abstraction eats them. It would be organizationally radioactive: fractured incentives, eroded internal power structures, channel conflict with AWS sales orgs, and revenue displacement Wall Street would initially punish. That’s the cost. The alternative is someone else owning the interface permanently.</p>

<h2 id="what-ziroh-actually-looks-like">What Ziroh Actually Looks Like</h2>

<p>Ziroh is a completely independent cloud provider. Separate brand, separate engineering team, separate culture. It piggybacks on Amazon’s physical infrastructure but inherits zero architectural debt from AWS.</p>

<p>The details matter less than the constraint: Ziroh must be simple enough that an AI agent can provision a production workload without human intervention, and opinionated enough that there’s no configuration surface to reason about. Identity-scoped networking, not user-managed graphs. Declared intent, not security group rules. Deployment in seconds, not YAML files.</p>

<p>Ziroh would deliberately remove whole categories of configuration surface. No user-managed networking graphs. No cross-service IAM policy authoring. No public load balancer primitives. If a workload cannot fit within the opinionated model, it does not belong on Ziroh. That’s the point. These are engineering decisions, not breakthroughs. Amazon has the talent. The constraint is backwards compatibility, which is exactly why Ziroh can’t be built inside AWS.</p>

<h2 id="the-one-way-door">The One-Way Door</h2>

<p>One rule makes or breaks this, and it is non-negotiable: <strong>Old Navy is allowed to steal from Gap, but never the other way around.</strong></p>

<p>Ziroh cannibalizes down. AWS Classic never gets to reach down and pull Ziroh back into its orbit. No “integration” initiatives. No shared IAM layer for “consistency.” The moment the legacy organization touches Ziroh, the architectural debt infects it.</p>

<p>This runs against every instinct of a large organization. Product managers want synergies. Engineering directors want to reduce duplication. VPs want portfolio collaboration. Every one of these impulses will kill Ziroh.</p>

<p>Gap learned this the hard way. When Old Navy outperformed, Gap blurred the lines. Share designers, align aesthetics, create “brand coherence.” It diluted both. Separation is the architecture. The wall is load-bearing.</p>

<h2 id="why-amazon-wont-do-this">Why Amazon Won’t Do This</h2>

<p>No executive will stand on an earnings call and tell Wall Street the $129 billion business needs a parallel replacement. Amazon has every tool it needs to succeed. The cash machine prevents them from using any of it.</p>

<p>Bezos stepped back to Executive Chair in 2021. Jassy became CEO of Amazon. Garman took over AWS in 2024. None of them have incentive to torch short-term margins for long-term structural advantage. That’s a founder’s move, and the founder is busy with Blue Origin.</p>

<p>Ziroh would require new thinking, new teams, and new willingness to let a second brand compete for the same developers. That’s a thought leadership investment, and it’s harder to justify on an earnings call than concrete and silicon. AWS is Amazon’s profit engine. Proposing an independent competitor from within is a career-jeopardizing move, even if it’s the right strategic decision.</p>

<h2 id="the-signal">The Signal</h2>

<p>Watch the money. Capital is flowing toward abstraction platforms. Investors are betting that abstraction, not raw cloud primitives, defines the next era of infrastructure.</p>

<p>Watch Microsoft and Google. If they start acquiring or building opinionated deployment platforms instead of competing on the same 200-service complexity model, it means the hyperscalers themselves have accepted that the interface layer has migrated. AWS would be the last to admit it.</p>

<p>Watch the AI agents. Ask an AI agent to deploy a containerized application and higher-abstraction platforms frequently appear before raw AWS primitives. The default is already shifting.</p>

<h2 id="split-or-be-split">Split or Be Split</h2>

<p>AWS must split its identity before the market splits it for them.</p>

<p>If they refuse, Classic calcifies. Microclouds win developers. AI commoditizes compute. AWS becomes legacy infrastructure with compressing margins. Not death. Maturity.</p>

<p>But maturity without reinvention becomes irrelevance over time. Most executives are rewarded for preserving margin, not redefining eras. The evolutionary move is both strategically correct and culturally improbable.</p>

<p>That tension, between knowing what needs to happen and being structurally incapable of doing it, is the real story. By the time the pain is undeniable, the playbook is always the same: a late-cycle acquisition after the mindshare has already moved on, integrated into the existing platform for “synergies,” and the thing that made it attractive dies on contact with the legacy architecture.</p>

<p>Amazon must decide whether it wants to define the next decade the way it defined the last, or live the tech stalwart life of IBM and Oracle. History says it won’t be able to do both.</p>

<hr />

<p><strong>boring (adj.)</strong>: infrastructure so opinionated it deploys itself, so stable it disappears, and so simple that an AI agent never needs to ask a follow-up question.</p>]]></content><author><name>Dan Zrobok</name></author><category term="boringops" /><category term="aws" /><category term="cloud" /><category term="infrastructure" /><category term="strategy" /><summary type="html"><![CDATA[AWS must split its identity before the market splits it for them.]]></summary></entry></feed>