The Model Is Replaceable. The Operating System Around It Is the Value.
Anthropic blocked third-party harnesses on April 4. We migrated seven production agents to a new model in one working day. Here is why that was possible.
By Dimitris Kontaxis ·
The Model Is Replaceable. The Operating System Around It Is the Value.
On April 4, 2026, Anthropic flipped a switch. From that day on, Claude Pro and Claude Max subscriptions could no longer drive third-party AI agent frameworks. The policy had been on their terms of service since February, but enforcement landed on April 4, and it landed hard.
If you were running an AI agent business on top of a Claude subscription, that was the week you found out whether you had built a product or a stack of config files.
We migrated every production agent to a different model provider in one working day. Zero client disruption. The agents kept answering messages on the same channels, with the same personalities, the same memories, the same protocols. The model underneath was different. Nothing else was.
This article is about why that was possible, and why it is the single most important thing we can tell you about what we do.
What actually happened on April 4
Start with the facts, because this story does not need any embellishment.
Anthropic's own explanation is straightforward. Subscription plans were never priced for the compute load that third-party agent frameworks generate. Industry reporting put the gap at five times or more: a 200 dollar per month Max subscription was routinely being used to run agent workloads that would cost 1000 to 5000 dollars per month at API rates. Anthropic tried enforcement in January, reversed after community pushback, updated their terms of service on February 20, and enforced on April 4. The restriction applies to any third-party harness, specifically including OpenClaw, the open-source gateway that this entire industry is building on.
The Hacker News thread that followed hit several hundred points. The OpenClaw creator, Peter Steinberger, went on record calling it a deliberate squeeze. Every consultancy with an agent running on a Claude Max login had the same Monday morning: figure out how to keep the agent talking to customers while the underlying model provider decides what you are allowed to do.
There were three honest choices. First, pay Claude API rates directly, which for a heavy agent workload is multiple thousands of euros per month per client. Second, migrate to a different model, which requires the harness to actually support that. Third, stop running. The third option was real. Several threads on the developer forums that week were from consultancies telling their clients they needed a week to figure it out.
Why we were a one-day job
We had seven agents in production on the week of April 4. We were on Claude through a subscription, the same as everyone else. We migrated all seven to a different model provider on April 5, in a single working day. The migration was a configuration change. We did not rewrite any agent personality. We did not re-engineer any memory architecture. We did not touch any client workflow. We did not lose any message on any channel. We did not ask any client for patience.
This was not a heroic effort. It was the payoff of a design decision we had made months earlier.
The design decision was: the model is behind a narrow interface. The interface is inside the harness. The harness is the thing we maintain. We do not maintain "a Claude deployment." We maintain a harness, and the harness can point at Claude or it can point at something else. When Anthropic changed the rules, we changed the configuration line that pointed at the provider. Everything above that line, which is to say the entire operator layer, kept running.
That is what harness engineering is. It is the engineering discipline that keeps the operator layer stable while the model layer churns, because the model layer will always churn, and your business does not want to churn with it.
What "harness" actually means
LangChain's engineering team is the clearest public voice on this. In early 2026 they published a series on their blog walking through how they iteratively improved their coding agent by only changing the harness. The result, reported publicly: their agent went from Top 30 to Top 5 on Terminal Bench 2.0, an improvement of 13.7 points, without touching the underlying model at all.
The canonical definition, from their piece on the anatomy of an agent harness, is this:
"The goal of a harness is to mold the inherently spiky intelligence of a model for tasks we care about. Harness Engineering is about systems, you're building tooling around the model to optimize goals like task performance, token efficiency, latency, etc."
Read that as an operator, not as a researcher. A raw model is smart in spikes. It is very good at some tasks, less good at others, surprisingly confused by a few, and occasionally brilliant. The job of the harness is to shape that spiky intelligence into a reliable daily worker. You do that with system prompts, tool choices, execution flow, memory, verification hooks, scheduled actions, and context that gets assembled fresh for every session.
None of that is the model. All of that is the thing a business is actually paying for.
For a Full-Time Agent running someone's operations on WhatsApp, the harness is:
- The personality file that defines tone, domain expertise, and how it handles ambiguity
- The operating protocols that define what it does at session start, how it behaves in a group channel, how it writes to memory, how it handles handoffs
- The tool policy layer: what it can read, what it can write, what it needs permission for, what it is never allowed to touch
- The memory architecture: what gets remembered daily, what gets remembered long-term, how new information gets merged with existing memory, how the agent searches its own notes
- The cron jobs: proactive morning briefings, daily recaps before the session reset, weekly reporting, monitoring checks
- The channel configuration: which WhatsApp groups it watches, who it responds to, when it stays silent
- The graduation plan: Month 1 it asks before acting, Month 3 it handles routine on its own, Month 6 it anticipates
- The context engineering: the per-business knowledge we onboard the agent into during implementation
All of that lives above the model. None of it changes when the model changes. Which means when the model provider changes the rules, the thing a client paid for is still there.
Wrapper companies versus harness companies
The fastest way to understand the difference is to look at what happens when a vendor ships a better onboarding wizard.
A wrapper consultancy sells installation. They pick a tool, configure the basics, hand over a login, and leave. Their value is whatever work they saved the client by running the installer. The moment the vendor automates that installer, the wrapper consultancy is worth nothing. Every AI agent platform on the market is racing to make their own onboarding wizard better, because the wizard is the single cheapest way for the platform to grow. Wrapper consultancies are running a business whose best customer is also their biggest long-term threat.
A harness consultancy sells the operator layer. The configuration is built from a discovery session, not a template. The security boundaries are designed for the specific business. The memory is tuned for what this business actually needs to remember. The ongoing care keeps the whole thing healthy as the platform ships new releases. None of that is in any wizard, because the wizard does not know anything about the business. A harness consultancy is not threatened by a better installer, because the installer was never the product.
The April 4 event was the cleanest possible demonstration of the difference. Wrapper consultancies with a Claude subscription baked into their stack lost their Monday morning. Harness consultancies whose model provider was behind an interface wrote a new config line.
| | Wrapper consultancy | Harness consultancy | |---|---|---| | What they sell | Installation | The operator layer around the model | | How they configure the agent | Template with minor tweaks | Discovery-first, fine-tuned per business | | Where the data lives | Vendor cloud | Client hardware | | What happens when the model provider changes the rules | Stack breaks until rewritten | Swap the provider, same day | | Improvement loop after install | None | Monthly refinement via ongoing care | | Value over time | Collapses when the vendor ships a better wizard | Compounds with every month of maintenance |
Our three pillars
We run the harness on three principles that together make the operator layer durable.
Privacy-first. Every agent runs on the client's own Mac Mini, in the client's own office, behind the client's own firewall. The memory is local. The search is local. The business content does not flow into any cloud service. GDPR is not an afterthought, it is the architecture. When a client asks us where their data lives, we answer with a postcode.
Fine-tuned to the business. The harness is built from the discovery session. Ninety-five percent of the infrastructure is shared across every client, because good engineering is standardized by definition. Five percent is deep customization specific to the business, based on the workflows we mapped, the invisible work we surfaced, and the ranked pain points we saw. No template. The agent sounds like the business because it was built for the business.
Managed and evolving. The Always-On Watch subscription is not help desk support. It is the harness staying alive. Every platform release gets tested on our own hardware before it goes near a client. The personality evolves as the business changes. The memory gets tuned. The graduation plan advances. When the model provider changes the rules, as happened on April 4, we move the client without them having to think about it. What a business is paying for monthly is a functioning operator layer, not a bundle of hours.
Why this matters more to small and medium businesses than to anyone else
When a hyperscaler flips a policy switch, a Fortune 500 with a forty-person AI infrastructure team writes an incident ticket and moves on. Their bench is deep enough to re-platform inside a week. Their vendor contracts were negotiated with exactly this kind of event in mind.
A small or medium business has no bench. If their AI agent stops working because the model provider changed a subscription rule, the person who notices is the founder, on WhatsApp, at eight in the morning, trying to figure out why the daily briefing did not arrive. The founder is not going to debug the stack. They want the service to work or they want to go back to doing it themselves.
This is why the operator layer matters so much more for this market. The contract a SMB signs with an AI consultancy is not "we will debug when things break." It is "we will keep the service running regardless of who changes what upstream." Meeting that contract is impossible with a wrapper. It is routine with a harness.
What you are actually paying for
If you are considering any AI consultancy, including ours, ask one question. What happens on the day the underlying model provider changes the rules? If the honest answer is "we will need some time to figure it out," what they are selling you is an installation. If the honest answer is "we have already planned for that and here is how we handle it," what they are selling you is a harness.
You are not buying an OpenClaw install. You are not buying a Claude subscription. You are not buying GPT access. Those are commodities. What you are buying is the operator layer around those commodities, built from a discovery session that mapped your Monday morning, tuned to the specific invisible work that keeps your business running, and maintained by an engineering team that keeps it current while you run the business.
The model is replaceable. We proved it on April 5, in one working day, on seven live agents, with zero client disruption. The operating system around the model is what you pay for, because it is the only thing that was still there when the rules changed.
Want to see what this looks like for your business
Book a discovery call. Thirty minutes. We map your Monday morning, we tell you which stations on your assembly line a Full-Time Agent could actually own, and we show you the harness we would build around it. If the honest answer is that AI is not the right move for you yet, we will tell you that too.
Sources
- Anthropic, updated terms of service prohibiting third-party harness use (February 20, 2026). Reported in multiple outlets, see below.
- The Register, Anthropic closes door on subscription use of OpenClaw (April 6, 2026). theregister.com/2026/04/06/anthropic_closes_door_on_subscription/
- VentureBeat, Anthropic cuts off the ability to use Claude subscriptions with OpenClaw and third-party AI agents (April 2026). venturebeat.com/technology/anthropic-cuts-off-the-ability-to-use-claude-subscriptions-with-openclaw-and
- The Next Web, Anthropic blocks OpenClaw from Claude subscriptions in cost crackdown (April 2026). thenextweb.com/news/anthropic-openclaw-claude-subscription-ban-cost
- LangChain Blog, Improving Deep Agents with harness engineering (early 2026). blog.langchain.com/improving-deep-agents-with-harness-engineering/
- LangChain Blog, The Anatomy of an Agent Harness. blog.langchain.com/the-anatomy-of-an-agent-harness/
- LangChain Blog, Better Harness: A Recipe for Harness Hill-Climbing with Evals. blog.langchain.com/better-harness-a-recipe-for-harness-hill-climbing-with-evals/