The agent runtime trap — Friday, May 15, 2026

Friday, May 15, 2026

When Anthropic open-sourced the Model Context Protocol in late 2024, the pitch was clear: a universal connector standard so that tool integrations built for one AI runtime could work across all of them. The adoption has been real — MCP is now in most major IDEs, several enterprise platforms, and hundreds of community-built integrations. But fourteen months on, the fragmentation problem MCP was meant to solve has gotten worse, not better. Because MCP standardises connectors, not runtimes. The memory model, the agent loop, the state persistence, the evaluation tooling — none of that travels.

Every major vendor now runs its own runtime. Anthropic has Claude's tool use layer and the MCP server ecosystem. OpenAI has the Responses API with its own tool calling format and context management. Google has Vertex AI Agent Builder. Microsoft wraps all of it in Copilot Studio with its own connector abstraction on top. LangChain, CrewAI, AutoGen, and a dozen smaller open-source frameworks sit underneath as infrastructure options. Each runs a different memory model, handles state differently, and has different rate limit structures and failure modes. The landscape is not converging. It is expanding.

Six months ago, most teams were still prototyping. The runtime choice did not matter much because nothing was running at scale. That window has closed. Teams that shipped agents into production in late 2024 or early 2025 are now maintaining them, and the switching cost has compounded quietly. Your tool connectors are wired to one vendor's MCP implementation. Your memory layer is configured against one vendor's storage model. Your deployment pipeline assumes one vendor's authentication scheme. The architecture that felt temporary has become load-bearing.

The fragmentation would be manageable if the choice were transparent at the point of entry. It is not. Vendor demonstrations optimise for capability — what the agent can do, how fast it responds, how many tools it connects. The runtime architecture is in the plumbing. A team evaluating OpenAI's Responses API is simultaneously making a decision about OpenAI's context window management, OpenAI's tool calling format, and OpenAI's rate limit model. If a stronger capability emerges from another vendor six months later, switching requires re-architecting the agent loop, not just swapping an API key. Most teams are not framing it that way when they start.

The pattern collapses into one structural point: choosing an AI agent capability in 2026 means choosing a runtime. The capability question and the infrastructure question are the same question, but the industry has not caught up to this. Procurement frameworks evaluate models. Architecture reviews look at latency and cost. Almost nobody is evaluating the exit cost of the runtime at the point of entry — which is the only time that evaluation is cheap to do.

One escape valve exists but it is early. Open-source frameworks — AutoGen, CrewAI, LangGraph — give you runtime portability at the cost of operational overhead. Several startups are building abstraction layers above the vendor runtimes, but none has sufficient production coverage to recommend confidently yet. For teams still in the evaluation phase, the most defensible position is to treat the agent loop itself as the thing you own and the vendor capability as the thing you rent. That distinction is harder to maintain in practice than it sounds, but teams that draw it early will have substantially more flexibility in twelve months.