MCP Apps and WebMCP: Two Doorways to the Agentic UI

The Model Context Protocol gave AI models a clean way to reach tools and data. What it didn't give them was an interface. In 2026 that changed twice over. MCP Apps lets an MCP server return rich, interactive UI that renders right inside the conversation. WebMCP lets any website expose its own functions as tools an AI agent can call directly in the browser. One brings UI into the agent; the other pushes tools out from the web. This post walks through both, how they work under the hood, how they relate, and what to do about them.

A quick word on the title. Neither of these is literally an "MCP 2.0" release — there's no version bump, and they ship through different bodies (one through the MCP project, one through the W3C). But together they're the biggest change in what MCP-powered agents can do since the protocol launched, so the shorthand fits. Two analogies, which we'll come back to, make them stick: MCP Apps is a micro frontend for agents, and WebMCP is accessibility for agents.

A 60-second MCP refresher

MCP is an open specification for connecting LLM clients (hosts) to external systems. It defines three primitives — tools (executable actions), resources (context and data), and prompts (reusable templates) — exchanged over a layered design: the primitives sit on top of a JSON-RPC data layer (messages like tools/list), which sits on top of a transport (commonly HTTP). Through 2025 it became the de facto "USB-C for AI tools."

But it left two gaps open, both about the boundary between an agent and the world.

The output gap: tools return text or JSON. When a tool hands back a few hundred rows, users don't want a paragraph summary — they want to sort, filter, drill into row 47, or view a PDF inline. With pure text, every one of those is another prompt round-trip.

The reach gap: MCP servers are purpose-built backends. The millions of existing websites still get driven by brittle screen-scraping and DOM-clicking, with the site's own developers nowhere in the loop.

MCP Apps closes the first. WebMCP closes the second.

MCP Apps — UI inside the agent

MCP Apps is the first official MCP extension. It was proposed in November 2025 and shipped on January 26, 2026, building on two pieces of prior art: the community MCP-UI project (created by Ido Salomon and Liad Yosef) and the OpenAI Apps SDK. Rather than three competing approaches, those efforts converged into one open standard.

The idea: a tool can return an interactive UI component that renders directly in the conversation — a filterable dashboard, a configuration wizard with dependent fields, an inline document viewer with clickable clauses, a live-updating monitoring panel. The kinds of interactions that are clumsy as a text exchange become as natural as using any web app.

If you've shipped micro frontends, the shape will feel familiar — an MCP App is, in effect, a micro frontend for agents. It's a self-contained UI, built and deployed independently by the server, composed into a host's shell at runtime, isolated in a sandboxed iframe, and talking to the host over a defined contract. That's the micro-frontend playbook almost exactly. The one twist worth holding onto: a normal micro frontend serves a human, while an MCP App serves a human and an LLM — via updateModelContext, it can feed state back to the model. And composition here is temporal (streamed inline in a conversation) rather than laid out in fixed page regions.

How it works

The architecture rests on two MCP primitives. First, tools carry UI metadata: a tool declares a _meta.ui.resourceUri pointing at a UI resource. Second, UI resources are served via the ui:// scheme and contain bundled HTML/JS.

// A tool that declares an interactive UI
{
  name: "visualize_data",
  description: "Visualize data as an interactive chart",
  inputSchema: { /* ... */ },
  _meta: {
    ui: { resourceUri: "ui://charts/interactive" }
  }
}

When the tool runs, the host reads that metadata, fetches the ui:// resource, renders it in a sandboxed iframe, and opens bidirectional JSON-RPC over postMessage between the iframe and the host. The model stays in the loop — it sees what the user does and can respond — while the UI handles what text can't: live updates, native media, persistent state, direct manipulation.

An MCP App in action: Excalidraw built with MCP Apps, running in Claude — the user can drill in without re-prompting, and the model sees every interaction.(GIF credit: https://github.com/modelcontextprotocol/ext-apps/)

The App API

Developers build against @modelcontextprotocol/ext-apps, which provides an App class. It's a convenience wrapper over plain postMessage, so you can use any framework or none.

import { App } from "@modelcontextprotocol/ext-apps";

const app = new App();
await app.connect();

// Receive tool results from the host
app.ontoolresult = (result) => renderChart(result.data);

// Call server tools from inside the UI
const response = await app.callServerTool({
  name: "fetch_details",
  arguments: { id: "123" },
});

// Quietly update the model's context
await app.updateModelContext({
  content: [{ type: "text", text: "User selected option B" }],
});

Because the app runs inside the client, it can log events for debugging, open links in the user's browser, send follow-up messages to drive the conversation, or update the model's context for later turns. The shared lineage with the Apps SDK shows up in a detail you'll see across hosts: UI resources use the text/html+skybridge MIME type for detection.

Security

You're running code you didn't write inside a trusted host, so the security model is layered. UI runs in sandboxed iframes with no access to the host's DOM, cookies, or storage. Hosts can review pre-declared HTML templates before rendering. All UI-to-host traffic is auditable JSON-RPC, and hosts can require explicit user consent for UI-initiated tool calls. None of this removes the need to vet the servers you connect — but it makes suspicious behavior reviewable and blockable before anything renders.

Where it runs

MCP Apps is live today in Claude (web and desktop), Goose, VS Code (Insiders), and ChatGPT, with JetBrains, AWS Kiro, and Google's Antigravity exploring support. The headline for developers: ship one interactive experience that works across a broad range of clients without writing a single line of client-specific code.

WebMCP — tools out to the agent

WebMCP comes at the problem from the opposite direction. It's a browser standard developed in the W3C Web Machine Learning Community Group, with authors from Microsoft and Google. The original proposal landed in August 2025; a Draft Community Group Report followed on February 10, 2026.

The pitch: instead of an agent taking a screenshot of your page and guessing where to click, your website hands it a typed list of the things it can do and the exact parameters each one needs. A new browser API, navigator.modelContext, lets a page register its features as structured, callable tools.

A useful way to frame this: WebMCP is accessibility for agents. Just as ARIA and the accessibility tree give a screen reader a machine-readable handle on a page instead of forcing it to interpret pixels, WebMCP gives an agent typed, callable tools instead of forcing it to scrape the DOM — screen-scraping is to agents what reading pixels is to a screen reader. The refinement worth keeping in mind: where ARIA mostly describes existing widgets and the assistive tech drives them, WebMCP exposes actions and intents — verbs, not just controls, some of which may not map to any single on-screen element (think "import 100 stamps"). So it's closer to "accessibility plus a task API" than to ARIA alone.

The API

There are two registration styles. provideContext({ tools }) declares the page's entire tool set, and calling it again replaces the set — handy for single-page apps where available actions change with UI state. registerTool() / unregisterTool() add and remove tools incrementally.

A tool is a name, a description, an inputSchema (JSON Schema), and an execute(args, agent) callback that returns structured content. The shape deliberately mirrors MCP and the Prompt API tool-use format.

if ("modelContext" in window.navigator) {
  navigator.modelContext.provideContext({
    tools: [{
      name: "add-stamp",
      description: "Add a new stamp to the collection",
      inputSchema: {
        type: "object",
        properties: {
          name: { type: "string", description: "The name of the stamp" },
          year: { type: "number", description: "The year it was issued" }
        },
        required: ["name", "year"]
      },
      execute({ name, year }, agent) {
        addStamp(name, year);            // reuse a function the page already has
        return {
          content: [{ type: "text", text: `Added "${name}".` }]
        };
      }
    }]
  });
}

The most appealing property for working developers: the execute body is often just a call to a helper your frontend already defines. Exposing existing functionality as a tool can be a minimal change rather than a rewrite.

Full booking flow with prompt on a WebMCP enabled demo site

Full booking flow on a WebMCP enabled demo site with prompt: "Find me an entire home in San Francisco for 3 guests with at least 2 bedrooms and a maximum price of $300 per night, and then reserve it between March 1 and 7"

(GIF credit: https://medium.com/data-science-collective/moving-beyond-screen-scraping-creating-an-agent-native-web-app-with-webmcp-4818552e1e11)

Tools run in page JavaScript, on HTTPS only. Calls execute one at a time, sequentially, on the main thread; heavy or batched work can be offloaded to workers. Crucially, the browser mediates access — the user reviews and consents to which agents connect to the page.

The agent object passed into execute can pause mid-execution to ask the user something via requestUserInteraction() — for confirmation, authentication, or a dialog.

async function buyProduct({ product_id }, agent) {
  const confirmed = await agent.requestUserInteraction(async () =>
    confirm(`Buy product ${product_id}?`));
  if (!confirmed) throw new Error("Purchase cancelled by user.");
  executePurchase(product_id);
  return `Product ${product_id} purchased.`;
}

Why it's designed this way

WebMCP intentionally aligns its API with MCP primitives, so any MCP-compatible agent can use a site's tools with minimal translation. But it leaves the data layer to the browser. That choice does three things: it decouples the web from any specific MCP version (the browser maintains compatibility as the protocol evolves), it lets the browser enforce web-platform security (such as managing iframe capabilities), and it keeps responses web-native (a tool can return an img or video element). The bigger philosophical point is developer involvement: the agentic web becomes something site authors publish into, reducing reliance on UI automation, which improves privacy, lowers site costs, and can even help accessibility.

Status and limitations

Be honest about maturity. As of mid-2026, WebMCP is in early preview in Chrome (Canary and an origin trial), with native Chrome and Edge support expected in the second half of 2026; Firefox and Safari are engaged in the spec but haven't committed to timelines. And the known limitations are real:

A browsing context is required — tools run in page JS, so a tab has to be open. There's no headless tool calling yet.
Discoverability is unsolved — there's no way to know a site's tools without visiting it. A manifest-based approach is under discussion but not specified.
UI synchronization and refactoring — developers have to keep the on-page UI consistent with state the agent changes, and complex sites may need restructuring.
It is a Draft Community Group Report, not a finalized standard.

How the pieces fit

It helps to name the whole cast. MCP is the base protocol. MCP Apps is the official extension that flows UI into the agent's conversation. WebMCP is the browser standard that flows tools out of a website to any agent.

The clearest way to hold MCP Apps and WebMCP in your head is side by side:

	MCP Apps	WebMCP
Direction	UI flows into the agent	Tools flow out from the website
Where code runs	Sandboxed iframe inside the host	The website's own page JS, in the browser
Who builds it	MCP server authors	Website developers
Needs a server?	Yes (an MCP server)	No — just the page plus the browser API
Standard body	MCP project (official extension)	W3C Web ML CG (draft)
Status (mid-2026)	Live in production clients	Early preview / origin trial
Transport	JSON-RPC over `postMessage`	Browser-mediated, MCP-aligned
Key API	`_meta.ui.resourceUri`, `ui://`, `App`	`navigator.modelContext`, `provideContext`
Pain it solves	Output gap (text → interactive UI)	Reach gap (real sites → agent tools)

They meet in the middle. Picture an agent that drives the tools a site exposes through WebMCP and renders the rich UI a server returns through MCP Apps — all grounded in the same MCP primitives.

Two mental models

If you remember nothing else, remember the two analogies — they're the fastest way to keep these straight.

MCP Apps is a micro frontend for agents. A self-contained UI, built and deployed by the server, composed into the agent's shell at runtime and isolated in a sandboxed iframe, over a postMessage contract. The twist: one of its consumers is an LLM, not just a human.

WebMCP is accessibility for agents. A developer-provided, machine-readable interface to the same features, so agents don't reverse-engineer pixels — the way ARIA serves a screen reader. The refinement: it exposes actions and intents (verbs), not just widgets, so it's "a11y plus a task API."

There's a clean symmetry in the pairing, too: micro frontends are about presentation (output), accessibility is about operability (input) — which is exactly the "UI in, tools out" split this whole post hangs on.

Why this matters for developers

A few things change in practice. You write the interface once and run it across agents: one MCP App works in Claude, ChatGPT, VS Code, and Goose with no client-specific code, and one WebMCP integration works with any MCP-compatible agent. You reuse what you already have, since WebMCP tools often wrap existing functions and MCP Apps are just web tech. The interface becomes a product surface again — instead of your value getting flattened into text the model paraphrases, you control the live UI and the exact capabilities you expose. And security and consent are first-class, through sandboxing, auditable messages, explicit approval, and browser mediation, rather than bolted on afterward.

Strategically, the agentic web shifts from scraping done to you toward capabilities you publish. Teams that move early get to shape their own agent experience — how they're discovered, how conversion flows feel, how accessible they are. For content and commerce sites in particular, that means exposing first-party tools (search, subscribe, browse, checkout) to agents on your own terms, rather than being navigated blindly by automation.

How to start (briefly)

For MCP Apps as a server author: install @modelcontextprotocol/ext-apps, start from one of the examples in the ext-apps repo (threejs, map, pdf, system-monitor, sheet-music), serve a ui:// resource of bundled HTML/JS, add _meta.ui.resourceUri to your tool, wire up the App class (connect, ontoolresult, callServerTool, updateModelContext), and test in a supporting client like Claude or VS Code Insiders.

For WebMCP as a web developer: feature-detect with if ("modelContext" in navigator), call provideContext({ tools }) (or registerTool) wrapping your existing page functions in execute, gate sensitive actions behind agent.requestUserInteraction(), offload heavy tools to workers, and try it in Chrome's origin trial.

Risks and open questions

Running third-party UI or code inside trusted hosts expands the supply-chain and prompt-injection surface — vet your servers. WebMCP's discoverability and no-headless constraints are genuinely unsolved, with manifests still TBD. The standards are young: WebMCP is a draft, and browser support is uneven. UX consistency between human-driven and agent-driven state takes real care. And there's always fragmentation risk if vendors diverge before specs settle.

Takeaway

2026 is the year agents got an interface layer. MCP Apps brings UI in; WebMCP pushes tools out; both are grounded in MCP and both are designed to converge. Pick the doorway that matches what you build — and start exposing your product to agents on your own terms.

Resources

MCP Apps announcement & docs — blog.modelcontextprotocol.io, modelcontextprotocol.io/docs/extensions/apps
@modelcontextprotocol/ext-apps — GitHub examples
WebMCP proposal / spec — webmachinelearning.github.io/webmcp

MCP 2.0? Two Doorways to the Agentic UI — MCP Apps and WebMCP

A 60-second MCP refresher

MCP Apps — UI inside the agent

How it works

The App API

Security

Where it runs

WebMCP — tools out to the agent

The API

Why it's designed this way

Status and limitations

How the pieces fit

Two mental models

Why this matters for developers

How to start (briefly)

Risks and open questions

Takeaway

Resources

Comments

More from this blog

Sharing rules and slash commands across AI coding agents

Making AI agent rule application visible — stable IDs and trace blocks

Root-default / screen-claim tracking: redesigning Expo Router analytics with two AIs in the loop

Upgrading from Expo SDK 52 → 54 (new architecture) in a Monorepo: A Survival Journal

Command Palette

A 60-second MCP refresher

MCP Apps — UI inside the agent

How it works

The App API

Security

Where it runs

WebMCP — tools out to the agent

The API

Execution and consent

Why it's designed this way

Status and limitations

How the pieces fit

Two mental models

Why this matters for developers

How to start (briefly)

Risks and open questions

Takeaway

Resources

Comments

More from this blog