Experiments

How MCP enables experimentation across AI tools

Jun 2, 2026

min read

A graphic of a bar chart with an arrow pointing upward.

Experiment context has always been stuck in one place: the dashboard.

If you wanted to check flag status, pull results, or configure a rollout, you left your editor, opened a browser, and navigated to a separate tool. The Model Context Protocol changes that by giving your experimentation platform a way to travel — connecting it to whatever AI tool you're already using, so the data comes to you instead of the other way around.

This article is for developers, PMs, and experimentation teams who want to understand how MCP experimentation actually works and what it changes about day-to-day workflows. It's written for people who are new to MCP but already familiar with running feature flags and A/B tests. Here's what you'll learn:

What MCP is and why experimentation platforms are a natural fit for it
Why context switching between your editor and your experiment dashboard costs more than it seems
How a single MCP server makes your experiment data portable across Cursor, Claude, VS Code, and any other compatible AI tool
Who beyond developers can now participate in the experimentation lifecycle — and how permissions stay intact
What the workflow actually looks like in practice, from creating a flag with a plain-English prompt to querying results without leaving your editor

The article moves from concept to consequence to practice. It starts with the architecture, explains the real friction it replaces, and ends with concrete examples of what changes once MCP is configured. No integration project required — just a clearer picture of why experiment context no longer has to live in a single tab.

MCP breaks the information silo that kept experiment context in dashboards

The Model Context Protocol landed quietly on November 25, 2024, when Anthropic open-sourced it alongside a straightforward description: a new standard for connecting AI assistants to the systems where data lives.

That framing undersells what it actually changes. For teams running experiments and managing feature flags, MCP is the first architectural answer to a problem that has kept experiment context locked inside dashboards and away from the tools where development actually happens.

A protocol, not a product

MCP is a client-server protocol. On one side, MCP servers expose data and capabilities — a feature flag system, an experiment results database, a code repository. On the other side, MCP clients — AI tools like Claude, Cursor, or VS Code — connect to those servers and gain access to whatever the server exposes.

The official MCP documentation offers the cleanest analogy: think of it like a USB-C port for AI applications. Just as USB-C standardized how devices connect to peripherals, MCP standardizes how AI tools connect to external systems.

The protocol is open-source and vendor-neutral. Anthropic built it, but no one needs Anthropic's permission to implement it. The principle baked into the spec is "build once, integrate everywhere" — a server built to the MCP standard works with any MCP-compatible client, regardless of who made it.

One distinction worth understanding: MCP lets you add new connections while an AI tool is already running, without restarting or redeploying anything. Think of it like adding a new app to your phone — you don't have to reinstall the operating system.

Traditional API integrations don't work this way; they're wired together when the software is built, which means only developers can extend them. With MCP, a user can connect a new data source — say, a feature flag system — to their AI tool through a configuration file, and the tool picks it up immediately. That's what makes MCP extensible by users, not just engineers.

The fragmentation problem it replaces

Before MCP, connecting an AI tool to an external system meant building a custom integration. Every pairing — Claude to your experiment platform, Cursor to your feature flag system, a new internal AI tool to your analytics database — required its own bespoke connector.

Anthropic described the resulting state directly: AI models "trapped behind information silos", with "every new data source requir[ing] its own custom implementation, making truly connected systems difficult to scale."

Practitioners describe this as the M×N problem. If you have M AI tools and N data sources, you potentially need M×N custom integrations to connect them all. MCP collapses that to M+N: each data source builds one MCP server, each AI tool implements one MCP client, and they interoperate. The protocol handles the handshake; the teams on each side only need to implement it once.

It's worth being precise about what MCP does and doesn't solve. The protocol layer becomes M+N. Authentication and authorization, as practitioners have noted, remain more complicated — each MCP server still needs to handle credentials and permissions according to what it's connecting to. MCP doesn't eliminate that complexity, but it does eliminate the need to reinvent the connection layer every time.

Experimentation platforms hold exactly the structured data MCP was designed to expose

Experimentation platforms are exactly the kind of external system MCP was designed to connect. They hold structured, queryable data — experiment configurations, feature flag states, metric results, statistical significance — that AI tools need context about to be genuinely useful during development.

Without a standard protocol, that data stayed in dashboards. Developers who wanted AI assistance during experiment QA or analysis had to manually copy context across tools, or wait for someone to build a one-off integration.

GrowthBook claims to have shipped the first production MCP server for feature management and experimentation in early 2025, and that server is listed on the official MCP Registry. Whether or not it was technically first, the implementation makes the architectural point concrete: an experimentation platform can now expose its full context — flags, experiments, results — to any MCP-compatible AI tool through a single server, without building separate connectors for Cursor, Claude, VS Code, or whatever AI client a team uses next year.

That's the structural shift. Experiment context stops being a destination developers navigate to and starts being context that travels with them.

The real cost of leaving your editor to manage experiments

Every developer who has run an A/B test knows the rhythm: write the code, switch to the experiment platform to configure the flag, switch to the analytics dashboard to check results, then switch back to the editor to act on what you found.

Each of those transitions feels like a minor inconvenience in isolation. Accumulated across a sprint, they represent something more corrosive — a constant interruption of the mental state where actual work happens.

The fragmented workflow developers actually live in

The traditional experimentation lifecycle is built around a centralized hub: a dashboard that owns the full workflow from ideation through implementation, measurement, and sharing. That architecture made sense when experimentation was a specialized activity handled by a dedicated team. It makes less sense when developers are expected to instrument, monitor, and iterate on experiments as a routine part of shipping code.

The fragmentation isn't a GrowthBook-specific design choice or an Optimizely-specific quirk. It's structural to how experimentation platforms have been built — as standalone web applications that sit outside the development environment.

Research on pre-MCP AI integrations describes the same pattern at the protocol level: each tool requires "bespoke interface definitions, authentication handling, and execution logic", with no shared standard connecting them to the environments where developers spend their time. The experiment platform and the editor exist in separate worlds, and moving between them is the developer's problem to manage.

The real cost of the context-switching tax

The cost isn't just the time it takes to open a browser tab. It's the interruption of flow state — the cognitive overhead of re-orienting to a different interface, finding the right experiment, interpreting the data, and then carrying that context back to the editor where a decision needs to be made. Every switch adds latency between "I want to know this" and "I can act on this."

Consider a developer mid-QA, not sitting at their primary workstation, who wants to know how their running experiments are performing. The old path: stop what you're doing, open a browser, navigate to the dashboard, authenticate, locate the right experiment, and parse the results interface.

The new path with MCP experimentation: ask a natural language question in whatever AI tool is already open and get a full breakdown without touching a single UI. The contrast isn't subtle. One workflow respects where the developer's attention already is; the other demands that attention be redirected.

GrowthBook's product page captures the before/after with unusual directness: "No context switching, just a plain english prompt to create features and experiments. Works with Cursor, Windsurf, and other AI tools." That framing isn't marketing abstraction — it's a description of a workflow inversion.

MCP inverts the expectation — experiment context comes to you

The structural shift MCP enables is simple to state but significant in practice: instead of developers going to the experiment platform, the experiment platform comes to wherever the developer already is. The IDE, the AI coding assistant, the Claude Desktop window open during QA — any MCP-compatible client becomes a full interface for the experimentation workflow, including flag creation, targeting configuration, and result querying.

This isn't a UI improvement. It's a change in where the workflow lives. When experiment context is available as a tool any compatible AI client can call, the developer's primary environment stops being a place they have to leave in order to manage experiments.

The question "how are my experiments doing?" becomes answerable in the same context where the code that runs those experiments was written — without a browser, without a dashboard, without a context-switching tax.

For teams trying to move faster, that architectural change matters more than any individual feature. Iteration speed is a function of how quickly a team can close the loop between shipping code and understanding its effect. Every unnecessary transition in that loop is a tax on velocity. MCP doesn't just reduce that tax — it restructures who pays it and when.

How MCP makes experiments portable across any AI tool

The promise of MCP experimentation only holds if it works in the tools your team actually uses — not just the one you happened to configure first. This is where the protocol's architectural design does the real work. Portability isn't a feature someone added on top of MCP; it's a direct consequence of how the standard is built.

The open standard architecture behind portability

Optimizely describes MCP as "a universal translator between your AI tool and the platforms you use every day. Instead of building custom integrations for every AI client, MCP gives us a common language." That framing is accurate and useful — the protocol is the shared grammar, and any tool that speaks it can participate.

The durability of this standard matters too. In December 2025, Anthropic donated MCP to the Agentic AI Foundation under the Linux Foundation, co-founded with Block and OpenAI, with backing from AWS, Google, Microsoft, Bloomberg, and Cloudflare. This is no longer a single vendor's protocol — it's vendor-neutral, community-governed infrastructure. Teams adopting MCP today are not betting on a proprietary standard that could be deprecated or locked down.

The ecosystem of compatible AI clients

The client ecosystem has grown to the point where the question is no longer "does my AI tool support MCP?" but "which of my AI tools should I configure first?" Claude, ChatGPT, Google Gemini, Cursor, Windsurf, and Sourcegraph Cody all implement the standard. The broader ecosystem has scaled to more than 17,000 community servers and over 97 million monthly SDK downloads across Python and TypeScript alone.

The practical implication is compounding: an investment in a single MCP server for your experimentation platform becomes more valuable as the client ecosystem grows. The server you configure today will reach AI tools that don't exist yet, as long as they implement the protocol — which, given the governance structure and adoption trajectory, is increasingly the default expectation for any serious AI development environment.

What "no bespoke connectors" means in practice

This is where the architecture becomes concrete. GrowthBook's MCP server connects to Cursor, VS Code, Claude Code, Claude Desktop, Windsurf, and Cline through a single configuration — one npx -y @growthbook/mcp@latest command and two environment variables for cloud users.

There is no separate engineering work per client, no custom API integration to maintain for each tool, and no divergence between what one client can access versus another.

The contrast with the pre-MCP alternative is stark. Before this, giving an AI assistant access to your feature flags or experiment results meant building a custom integration — typically a webhook or API wrapper — for each tool, each requiring its own authentication handling, error management, and ongoing maintenance as both the AI tool and the experimentation platform evolved. That work multiplied with every new AI tool the team wanted to adopt.

Practitioners who have made the switch describe the shift in terms of freedom rather than just efficiency. As one engineer put it: "You are not tied to one tool. Your context goes where you go. Whether it's ChatGPT, Claude Code, Windsurf, or Codex. I now worry less about switching model providers. It's really liberating."

That's not a productivity claim — it's an architectural one. When your experiment context is attached to a protocol rather than a product, switching AI tools stops being a migration project and starts being a configuration change.

The same logic applies regardless of which experimentation platform you're using. Optimizely's MCP server connects to ChatGPT, Claude, and Cursor through the same mechanism. The portability is a property of the standard, not of any single vendor's implementation.

From developer tool to team-wide capability: who MCP unlocks for experimentation

For most teams, experimentation has always been developer-gated by default. Not because product managers and experimentation leads lack the judgment to run tests — they often have more context on the business question than anyone — but because the tooling required it.

Creating a flag meant opening a dashboard. Pulling results meant waiting for a data pull or filing a request. MCP changes that equation, but only if the implementation extends beyond the IDE.

The IDE-first limitation early MCP integrations inherited

The first wave of MCP integrations for experimentation was built around code editors. GrowthBook's MCP server, for example, is explicitly positioned around IDE use — Cursor, Windsurf, VS Code, Claude Code — with the value proposition framed as eliminating context switching for developers already living in those environments.

That framing is accurate, and the productivity gains are real. But it also means the benefit was narrow. A developer could query experiment status without leaving their editor. A product manager working in a browser-based AI tool could not.

This isn't a failure of the early implementations — local MCP was the natural starting point, and developer workflows were the most tractable integration surface. But it created a new version of the old gatekeeping problem: the people closest to the business questions still couldn't access experiment data without going through someone else or navigating a separate interface.

Remote MCP servers expand who's at the table

The Remote MCP Server launch in April 2026 is the clearest evidence that the industry recognized this limitation and moved to address it. The launch explicitly extended MCP access beyond IDEs to browser-based AI clients — Claude, ChatGPT, and Cursor — and was explicitly targeted at product managers, program managers, and experimentation teams, not just developers.

The capability set available through the remote server is substantive, not a stripped-down read-only view. Through natural language, non-technical users can query running experiments, retrieve results, compare flag configurations across environments, manage audiences, and create new experiments — without API knowledge and without a code editor open.

The same experimentation data and actions available to a developer in their IDE become available to a PM in Claude. That's the structural shift: it's not just that the interface changed, it's that the prerequisite skillset for participation dropped significantly.

GrowthBook's own documentation frames this goal directly, describing experimentation democratization as what happens when decentralized teams are each empowered to design and start their own experiments. The mechanism for achieving that at scale — remote MCP access for non-technical users — is still emerging across the industry, but the direction is clear.

OAuth inheritance means broader access without governance erosion

The obvious concern with democratizing experimentation access is quality and control. GrowthBook's docs acknowledge this tension explicitly: decentralized experimentation increases frequency but can become "the Wild West" without shared best practices and guardrails. More people touching experiments means more surface area for misconfigured flags, underpowered tests, and conflicting rollouts.

Optimizely's Remote MCP Server addresses this directly through OAuth authentication. Rather than creating a new permission layer, the server inherits existing platform permissions through Opti ID, Optimizely's unified identity system. Users can only execute through MCP what they're already authorized to do in the Optimizely UI.

A program manager who can view results but not create experiments in the platform has exactly the same constraints when working through Claude. The role-based access controls organizations already maintain carry over automatically — no new governance framework required.

This is the structural argument for why MCP-enabled democratization is different from simply giving everyone a login to the experimentation dashboard. The permission model doesn't flatten; it extends. Teams get broader participation in the experimentation lifecycle without trading away the controls that prevent that participation from becoming chaotic. The question "is this just a developer productivity tool?" has a concrete answer: not anymore.

Tuesday afternoon with MCP: where experimentation work actually happens now

The protocol arguments and architecture diagrams only matter if the day-to-day workflow actually changes. For engineers and PMs who've followed the MCP conversation this far, the real question is concrete: what does Tuesday afternoon look like after you've adopted this?

The answer is a meaningful shift in where experimentation work happens — and how naturally it fits into work that was already in progress.

Creating and managing feature flags without leaving your editor

The most immediate change is flag creation. In a traditional workflow, wrapping a code change in a feature flag means stopping what you're doing, opening a browser, navigating to your experimentation platform, configuring targeting conditions and rollout rules through a UI, copying the flag key back into your editor, and resuming. It's not catastrophic, but it's enough friction that developers often defer it or skip it.

With GrowthBook's MCP integration, that sequence collapses into a prompt. You're mid-PR in Cursor or VS Code, you describe what you want in plain English — the feature, the targeting conditions, the rollout percentage — and the flag is created in GrowthBook without a browser tab opening.

The full flag lifecycle is accessible this way: creation, targeting rules, rollout adjustments, and kill switches. GrowthBook's own framing for this is blunt: "No context switching, just a plain english prompt to create features and experiments." The integration works across Cursor, Windsurf, VS Code, and Claude Code from a single configuration, so the tool you're already in is the tool you use.

This matters especially in the context of AI-assisted development. GrowthBook has pointed out the risk of vibe shipping — writing AI-generated code quickly and deploying it without measurement. MCP closes that gap by making it just as easy to instrument code with a feature flag as it is to write the code in the first place.

Querying experiment results in natural language

The second shift is in how you access results. The traditional model requires navigating to a results dashboard, finding the right experiment, and interpreting a table of statistical outputs. MCP inverts this: instead of going to the data, you ask a question where you already are.

A developer doing a QA pass can ask their AI assistant how a specific experiment is performing on mobile, or whether a particular metric has moved since the last deployment, and receive a structured answer inline — without opening a separate application. GrowthBook's analysis runs directly against your own data warehouse — BigQuery, Snowflake, Redshift — which means the results surfaced through an MCP query reflect your organization's actual data, not a vendor-managed copy of it.

Optimizely's Remote MCP Server makes the same conversational capability explicit for non-technical users, who can "pull results in plain language — no API knowledge or code editor needed." The underlying mechanism is the same in both cases: the MCP server exposes experiment data as context the AI can reason over, and the user interacts with it conversationally.

AI-surfaced insights and the follow-up question

The qualitative shift goes further than just answering questions you already knew to ask. When experiment data is accessible conversationally, the AI can surface patterns proactively — metric correlations, win rate trends, performance differences across segments — that a developer scanning a dashboard might not think to look for.

GrowthBook's Insights capability tracks cumulative impact across experiments, win rates, metric correlations, and a learning library of past tests. When that data is accessible through an MCP interface, a developer can follow up on a surfaced pattern with an ad hoc question that would otherwise require a data scientist or a custom query. The character of experiment data interrogation changes: it becomes iterative and conversational rather than a one-time report pull.

Setup friction is minimal

None of this requires a significant integration project. GrowthBook's MCP server is entirely open source, and the configuration is a one-time setup — environment variables pointing to your GrowthBook instance, added to your AI tool's MCP configuration file. Once that's done, the same server works across every compatible client: Cursor, VS Code, Claude Code, Windsurf. There's no per-tool engineering, no separate onboarding for each AI assistant your team uses.

If Claude Code is your entry point, use the GrowthBook MCP Server for Claude Code setup guide to configure the server and verify the connection.

The contrast with traditional integration work is real. Custom API connectors, separate dashboard logins, and per-tool configurations are replaced by a single open-source server and a configuration block. The experimentation platform doesn't move — it just becomes reachable from wherever the work is already happening.

Assessing your stack and picking an entry point

The core argument of this article is simple: experiment context doesn't have to live in a separate tab. MCP makes it portable — attached to a protocol that travels with you across AI tools rather than locked inside a dashboard you have to navigate to. The workflow inversion is real, and the setup cost is low enough that the main question isn't whether to do it, but where to start.

Readiness is mostly about your AI tooling, not your experimentation platform

The honest answer is that readiness is mostly about your AI tooling, not your experimentation platform. If your team is already using Cursor, VS Code, Claude, or any other MCP-compatible client, the client side is covered.

The question on the platform side is whether your experimentation tool exposes an MCP server — and if it does, whether it's local-only or remote. Local MCP servers work well for developers in IDEs. If you want PMs and experimentation leads to participate without a code editor, you need a remote server with OAuth inheritance. Know which use case you're solving for before you configure anything.

IDE integration and remote MCP serve different goals — start with the right one

If your primary goal is reducing context switching for developers — fewer browser tabs during QA, flag creation from inside a PR, natural language result queries mid-sprint — start with the IDE integration. It's a one-time configuration and the feedback loop is immediate.

If your goal is broader: giving product managers direct access to experiment data without filing requests or waiting for a data pull, the remote MCP path is the right entry point, but it requires your platform to support it. These aren't competing approaches; they're sequential. Most teams will start with IDE integration and expand from there as the remote ecosystem matures.

One end-to-end run makes the workflow shift concrete

The best way to understand what changes is to run one experiment end-to-end through an MCP interface — flag creation, targeting, and a result query — without opening a dashboard. The experience of doing it once makes the workflow shift concrete in a way that no architecture diagram does.

This article was written to give you a clear picture of what MCP experimentation actually is and what it changes — not to sell you on a vision, but to help you make a grounded decision about whether and how to adopt it.

What to do next:

If you're a developer: Configure GrowthBook's open-source MCP server in whichever AI tool you already use most — Cursor, VS Code, or Claude Code — and create one feature flag from a plain-English prompt. The end-to-end run takes less time than the context-switching it replaces.

If you're a PM or experimentation lead: Check whether your platform supports remote MCP with OAuth inheritance, and ask whether your existing role permissions would carry over. That single question will tell you how close you are to participating in the experimentation lifecycle without going through a developer or a dashboard.

Related insights

Sign up for free

Take Growthbook for a spin, no credit card required.

Create my account

Example H2

See All Articles

Experiments

Data Science

T-test vs z-test: Key differences and when to use each

Jul 15, 2026

min read

Experiments

Data Science

Bayesian statistics: What it is and how it applies to A/B testing

Jul 15, 2026

min read

Experiments

Data Science

What is statistical significance? Definition and how to calculate it

Jul 14, 2026

min read

Ready to ship faster?

No credit card required. Start with feature flags, experimentation, and product analytics—free.

Get Started

Book a Demo

Simplified white illustration of a right angle ruler or carpenter's square tool.

White checkmark symbol with a scattered pixelated effect around its edges on a transparent background.

How MCP enables experimentation across AI tools

Experiment context has always been stuck in one place: the dashboard.

MCP breaks the information silo that kept experiment context in dashboards

A protocol, not a product

The fragmentation problem it replaces

Experimentation platforms hold exactly the structured data MCP was designed to expose

The real cost of leaving your editor to manage experiments

The fragmented workflow developers actually live in

The real cost of the context-switching tax

MCP inverts the expectation — experiment context comes to you

How MCP makes experiments portable across any AI tool

The open standard architecture behind portability

The ecosystem of compatible AI clients

What "no bespoke connectors" means in practice

From developer tool to team-wide capability: who MCP unlocks for experimentation

The IDE-first limitation early MCP integrations inherited

Remote MCP servers expand who's at the table

OAuth inheritance means broader access without governance erosion

Tuesday afternoon with MCP: where experimentation work actually happens now

Creating and managing feature flags without leaving your editor

Querying experiment results in natural language

AI-surfaced insights and the follow-up question

Setup friction is minimal

Assessing your stack and picking an entry point

Readiness is mostly about your AI tooling, not your experimentation platform

IDE integration and remote MCP serve different goals — start with the right one

One end-to-end run makes the workflow shift concrete

Related insights

Sign up for free

Table of Contents

Related Articles

T-test vs z-test: Key differences and when to use each

Bayesian statistics: What it is and how it applies to A/B testing

What is statistical significance? Definition and how to calculate it

Ready to ship faster?