Ask Runable forDesign-Driven General AI AgentTry Runable For Free
Runable
Back to Blog
Technology11 min read

200,000 MCP servers expose a command execution flaw that Anthropic calls a feature | VentureBeat

OX Security confirmed arbitrary command execution on six live platforms and estimates 200,000 MCP servers are exposed. Here's how to audit your deployments.

TechnologyInnovationBest PracticesGuideTutorial
200,000 MCP servers expose a command execution flaw that Anthropic calls a feature | VentureBeat
Listen to Article
0:00
0:00
0:00

200,000 MCP servers expose a command execution flaw that Anthropic calls a feature | Venture Beat

Overview

200,000 MCP servers expose a command execution flaw that Anthropic calls a feature

Anthropic created the Model Context Protocol as the open standard for AI agent-to-tool communication. Open AI adopted it in March 2025. Google Deep Mind followed. Anthropic donated MCP to the Linux Foundation in December 2025. Downloads crossed 150 million. Then four researchers at OX Security found an architectural problem that affects all of them.

Details

MCP's STDIO transport, the default for connecting an AI agent to a local tool, executes any operating system command it receives. No sanitization. No execution boundary between configuration and command. A malicious command returns an error after the command has already run. The developer toolchain raises no flag.

OX Security researchers Moshe Siman Tov Bustan, Mustafa Naamnih, Nir Zadok and Roni Bar scanned the ecosystem and found 7,000 servers on public IPs with STDIO transport active — and estimate 200,000 total vulnerable instances extrapolated from that ratio. They confirmed arbitrary command execution on six live production platforms with paying customers. The research produced more than 10 CVEs rated high or critical across Lite LLM, Lang Flow, Flowise, Windsurf, Langchain-Chatchat, Bisheng, Docs GPT, GPT Researcher, Agent Zero, Letta AI and others.

Kevin Curran, IEEE senior member and professor of cybersecurity at Ulster University, independently told Infosecurity Magazine the research exposed "a shocking gap in the security of foundational AI infrastructure."

Anthropic confirmed the behavior is by design and declined to modify the protocol — characterizing STDIO's execution model as a secure default and input sanitization as the developer's responsibility. That characterization comes from OX; the only word Anthropic explicitly stated on the record is "expected." Anthropic has not issued a standalone public statement and did not respond to Venture Beat's request for comment.

OX says expecting 200,000 developers to sanitize inputs correctly is the problem. Anthropic's strongest technical counter: sanitizing STDIO would either break the transport or move the payload one layer down. Both positions are technically coherent. The question is what to do while that debate plays out.

Every major outlet covered the disclosure. None built the prescriptive product-by-product audit a security director needs to triage her own MCP deployments. This piece does.

Five questions determine whether your MCP deployments are exposed, whether your patches hold, and what to do Monday morning.

OX identified four exploitation families. Unauthenticated command injection through AI framework web interfaces, demonstrated against Lang Flow and Lite LLM. Hardening bypasses in tools that implemented command allowlists, demonstrated against Flowise and Upsonic, where OX bypassed the allowlist through argument injection (npx -c). Zero-click prompt injection in AI coding IDEs, where malicious HTML modifies local MCP configuration files. Windsurf (CVE-2026-30615) was the only IDE where exploitation required zero user interaction, though Cursor, Claude Code, and Gemini-CLI are all vulnerable to the broader family. And malicious package distribution through MCP registries, where OX submitted a benign proof-of-concept to 11 registries, and nine accepted it without security review.

Carter Rees, VP of AI and Machine Learning at Reputation and member of the Utah AI Commission, told Venture Beat the framing needs to change entirely. "MCP stdio is a privileged execution surface, not a connector. Enterprise teams should treat it like production shell access. Deny by default, allowlist, sandbox and stop assuming downstream input validation will hold at scale," Rees said.

The IDE family deserves particular attention because it hits developer workstations, not servers. A developer who visits an attacker-controlled website can trigger a modification to their local MCP configuration file — and in Windsurf's case, the change executes immediately with no approval prompt. Cursor, Claude Code and Gemini-CLI require some form of user interaction, but if the UI presents a configuration change without surfacing the execution consequence, clicking 'approve' does not constitute informed consent.

Some did. Some partially. Some have not confirmed. The matrix below maps each affected product against the exploitation family, patch state, and the gap that remains. The critical column is "Protocol fix?" Every row says no.

Lite LLM is fixed. New STDIO configs outside Lite LLM inherit the same insecure default.

Lite LLM is fixed. New STDIO configs outside Lite LLM inherit the same insecure default.

Pin to v 1.83.7-stable or later (CVE-2026-30623). Verify against Git Hub advisory. Audit all other STDIO definitions.

Pin to v 1.83.7-stable or later (CVE-2026-30623). Verify against Git Hub advisory. Audit all other STDIO definitions.

Auth token freely available via public endpoint. STDIO executes whatever follows.

Auth token freely available via public endpoint. STDIO executes whatever follows.

Allowlist gives false confidence. OX bypassed it. Trivial.

Allowlist gives false confidence. OX bypassed it. Trivial.

Do not rely on command allowlists. Enforce process-level sandbox isolation.

Do not rely on command allowlists. Enforce process-level sandbox isolation.

Only an IDE with a true zero-interaction exploit. Hits developer workstations, not servers.

Only an IDE with a true zero-interaction exploit. Hits developer workstations, not servers.

Disable automatic MCP server registration. Review all active configs manually.

Disable automatic MCP server registration. Review all active configs manually.

User interaction required, but config-change UI does not surface execution consequence. Approval does not equal informed consent.

User interaction required, but config-change UI does not surface execution consequence. Approval does not equal informed consent.

Audit MCP config files (~/.cursor/mcp.json, equivalent paths). Disable auto-registration. Review all pending config changes before approval.

Audit MCP config files (~/.cursor/mcp.json, equivalent paths). Disable auto-registration. Review all pending config changes before approval.

Downstream chatbot framework inherits the same STDIO default. Patch status unconfirmed.

Downstream chatbot framework inherits the same STDIO default. Patch status unconfirmed.

Inventory all Langchain-Chatchat deployments. Sandbox from host OS. Monitor vendor advisory for patch.

Inventory all Langchain-Chatchat deployments. Sandbox from host OS. Monitor vendor advisory for patch.

Registries lack submission security review. Install and risk a backdoor.

Registries lack submission security review. Install and risk a backdoor.

Use registries with documented submission review. Audit installs against known-good hashes.

Use registries with documented submission review. Audit installs against known-good hashes.

Yes. Every product-level patch in the matrix addresses the specific entry point in that product. None of them changes the MCP protocol's STDIO behavior. A security director who patches Lite LLM today and configures a new MCP STDIO server tomorrow will inherit the same insecure default on the new server. The patches are necessary. They are not sufficient.

This was predictable. When Venture Beat first reported on MCP's security flaws in January, Merritt Baer, chief security officer at Enkrypt AI and former deputy CISO at AWS, warned: "MCP is shipping with the same mistake we've seen in every major protocol rollout: insecure defaults. If we don't build authentication and least privilege in from day one, we'll be cleaning up breaches for the next decade." The Cloud Security Alliance independently confirmed OX's findings in a separate research note and recommended organizations treat MCP-connected infrastructure as an active, unpatched threat. The defaults did not change. The attack surface grew.

Rees argued that Anthropic's position, while internally consistent, does not survive contact with enterprise reality. "It stops being a developer mistake and starts being a distributed failure mode when the same class of failure reproduces across that many independent implementations," he told Venture Beat. "Guidance is not an architectural control. Relying on thousands of downstream implementers to consistently interpret a trust boundary is a known anti-pattern in enterprise security."

Anthropic updated its SECURITY.md file nine days after OX's initial contact in January 2026 to note that STDIO adapters should be used with caution, but made no architectural changes. The researchers' assessment of that update: "This change didn't fix anything."

Rees took a more measured view. "It's worth giving Anthropic credit where it's due," he told Venture Beat. "After the disclosure, they updated their security guidance to recommend caution with stdio adapters. That's a meaningful step even if researchers argue it falls short of a protocol-level fix."

Nothing architectural. Anthropic has not implemented manifest-only execution, a command allowlist in the official SDKs, or any other protocol-level mitigation. OX recommended all three. The SECURITY.md guidance update was the only change. OX's research began in November 2025 and included more than 30 responsible disclosure processes across the ecosystem before the April 15 publication.

The disagreement is substantive. Anthropic's architectural argument deserves its full weight. STDIO is a local subprocess transport designed to launch processes on the machine that configured it. The trust boundary, in Anthropic's model, sits with whoever controls the configuration file. If you can write to the MCP config, you are by definition someone authorized to execute commands on that machine. Under that logic, what looks like command injection is a feature working as intended. Restricting what STDIO can launch at the protocol level would either break the transport's core function, since its purpose is to launch arbitrary local processes, or displace the attack surface into the launched process itself. The unopinionated-standard argument is also defensible: a universal protocol that hard-codes execution constraints stops being universal. OX's counter, from their advisory: "Shifting responsibility to implementers does not transfer the risk. It just obscures who created it."

Do not wait for a protocol-level fix. Treat every MCP STDIO configuration as an untrusted input surface, regardless of which product it sits inside.

Enumerate. Identify every MCP server deployment across dev, staging, and production. Search for MCP configuration files (mcp.json, mcp_config.json) in developer home directories and IDE config paths (~/.cursor/, ~/.codeium/windsurf/, ~/.config/claude-code/). List running processes that match MCP server binaries. Flag any using STDIO transport with public IP accessibility. OX found 7,000 on public IPs. Your environment may have instances you do not know about.

Patch. Pin every affected product to its patched release. Lite LLM v 1.83.7-stable includes the fix for CVE-2026-30623. Docs GPT, Flowise, and Bisheng have also shipped fixes. Windsurf and Langchain-Chatchat remain in reported state as of May 1, 2026. Cursor was patched against an earlier related disclosure (CVE-2025-54136) but inherits the same protocol default. Check each vendor's advisory in the morning you execute this step.

Sandbox. Isolate every MCP-enabled service from the host operating system. Never give a server full disk access or shell execution privileges. The Flowise/Upsonic allowlist bypass proves that restricting commands alone is not enough.

Audit registries. Review every MCP server installed from a third-party registry. Nine of 11 registries accepted OX's proof-of-concept without a security review. Use registries with documented submission review processes. Remove any MCP server whose origin you cannot verify.

Treat STDIO config as untrusted. This step survives every future patch and every future product. The protocol-level default has not changed. Every STDIO server definition is a command execution surface. Treat it the same way you treat user input to a database query: assume it is hostile until validated.

Anthropic and OX Security disagree on where the responsibility for securing MCP's STDIO transport belongs. That disagreement will not be resolved this week. What can be resolved this week is whether your MCP deployments are enumerated, patched, sandboxed, and treated as the untrusted execution surfaces they are.

As Rees put it: "The core question here is architectural policy, not exploit payloads." Baer warned in January that insecure defaults would produce exactly this outcome. OX documented 200,000 servers running with a configuration field that doubles as an execution surface. The protocol's designer says it is working as intended. Your Monday morning question is not who is right. It is which of your servers are exposed.

Deep insights for enterprise AI, data, and security leaders

By submitting your email, you agree to our Terms and Privacy Notice.

Key Takeaways

  • 200,000 MCP servers expose a command execution flaw that Anthropic calls a feature

  • Anthropic created the Model Context Protocol as the open standard for AI agent-to-tool communication

  • MCP's STDIO transport, the default for connecting an AI agent to a local tool, executes any operating system command it receives

  • OX Security researchers Moshe Siman Tov Bustan, Mustafa Naamnih, Nir Zadok and Roni Bar scanned the ecosystem and found 7,000 servers on public IPs with STDIO transport active — and estimate 200,000 total vulnerable instances extrapolated from that ratio

  • Kevin Curran, IEEE senior member and professor of cybersecurity at Ulster University, independently told Infosecurity Magazine the research exposed "a shocking gap in the security of foundational AI infrastructure

Cut Costs with Runable

Cost savings are based on average monthly price per user for each app.

Which apps do you use?

Apps to replace

ChatGPTChatGPT
$20 / month
LovableLovable
$25 / month
Gamma AIGamma AI
$25 / month
HiggsFieldHiggsField
$49 / month
Leonardo AILeonardo AI
$12 / month
TOTAL$131 / month

Runable price = $9 / month

Saves $122 / month

Runable can save upto $1464 per year compared to the non-enterprise price of your apps.