Black Hat2024

Living off Microsoft Copilot

Black Hat47,969 views42:06over 1 year ago

This talk demonstrates how Microsoft 365 Copilot can be weaponized for post-compromise activities, including data exfiltration and automated spear-phishing. The researcher highlights how Copilot's integration with enterprise data sources allows attackers to bypass traditional DLP controls and manipulate user perceptions. The presentation introduces 'PowerPwn,' a tool designed to automate these attack vectors within the Microsoft 365 ecosystem. The findings emphasize that jailbreaking an AI assistant with access to enterprise data is functionally equivalent to remote code execution.

Weaponizing Microsoft 365 Copilot for Automated Post-Compromise Operations

TLDR: Microsoft 365 Copilot integrates deeply with enterprise data, creating a massive attack surface for post-compromise activity. By leveraging prompt injection and manipulating the RAG (Retrieval-Augmented Generation) process, attackers can exfiltrate sensitive data and automate sophisticated spear-phishing campaigns. This research proves that for an AI assistant with enterprise access, a successful jailbreak is functionally equivalent to remote code execution.

Security researchers have spent the last year obsessing over LLM jailbreaks, mostly focusing on getting chatbots to output forbidden content or generate malware. While those are interesting, they miss the bigger picture for anyone doing real-world penetration testing. The real danger isn't a chatbot that tells you how to build a bomb; it is an AI assistant that has been granted read/write access to your entire corporate environment. When you give an LLM the ability to read your emails, access your SharePoint files, and send messages on your behalf, you are essentially handing over a set of keys to the kingdom.

The Mechanics of Copilot RAG Injection

At the heart of Microsoft 365 Copilot is a RAG architecture. When a user asks a question, Copilot searches the enterprise graph, retrieves relevant documents, and feeds them into the LLM context window to generate an answer. The vulnerability here is simple: the LLM cannot distinguish between trusted system instructions and untrusted data retrieved from the enterprise graph.

If an attacker can inject malicious instructions into a document, email, or Teams message that Copilot indexes, they can hijack the assistant's behavior. This is not just theoretical. During the research presented at Black Hat 2024, it was demonstrated that by injecting specific delimiters and instructions into a document, an attacker can force Copilot to ignore its original system prompt and follow the attacker's commands instead.

The tool released alongside this research, PowerPwn, automates this process. It allows a tester to perform reconnaissance by querying Copilot for information about the user's collaborators, recent email threads, and sensitive files. Once the context is established, the tool can craft a response that mimics the user's writing style, complete with emojis and professional tone, to send a malicious link or file to a target.

Why Jailbreak Equals RCE

In a traditional web application, RCE means executing arbitrary code on a server. In the context of an AI assistant, RCE means getting the assistant to perform actions on your behalf with your identity. If Copilot has a plugin that allows it to send emails or update records in a CRM, and you can trick it into using that plugin, you have achieved the functional equivalent of code execution.

The technical hurdle is bypassing the guardrails. Microsoft has implemented several layers of defense, including sensitivity labels and an "AI Watchdog" that monitors inputs and outputs for suspicious patterns. However, these defenses are often bypassed by encoding the malicious payload. For instance, encoding instructions in Base64 or using HTML tags to hide text from the user while keeping it visible to the LLM can effectively neutralize these controls.

Real-World Engagement Scenarios

During a red team engagement, you would typically start by gaining access to a low-privilege user account. Once inside, you don't need to hunt for local files or run Mimikatz immediately. Instead, you can use the user's access to Copilot to map the organization. You can ask Copilot to summarize recent projects, identify key stakeholders, or find documents containing specific keywords like "password" or "confidential."

The impact is significant because the actions taken by Copilot are performed under the user's identity. This makes detection extremely difficult for traditional security operations centers. If the assistant sends an email to a colleague, it appears as a legitimate communication from a trusted internal source. This is T1566-phishing and T1078-valid-accounts at their most efficient.

Defensive Strategies for Blue Teams

Defending against this requires a shift in how we think about data access. The principle of least privilege is more critical than ever. If a user doesn't need access to a specific SharePoint site, they shouldn't have it, because that site is now part of the potential context window for their Copilot instance.

Organizations should also leverage Microsoft Purview to apply sensitivity labels to sensitive documents. While not a silver bullet, these labels can prevent Copilot from surfacing certain files in response to unauthorized queries. Most importantly, security teams must start monitoring the audit logs for unusual Copilot activity, such as bulk data retrieval or unexpected external communications.

The era of "AI security" is just beginning, and we are all essentially working in a live clinical trial. The tools and techniques we use to break these systems today will define the security standards of tomorrow. If you are a researcher, stop looking for ways to make the AI say bad words and start looking for ways to make it do bad things. That is where the real impact lies.

Talk Type

talk

Difficulty

advanced