DEF CON2025

Invitation is All You Need: TARA for Targeted Promptware Attack against Gemini-Powered Assistants

DEFCONConference1,524 views45:366 months ago

This talk demonstrates a novel class of attacks called 'Promptware' that exploits LLM-powered assistants by injecting malicious instructions into shared resources like calendar invitations. By poisoning the LLM's context window, attackers can trigger unauthorized actions across integrated applications, including IoT device control, file exfiltration, and unauthorized video conferencing. The researchers show how these attacks can be chained to perform lateral movement between different agents and tools within the Google Workspace ecosystem. The presentation concludes with a threat analysis and risk assessment (TARA) highlighting the high-criticality of these vulnerabilities.

Why Your LLM-Powered Assistant is a Trojan Horse for Your Entire Digital Life

TLDR: Researchers at DEF CON 2025 demonstrated that LLM-powered assistants like Google Gemini can be weaponized through "Promptware"—malicious instructions hidden in shared data like calendar invites. By poisoning the LLM's context window, an attacker can force the assistant to perform unauthorized actions, including controlling IoT devices, exfiltrating files, and initiating video calls. This research highlights that the real risk isn't just data leakage, but the ability of an attacker to move laterally across the entire user ecosystem.

Security researchers have spent the last year obsessing over prompt injection as a way to make chatbots say naughty things or leak system prompts. That is the amateur league. The real danger, as demonstrated by the team behind the "Invitation is All You Need" research, is when an LLM is granted agency over your local environment. When you connect an LLM to your calendar, email, and IoT devices, you are not just building a productivity tool; you are building a remote control for your life that an attacker can hijack with a single, well-crafted calendar invite.

The Mechanics of Context Poisoning

The core of this attack is context poisoning. Modern LLM-powered assistants, such as Gemini for Workspace, do not just process the current user prompt. They ingest a massive amount of background data—emails, calendar events, and files—to provide relevant answers. This is the "context window." If an attacker can inject malicious instructions into a piece of data that the LLM will eventually read, they can manipulate the assistant's behavior.

The researchers demonstrated this by sending a calendar invitation to a victim. The invite contained hidden instructions, effectively a payload, that told the LLM to ignore previous instructions and execute a specific, malicious task. Because the LLM treats the calendar event as a trusted source of information, it executes the payload as if it were a legitimate user request. This is a classic OWASP A03:2021-Injection scenario, but applied to the semantic layer of an application rather than a database query.

Lateral Movement and Tool Chaining

What makes this research brilliant is the demonstration of lateral movement. The researchers showed that an attacker doesn't need to compromise the LLM model itself. Instead, they exploit the "orchestrator" logic that allows the LLM to call external tools.

In their demo, they used a calendar invite to trigger a chain of events. First, the LLM reads the invite and is "jailbroken" by the payload. Then, the LLM is instructed to use its access to other tools. The researchers successfully chained tool invocations to:

Open a specific URL in the user's browser to exfiltrate data.
Initiate a Zoom call to video stream the user.
Toggle IoT devices like smart boilers and lights.

This is not just a theoretical risk. If an assistant has the permission to "read emails" and "create calendar events," it has the permission to act on your behalf. By manipulating the assistant, the attacker effectively gains the user's identity within the Google Workspace ecosystem. You can find more on the risks of these integrations in the official Google security advisory regarding their layered defense approach.

The Pentester’s Perspective

For those of us conducting red team engagements, this changes the scope of our work. We are no longer just looking for XSS or broken access control on a web application. We are looking for the "agentic" capabilities of the applications we test.

If you are testing an application that integrates with an LLM, ask yourself:

What tools does the LLM have access to?
Does the application sanitize the data that the LLM ingests from external sources?
Is there a human-in-the-loop requirement for sensitive actions, like opening a URL or sending an email?

If the answer to the last question is "no," you have a high-criticality finding. The impact is not just a data breach; it is the potential for full account takeover and physical-world consequences. During an engagement, try to inject payloads into any data source the LLM might read—calendar invites, shared documents, or even public-facing web pages that the LLM might crawl.

Defensive Realities

Defending against this is difficult because it requires a shift in how we think about trust. You cannot simply "patch" prompt injection. Google’s approach, which involves content classifiers and user confirmation for sensitive actions, is a start, but it is not a silver bullet.

Defenders should focus on the principle of least privilege. If an LLM-powered assistant does not need access to your smart home devices, do not grant it that permission. Furthermore, implement strict input validation on all data that enters the LLM's context window. Treat any data coming from an external source as untrusted, regardless of whether it is a text file or a calendar event.

The era of "genius toddlers"—LLMs that are incredibly powerful but easily manipulated—is here. As researchers, our job is to map the boundaries of these systems before the attackers do. This research is a wake-up call that the next generation of vulnerabilities will not be found in the code we write, but in the instructions we give to the models we trust. Keep digging into these agentic workflows; the most interesting bugs are currently hiding in the gaps between the LLM and the tools it controls.

Talk Type

research presentation

Difficulty

advanced