Black Hat2023

Perspectives on AI Hype and Security

Black Hat692 views38:32about 2 years ago

This panel discussion explores the intersection of generative AI, cybersecurity, and policy, focusing on the practical implications of integrating large language models (LLMs) into enterprise environments. The speakers analyze the security risks associated with LLMs, including prompt injection, data leakage, and the potential for AI to be used in automated social engineering and phishing campaigns. The panel emphasizes the need for rigorous evaluation, red teaming, and a shift in mindset to treat AI models as complex software components with unique attack surfaces. The discussion highlights the importance of bridging the gap between data science and traditional security practices to effectively manage these emerging threats.

Beyond the Hype: Why LLMs Are Just Another Attack Surface

TLDR: Large language models are not magic; they are complex software components that require the same rigorous security scrutiny as any other application. This panel at Black Hat 2023 highlights that while the hype cycle is in full swing, the real-world risks involve familiar vectors like injection and data leakage. Security professionals must stop treating AI as a black box and start applying traditional red teaming and evaluation methodologies to these systems.

Generative AI has reached a state of social contagion, and the security industry is currently struggling to separate the signal from the noise. Every vendor is rushing to integrate LLMs into their products, often without a clear understanding of the underlying attack surface. While the press fixates on existential threats and science-fiction scenarios, the actual risk is far more mundane: we are rapidly deploying complex, opaque software components into production environments with little to no security validation.

Treating Models as Software Components

The most critical takeaway from the recent Black Hat panel is the need to stop viewing AI models as mystical entities. They are software. They have inputs, they have outputs, and they have internal logic that can be manipulated. When you integrate an LLM into an application, you are essentially adding a new, highly complex interface that is susceptible to OWASP Top 10 A03:2021-Injection.

If you are a pentester, your engagement methodology needs to shift. You are no longer just looking for SQLi or XSS in the traditional sense. You are looking for ways to subvert the model's logic. This includes prompt injection, where an attacker crafts inputs to bypass safety filters or force the model to execute unauthorized actions. The challenge is that these models are often trained on massive, uncurated datasets, including the entirety of GitHub, which makes them inherently unpredictable.

The Reality of the Attack Surface

During the discussion, the panelists pointed out that we lack standardized, open benchmarks for evaluating the security of these models. Without these, we are flying blind. When a vendor shows you a demo of an LLM driving an application, you should immediately ask: is this behavior part of the training set, or is it a genuine, reproducible capability?

Consider the implications of AutoGPT or similar autonomous agents. These tools are designed to chain tasks together, which significantly expands the potential impact of a single successful injection. If an agent has access to your file system or API keys, a successful prompt injection is not just a data leak; it is a full system compromise.

For those of us in offensive security, this is an opportunity. We need to develop new ways to fuzz these interfaces. Traditional fuzzing tools are built for structured protocols, but LLMs require a different approach. You need to test the boundaries of the model's instructions. If you can force the model to ignore its system prompt, you have effectively bypassed the primary security control.

The Regulatory and Defensive Shift

Policy is catching up, and it will force the issue. The EU AI Act is a prime example of how regulators are beginning to demand transparency and accountability for high-risk AI applications. For security teams, this means you will soon be responsible for auditing models, managing their training data, and ensuring they meet compliance standards.

Defenders must start by mapping the data flow. Where does the model get its context? If it is pulling from a database containing sensitive information, you have a data leakage risk. If it is accepting user input to generate code or execute commands, you have an injection risk. You need to treat the model's output as untrusted data, just as you would with any other user-supplied input.

Why You Should Be Skeptical

The current market is flooded with "AI-powered" security tools, but many of these are just wrappers around existing models. Do not take the marketing claims at face value. If a tool promises to automate your entire SOC, test it. Use it to perform tasks you already know how to do manually and see where it fails.

The most effective way to understand these systems is to get your hands dirty. Spend time with the models. Use them to write code, analyze logs, and generate reports. You will quickly find that they are prone to hallucinations and subtle errors. These errors are your entry points.

We are in the early stages of this technology, and the "Red Queen" race between attackers and defenders is just beginning. Attackers are already using these tools to scale their operations, from generating more convincing phishing emails to automating the discovery of vulnerabilities. If you are not actively experimenting with these models, you are already behind.

Stop waiting for the industry to provide you with a playbook. The tools are available, the documentation is public, and the attack surface is wide open. The next time you are on an engagement, look for the LLM integration. It is likely the most interesting part of the target's infrastructure, and it is almost certainly the least tested.

Talk Type

panel

Difficulty

intermediate