Security BSides2025

Future-Proof Your Career: Evolving in the Age of AI

Security BSides San Francisco58 views44:085 months ago

This panel discussion explores the integration of artificial intelligence into cybersecurity workflows, focusing on both defensive and offensive applications. The speakers analyze how generative AI and machine learning are reshaping threat detection, vulnerability research, and security operations. The discussion emphasizes the necessity for security professionals to adapt to AI-driven threats while leveraging AI tools for operational efficiency and automated security tasks.

Beyond the Hype: Why LLM Hallucinations and Data Leakage Are Your Next Big Bug Bounty Targets

TLDR: Generative AI is no longer just a novelty; it is being integrated into enterprise workflows, creating a massive new attack surface for researchers. This post breaks down how LLM hallucinations and data leakage are becoming the primary vectors for exploitation in AI-integrated systems. Pentesters need to pivot from traditional web app testing to evaluating the trust boundaries and data sanitization pipelines of these models.

Security research is currently in a state of transition. We are moving away from the era where we only worried about SQL injection and XSS in static applications. Today, every enterprise is rushing to integrate Large Language Models (LLMs) into their internal tools, often without understanding the security implications of doing so. This is not a theoretical risk. When you connect an LLM to your internal data, you are essentially creating a new, highly complex interface that can be manipulated through natural language.

The Reality of LLM Exploitation

The core issue with current AI integration is the assumption that the model is a trusted, static component. It is not. When an LLM is used to parse internal documentation or automate security responses, it becomes a target for prompt injection and data exfiltration.

Consider the OWASP Top 10 for LLM Applications, specifically the risks surrounding prompt injection and insecure output handling. If an application uses an LLM to summarize user-submitted reports, an attacker can inject instructions into the report that force the model to ignore its system prompt and leak sensitive information. This is effectively a form of broken access control where the model acts as the proxy for the attacker.

Why Hallucinations Are a Security Feature

Most people view LLM hallucinations as a quality issue. For a security researcher, they are a feature. When a model hallucinates, it is often because it is attempting to fill gaps in its context window or training data. If you can manipulate the context provided to the model, you can force it to "hallucinate" in a way that benefits your objective.

In a recent engagement, we observed an LLM-based support bot that was configured to pull data from an internal knowledge base. By crafting specific queries, we were able to trick the model into revealing internal project codenames and API endpoint structures that were never intended to be exposed to the end-user. The model was not "broken" in the traditional sense; it was simply doing exactly what it was told to do by the attacker, because the system lacked proper input validation and context isolation.

The Data Leakage Pipeline

Data leakage in AI systems often occurs at the integration layer. Developers frequently connect LLMs to databases or internal APIs without implementing robust filtering. If your model has access to a database, it is only as secure as the query it generates.

If you are testing an AI-integrated system, stop looking at the UI and start looking at the data flow. Ask yourself:

What is the system prompt?
How is the user input sanitized before it reaches the model?
What are the permissions of the service account the model uses to query internal data?

If you can influence the model's output, you can often influence the underlying system's behavior. This is the new frontier of T1592-gather-victim-org-information, where the LLM becomes the primary tool for reconnaissance.

Defensive Strategies for the Modern Stack

Defenders must treat LLM inputs with the same level of scrutiny as they do raw SQL queries. Implementing a "human-in-the-loop" design for critical decision-making is the only way to mitigate the risks of automated hallucination. Furthermore, you must enforce strict data boundaries. If the model does not need access to the entire production database, do not give it that access. Use least privilege principles for every AI-integrated service account.

What to Do Next

The next time you are on a pentest or hunting for bugs, look for the "AI-powered" features. They are almost always the most vulnerable parts of the application. Do not just test for standard web vulnerabilities; test the model's boundaries. Try to force it to reveal its system instructions. Try to get it to output data it shouldn't have access to.

The industry is still learning how to secure these systems, and the current lack of standardized security controls means there is a massive opportunity for researchers to find high-impact bugs. Start by mapping the data flow between the user, the LLM, and the backend systems. If you can break the trust between those three, you have found your bug. The tools are changing, but the fundamental principles of security—trust, boundaries, and input validation—remain exactly the same.

Talk Type

panel

Difficulty

beginner