Security BSides2025

Resilience in the Uncharted AI Landscape

Security BSides San Francisco59 views31:2410 months ago

This talk outlines a framework for building resilient AI-driven applications by integrating security, observability, and incident response into the development lifecycle. It demonstrates how to apply traditional security principles like defense-in-depth, zero-trust, and robust backup strategies to modern agentic AI architectures. The speaker emphasizes the importance of proactive risk assessment and scenario planning to mitigate threats like prompt injection and data exfiltration. The presentation provides a structured approach for security teams to align AI development with regulatory compliance and business continuity requirements.

Building Resilient Agentic AI Pipelines Against Prompt Injection and Data Exfiltration

TLDR: Modern agentic AI architectures introduce complex attack surfaces where prompt injection can lead to unauthorized data exfiltration and lateral movement. By treating AI agents as untrusted components within a microservices architecture, security teams can implement robust observability and recovery controls. This post breaks down how to apply traditional defense-in-depth and zero-trust principles to secure agentic workflows against common exploitation techniques.

Security researchers are currently witnessing a shift from simple LLM chat interfaces to complex, multi-agent systems that perform autonomous actions. While these agentic workflows drive business value, they also create a massive, often overlooked, attack surface. When an AI agent is granted the ability to query internal databases, interact with external APIs, or manage procurement processes, a single successful prompt injection attack is no longer just a "jailbreak" that generates offensive text. It becomes a vector for T1190-exploit-public-facing-app and T1565-data-manipulation that can compromise the entire backend infrastructure.

The Anatomy of an Agentic Breach

In a typical agentic setup, an LLM acts as the orchestrator, deciding which tools to call to fulfill a user request. If an attacker can manipulate the LLM's decision-making process through prompt injection, they can force the agent to call tools with malicious parameters. For example, if an agent is designed to help a user manage a shopping list, an attacker might inject a prompt that forces the agent to query sensitive customer data or trigger unauthorized API calls to a payment processor.

The risk is compounded when these agents are deployed in a microservices environment using Kubernetes. If each agent is running in its own container, a breach in one agent can lead to lateral movement if the network policies are not strictly defined. Pentesters should focus on the "tool-use" phase of the agent's lifecycle. During an engagement, map out every tool the agent has access to. If the agent can interact with an internal ELK stack or a database, test whether the agent can be coerced into performing unauthorized queries or exfiltrating data to an attacker-controlled endpoint.

Applying Zero Trust to AI Agents

Securing these systems requires moving away from the assumption that the LLM is a trusted component. Instead, treat the LLM as an untrusted user interface that happens to be capable of executing code. This means implementing Zero Trust principles at the agent level.

Every tool call made by an agent should be validated against a strict schema. If an agent is only supposed to search a product catalog, the backend should reject any request that attempts to access user profiles or system logs. Furthermore, observability is non-negotiable. You need to monitor the agent's behavior in real-time using tools like Prometheus and Grafana. If an agent suddenly starts making an unusual volume of requests to a sensitive endpoint, that should trigger an immediate alert.

Incident Response in the Age of AI

When an agent is compromised, the standard incident response playbook needs to be adapted. The first step is detection, but the response must be automated. If you detect an anomaly, you need the ability to instantly disable the impacted agent or rotate its credentials.

# Example of isolating a compromised agent container in Kubernetes
kubectl label pod <agent-pod-name> status=quarantined
kubectl delete networkpolicy <agent-access-policy>

After containment, the focus shifts to breach assessment. You need to trace the exploit by reviewing the logs generated by the agent and the tools it interacted with. This is where having a well-configured ELK stack pays off. You should be able to reconstruct the exact sequence of tool calls that led to the compromise.

Once the assessment is complete, the restoration process involves rolling back the agent to a known-good state. This is where Terraform and Git become critical. By maintaining your infrastructure and agent configurations as code, you can ensure that you can redeploy a clean version of your agentic pipeline in minutes rather than hours.

The Path Forward for Researchers

The industry is still in the early stages of securing agentic AI. Most organizations are focused on the "chat" aspect, leaving the "agent" aspect wide open. For bug bounty hunters and researchers, this is a goldmine. Look for implementations where agents have excessive privileges or where the backend fails to validate the agent's tool calls.

Don't just look for the prompt injection; look for what the agent does after the injection. Can you make it read a file? Can you make it send an email? Can you make it change a configuration? The most impactful findings will be those that demonstrate a clear path from a malicious prompt to a tangible business impact. Start by mapping the agent's capabilities, then systematically test the boundaries of each tool it can access. The goal is to find the point where the agent's autonomy becomes a liability.

Talk Type

talk

Difficulty

intermediate