Security BSides2025

GenAI Application Security: Not Just Prompt Injection

Security BSides San Francisco268 views35:1410 months ago

This talk explores the expanded attack surface of Generative AI applications, moving beyond simple prompt injection to include supply chain risks and insecure deserialization. It details how AI models can be treated as executable code, particularly when using formats like Pickle, which can lead to remote code execution. The presentation provides a framework for securing AI pipelines using MLOps, including model scanning and supply chain verification. It also highlights the importance of threat modeling for AI-integrated systems using frameworks like MITRE ATLAS.

Beyond Prompt Injection: Why Your AI Model Is Just Another Executable

TLDR: Generative AI applications are increasingly vulnerable to supply chain attacks and insecure deserialization, not just prompt injection. By treating AI models as executable code, attackers can achieve remote code execution through malicious model files, particularly those using the Pickle format. Security teams must implement MLOps pipelines that include automated model scanning and supply chain verification to mitigate these risks.

The security community has spent the last two years obsessed with prompt injection. While jailbreaking a chatbot to ignore its system instructions is a valid concern, it is a narrow view of the actual attack surface. When we integrate Large Language Models (LLMs) into production environments, we are not just adding a text-processing layer; we are introducing complex, opaque, and often untrusted binary blobs into our infrastructure. These models are now being treated as first-class citizens in our CI/CD pipelines, yet they are rarely subjected to the same scrutiny as the application code that calls them.

The Model as an Executable

The fundamental shift in thinking required here is to stop viewing model files as static data and start viewing them as executables. When you load a model using libraries like the Hugging Face Transformers library, you are often executing code embedded within the model file itself.

The most dangerous example of this is the Pickle format. Pickle is a Python-specific serialization format that is notoriously insecure because it allows for the execution of arbitrary code during the deserialization process. If an attacker can trick your application into loading a malicious model file, they can trigger code execution the moment the pickle.load() function is called.

This is not a theoretical vulnerability. We have seen instances where malicious models are uploaded to public repositories like Hugging Face, waiting for a developer or an automated pipeline to pull them. Once the model is pulled and loaded, the attacker has a foothold in your environment.

Automating the Defense

If you are a pentester or a security engineer, you need to start treating model files as part of your attack surface. You cannot rely on manual inspection to catch malicious payloads in a multi-gigabyte model file. You need automation.

The ModelScan tool, developed by Protect AI, is the current standard for identifying these risks. It scans model files for unsafe code patterns, including those that leverage the os or subprocess modules to execute system commands.

For a pentester, the workflow is straightforward. During an engagement, identify where the application pulls its models. If the application is pulling models from an external source or a shared internal repository, that is your target. You can use ModelScan to audit these files for backdoors. If you find a model that uses pickle to execute bash commands, you have found your path to RCE.

The Supply Chain Problem

The risk extends beyond the model file itself. The entire ecosystem of AI development is built on a supply chain that is currently wide open. We use Ollama to run models locally, we pull dependencies from PyPI, and we integrate with third-party APIs. Each of these points is an entry vector.

OWASP has documented these risks extensively in their Top 10 for LLMs. Specifically, A05: Supply Chain Vulnerabilities highlights that the lack of provenance for models and plugins is a critical failure. When you deploy an AI-integrated application, you are effectively running code from an unknown author.

To secure this, you need to implement an MLOps approach that mirrors your DevSecOps practices. This means:

Software Bill of Materials (SBOM): You must know exactly what is inside the model you are deploying.
Automated Scanning: Integrate ModelScan into your CI/CD pipeline. If a model fails the scan, the build must break.
Network Controls: Restrict the ability of your AI application to reach out to the internet. If the model does not need to fetch external resources, block that egress traffic.
Threat Modeling: Use frameworks like MITRE ATLAS to map out the specific attack vectors relevant to your AI implementation.

What to Do Next

Stop treating your AI models as black boxes. If you are auditing an application, ask the developers where the models come from, how they are verified, and what format they are in. If they are using Pickle, they are already behind on their security posture.

The next time you are on an engagement, don't just try to get the LLM to say something offensive. Look at the pipeline. Look at the model source. Look for the deserialization points. The most interesting bugs in AI security are not in the prompts; they are in the infrastructure that makes the AI possible. Start there, and you will find the real vulnerabilities.

Talk Type

talk

Difficulty

intermediate