DEF CON2024

I've Got 99 Problems But Prompt Injection Ain't

DEFCONConference604 views40:06over 1 year ago

This talk explores the evolving landscape of AI-specific vulnerabilities, focusing on prompt injection, data poisoning, and supply chain attacks against machine learning models. It highlights how common serialization formats like pickle and model-loading frameworks can be exploited to achieve arbitrary code execution. The speakers emphasize the critical need for improved security practices, standardized vulnerability disclosure policies, and better collaboration between security researchers and AI developers. The presentation also discusses the challenges of bug bounty programs in the context of AI, including issues with hallucination reporting and severity scoring.

Why Your Machine Learning Model Pipeline Is a Remote Code Execution Goldmine

TLDR: Machine learning pipelines are increasingly vulnerable to remote code execution because they rely on insecure serialization formats like pickle and ONNX to load untrusted model files. Attackers can inject malicious payloads into these files, which execute when the model is loaded or parsed by common frameworks. Pentesters should prioritize auditing model-loading endpoints and supply chain integrations, as these are often overlooked by traditional security scanners.

Machine learning security is currently in the same state that web security was in the late nineties. We are seeing a massive influx of new, complex frameworks being deployed into production environments with almost zero scrutiny regarding their underlying data-handling mechanisms. While most security teams are busy worrying about prompt injection, they are completely missing the fact that the model files themselves are often just glorified remote code execution vectors.

The Hidden Danger of Model Serialization

At the heart of the problem is the way we handle model files. When a data scientist trains a model, they need a way to save it to disk and load it back into memory later. This process, known as serialization, is where the security model falls apart. Many of the most popular formats, such as pickle, are inherently insecure because they allow for the reconstruction of arbitrary Python objects. If an attacker can replace a legitimate model file with a malicious one, they can execute arbitrary code the moment that file is loaded by the application.

This is not a theoretical risk. We see this constantly in environments that use Hugging Face or similar repositories to pull pre-trained models. If a developer pulls a model from a public repository without verifying its integrity, they are essentially running untrusted code on their infrastructure. The OWASP category for Software and Data Integrity Failures is the perfect framework for understanding this, as it highlights the danger of relying on untrusted sources for critical components.

Exploiting Model Loaders

Consider the case of skops, a library designed to help share scikit-learn models. While it attempts to be safer than raw pickle by using JSON for metadata, the underlying implementation can still be manipulated. If a loader reconstructs a complex object tree from a JSON file, an attacker can craft a payload that triggers an eval() or similar function during the deserialization process.

The same logic applies to ONNX, which is often touted as a safer, platform-independent format. While ONNX uses Protocol Buffers for its structure, the way it handles external data can lead to path traversal vulnerabilities. If an attacker can control the path from which the model loads its weights, they can force the application to read sensitive files from the local filesystem, such as /etc/passwd.

If you are testing an application that accepts model uploads, your payload should look something like this:

import pickle
import os

class MaliciousModel:
    def __reduce__(self):
        return (os.system, ('id',))

with open('model.pkl', 'wb') as f:
    pickle.dump(MaliciousModel(), f)

When the target application calls pickle.load() on this file, it will execute the id command on the server. This is the most basic form of the attack, but it demonstrates the core issue: the application trusts the structure of the file implicitly.

Real-World Testing Strategies

During an engagement, you should treat model-loading endpoints exactly like you would treat a file upload form that accepts executable code. First, map out where the application pulls its models. Is it from a local directory, an S3 bucket, or a public repository like Hugging Face? If it is a public repository, can you perform a man-in-the-middle attack or a repository poisoning attack to swap the model?

Second, look for the libraries being used to parse these files. If you see pickle, joblib, or older versions of numpy.load with allow_pickle=True, you have a high probability of finding an execution primitive. Even if the library claims to be secure, check the documentation for "safe" modes or optional parameters that might be disabled by default. Many developers leave these features in their default, insecure state because they are unaware of the risks.

Defensive Hardening

Defenders need to move away from trusting model files. The first step is to implement strict signature verification for every model file that enters the environment. If you cannot verify the origin and integrity of a model, do not load it. Additionally, run model-loading processes in highly restricted, sandboxed environments with minimal filesystem access and no network connectivity.

If you are using libraries that support safer formats, such as SafeTensors, prioritize those over legacy formats like pickle. SafeTensors is designed specifically to prevent arbitrary code execution by only storing the raw tensor data, removing the ability to embed executable logic within the file structure.

The industry is currently in a race to build faster, more powerful AI, and security is being left in the dust. As researchers, our job is to force the conversation toward secure-by-default architectures. If you find a model-loading vulnerability, report it, but also push for the adoption of formats that don't allow for code execution. We need to stop treating model files as data and start treating them as the executable code they truly are.

Talk Type

research presentation

Difficulty

intermediate