DEF CON2024

The Fault in Our Parsers: Incubated ML Exploits

DEFCONConference582 views29:41over 1 year ago

This talk introduces 'incubated ML exploits,' a novel attack class that chains system-level input-handling vulnerabilities with model-level backdoors. The research demonstrates how attackers can leverage insecure deserialization and parser differentials in common ML file formats like Pickle, TorchScript, and Safetensors to inject malicious payloads. The speaker provides a framework for auditing the ML supply chain and highlights the risks of treating ML models as isolated objects rather than integrated system components. The presentation includes a discussion of the 'Fickling' tool for analyzing and manipulating malicious model files.

The Silent Backdoor: Exploiting ML Model Supply Chains via Parser Differentials

TLDR: Machine learning models are often treated as opaque, trusted blobs, but they are actually complex files parsed by vulnerable, non-minimalist code. By chaining insecure deserialization in formats like Pickle with parser differentials, attackers can inject backdoors that trigger only under specific conditions. Security teams must stop treating models as isolated objects and start auditing the entire ML supply chain, from ingestion to inference.

Machine learning security is currently stuck in a dangerous phase of "model-as-a-black-box" thinking. Most organizations focus on adversarial examples or prompt injection, but they completely ignore the underlying file formats and the parsers that process them. If you are a pentester or a researcher, you should stop looking at the model's weights and start looking at how those weights get into memory. The reality is that the ML supply chain is a massive, un-audited attack surface, and the tools we use to load these models are often riddled with the same memory corruption and injection bugs we saw in web applications a decade ago.

The Anatomy of an Incubated ML Exploit

An incubated ML exploit is not a single vulnerability. It is a chain. It starts by identifying a system-level input-handling bug—like insecure deserialization—and uses that as a delivery mechanism for a model-level backdoor. The goal is to force a model to produce specific, attacker-chosen outputs when a trigger is present.

Consider the Pickle format. It is essentially a stack-based virtual machine that executes opcodes to reconstruct objects. Because it is a full-blown programming language, it is inherently dangerous. If you can control a Pickle file, you have arbitrary code execution. The Fickling tool, developed by the team at Trail of Bits, is the gold standard for analyzing these files. It allows you to decompile, statically analyze, and rewrite the bytecode of a Pickle file.

When you combine this with a model backdoor, you create a persistent, stealthy threat. An attacker can distribute a model that appears benign to standard scanners but contains a malicious payload that executes only when the model is loaded into a specific environment.

Parser Differentials and the Trust Gap

The most interesting part of this research is the concept of parser differentials. A file has no intrinsic meaning; its interpretation depends entirely on the parser. If you have two different parsers in your system—perhaps one for validation and one for inference—and they interpret the same file differently, you have a vulnerability.

This is exactly what happens with Safetensors and PyTorch. Safetensors was designed to be a safer alternative to Pickle, but it still relies on JSON for metadata. If your validation logic uses a different JSON parser than your inference engine, an attacker can use duplicate keys or malformed offsets to hide malicious data.

For example, you can craft a file that looks like a valid model to a strict validator but contains an appended, malicious payload that is ignored by the validator but processed by the inference engine. This is a classic "shotgun parsing" issue, where the system fails to enforce a single, consistent view of the input.

Real-World Engagement Strategy

If you are on a red team engagement or a bug bounty hunt, stop trying to fuzz the model's output. Instead, target the model distribution pipeline. Look for where models are stored and how they are loaded.

Map the Pipeline: Identify every tool that touches the model file. Is it being converted from PyTorch to ONNX? Is it being quantized? Each conversion step is a potential point of injection.
Fuzz the Parsers: Use tools like Fickling to inspect the model files. If the application uses a custom loader or an older version of a library, look for insecure deserialization patterns.
Test for Differentials: Try to create a polyglot file—a single file that is valid in two different formats. If you can get a system to accept a file that is interpreted as a benign model by a security scanner but as a malicious one by the application, you have successfully bypassed the security controls.

Defensive Hardening

Defenders need to move toward a "secure-by-default" model. This means moving away from formats that support arbitrary code execution, like Pickle, and enforcing strict validation at every stage of the pipeline.

Use Minimalist Parsers: If you must use a complex format, ensure your parser is as simple as possible. Avoid parsers that attempt to "fix" or "correct" invalid input. If the input is malformed, reject it immediately.
Enforce Integrity: Every model file should have a cryptographic signature and a checksum. If the file has been modified, the system should refuse to load it.
Isolate the Environment: Run model loading and inference in a sandboxed environment with minimal privileges. Even if an attacker achieves code execution, they should not be able to pivot to the rest of your infrastructure.

The ML supply chain is currently the Wild West of security. We are building massive, complex systems on top of fragile, insecure foundations. If you want to find the next big bug, stop looking at the AI and start looking at the code that parses it. The vulnerabilities are not in the math; they are in the parsers.

Talk Type

research presentation

Difficulty

advanced