Security BSides2025

Good Models Gone Bad: Visualizing Data Poisoning with Networks

BSidesSLC73 views20:0410 months ago

This talk demonstrates how network science can be applied to visualize and detect data poisoning attacks against machine learning models. By representing training data as a graph of nodes and edges, the speaker illustrates how attackers can manipulate model behavior through the addition, modification, or deletion of data points. The presentation highlights the use of Gephi to visualize these changes and identify malicious patterns in datasets. The key takeaway is that network analysis provides a powerful, intuitive method for detecting data tampering and assessing the integrity of training data.

Visualizing Data Poisoning Attacks with Network Science

TLDR: Data poisoning attacks manipulate machine learning models by injecting, modifying, or deleting training data to force undesirable behavior. By representing these datasets as graphs, researchers can use network analysis tools like Gephi to visualize and detect these subtle, malicious changes. This approach provides a clear, intuitive way to identify tampered data that would otherwise remain hidden in massive, complex training sets.

Machine learning models are increasingly integrated into critical software infrastructure, yet the security of the data feeding these models remains an afterthought. Most security teams focus on securing the model architecture or the API endpoints, while the training data itself is treated as a black box. This oversight creates a massive blind spot. If an attacker can influence the training set, they can dictate the model's output without ever touching the underlying code or infrastructure.

Data poisoning is not a theoretical risk. It is a practical, high-impact attack vector that falls squarely under OWASP Top 10 for LLMs as a form of training data poisoning. When an attacker successfully injects malicious samples or modifies existing data, they can cause a model to misclassify inputs, leak sensitive information, or bypass security filters. The challenge for researchers and security engineers is that these changes are often buried within millions of data points, making them nearly impossible to spot through manual review or standard logging.

Mapping the Poison

Network science offers a way to cut through this noise. By treating data points as nodes and their relationships as edges, we can map the structure of a dataset. In a healthy, untampered dataset, the graph exhibits a specific, predictable topology. When an attacker introduces poisoned data, they inevitably alter this topology.

Consider a dataset of Java dependencies. Each dependency is a node, and the connections between them represent how they are used together in a project. An attacker looking to poison this data might inject a malicious package that appears to be a legitimate dependency. By mapping this in Gephi, the open-source network visualization tool, the injected node and its artificial connections stand out. The graph reveals the anomaly because the malicious node creates clusters or paths that do not exist in the legitimate, organic structure of the dependency network.

The following snippet demonstrates how one might programmatically add edges to a network to simulate a poisoning attack:

# Adding edges to a network to simulate a poisoning attack
def add_strategic_edges(G, num_edges):
    nodes = list(G.nodes())
    possible_edges = list(combinations(nodes, 2))
    existing_edges = set(G.edges())
    
    # Filter out existing edges
    possible_edges = [e for e in possible_edges if e not in existing_edges]
    
    # Sort by degree to target high-traffic nodes
    degrees = dict(G.degree())
    edge_scores = [(degrees[e[0]] + degrees[e[1]]) for e in possible_edges]
    edge_scores.sort(key=lambda x: x, reverse=True)
    
    # Add the edges
    for i in range(num_edges):
        edge = possible_edges[i]
        G.add_edge(edge[0], edge[1])

Detecting the Anomaly

Visualizing these changes is where the real power lies. During a penetration test or a bug bounty engagement, you rarely have the luxury of knowing exactly where the data was tampered with. By generating a graph of the dataset before and after a suspected poisoning event, you can perform a visual diff.

In the case of the Tay chatbot incident, the model was essentially poisoned by the public. If the researchers had mapped the interaction network of the training data in real-time, the sudden, massive influx of hateful, repetitive input would have appeared as a distinct, abnormal cluster in the graph. The "before" and "after" snapshots would have shown a clear deviation from the expected, diverse interaction patterns.

For a pentester, this means that if you are auditing an AI-driven system, you should ask for the data lineage. If you can obtain a snapshot of the training data, you can build a graph representation. Look for nodes with unusually high centrality or edges that connect disparate, unrelated clusters. These are often the fingerprints of a poisoning attempt.

Defending Against Data Tampering

Defending against these attacks requires a shift toward data provenance. You must treat your training data with the same rigor as your source code. This means implementing strict access controls on data ingestion pipelines, using cryptographic signatures to verify the integrity of data sources, and, as demonstrated here, using network analysis to monitor for structural anomalies.

If you are working with public datasets, the risk is even higher. Public repositories are prime targets for data injection. Always assume that any data sourced from the open web is potentially compromised. Before feeding it into your model, perform a structural analysis to ensure the data distribution matches your expectations.

The field of AI security is still maturing, and the tools we use to audit these systems are evolving. Network science provides an intuitive, powerful lens to view data integrity. Instead of trying to find a needle in a haystack of logs, start mapping the haystack. You might be surprised by what you find when you look at the connections rather than the individual data points. Start by exploring your own datasets with Gephi and see if you can identify the "normal" structure of your data. Once you know what normal looks like, the poisoned data will become impossible to ignore.

Talk Type

research presentation

Difficulty

intermediate