Black Hat2024

AI Governance and Security: A Conversation with Singapore's Chief AI Officer

Black Hat1,066 views53:03over 1 year ago

This talk is a high-level discussion regarding the strategic implementation of AI governance, ethics, and security frameworks within a national context. It explores the challenges of balancing rapid AI innovation with the necessity for responsible deployment, human-in-the-loop oversight, and the mitigation of risks like data poisoning and adversarial attacks. The speakers emphasize the importance of international collaboration and sector-specific regulatory standards to ensure AI systems remain safe and trustworthy.

Why Your AI Model Is Just a Fancy Data Poisoning Target

TLDR: Large language models and AI systems are increasingly being integrated into critical infrastructure, yet they lack the fundamental security controls we apply to traditional software. This talk highlights that AI systems are uniquely vulnerable to data poisoning and adversarial manipulation, which can bypass standard security filters. Pentesters must shift their focus from traditional input validation to evaluating the integrity of the training data and the robustness of the model's decision-making logic.

Security researchers have spent decades perfecting the art of breaking traditional software. We know how to find a buffer overflow, we know how to exploit an insecure deserialization, and we know how to chain together a series of misconfigurations to achieve remote code execution. But as organizations rush to integrate AI into their production environments, we are hitting a wall. The old playbooks for testing web applications or network services do not apply when the "code" is a black-box model trained on massive, often untrusted, datasets.

The Reality of AI Vulnerability

Most security teams treat AI models as static assets, similar to a database or a web server. This is a dangerous misconception. An AI model is a living, breathing entity that evolves based on the data it consumes. If you control the data, you control the model. This is the core of the data poisoning threat. Unlike a traditional SQL injection where you are trying to trick a query, data poisoning is about corrupting the fundamental logic of the system.

During the discussion at Black Hat, the focus shifted toward the inherent difficulty of securing these systems. When a model is trained on public data, it is effectively ingesting the entire internet. If an attacker can influence a subset of that data, they can introduce backdoors or bias that are nearly impossible to detect through standard code review. This is not a theoretical risk. We are already seeing OWASP Top 10 for LLM Applications projects documenting how these systems can be manipulated to leak sensitive information or perform unauthorized actions.

Adversarial Machine Learning in Practice

For a pentester, the challenge is that you cannot simply run a scanner and expect a list of vulnerabilities. You have to think like a data scientist who wants to break things. Adversarial machine learning involves crafting inputs that are specifically designed to cause the model to misclassify data or behave in unintended ways.

Consider a scenario where an AI system is used for automated document classification. An attacker does not need to break the server hosting the model. They only need to submit a series of documents that contain subtle, malicious patterns. Over time, as the model retrains or updates its weights, those patterns become part of its internal logic. By the time the model is in production, it has been "trained" to ignore certain threats or misidentify malicious payloads as benign.

This is essentially T1588-obtain-capabilities applied to the model's training phase. You are not just obtaining an exploit; you are obtaining a permanent, invisible influence over the target's decision-making process.

The Human in the Loop

One of the most critical takeaways from the discussion is the role of the human in the loop. We often talk about "AI-driven security," but we rarely talk about the security of the AI itself. If the system is making high-stakes decisions, there must be a mechanism for human oversight that is not just a rubber stamp.

If you are testing an AI-integrated system, start by mapping out the data pipeline. Where does the training data come from? Is it sanitized? Who has access to the model weights? If you can find a way to inject data into the pipeline, you have found the most critical vulnerability in the entire stack.

Defensive Strategies for the Modern Pentester

Defenders are currently struggling to keep up. Traditional firewalls and WAFs are useless against adversarial inputs because the malicious intent is embedded in the data, not the protocol. The only way to defend these systems is through rigorous data provenance and model monitoring.

If you are working with a blue team, push them to implement Adversarial Robustness Toolbox, which provides a suite of tools for testing the resilience of machine learning models against adversarial attacks. It is not a silver bullet, but it is a necessary step toward understanding the attack surface of your models.

Moving Forward

We are in the early days of AI security, and the landscape is shifting rapidly. The "land grab" mentality we see in the industry—where companies are prioritizing speed and scale over security—is creating a massive amount of technical debt that we will be dealing with for years.

As researchers, our job is to stop treating AI as a magic box. It is software, and like all software, it has bugs. The difference is that the bugs in AI are often baked into the very foundation of the system. Start looking at the data pipelines, start testing the model's responses to adversarial inputs, and stop assuming that your AI is secure just because it passed a standard penetration test. The next generation of exploits will not be found in the code; they will be found in the data.

Talk Type

panel

Difficulty

beginner