(Mis)adventures with Copilot+: Attacking and Exploiting Windows NPU Drivers

BBlack Hat
253,000
1,205 views
29 likes
6 months ago
35:39

Description

An in-depth analysis of Neural Processing Unit (NPU) driver vulnerabilities in Windows 11 Copilot+ PCs. The presentation demonstrates how to exploit Qualcomm and AMD NPU drivers to achieve Local Privilege Escalation, specifically bypassing new mitigations in Windows 11 24H2.

(Mis)adventures with Copilot+: Attacking and Exploiting Windows NPU Drivers

Introduction

With the release of Copilot+ PCs in May 2024, Microsoft signaled a paradigm shift in personal computing: the local integration of Large Language Models (LLMs). Central to this shift is the Neural Processing Unit (NPU), a dedicated piece of hardware designed to handle AI tasks efficiently. While NPUs offer benefits for privacy and battery life, they also introduce a massive, relatively unvetted attack surface into the Windows kernel.

In this post, we’ll dive into the architecture of Windows NPUs, analyze critical vulnerabilities found in drivers from major vendors like Qualcomm and AMD, and walk through a sophisticated exploit chain that achieves Local Privilege Escalation (LPE) on the latest Windows 11 24H2—even bypassing modern mitigations like the PreviousMode patch.

The NPU Architecture: A Shared Legacy

To understand how to attack an NPU, one must understand how Windows talks to it. The Windows AI stack is layered like an onion: applications talk to the ONNX Runtime, which interfaces with DirectML. DirectML then communicates with the DirectX 12 User Mode Driver (UMD), which finally passes commands to the Kernel Mode Driver (KMD).

Microsoft manages these hardware components using the Microsoft Compute Driver Model (MCDM). Interestingly, MCDM is essentially a subset of the long-standing Windows Driver Device Model (WDDM) used for GPUs. Because NPUs and GPUs share many interface functions, the security pitfalls that plagued GPU drivers over the last decade are being reincarnated in NPUs. The primary vector for attack is the Device Driver Interface (DDI), specifically "Escape Functions." These functions are designed to allow vendors to implement proprietary features (like specialized AI acceleration) that don't fit into the standard Microsoft API. Because these functions often handle "private driver data" with minimal validation from dxgkrnl.sys, they are prime targets for memory corruption.

Technical Deep Dive: Two Vendor Flaws

1. Qualcomm Snapdragon: Arbitrary Write

The Qualcomm NPU driver contained a flaw in its escape function handler. When processing user-supplied data, the driver used MMProbeAndLockPages with an access mode set to zero. This incorrectly allowed the driver to lock kernel virtual addresses. By carefully structuring the private driver data, an attacker could force the driver to dereference a controlled pointer and write an arbitrary value to an arbitrary address. This "Write-What-Where" primitive is the holy grail for kernel exploit developers.

2. AMD Ryzen: Integer Overflow to Pool Overflow

In the AMD Ryzen NPU driver, the validation logic for private driver data was flawed. The driver attempted to validate a series of structs by checking their offsets and sizes. However, an integer overflow occurred when adding two uint32 values, which was then truncated. This logic error allowed an attacker to bypass size checks and trigger a paged pool overflow. By writing multiple times out-of-bounds, an attacker can corrupt adjacent objects in the kernel heap.

Exploitation on Windows 11 24H2

Exploiting these bugs on the latest version of Windows 11 requires overcoming modern mitigations.

Step 1: Heap Feng Shui with WNF

The exploit uses Windows Notification Facility (WNF) state data. WNF is excellent for pool feng shui because it allows for user-controlled allocation sizes. By spraying WNF objects and then triggering the AMD NPU overflow, the attacker can overwrite the DataSize field of a WNF object, enabling an out-of-bounds read and write within the paged pool.

Step 2: Bypassing the PreviousMode Mitigation

Historically, attackers used a technique called PreviousMode to tell the kernel that a request originated from the kernel itself, bypassing many security checks for NtReadVirtualMemory and NtWriteVirtualMemory. In Windows 11 24H2, Microsoft finally patched this widely used trick.

To pivot, the researcher used an Arbitrary Increment primitive. By misusing the reference counter in the BnoIsolationHandleEntry of a token, an attacker can call DuplicateToken to increment a specific memory address. By calculating the exact offset of the Privileges.Enabled and Privileges.Present bitfields in the process token, the attacker can "increment" their way to enabling SeDebugPrivilege.

Step 3: From Privilege to System

Once SeDebugPrivilege is enabled, the process can open and inject code into high-privilege processes like winlogon.exe. From there, spawning a cmd.exe as NT AUTHORITY\SYSTEM is trivial.

Mitigation & Defense

For defenders, detecting NPU-based attacks is challenging as they occur deep within the driver stack. Key strategies include:

  • Driver Signing and Integrity: Ensure that only WHQL-signed drivers are loaded and keep them updated immediately.
  • Win32k System Call Filtering: Microsoft provides a mitigation to disable win32k syscalls for specific processes. While this breaks many GUI applications, it effectively blocks the entry point to NPU drivers for specialized or sandboxed processes (like Chrome's renderer).
  • VBS and HVCI: Enabling Virtualization-Based Security and Hypervisor-Protected Code Integrity provides a robust layer of protection against many common kernel exploits.

Conclusion

The move to local AI is inevitable, and NPUs are here to stay. However, as this research shows, porting legacy GPU driver architectures to new AI hardware has brought old vulnerabilities back to the forefront. As NPUs gain more permissions—such as the ability to modify system settings in future iterations of Copilot—the impact of these driver bugs will only grow. For security researchers, the NPU is the next great frontier for Windows kernel exploration. Stay curious, but remember to practice safely and report vulnerabilities responsibly through official bug bounty programs.

AI Summary

This presentation explores the security landscape of Windows NPUs (Neural Processing Units), a key hardware requirement for the Copilot+ PC category. The talk begins by explaining the shift toward local AI processing driven by privacy concerns (e.g., Microsoft Recall) and the cost/availability constraints of cloud-based LLMs. The speaker categorizes NPUs from the three major suppliers—Intel, Qualcomm, and AMD—comparing them to 'marathon runners' versus the high-power 'sprinters' that are GPUs. Technically, the NPU architecture in Windows is described as an 'onion' with layers involving the application, ONNX Runtime, DirectML, and finally, the DirectX 12 user-mode and kernel-mode drivers. A key highlight is the introduction of the Microsoft Compute Driver Model (MCDM), which is a subset of the Windows Driver Device Model (WDDM). This shared heritage means NPUs inherit the same complex attack surface as GPUs, including Device Driver Interface (DDI) functions and 'escape' functions that allow user-mode drivers to share proprietary data with the kernel. The speaker identifies private driver data and size as high-value targets because they are often passed to drivers without validation by the DirectX kernel subsystem. Two specific vulnerabilities are analyzed. The first is a Qualcomm Snapdragon NPU flaw where improper use of 'MMProbeAndLockPages' (with access mode set to zero) allows an attacker to lock kernel virtual address memory, leading to an arbitrary 'write-what-where' primitive. The second is an AMD Ryzen NPU driver bug involving an integer overflow and truncation during the validation of private driver data structs, enabling a dynamic heap (paged pool) overflow. The core of the presentation focuses on achieving Local Privilege Escalation (LPE) on Windows 11 version 24H2. The speaker details a 'data-only' attack using WNF (Windows Notification Facility) state data for paged pool feng shui to gain an arbitrary read/write primitive. However, a significant obstacle is revealed: Microsoft patched the 'PreviousMode' trick in version 24H2, which previously allowed easy kernel-to-user memory access. To bypass this, the speaker introduces a novel technique misusing object reference counters. By manipulating the 'BnoIsolationHandleEntry' and duplicating tokens, the attacker can achieve an arbitrary increment primitive. This is used to flip specific bits in the token's privilege field (e.g., SeDebugPrivilege), eventually allowing the spawning of a system shell. The talk concludes that NPUs represent a revived and expanding attack surface that mirrors the vulnerabilities of the GPU ecosystem from a decade ago.

More from this Playlist

Behind Closed Doors - Bypassing RFID Readers
42:04
Travel & Eventsresearch-presentationhybridrfid
DriveThru Car Hacking: Fast Food, Faster Data Breach
36:35
Travel & Eventsresearch-presentationhybriddashcam
Impostor Syndrome - Hacking Apple MDMs Using Rogue Device Enrolments
34:53
Travel & Eventsresearch-presentationhybridapple
Dismantling the SEOS Protocol
26:50
Travel & Eventsresearch-presentationtechnical-deep-diverfid
The ByzRP Solution: A Global Operational Shield for RPKI Validators
47:04
Travel & Eventsresearch-presentationtechnical-deep-divebgp
Powered by Kuboid

We break your app
before they do.

Kuboid is a cybersecurity agency that finds hidden vulnerabilities before real attackers can exploit them. Proactive security testing, so you can ship with confidence.

Get in Touch

Trusted by the security community • Visit kuboid.in