Black Hat2024

All Your Secrets Belong to Us: Leveraging Firmware Bugs to Break TEEs

Black Hat1,040 views37:14about 1 year ago

This talk demonstrates how to exploit firmware vulnerabilities in AMD's Secure Encrypted Virtualization (SEV-SNP) to compromise Trusted Execution Environments (TEEs). The researcher identifies flaws in the command dispatch mechanism and the handling of guest context pages, allowing for memory corruption and the potential leakage of sensitive data. The presentation highlights the critical role of the Platform Security Processor (PSP) and the Reverse Map Table (RMP) in maintaining TEE integrity. The researcher provides proof-of-concept code to demonstrate these techniques.

Breaking AMD SEV-SNP: Exploiting Firmware Command Dispatch and Guest Context Pages

TLDR: This research exposes critical vulnerabilities in AMD’s Secure Encrypted Virtualization (SEV-SNP) that allow a malicious hypervisor to compromise the integrity of Trusted Execution Environments (TEEs). By exploiting flaws in the command dispatch mechanism and manipulating guest context pages, an attacker can corrupt memory and potentially leak sensitive data. These findings demonstrate that even hardware-backed security features are only as strong as the firmware managing them, and researchers should prioritize auditing these low-level interfaces.

Hardware-backed security is often treated as a black box. We assume that if a vendor like AMD implements a Trusted Execution Environment (TEE) such as AMD SEV-SNP, the boundary between the untrusted hypervisor and the protected guest is absolute. This research proves that assumption is dangerous. By analyzing the firmware responsible for managing these environments, we can find ways to bypass the very protections designed to keep the hypervisor out of the guest’s memory.

The Mechanics of the Command Dispatch Vulnerability

At the heart of the SEV-SNP architecture lies the Platform Security Processor (PSP). This processor acts as the root of trust, handling sensitive operations like VM creation, attestation, and memory management. The hypervisor communicates with the PSP through a command dispatch mechanism. When the hypervisor needs the PSP to perform a task, it writes a request to a specific memory location and then triggers the PSP to process it.

The vulnerability, identified as CVE-2024-21980, stems from an improper input validation flaw in this dispatch process. Specifically, the firmware fails to consistently enforce access control checks on the memory buffers used for these commands. In a secure implementation, the PSP should verify that the memory being accessed is owned by the entity making the request. However, the research shows that for certain commands, the firmware skips the necessary Reverse Map Table (RMP) checks.

An attacker can exploit this by pointing the PSP to memory that it does not own. Because the firmware assumes the hypervisor is only passing valid, authorized buffers, it proceeds to write the command response directly into the target memory. If that target memory belongs to a guest, the hypervisor has effectively achieved a write-what-where primitive inside a supposedly isolated environment.

Manipulating Guest Context Pages

Beyond command dispatch, the research highlights a second, more subtle attack vector involving guest context pages. These pages store critical metadata about a running VM, including encryption keys and state information. They are marked as owned by the SEV firmware in the RMP using a special state, and they are encrypted with a key accessible only to the firmware.

The vulnerability, CVE-2024-21978, arises because the firmware does not adequately protect these pages from being re-purposed or corrupted if an attacker can influence the memory management logic. By forcing the firmware to re-initialize or refresh these pages, an attacker can manipulate the underlying data structures.

The most impactful technique involves targeting the Unified Memory Controller (UMC) key seed. When a guest is launched, the firmware populates this seed with random values to derive the guest’s encryption key. By corrupting this seed through the memory vulnerability, an attacker can force multiple guests to use the same encryption key. Once two guests share a key, the isolation provided by SEV-SNP collapses. An attacker can then use one guest to decrypt the memory of another, effectively turning the TEE into a transparent window.

Real-World Applicability for Pentesters

For those of us performing red team engagements or cloud security assessments, these findings change the threat model for virtualized environments. If you are testing a cloud provider or a private infrastructure running on AMD EPYC processors, you can no longer assume that the hypervisor is strictly partitioned from the guest.

If you have access to the hypervisor, your goal is to interact with the PSP interface. You should look for ways to trigger the vulnerable commands identified in the proof-of-concept code. During an engagement, this might involve fuzzing the hypervisor-to-firmware communication channel or looking for race conditions in how the RMP is updated. The impact is total compromise of the guest VM, including the ability to extract secrets, modify code, or exfiltrate data that was supposed to be encrypted at rest and in transit.

Defensive Considerations

Defending against these types of firmware-level attacks is notoriously difficult because the vulnerability exists below the operating system layer. The primary defense is ensuring that your firmware is up to date. AMD has released security advisories addressing these issues, and applying these patches is the only way to close the gap.

Beyond patching, organizations should advocate for greater transparency in the TEE stack. We need to move away from proprietary, opaque firmware and toward open-source implementations where the community can audit the command dispatch logic and memory management routines. If you are building systems that rely on TEEs, treat the firmware as part of your attack surface, not as a static, secure foundation.

Security researchers should continue to focus on these low-level interfaces. The complexity of modern TEEs is a breeding ground for these types of logic errors. As we move more sensitive workloads into virtualized environments, the security of the firmware managing those environments will become the single most important factor in maintaining the confidentiality and integrity of our data.

Talk Type

research presentation

Difficulty

expert