The Problems of Embedded Python in Excel, or How to Excel in Pwning Pandas
Description
This presentation investigates the security architecture of Microsoft Excel's embedded Python feature, demonstrating how the remote Jupyter-based execution environment can be manipulated. Researchers show techniques for binary injection, environment poisoning, and session probing within the Microsoft-managed Azure containers.
Beyond the Spreadsheet: Exploiting Embedded Python in Microsoft Excel
Introduction
For decades, Excel was synonymous with VLOOKUPs and VBA macros. However, Microsoft recently revolutionized the platform by embedding Python support directly into cells using the =PY() function. While this brings the power of data science to the average business user, it also introduces a sophisticated new attack surface. This post explores the security research presented at Black Hat Asia regarding the risks associated with Excel’s remote Python execution environment.
As security professionals, we often view Excel as a vector for malicious macros, but the shift to a cloud-based Python runtime changes the game. We are no longer just looking at local code execution; we are looking at a remote code execution platform hosted on Microsoft’s Azure infrastructure. This research highlights how the Jupyter-based backend can be probed, manipulated, and potentially escaped.
Background & Context
When you type =PY() in an Excel cell, your code isn't running on your local CPU. Instead, Excel packages the code and any referenced data, sending it to a Microsoft-managed container in Azure running Linux and a Jupyter server. This architecture was chosen to provide high-performance computing without taxing the user's local hardware, but it effectively creates a giant, shared Jupyter notebook environment for millions of users.
Historically, VBA has been a primary target for attackers, leading many organizations to disable it entirely. Microsoft positions Python in Excel as a more secure alternative because it is containerized and lacks direct access to the local file system. However, as this research demonstrates, moving the execution to the cloud doesn't eliminate risk—it simply shifts it to the cloud infrastructure and the Jupyter runtime itself.
Technical Deep Dive
Understanding the Architecture
The implementation relies on a specific workflow: the Python code and data frames are serialized, sent to an Azure endpoint, executed within a container, and the result is returned to the worksheet. This environment comes pre-loaded with the Anaconda distribution, featuring libraries like pandas, matplotlib, and scikit-learn.
The Power of Jupyter Magic
The researchers discovered that the environment supports Jupyter "magic commands." By using the %sx or ! prefix, users can break out of the Python interpreter and execute shell commands directly on the underlying Linux host. For example:
# Listing environment variables in an Excel cell
%env
# Listing the remote file system
%sx ls -la /
Binary Injection and Environment Poisoning
One of the most impressive feats demonstrated was the ability to upload and run custom binaries in the remote environment, despite the lack of a traditional file upload mechanism. The researchers used a multi-step process:
- Encoding: Convert a Linux binary (like
nmap) to a Base64 string. - Importing: Use Excel's Power Query to pull the Base64 string into a named range (avoiding the character limits and auto-formatting of standard cells).
- Reconstruction: Use a Python cell to read the string, decode it, and write it to a writable directory like
/tmpor the user's home directory. - Execution: Modify permissions using
chmod +xand execute the binary.
import base64
import os
# Example of reconstructing a binary
encoded_data = xl("BinaryRange") # Referencing Excel data
with open("/tmp/nmap", "wb") as f:
f.write(base64.b64decode(encoded_data))
os.chmod("/tmp/nmap", 0o755)
Writable Python Modules
The researchers found that the Python site-packages directory was user-writable. This means an attacker could modify the source code of a popular library like pandas within their session. If sessions are shared or reused across different documents or users (a phenomenon the researchers observed), this could lead to data exfiltration where a poisoned library sends cell data to an attacker-controlled endpoint.
Mitigation & Defense
Microsoft has implemented several layers of defense, including blocking outbound internet access from the containers. However, the internal network surface remains reachable via tools like nmap. For organizations concerned about this risk, the primary defense is administrative control.
You can manage Python in Excel via the Windows Registry. Navigating to HKEY_CURRENT_USER\Software\Microsoft\Office\16.0\Excel\Security and setting PythonFunctionControl allows you to:
- 0: Block Python entirely.
- 1: Allow with a warning (default).
- 2: Allow without warning.
Furthermore, Microsoft recently updated session isolation logic (February 2024) to ensure that different users are strictly partitioned into separate container instances.
Conclusion & Key Takeaways
Python in Excel is a powerful feature that effectively turns a spreadsheet into a remote code execution platform. While Microsoft has done significant work to isolate these containers, the ability to upload binaries and modify the Python environment proves that no sandbox is perfectly airtight.
Key Takeaways:
- Treat
=PY()functions with the same scrutiny as macros. - Monitor for unusual Base64 blobs within spreadsheets.
- Understand that cloud-based execution moves the threat from the endpoint to the cloud tenant. Always practice responsible disclosure and test these features in isolated environments to ensure your organization's data remains secure.
AI Summary
The presentation 'The Problems of Embedded Python in Excel, or How to Excel in Pwning Pandas' delivered by Shalom Carmel explores the security boundaries of Microsoft's recent integration of Python into Excel. This feature, initiated via the `=PY()` function, allows users to perform advanced data analysis using libraries like Pandas directly within a spreadsheet. However, unlike traditional VBA which runs locally, this Python code is executed in a remote, Microsoft-managed Jupyter environment running on Azure Linux containers. The researchers highlight that while Microsoft marketed the environment as isolated and secure, the implementation effectively turns every Excel spreadsheet into a potential Remote Code Execution (RCE) platform. By leveraging Jupyter 'magic commands' (such as `%sx` or `!`), a user can execute shell commands on the remote Linux host. The analysis revealed that large portions of the remote file system are user-writable, including the Python module directories and Jupyter startup paths. This allows an attacker to poison the environment by modifying standard libraries like Pandas to intercept or exfiltrate data. A significant portion of the talk focuses on bypassing the lack of pre-installed security tools in the Azure environment. The researchers demonstrated a method to upload custom binaries, such as Nmap and Netcat, by converting them to Base64 and importing them into Excel via Power Query to avoid character corruption. Once uploaded, these binaries can be reconstructed and executed within the container to probe the internal Microsoft network. The presenters also noted the presence of multiple Jupyter sessions within the same container environment, raising concerns about cross-session data leakage, though Microsoft claimed to have addressed specific isolation issues in a February 2024 update. The talk concludes with a reminder that adding complex, code-executing runtimes to ubiquitous software like Excel inevitably expands the attack surface. While direct internet access from the containers is currently blocked, the ability to manipulate the runtime environment and potentially interact with the Azure control plane presents a significant area for future security research. Practitioners are advised that they can control or disable this feature via Windows Registry keys if the risk is deemed too high for their environment.
More from this Playlist




Dismantling the SEOS Protocol
