DEF CON2025

AI Assisted Web Attack Surface Enumeration

DEFCONConference1,282 views40:293 months ago

This talk demonstrates the use of Large Language Models (LLMs) to automate and enhance web attack surface enumeration, specifically for subdomain and API endpoint discovery. The speaker explores how LLMs can be fine-tuned or prompted to identify patterns in naming conventions and generate realistic, previously unknown subdomains and API paths. The presentation highlights the limitations of traditional brute-force methods and showcases how LLMs can provide context-aware, intelligent suggestions to improve discovery rates. The speaker also releases open-source models and tools to facilitate this approach.

Beyond Wordlists: Using LLMs to Map Hidden Web Attack Surfaces

TLDR: Traditional brute-force enumeration often fails to uncover non-standard API endpoints and subdomains that don't appear in common wordlists. By using fine-tuned Large Language Models (LLMs) like Qwen-3-4B, researchers can now predict and discover these hidden assets based on learned naming patterns and context. This approach significantly increases the success rate of finding sensitive, undocumented endpoints during penetration tests and bug bounty engagements.

Standard wordlists are the backbone of reconnaissance, but they are also a massive bottleneck. Every bug hunter has spent hours running ffuf or gobuster against a target, only to find the same generic paths while missing the custom, developer-specific endpoints that actually lead to critical vulnerabilities. The problem is simple: wordlists are static, but modern web architectures are dynamic and highly specific to the organization.

When you are testing a large-scale application, the attack surface is not just what is linked on the homepage. It includes shadow IT, legacy subdomains, and internal API versions that developers forgot to decommission. Traditional tools rely on passive DNS records or brute-forcing common names, both of which fail when a company uses internal naming conventions like activate-iphone-use1-cx02.apple.com. If that string isn't in your dictionary, you aren't finding it.

The Shift to Intelligent Enumeration

Recent research presented at DEF CON 33 demonstrates that we can move past static lists by treating enumeration as a language prediction problem. Instead of guessing, we can train or prompt an LLM to understand the "grammar" of a target's infrastructure.

The core technique involves deconstructing known subdomains or API paths into their constituent parts—environment flags, region codes, and resource identifiers. Once the model understands that a target uses a specific pattern like [action]-[product]-[region]-[cluster_id]-[env], it can generate thousands of highly probable, previously unknown candidates.

For example, if you identify api/v2/user_crt as a valid endpoint, an LLM can infer that user is an entity, crt is an abbreviation for create, and v2 is a versioning flag. It then generates variations like api/v2/user_rmv or api/v2/admin_crt. This is not just random mutation; it is structural inference.

Practical Implementation with LLMs

To implement this, you don't need a massive, proprietary model. The research highlights the use of Qwen-3-4B, a lightweight transformer model that can be fine-tuned on your specific findings. By feeding the model a dataset of known subdomains or paths, it learns the target's specific naming logic.

If you are working on a live engagement, you can use an agentic approach. Start with a single, confirmed endpoint and prompt the model to suggest others. The workflow looks like this:

Deconstruction: The model breaks down the input URL into tokens and identifies variable slots.
Inference: The model applies the learned naming convention to generate new candidates.
Validation: You pass these candidates through dnsx or httpx to verify they exist.

This loop turns the LLM into a specialized fuzzer. In one case study, this method uncovered an unauthenticated admin panel on a subdomain that was completely invisible to standard wordlist-based tools, resulting in a significant bug bounty payout.

Why This Matters for Pentesters

The primary advantage here is context awareness. Traditional tools treat admin and user as isolated strings. An LLM understands that admin is a role and user is an entity, and it knows how they typically interact with paths like create, delete, or update.

When you encounter OWASP A01:2021-Broken Access Control, the vulnerability often exists on an endpoint that isn't linked anywhere in the UI. By using LLMs to map these "hidden" paths, you are essentially performing a more surgical version of active scanning. You are not just throwing random strings at a server; you are speaking the developer's language.

Defensive Considerations

For blue teams, this research is a wake-up call regarding the visibility of your internal infrastructure. If an LLM can predict your naming conventions, so can an attacker.

Defenders should focus on:

Consistent Naming Security: Do not rely on "security through obscurity" for internal endpoints. If an endpoint is sensitive, it must require authentication, regardless of how hard it is to guess the path.
Rate Limiting: Since this technique generates a high volume of highly relevant requests, robust rate limiting and anomaly detection on your API gateways are essential to catch automated discovery attempts.
Asset Inventory: If you don't know your subdomains exist, you cannot secure them. Use automated discovery tools to maintain a real-time inventory of your public-facing attack surface.

The era of relying solely on SecLists is ending. As these models become more accessible, the bar for what constitutes a "hidden" endpoint is rising. If you are still manually managing wordlists, you are missing the most interesting parts of the target. Start experimenting with small, fine-tuned models on your next engagement. The results will likely surprise you.

Talk Type

research presentation

Difficulty

advanced