Security BSides2025

Hunt at Scale: Fingerprinting Threat Actors Across the Web

Security BSides London22 views28:07about 1 month ago

This talk demonstrates a methodology for large-scale threat actor fingerprinting by scanning newly registered domains (NRDs) and correlating technical artifacts. The speaker utilizes a custom-built, on-premise infrastructure to perform automated web scanning, capturing network, script, and technology fingerprints for millions of domains. By analyzing these fingerprints, the presenter identifies clusters of malicious infrastructure associated with specific threat campaigns. The approach emphasizes moving from reactive incident response to proactive campaign discovery through pattern analysis.

Scaling Threat Intelligence: From Reactive Alerts to Proactive Campaign Discovery

TLDR: Most security teams are stuck in a reactive loop, chasing individual alerts instead of mapping the infrastructure behind them. This research demonstrates how to build an on-premise pipeline that scans newly registered domains (NRDs) to extract technical fingerprints like network stacks, scripts, and page metadata. By correlating these fingerprints at scale, you can identify and track entire threat actor campaigns before they even launch their first attack.

Incident response is often a game of whack-a-mole. You get an alert, you block an IP, you move on. By the time you’ve finished your coffee, the threat actor has already spun up three new domains on a different provider. If you are still relying on static indicators of compromise (IOCs) to defend your perimeter, you are fighting a war with yesterday’s intelligence. The real value isn't in catching the individual phishing link; it’s in identifying the infrastructure that generated it.

Building the Hunt Pipeline

The methodology presented at BSides London 2025 shifts the focus from individual incidents to campaign-level discovery. Instead of waiting for a malicious domain to hit your logs, you start by pulling the raw data. The ICANN Centralized Zone Data Service (CZDS) provides the feed of newly registered domains. This is your starting point.

The pipeline itself is straightforward but requires a shift in how you view infrastructure. You do not need a massive enterprise budget to do this. The research highlights an on-premise setup using consumer-grade hardware—specifically, a 2U server rack running OpenSearch for indexing and a custom Selenium and Chrome Driver implementation for scanning.

The scanning logic is simple:

Pull the NRD list from ICANN.
Use Selenium to visit each domain.
Capture the full page render, including scripts, third-party requests, and page metadata.
Index these artifacts into OpenSearch.

By capturing the full DOM and network requests, you aren't just seeing a domain name; you are seeing the "fingerprint" of the site. Threat actors are creatures of habit. They reuse templates, they use the same JQuery versions, and they often host their assets on the same infrastructure.

Fingerprinting at Scale

The power of this approach lies in the correlation. When you scan millions of domains, you start to see patterns that aren't visible in a single incident. You can query your OpenSearch index for specific page titles or script hashes.

For example, if you see a domain with the page title "Giriş Yap | Binance TR," you don't just flag that one domain. You query your database for every other domain that shares the same technical fingerprint—the same JQuery version, the same favicon URL, and the same network ASN.

# Simplified snippet for capturing network logs via Selenium
options = webdriver.ChromeOptions()
options.set_capability('goog:loggingPrefs', {'browser': 'ALL', 'performance': 'ALL'})
driver.execute_cdp_cmd('Network.enable', {})
# ... perform scan and extract logs

This is how you move from "I found a phishing site" to "I found the entire campaign infrastructure." When you identify that a cluster of 30 domains all share the same unique fingerprint, you have effectively mapped the actor's footprint. You can then monitor for new domains that match that specific fingerprint, giving you a massive head start on detection.

Real-World Application for Pentesters

For a pentester or a bug bounty hunter, this is a force multiplier. During an engagement, you are often tasked with identifying the attack surface of a client. If you can identify the infrastructure an attacker is using to target that client, you can provide much higher-value intelligence than a simple list of open ports.

If you are hunting for bugs, this methodology helps you find the "staging" areas. Attackers often leave their test environments exposed or use the same infrastructure for multiple targets. By fingerprinting the staging sites, you can often find the same vulnerabilities—like A03:2021-Injection or A07:2021-Identification and Authentication Failures—that the attacker is planning to use against your primary target.

Defensive Considerations

Defenders should stop treating domains as isolated entities. If your security stack only looks at the domain name, you are missing the context. Start looking at the ASN, the technology stack, and the page metadata. If you see a domain that matches the fingerprint of a known malicious campaign, it should be blocked regardless of whether it has been reported as malicious yet.

The barrier to entry for this kind of research has never been lower. You don't need a massive cloud budget to start scanning. You need a few servers, a solid understanding of how to parse DOM data, and the patience to look for the patterns in the noise. The next time you encounter a phishing site, don't just block it. Use it as a seed to find the rest of the campaign. You will be surprised at how much infrastructure you can uncover when you stop looking at the domain and start looking at the fingerprint.

Talk Type

research presentation

Difficulty

intermediate