DEF CON2025

How We Protect Cat Memes from DDoS

DEFCONConference521 views26:276 months ago

This talk details the architectural strategies and signal-based methodologies used by Reddit to mitigate large-scale DDoS attacks. It covers the implementation of multi-layered rate limiting at both the edge and application levels, utilizing TLS fingerprinting, request header analysis, and behavioral modeling. The presentation highlights the importance of observability and custom logging for effective triage and defense against non-human traffic. It also discusses the use of resource pools and tarpitting to increase the cost of attacks for adversaries.

Engineering Resiliency: How Reddit Scales DDoS Mitigation

TLDR: Scaling DDoS protection for a platform with 1.3 trillion weekly requests requires moving beyond simple IP-based blocking. Reddit’s approach combines multi-layered rate limiting, TLS fingerprinting, and behavioral modeling to distinguish between legitimate users and automated traffic. By offloading coarse filtering to the edge and reserving complex logic for the application layer, they maintain performance while increasing the cost of attack for adversaries.

Defending a high-traffic platform against distributed denial-of-service attacks is rarely about finding a single silver bullet. It is about building a series of filters that increase the cost of an attack until it becomes economically unviable for the adversary. When you are handling 1.3 trillion requests per week, you cannot rely on manual intervention or static rules. You need an architecture that understands the difference between a user scrolling through a feed and a headless script hammering an API endpoint.

The Multi-Layered Defense Model

Effective mitigation requires a clear separation between the edge and the application. Relying solely on the edge is cheap but lacks the necessary context to make nuanced decisions. Relying solely on the application is precise but expensive and risks exhausting backend resources before the traffic is even filtered.

At the edge, the goal is to drop the obvious noise. This is where you implement coarse-grained filters based on IP reputation, TLS fingerprints, and basic request header analysis. Tools like Fastly Edge Rate Limiter or Cloudflare WAF rules allow you to maintain a "penalty box" for known bad actors. By using TLS fingerprinting techniques like JA3 or JA4, you can identify the underlying client or library making the request, regardless of the IP address. This is critical because modern botnets often rotate IPs rapidly, but they rarely change their underlying TLS handshake parameters.

Moving Up the Stack

Once traffic passes the edge, the application layer takes over. This is where you apply logic that requires state. If you are a researcher or a pentester, this is the layer where you look for flaws in the rate-limiting implementation itself. Does the application correctly track state across multiple nodes? If the rate limiter relies on a Redis backend, is the key structure granular enough to prevent a single user from exhausting the limit for an entire subnet?

The most effective application-level rate limiting uses a sliding window algorithm. This allows you to define an "allowance" and an "interval" that feels natural to a human user but restrictive to a script. For example, if a user is posting content, you can calculate the maximum number of posts a human could reasonably make in a minute and set your threshold accordingly. If the request volume exceeds this, you trigger a 429 status code.

The Power of Observability

You cannot defend what you cannot see. The biggest mistake teams make is treating rate-limiting logs as a black box. If your logs only show a 429 status code, you are flying blind. You need to enrich your logs with request metadata that explains why a request was blocked.

By injecting unique UUIDs into every request and logging the specific reason for a block—such as a failed TLS fingerprint match or an out-of-order API call—you can perform post-mortem analysis in tools like BigQuery. This data is invaluable for tuning your filters. If you see a surge in false positives, you can quickly identify the pattern and adjust your edge dictionaries.

Increasing the Cost of Attack

Defensive engineering is about changing the economics of an attack. Tarpitting is a classic technique that remains highly effective. Instead of immediately dropping a malicious request, you artificially delay the response. This forces the attacker to hold open a connection for a longer period, consuming their local resources and limiting their ability to scale the attack.

Similarly, response bloating can be used to turn an attacker's bandwidth against them. If you detect a simple GET request that is part of a volumetric attack, you can respond with a larger-than-normal payload. This forces the attacker to process more data than they are sending, effectively increasing the cost of their own infrastructure.

What This Means for Pentesters

When you are testing an application, look for these gaps. Does the application have a consistent rate-limiting policy across all endpoints? Often, developers will secure the login page but leave the search or API endpoints wide open. Check if the rate limiter is tied to the IP address or the user session. If it is tied to the IP, you can often bypass it by using a proxy rotation service. If it is tied to the session, you need to find a way to generate new sessions or exploit the lack of OWASP A07:2021 – Identification and Authentication Failures.

Ultimately, the goal is to build a system that is "reasonable." You want to allow legitimate users to interact with your platform without friction while making it prohibitively expensive for automated systems to cause disruption. If you are building these systems, prioritize observability and ensure your rate-limiting logic is as easy for your developers to use as it is for your security team to monitor.

Talk Type

talk

Difficulty

intermediate