Security BSides2023

Accelerating Software Development with ChatGPT

BSidesSLC39 views28:50almost 3 years ago

This talk demonstrates how to leverage ChatGPT to automate the development of a cloud-native, highly parallelized port scanner using AWS serverless technologies. The speaker outlines a methodology for using iterative prompting to generate infrastructure-as-code templates, directory structures, and unit tests while managing the limitations of AI-generated code. The presentation highlights the importance of manual validation and iterative refinement when using LLMs for security-focused engineering tasks.

Automating Cloud-Native Recon: Building a Serverless Port Scanner with LLM Assistance

TLDR: This post explores how to use LLMs to accelerate the development of cloud-native security tools, specifically a highly parallelized port scanner built on AWS Lambda, SQS, and DynamoDB. While ChatGPT can generate functional infrastructure-as-code and boilerplate logic, the real value lies in the iterative "prompt-review-validate" loop that manages the model's tendency to hallucinate configurations. Pentesters can adopt this workflow to rapidly prototype custom tooling for specific engagement requirements without getting bogged down in boilerplate.

Security research often hits a wall when the time required to build custom tooling exceeds the time available for the actual engagement. We have all been there: you need a specific, highly parallelized scanner to map a massive IP range, but writing the infrastructure-as-code, handling the queue logic, and managing the state in a database takes days. Recent research into using LLMs for security engineering suggests that we can collapse that development cycle by treating the model as a junior developer who needs constant, rigorous code review.

The Architecture of a Serverless Scanner

The goal here is to build a scanner that doesn't just run on a single box, but scales horizontally across AWS infrastructure. By using AWS Lambda for execution, Amazon SQS for task queuing, and Amazon DynamoDB for result storage, you create a system that can handle thousands of scans per hour without managing a single server.

The workflow starts by defining the functional requirements in plain English. Instead of asking for a "port scanner," you define the constraints: "I need a highly parallelized port scanner using AWS Lambda, SQS for queue management, and DynamoDB for results." The model will likely suggest a basic architecture, but the critical step is forcing it to break the project into discrete, testable milestones. If you ask for the entire stack at once, you will get a mess of broken IAM policies and incorrect resource references.

The Iterative Development Loop

Treating the LLM as a pair programmer requires a strict "90/10" rule. The model can get you 90% of the way there with boilerplate, but you must be prepared to handle the final 10%—the manual tweaks, the security-critical configuration, and the inevitable logic errors.

For instance, when generating the AWS SAM template, the model will often default to overly permissive IAM policies or incorrect resource attributes. You need to review the generated YAML specifically for these common pitfalls. If the model suggests an inline policy, reject it and prompt it to use SAM policy templates instead. This ensures you are adhering to the principle of least privilege without manually writing every JSON policy document.

When the model generates the Python logic for the scanner, it will likely use standard synchronous socket connections. This is a performance killer. You need to explicitly instruct it to use asyncio to handle multiple connections concurrently.

async def scan_port(ip_address: str, port: int):
    try:
        conn = asyncio.open_connection(ip_address, port)
        reader, writer = await asyncio.wait_for(conn, timeout=2)
        writer.close()
        await writer.wait_closed()
        return True
    except:
        return False

This snippet is the core of the scanner. By wrapping the connection attempt in an asyncio.wait_for call, you prevent a single unresponsive host from hanging your entire Lambda execution.

Validating the Output

The most dangerous part of this process is assuming the generated code works. You must write unit tests for every component. Using moto to mock AWS services allows you to run these tests locally without incurring costs or needing actual cloud resources.

When the model generates tests, it will often create "happy path" scenarios that don't actually exercise your error handling. You need to manually add test cases for malformed CIDR blocks, empty queues, and timeout conditions. If the model provides a test that fails, do not try to fix the test—fix the underlying code and feed the error back into the chat thread. This keeps the context window focused on the specific bug rather than the entire project.

Real-World Applicability

This approach is particularly useful for bug bounty hunters who need to perform rapid reconnaissance on large, dynamic targets. If you are on an engagement where you need to scan a /16 range for a specific service, you can spin up this infrastructure in minutes, run your scan, and tear it down. The cost is negligible, and the speed is significantly higher than running a local nmap scan, which would likely be throttled or blocked by local network constraints.

Defensive Considerations

From a defensive perspective, this level of automation makes it trivial for attackers to perform high-speed, distributed reconnaissance. If your infrastructure is exposed to the internet, you should expect that any service you leave open will be discovered and fingerprinted within minutes. Relying on IP-based rate limiting is no longer sufficient when an attacker can rotate through thousands of AWS Lambda execution environments, each with a different source IP.

Focus your defensive efforts on robust application-layer authentication and monitoring for anomalous traffic patterns that deviate from standard user behavior. If you see a sudden burst of connection attempts from a range of AWS-owned IP addresses, you are likely being scanned by a tool similar to the one described here.

The key to using LLMs in security engineering is not to let them do the work, but to let them do the typing. You are the architect and the lead auditor. If you cannot explain the code the model generates, you have no business deploying it on an engagement. Keep your prompts narrow, validate every function with a test, and never trust the model's math—especially when it comes to calculating scan throughput.

Talk Type

talk

Difficulty

intermediate