A Splunk-focused SIEM security event simulator and SPL detection rule library for detection engineers, SOC analysts, and cybersecurity students.
SplunkForge generates realistic, multi-source security event datasets — authentication logs, firewall records, Sysmon telemetry, proxy logs, and DNS queries — organized into complete multi-stage attack scenarios. It ships a library of 30+ validated SPL detection queries mapped to the MITRE ATT&CK framework, three importable Splunk dashboard XML files, and a text-based ATT&CK coverage matrix for gap analysis.
Security Information and Event Management (SIEM) platforms are only as good as the detection rules running inside them. Writing good detection rules requires:
- Understanding what attack activity looks like across multiple log sources simultaneously
- Having realistic test data to run rules against without needing a live attack
- Knowing which MITRE ATT&CK techniques your rules cover — and which ones you're blind to
SplunkForge solves all three. It generates complete, realistic attack datasets that look like actual SIEM telemetry, provides production-quality SPL detection queries for every major attack tactic, and shows you exactly where your detection coverage has gaps.
This is the toolkit I wish existed when learning detection engineering. Every event generated here reflects real attacker behavior studied from incident response reports, threat intelligence, and the ATT&CK knowledge base.
- Windows Security Event Log — EventIDs 4624, 4625, 4648, 4688, 4698, 4720, 7045 with correct field names and logon type codes
- Linux auth.log / sshd — PAM authentication success and failure messages
- Sysmon Telemetry — EID 1 (process create), 3 (network connect), 8 (remote thread), 10 (process access), 11 (file create), 12/13 (registry), 22 (DNS query)
- Firewall Logs — Palo Alto PAN-OS style traffic logs with bytes, packets, session IDs
- DNS Query Logs — Splunk Stream DNS format with TTL, record type, response data
- Web Proxy Logs — Squid/Zscaler Apache Combined Log format with URL, user-agent, bytes
- Web Server Access Logs — Apache and IIS W3C format with attack signature detection
- Configurable timing — realistic jitter, spread intervals, events per minute
- Benign baseline mixing — configurable signal-to-noise ratio for authentic datasets
Five complete, multi-stage attack kill chains with realistic timing, cross-source correlation, and MITRE ATT&CK metadata:
| Scenario | Phases | Event Types | ATT&CK Techniques |
|---|---|---|---|
| Brute Force | Failures → Success → Discovery → Persistence | WinEventLog, linux:auth, Sysmon, PowerShell | T1110.001, T1110.003, T1078, T1059.001, T1053.005 |
| Lateral Movement | Phishing → Recon → Cred Dump → Pivot → Persist | WinEventLog, Sysmon, Proxy, Firewall | T1566.001, T1003.001, T1021.002, T1543.003 |
| Data Exfiltration | Recon → Stage → Exfil → Cleanup | WinEventLog, DNS, Proxy, Sysmon | T1135, T1074.001, T1048.003, T1070.001 |
| Ransomware | Phishing → Download → Disable AV → Encrypt → Ransom Note | WinEventLog, Sysmon, Proxy, PowerShell | T1566.001, T1562.001, T1490, T1486 |
| Insider Threat | After-hours → Sensitive Shares → Bulk Copy → Cloud/USB | WinEventLog, Proxy, Sysmon | T1078, T1039, T1567.002, T1025 |
30+ production-quality detection queries across all major MITRE ATT&CK tactics:
| Rule ID | Name | Tactic | Severity |
|---|---|---|---|
| AUTH-001 | Multiple Failed Logins from Single Source | Credential Access | High |
| AUTH-002 | Password Spraying - Low Rate Across Many Accounts | Credential Access | High |
| AUTH-003 | Brute Force Followed by Successful Login | Credential Access | Critical |
| AUTH-004 | Pass-the-Hash Indicator (Explicit Credential Use) | Lateral Movement | High |
| AUTH-005 | Account Lockout Storm | Credential Access | Medium |
| AUTH-006 | New Local Admin Account Created | Persistence | High |
| EXEC-001 | PowerShell Encoded Command Execution | Execution | High |
| EXEC-002 | LOLBin Spawned by Office Application (Macro) | Execution | Critical |
| EXEC-003 | WMI Remote Command Execution | Execution | High |
| EXEC-004 | Certutil Download Cradle | Defense Evasion | High |
| EXEC-005 | PowerShell Script Block - Offensive Keywords | Execution | Critical |
| PERS-001 | Suspicious Registry Run Key Modification | Persistence | High |
| PERS-002 | Scheduled Task Created by Suspicious Process | Persistence | High |
| PERS-003 | Service Installed in Suspicious Location | Persistence | High |
| PERS-004 | WMI Event Subscription (Fileless Persistence) | Persistence | Critical |
| LAT-001 | Pass-the-Hash / Lateral Movement via SMB | Lateral Movement | High |
| LAT-002 | LSASS Memory Access (Credential Dumping) | Credential Access | Critical |
| LAT-003 | PsExec-Style Remote Service Creation | Lateral Movement | High |
| LAT-004 | Remote Thread Injection (Process Injection) | Defense Evasion | Critical |
| LAT-005 | WinRM / PSRemoting Lateral Movement | Lateral Movement | High |
| EXFIL-001 | DNS Tunneling - Long Subdomain Queries | Exfiltration | Critical |
| EXFIL-002 | Bulk Upload to Cloud Storage | Exfiltration | High |
| EXFIL-003 | Large Outbound Transfer to Non-Business IP | Exfiltration | High |
| EXFIL-004 | Staging Directory - Bulk File Copy to Temp | Collection | Medium |
| EXFIL-005 | USB Device Connected - Possible Data Theft | Exfiltration | Medium |
| C2-001 | C2 Beaconing - Regular Connection Intervals | C2 | High |
| C2-002 | Outbound Connection on Non-Standard Port | C2 | High |
| C2-003 | DNS over HTTPS to Non-Corporate Resolver | C2 | Medium |
| C2-004 | Protocol Tunneling - Suspicious POST Volume | C2 | High |
| C2-005 | Self-Signed Certificate on External HTTPS | C2 | High |
A text-based coverage matrix showing which techniques have detection rules and which are gaps:
╔══════════════════════════════════════════════════════════════════════╗
║ SplunkForge — MITRE ATT&CK Enterprise Coverage Matrix ║
╚══════════════════════════════════════════════════════════════════════╝
┌─ CREDENTIAL ACCESS
│ Coverage: [████████████░░░░░░░░] 4/6 (67%)
│
│ ✓ T1110.001 Brute Force: Password Guessing [AUTH-001, AUTH-002, AUTH-003]
│ ✓ T1110.003 Brute Force: Password Spraying [AUTH-002]
│ ✓ T1003.001 OS Credential Dumping: LSASS Memory [LAT-002]
│ ✓ T1550.002 Pass the Hash [AUTH-004, LAT-001]
│ ✗ T1558 Steal Kerberos Tickets [NO COVERAGE - GAP]
│ ✗ T1040 Network Sniffing [NO COVERAGE - GAP]
Three importable Splunk dashboard XML files:
- Security Overview — Event volume trends, severity distribution, top source IPs, auth failure heatmap, high-severity alert feed
- Threat Hunting — Interactive investigation workspace with process chain explorer, lateral movement indicators, DNS anomaly detection, C2 beacon analysis
- Attack Timeline — Kill chain phase progression, per-host activity summary, cross-sourcetype correlation, attacker IP drill-down
| Format | Use Case |
|---|---|
| Splunk JSON | HTTP Event Collector (HEC) ingestion, one event per line |
| CSV | Splunk CSV sourcetype, spreadsheet analysis |
| Syslog (RFC 5424) | rsyslog, syslog-ng, universal SIEM input |
| CEF | ArcSight, QRadar, any CEF-compatible SIEM |
git clone https://github.com/marez8505/SplunkForge
cd SplunkForge
pip install -r requirements.txt
# or install as package
pip install -e .Requirements: Python 3.8+, pyyaml, jinja2
# Brute force attack → JSON output
python -m splunkforge scenario --type brute_force --output ./events/ --format json
# Ransomware with 100 encrypted files → CSV
python -m splunkforge scenario --type ransomware --target-count 100 --output ./events/ --format csv
# Lateral movement → syslog format (without background noise)
python -m splunkforge scenario --type lateral_movement --output ./events/ --format syslog --no-noise
# Data exfiltration → CEF format
python -m splunkforge scenario --type data_exfiltration --output ./events/ --format cef
# Insider threat scenario
python -m splunkforge scenario --type insider_threat --output ./events/ --format json# 1 hour of mixed events at 100/min with 5% attack ratio
python -m splunkforge generate --duration 60 --epm 100 --attack-ratio 0.05 --output ./events/
# 30 minutes, high volume, low noise
python -m splunkforge generate --duration 30 --epm 500 --attack-ratio 0.10 --format csv# List all rules
python -m splunkforge rules --list
# Rules for a specific MITRE technique
python -m splunkforge rules --technique T1110
# Rules for a tactic
python -m splunkforge rules --tactic "Credential Access"
# Full details for one rule
python -m splunkforge rules --id AUTH-001
# Filter by severity
python -m splunkforge rules --severity critical
# Export as Splunk savedsearches.conf
python -m splunkforge rules --export ./spl_rules.conf# Print matrix to terminal
python -m splunkforge coverage
# Write to file
python -m splunkforge coverage --output ./coverage_matrix.txt
# Show only gaps
python -m splunkforge coverage --gaps
# Full coverage report with all indicators
python -m splunkforge coverage --output ./coverage.txt --gaps# Run everything: all scenarios + rule library + coverage matrix
python -m splunkforge demo
# With custom output directory
python -m splunkforge demo --output ./demo_output/{"time": 1710505800.0, "host": "WKSTN-0042", "source": "splunkforge",
"sourcetype": "WinEventLog:Security", "index": "main",
"event": {"EventCode": 4625, "TargetUserName": "administrator",
"IpAddress": "198.51.100.42", "LogonType": 3,
"SubStatus": "0xC000006A", "FailureReason": "Wrong password"}}<131>1 2024-03-15T10:30:00.000Z WKSTN-0042 WinEvtSecurity 4248 WinEventLog_Security
[splunkforge@57032 EventCode="4625" src_ip="198.51.100.42" TargetUserName="administrator"]
EventID=4625 User=administrator Src=198.51.100.42 Severity=medium Malicious=true
<20>Mar 15 10:30:00 WKSTN-0042 CEF:0|Microsoft|Windows Security Event Log|1.0|4625|Windows Security Event|5|
rt=1710505800000 dhost=WKSTN-0042 src=198.51.100.42 duser=administrator
splunkforgeSeverity=medium splunkforgeMalicious=true
Duration: 30–45 minutes
Models a credential brute force attack against RDP or SSH:
- Phase 1 — Attacker generates 10–50 failed login attempts (Event 4625 /
Failed password for...) from a single external IP, spread over a configurable window with realistic jitter - Phase 2 — Successful login after the brute force (Event 4624 LogonType=10 for RDP, or
Accepted password forin SSH) - Phase 3 — Post-exploitation discovery:
whoami,net user /domain,systeminfo,ipconfig /all,arp -a,netstat -ano,tasklist /v - Phase 4 — Persistence installation via scheduled task (Event 4698) or registry Run key (Sysmon EID 13) plus optional PowerShell download cradle
Key detection opportunities: Spike in Event 4625 from one IP; brute force + success correlation; suspicious process parent-child chain; registry Run key modification in %TEMP%.
Duration: 60–120 minutes
Models an adversary who phishes in, dumps credentials, and moves through the environment:
- Phase 1 — Web proxy log shows download of malicious Office document; Word spawns
mshta.exe(parent-child anomaly) - Phase 2 — Local recon:
whoami /groups,net group "Domain Admins" /domain, PowerShell LDAP query dumping user list to CSV - Phase 3 — LSASS memory access (Sysmon EID 10,
GrantedAccess=0x1010) simulating Mimikatz; process creation of suspicious tool - Phase 4 — SMB connection to target hosts (firewall log), remote service creation (PSEXESVC pattern), network logon type 3 on pivot hosts
- Phase 5 — New service install on final pivot host; WMI event subscription for fileless persistence
Key detection opportunities: Office spawning LOLBin; LSASS access with suspicious grants; same user authenticating to 3+ hosts; PsExec service name pattern.
Duration: 90–180 minutes
Models internal recon, data staging, and exfiltration with cleanup:
- Phase 1 —
net view /all /domain, PowerShell ping sweep,wmic /node: share get - Phase 2 — SMB share access events (Event 5140) to Finance/HR/Legal shares not normally accessed;
robocopystaging toC:\Temp; file archive creation (Sysmon EID 11, large file size) - Phase 3 (DNS) — 50–100 DNS queries with 30–60 char encoded subdomains (stream:dns), DNS tunneling tool process creation
- Phase 3 (HTTP) — Large HTTPS POST to external IP (proxy log, bytes_out > 50MB), Sysmon network connection event
- Phase 4 —
wevtutil.exe cl Security,wevtutil.exe cl System,vssadmin.exe delete shadows /all /quiet, staging directory deletion
Key detection opportunities: DNS query length > 50 chars, high query rate; large POST to unknown IP; wevtutil clearing event logs; vssadmin shadow deletion.
Duration: 45–90 minutes
Models a Ryuk/LockBit-style attack from phishing to encryption:
- Phase 1 — Proxy log shows
.docmdownload; Word spawnscmd.exe /c powershell.exe -enc <base64> - Phase 2 — PowerShell encoded command event (EID 4104); HTTP GET for stage2.ps1; ransomware binary dropped (Sysmon EID 11)
- Phase 3 —
Set-MpPreference -DisableRealtimeMonitoring $true;net stop WinDefend;vssadmin delete shadows /all /quiet;bcdedit /set {default} recoveryenabled no - Phase 4 — Ransomware process spawned by
cmd.exe; rapid Sysmon EID 11 file creation events (10–100+ files with encrypted extensions like.docx.a3f1b2) - Phase 5 — Ransom note files (
README_FOR_DECRYPT.txt) written to each directory; C2 callback (Sysmon network connect); registry Run key persistence
Key detection opportunities: Office macro spawning PowerShell with -enc; disabling Defender via PowerShell; mass file creation in rapid succession; vssadmin shadow deletion.
Duration: 3–6 hours (spread across evening hours)
Models a malicious insider using legitimate credentials:
- Phase 1 — Workstation logon at 10 PM (Event 4624 LogonType=2 at unusual hour); VPN connection from home IP
- Phase 2 — Event 5140 share access to Finance$, HR$, Legal$ shares not normally accessed by this user;
dir /s /bon sensitive share - Phase 3 —
robocopybulk copy from shares to Desktop; large proxy traffic to OneDrive/Dropbox/Box (multiple POST requests, bytes_out > 100MB) - Phase 4 (cloud) — HTTP PUT to cloud storage API (bytes_out > 200MB); OneDrive sync process network connection; elevated cloud domain DNS queries
- Phase 4 (USB) — Explorer.exe opening
E:\(USB drive letter);robocopyDesktop to USB drive - Logoff — Normal Event 4634 logoff
Key detection opportunities: After-hours logon; share access outside of user's normal scope; large outbound to cloud storage; USB device insertion with subsequent file copy.
The splunkforge coverage command generates a full coverage matrix. Current coverage by tactic:
| Tactic | Rules | Coverage |
|---|---|---|
| Initial Access | 2 | T1566.001 via scenario metadata |
| Execution | 5 | T1059.001, T1047, T1204.002, T1218 |
| Persistence | 4 | T1547.001, T1543.003, T1053.005, T1546.003 |
| Defense Evasion | 4 | T1055, T1070.001, T1218, T1562.001 |
| Credential Access | 6 | T1110.001/.003, T1003.001, T1550.002, T1136 |
| Lateral Movement | 5 | T1021.002, T1021.006, T1055, T1550.002 |
| Collection | 2 | T1074.001, T1039 |
| Exfiltration | 5 | T1048.003, T1567.002, T1048, T1052.001, T1074.001 |
| Command & Control | 5 | T1071.001, T1071.004, T1572, T1571, T1573 |
Detection gaps (identified via splunkforge coverage --gaps) represent opportunities to add new rules — this is the real value for detection engineering practice.
Via HTTP Event Collector (HEC):
# Send JSON events via HEC
curl -k https://splunk-host:8088/services/collector/event \
-H "Authorization: Splunk YOUR_HEC_TOKEN" \
-d @./events/brute_force_20240315.jsonVia Splunk CLI (one-shot indexing):
/opt/splunk/bin/splunk add oneshot ./brute_force_events.json \
-index main -sourcetype splunkforge_jsonVia Splunk Web:
- Settings → Data Inputs → Files & Directories → Add New
- Upload your generated
.json,.csv, or.syslogfile - Set sourcetype:
_jsonfor JSON,csvfor CSV,syslogfor syslog
- In Splunk Web: Settings → User Interface → Views
- Click Create New View → Import
- Paste the XML from
dashboards/security_overview.xml,threat_hunting.xml, orattack_timeline.xml - Save and navigate to the dashboard
Export rules as a savedsearches.conf file:
python -m splunkforge rules --export ./spl_rules.confThen place the conf file in $SPLUNK_HOME/etc/apps/YOUR_APP/local/savedsearches.conf and restart Splunk, or import individual searches through Splunk Web → Search & Reporting → Activity → Searches, Reports, and Alerts.
Search Processing Language (SPL) is Splunk's query language. SplunkForge detection rules use core SPL constructs:
stats— Aggregate events:stats count as failures by src_ipcounts failures grouped by source IPwhere— Filter results:where failures > 10applies the thresholdeval— Calculate fields:eval risk=case(failures>50,"HIGH", failures>10,"MEDIUM", true(),"LOW")streamstats— Running calculations over time (used for beacon interval analysis)timechart— Time-series aggregation for volume trendingtransaction— Group related events: used for correlating brute-force + success
Detection rules trigger on individual event patterns. Correlation rules join patterns across multiple events or sourcetypes (e.g., AUTH-003 joins failures + success from the same IP). SplunkForge's more advanced rules demonstrate multi-step correlation using join, transaction, and lookup-based enrichment.
Every rule includes false_positive_notes. Good detection engineering starts with understanding what legitimate activity looks like before tuning thresholds. The --attack-ratio flag in generate mode lets you tune the signal-to-noise ratio to test rule sensitivity.
- Generate SplunkForge events for a scenario
- Run your detection rules against the events in Splunk
- Run
splunkforge coverageto see which ATT&CK techniques have no rules - Research the gap techniques and write new SPL rules
- Re-run the scenario and verify the new rules fire
This is the core loop of detection engineering.
SplunkForge/
├── splunkforge/
│ ├── main.py # CLI entry point
│ ├── utils.py # IP/hostname generators, constants
│ ├── generators/ # Event generators by log source
│ ├── scenarios/ # Multi-stage attack simulators
│ ├── detection/
│ │ ├── spl_library.py # Rule loading and query engine
│ │ ├── mitre_mapper.py # ATT&CK coverage matrix generator
│ │ └── rules/ # YAML rule definitions
│ └── formatters/ # JSON, CSV, syslog, CEF output
├── dashboards/ # Splunk dashboard XML files
├── tests/ # 145-test unittest suite
├── sample_output/ # Output format documentation
├── requirements.txt
├── setup.py
└── LICENSE
pip install -r requirements.txt
python -m pytest tests/ -v
# Run specific test modules
python -m pytest tests/test_generators.py -v
python -m pytest tests/test_scenarios.py -v
python -m pytest tests/test_spl_rules.py -v
python -m pytest tests/test_formatters.py -v145 tests covering event structure validation, scenario chronology, SPL rule YAML schema compliance, MITRE technique ID format validation, and formatter output correctness.
- Fork the repository
- Create a feature branch:
git checkout -b feature/new-scenario - Write detection rules in the appropriate YAML file under
splunkforge/detection/rules/ - Add corresponding tests in
tests/ - Run
python -m pytest tests/— all tests must pass - Submit a pull request
Rules follow a strict YAML schema (see detection/rules/authentication.yml for examples). Required fields: id, name, description, mitre_attack, severity, spl. The test suite validates schema compliance automatically.
MIT License — see LICENSE for full text.
- MITRE ATT&CK Enterprise Matrix
- Splunk SPL Documentation
- Sysmon Event ID Reference
- Windows Security Event Log Reference
- LOLBAS Project
- Sigma Rules — Additional detection rule inspiration
- CEF Implementation Standard