- Overview
- Project Objectives
- Dataset
- Anomaly Types Modeled
- Methodology
- Anomaly Scoring Strategy
- Evaluation Protocol
- Results Summary
- Project Structure
- Usage Instructions
- Outputs & Deliverables
- Applications
- Key Takeaway
Modern network infrastructures (cloud, ISP backbones, enterprise VLANs, SOC environments) face increasingly complex failure and attack patterns. Traditional monitoring systems rely heavily on threshold-based alerts (CPU, memory, bandwidth), which are insufficient to detect topological misconfigurations or stealthy lateral connections.
This project proposes a graph-based anomaly detection framework that detects both:
- Attribute anomalies (e.g., abnormal resource usage)
- Structural anomalies (e.g., unauthorized links between isolated network segments)
by explicitly modeling the network topology using Graph Neural Networks (GNNs).
- Detect structural anomalies that cannot be identified using classical tabular methods
- Compare traditional ML (DBSCAN) with Graph Representation Learning
- Simulate a realistic secure network scenario (strict VLAN isolation)
- Demonstrate why topological context is essential for anomaly detection in networks
- Source: Internet Topology Zoo (conceptually inspired)
- Implementation: Synthetic VLAN-based network topology
- Model: Stochastic Block Model (SBM)
Each VLAN represents a secure subnet, where:
- Intra-VLAN communication is allowed
- Inter-VLAN communication is strictly forbidden
This design provides a clean ground truth for detecting structural violations.
💡 The framework is dataset-agnostic and can be applied to any real network topology provided as an edge list.
Simulated as extreme CPU usage spikes:
- Normal nodes: CPU ∈ [0.1, 1.0]
- Anomalous nodes: CPU ∈ [90, 100]
These anomalies are designed to be easily detectable by DBSCAN, serving as a baseline.
Injected as unauthorized links between distant VLANs:
-
Example: Direct connections between VLAN 0 and VLAN 3
-
Represent:
- Firewall misconfigurations
- Unauthorized tunnels
- Lateral movement / backdoors
These anomalies do not affect node attributes, making them invisible to classical ML.
-
Nodes are treated as independent samples
-
Features used:
- CPU usage
- Memory usage
-
No graph structure is considered
Anomalous nodes lie in low-density regions of the feature space.
- Ignores adjacency and topology
- Cannot detect structural anomalies
- Fails when anomalies are purely relational
Encoder
-
GraphSAGE-based encoder
-
Learns node embeddings by aggregating neighborhood information
-
Captures:
- VLAN structure
- Connectivity patterns
- Structural regularities
Decoder
- Dot-product decoder
- Reconstructs the adjacency matrix
- Outputs link existence probabilities
The model is trained on a clean topology only.
At inference time:
- Links that cannot be reconstructed accurately
- Are assigned high reconstruction error
- And flagged as structural anomalies
- Edge-level: Low reconstructed probability ⇒ suspicious link
- Node-level: A node is anomalous if it participates in at least one suspicious link
Final node anomaly score:
score(node) = 1 − min(reconstructed_link_probability)
- Known injected CPU anomalies
- Known injected inter-VLAN bridges
- Precision
- Recall
- F1-Score
- ROC-AUC (GNN only)
| Method | Attribute Anomalies | Structural Anomalies | Topology-Aware |
|---|---|---|---|
| DBSCAN | ✅ Detected | ❌ Missed | ❌ No |
| GNN (GraphSAGE + GAE) | ✅ Detected | ✅ Detected | ✅ Yes |
- DBSCAN performs well only when anomalies affect raw features
- GNN successfully detects stealth structural violations
- Structural context is critical for robust network anomaly detection
Network_Anomaly_Detection/
│
├── data/
│ ├── raw/ # Clean topology
│ └── processed/ # Nodes, edges, predictions
│
├── utils/
│ ├── data_loader.py
│ ├── feature_generator.py
│ ├── dataset.py
│ ├── models.py
│ ├── baseline.py
│ ├── train.py
│ └── visualization.py
│
├── notebooks/
│ └── NADGNN.ipynb
│
├── models/
│ └── gnn_model.pth
│
├── output/
│ ├── dashboard.png
│ └── risk_map.png
│
├── config.py
├── main.py
├── run.sh
└── requirements.txt
pip install -r requirements.txt./run.shThis will:
- Generate the network topology
- Inject anomalies
- Train the GNN
- Evaluate DBSCAN vs GNN
- Save results, metrics, and visualizations
nodes.csv— node features + ground truthedges_train.csv— clean topologyedges_test.csv— topology with anomaliesresults_gnn_predictions.csv— final scores & predictions
gnn_model.pth— trained Graph Auto-Encoder
- Dashboard: Training loss, ROC, confusion matrix, metrics comparison
- Risk Map: Network visualization with detected anomalous links
- SOC automation & zero-trust validation
- Cloud network misconfiguration detection
- ISP backbone monitoring
- Insider threat & lateral movement detection
- Digital twin simulation of secure networks
Anomalies in networks are not always about “high values” — they are often about “wrong connections.”
Graph Neural Networks provide the necessary inductive bias to understand and protect network structure, making them indispensable for next-generation network security and monitoring systems.