HQNN Fraud Detection Benchmark

Academic Context


Degree	Bachelor of Science Business Informatics
Institution	IU International University of Applied Sciences, Munich
Supervisor	Prof. Dr. rer. nat. Michael Barth
Student	Gregor Kobilarov
Dataset	Kaggle Credit Card Fraud Detection (n = 284,807)

Research Question

This study explicitly avoids framing HQNNs as absolute performance replacements for classical deep learning. Given the physical constraints of the NISQ era and the inherently parameter-hungry nature of classical architectures, the primary objective is to evaluate efficiency of feature extraction: whether HQNNs can extract a highly concentrated predictive signal using a fraction of the structural complexity required by state-of-the-art classical models.

"To what extent do Hybrid Quantum Neural Networks (HQNNs) demonstrate a Parameter Efficiency Advantage — defined by superior ratios of predictive performance to trainable parameter count (MCC/kParam and PR-AUC/kParam) — over state-of-the-art classical models when classifying highly imbalanced financial tabular data?"

Hypotheses

H1 — Parameter Efficiency Advantage HQNNs achieve higher MCC/kParam and PR-AUC/kParam ratios than all classical baselines.

H2 — Competitive Absolute Performance HQNNs achieve MCC not significantly inferior to SNN (the closest classical baseline by parameter count), with MCC as the primary classification metric. PR-AUC is reported as a secondary metric and may show a modest disadvantage consistent with the parameter constraint. Notably, SHNN also surpasses TabNet in MCC despite using 50× fewer parameters.

H3 — Non-Trivial VQC Contribution The VQC component provides a non-trivial predictive signal; removing it causes the model to collapse to random prediction (operationalised via the ablation study).

Results

5-fold stratified CV on the Kaggle Credit Card Fraud dataset (n = 284,807). Metrics are mean ± std across folds.

Model	Type	Params	MCC	PR-AUC	MCC / kParam	PR-AUC / kParam
SHNN	Quantum hybrid	122	0.5758 ± 0.0371	0.5910 ± 0.0323	4.720	4.844
Parallel Hybrid	Quantum hybrid	489	0.5688 ± 0.0371	0.6239 ± 0.0101	1.163	1.276
SNN	Classical	3,201	0.5633 ± 0.0139	0.6449 ± 0.0086	0.176	0.201
TabNet	Classical	6,176	0.4824 ± 0.0732	0.6551 ± 0.0399	0.078	0.106
ResNet	Classical	8,897	0.6933 ± 0.0329	0.7170 ± 0.0164	0.078	0.081
FT-Transformer	Classical	14,869	0.6934 ± 0.0164	0.7061 ± 0.0220	0.047	0.047
SAINT	Classical	29,357	0.6975 ± 0.0164	0.6570 ± 0.0505	0.024	0.022

Key finding: SHNN achieves comparable MCC to SNN (0.576 vs 0.563) with 26× fewer parameters, yielding a ~27× MCC/kParam advantage and ~24× PR-AUC/kParam advantage. Larger classical models (ResNet, FT-Transformer, SAINT) achieve higher absolute MCC but at 73–240× the parameter count, resulting in 60–197× lower efficiency. The central thesis claim — quantum advantage through parameter efficiency rather than raw performance — is supported.

Evaluation Design

The benchmark measures two dimensions:

1. Absolute performance — MCC and PR-AUC across 5 stratified CV folds, reported as mean ± std. Statistical consistency is assessed via the Wilcoxon signed-rank test. Given n=5 folds, the minimum achievable p-value (0.0625) exceeds the standard significance threshold; rank-biserial correlation therefore serves as the primary effect size measure.

2. Parameter efficiency — MCC/kParam and PR-AUC/kParam (predictive performance per 1,000 trainable parameters) for each architecture. This operationalises the theoretical quantum expressivity advantage: a small quantum model should achieve disproportionately high performance relative to its parameter count compared to larger classical baselines.

Dataset


Source	Kaggle — Credit Card Fraud Detection
Samples	284,807 transactions
Features	30 (28 PCA-anonymised V1–V28 + Amount + Time)
Target	Binary: Fraud (492 cases, 0.17%) / Legitimate (284,315 cases)
CV	5-fold stratified, SMOTE applied strictly inside each fold

# Download via Kaggle CLI
kaggle datasets download -d mlg-ulb/creditcardfraud --path data/raw --unzip

# Or manually place creditcard.csv at:
# data/raw/creditcard.csv

Project Structure

hqnn-fraud-detection-benchmark/
├── configs/
│   └── default.yaml            # All hyperparameters (single source of truth)
├── data/
│   └── raw/                    # Place creditcard.csv here
├── results/
│   ├── folds/                  # Per-fold JSON results
│   ├── figures/                # Generated plots
│   ├── metrics/                # Aggregated metrics
│   └── models/                 # Saved model states
├── scripts/
│   ├── run_benchmark.py        # Full benchmark entrypoint
│   ├── run_fold.py             # Single fold (for parallel dispatch)
│   └── run_plots.py            # Plot generation
├── src/
│   ├── config.py               # Pydantic config schema
│   ├── data/                   # Loader, CV splits, preprocessing
│   ├── models/                 # SHNN, Parallel Hybrid, SNN, TabNet, FT-T, ResNet, SAINT
│   ├── training/               # PyTorch training loop with early stopping
│   └── evaluation/             # Metrics, statistical tests, plots
└── tests/                      # pytest suite

Setup

# Install pixi (if not already installed)
curl -fsSL https://pixi.sh/install.sh | bash

# Install all dependencies
pixi install

Running

# Full benchmark (all models, 5 folds)
pixi run benchmark

# Single fold — for parallel dispatch across machines
pixi run fold -- --model shnn --fold 0
pixi run fold -- --model parallel --fold 0

# Tests
pixi run test

All hyperparameters are in configs/default.yaml. Pass --config path/to/custom.yaml to override.

Reproducing Figures

Fold results are committed, so all benchmark figures can be regenerated without re-running the benchmark (~185 h).

# 12 benchmark figures — no dataset needed, reads committed results/folds/*.json
pixi run plots

# 6 conceptual figures — requires creditcard.csv in data/raw/
pixi run python scripts/run_conceptual_figures.py

# Wilcoxon signed-rank statistics — no dataset needed
pixi run python scripts/run_statistics.py

All output goes to results/figures/.

Preprocessing Pipeline

Challenge	Solution
Class imbalance (~0.17% fraud)	SMOTE inside each CV fold (never globally)
Outliers in Amount	`RobustScaler` (median/IQR-based)
Qubit count constraint	PCA to 8 components (= qubit count), fitted on train fold only
Angle encoding range	MinMax to [0, π] after RobustScaler

HQNN Architectures

SHNN — Sequential Hybrid Neural Network

Input (8) → Linear(8→8) + PiSigmoid → VQC(8 qubits, 2 layers) → Linear(1→1) + Sigmoid → Output

VQC uses AngleEmbedding + BasicEntanglerLayers
48 quantum parameters, ~74 classical parameters, ~122 total

Parallel Hybrid Neural Network

Input (8) ──┬─→ MLP [16, 8] ──────────────────┐
            └─→ VQC(8 qubits, 2 layers) ──┐   concat → FC [8] → Sigmoid → Output
                                           └───┘

Quantum and classical streams process the same input independently
Outputs are concatenated before the final classification head

Evaluation

Metric	Role
MCC	Primary — balanced, threshold-aware, robust to imbalance
PR-AUC	Primary — threshold-free, captures precision/recall trade-off

Early stopping: patience 20 (quantum) / 15 (classical), monitored on validation MCC. Final threshold: tuned on validation set post-training using find_optimal_threshold. Statistical test: Wilcoxon signed-rank, effect size: rank-biserial correlation.

Ablation

A structural ablation replaces the VQC output with a constant zero vector to isolate the quantum contribution. Both conditions were trained from scratch for 10 epochs on fold 0 under identical conditions:

Condition	MCC	Loss
SHNN (full, 10 epochs)	~0.22	~0.08
SHNN (VQC → zeros, 10 epochs)	0.000	0.6932 (random)

The VQC provides 100% of the model's predictive signal. Without it, SHNN collapses to random prediction.

Glossary

Term	Definition
Qubit	Quantum bit — exists in superposition of 0 and 1 simultaneously until measured
VQC	Variational Quantum Circuit — a parameterised quantum circuit trained by gradient descent
AngleEmbedding	Encodes classical features as rotation angles on qubits
Adjoint differentiation	An efficient gradient method for quantum circuits; mathematically equivalent to the parameter-shift rule but faster in simulation
NISQ	Noisy Intermediate-Scale Quantum — current era of 50–1000 qubit hardware with non-negligible error rates
Hilbert space	The exponentially large mathematical space in which quantum states live (2ⁿ dimensions for n qubits)
SMOTE	Synthetic Minority Oversampling Technique — generates synthetic fraud examples to counteract class imbalance
MCC	Matthews Correlation Coefficient — single balanced metric for binary classification on imbalanced data (range: −1 to +1)
PR-AUC	Area under the Precision-Recall curve — more informative than ROC-AUC under heavy class imbalance
Wilcoxon signed-rank	Non-parametric paired statistical test used to compare fold-level metrics without normality assumption
Barren plateau	Phenomenon where gradients vanish exponentially with circuit depth, making VQC training increasingly difficult
Parameter efficiency	MCC per trainable parameter — the central thesis metric quantifying representational value per unit of model complexity

Author

Gregor Kobilarov

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
.github/workflows		.github/workflows
configs		configs
results		results
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pixi.lock		pixi.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HQNN Fraud Detection Benchmark

Academic Context

Research Question

Hypotheses

Results

Evaluation Design

Dataset

Project Structure

Setup

Running

Reproducing Figures

Preprocessing Pipeline

HQNN Architectures

SHNN — Sequential Hybrid Neural Network

Parallel Hybrid Neural Network

Evaluation

Ablation

Glossary

Author

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HQNN Fraud Detection Benchmark

Academic Context

Research Question

Hypotheses

Results

Evaluation Design

Dataset

Project Structure

Setup

Running

Reproducing Figures

Preprocessing Pipeline

HQNN Architectures

SHNN — Sequential Hybrid Neural Network

Parallel Hybrid Neural Network

Evaluation

Ablation

Glossary

Author

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages