This repository evaluates synthetic tabular datasets against a real reference dataset using statistically grounded and model-based checks.
It is designed as an end-to-end workflow:
- schema inference and alignment (
strictor intersection mode) - per-dataset metric pipeline (similarity, detectability, privacy, optional utility)
- normalization + weighted composite scoring
- ranked reporting (
metrics.json,scores.csv,plots/,logs.txt)
Mermaid source: docs/architecture.md
uv syncBasic evaluation:
uv run sdeval --real raw\sample_dataset.csv --synthetic synthetic\ctgan_1x.csv synthetic\tvae_1x.csv --out reports\demoWith utility enabled via config:
uv run sdeval --real raw\sample_dataset.csv --synthetic synthetic\ctgan_1x.csv synthetic\tvae_1x.csv --config configs\examples\example_with_target.yaml --out reports\demo_with_targetuv run ruff format .
uv run ruff check .
uv run pytest -q- infer data types and exclude low-signal columns by default (constant / ID-like)
- align real and synthetic columns with explicit conversion tracking
- compute similarity (KS, Wasserstein, JSD, correlation drift)
- run detectability model (real-vs-synthetic ROC-AUC)
- compute privacy indicators (exact match, NN ratio, QID collisions)
- optionally compute utility (TSTR vs baseline) when target is provided
- normalize family scores to
[0,1], apply weights, rank synthetic datasets
- Metrics and raw details:
reports/<run>/metrics.json - Ranking table:
reports/<run>/scores.csv - Plots (drift, radar, ranking):
reports/<run>/plots/ - Run log + warnings:
reports/<run>/logs.txt
uv pip compile pyproject.toml -o requirements.txt


