Driftmap Public Harness (llm-eval-harness-lite)

Lightweight, public-safe LLM evaluation harness starter kit: CSV prompt suites + run logs for refusal, boundary integrity, uncertainty, and drift tracking.

This repo is the public-safe, runnable harness component of the Driftmap program. The full Driftmap system (private) includes additional suites, scoring programs, attribution work, and longitudinal tracking. Nothing private is published here.

Why This Matters

Deployed AI systems need measurable behavioral consistency over time. This harness provides reproducible test suites for detecting:

Drift: Changes in refusal boundaries, reasoning patterns, or uncertainty calibration
Boundary integrity: Whether systems maintain clear scope and don't absorb user intent into identity
Reproducibility: Complete audit trails via CSV-based run logs and scoring rubrics

This approach enables systematic evaluation of AI safety properties that are critical for reliable deployment.

Driftmap docs (public-safe)

Drift taxonomy: docs/drift_taxonomy.md
Metrics definitions: docs/metrics_definitions_driftmap.md
Run log schema: docs/run_log_schema_driftmap.md

License

Code: MIT (see LICENSE.md)
Documentation + prompt suites (CSV): CC BY-ND 4.0 (as noted in LICENSE.md)

How to run (manual, no code)

Open prompts/suite_refusal_basic.csv
Copy each prompt into LM Studio (or AnythingLLM if testing with documents)
Paste outputs into results/results_refusal_basic_template.csv
Score using docs/rubric_refusal_basic.md
Save as a new file: results/results_refusal_basic_<date>.csv

Quickstart (No code)

Open a prompt suite in prompts/
Run each prompt in LM Studio (or another model UI)
Paste outputs into a copy of the matching file in results/
Score each row using docs/scoring_rubric.md
Save the scored run with a date in the filename

Repository structure

prompts/ = public-safe CSV prompt suites
results/ = results templates and sample logs
docs/ = rubrics + methodology notes
sample_results/ = example runs and run notes
src/ = optional runner code (if used)

Privacy boundary

This repository contains only generic, public-safe test suites and templates. Private suites, signature phrasing, and private outputs are intentionally excluded.

Default rule: if there is any ambiguity, treat it as private and do not add it here.

Portfolio map: https://github.com/alyssadata/PORTFOLIO_MAP.md

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
docs		docs
prompts		prompts
results		results
sample_results		sample_results
src		src
.gitignore		.gitignore
LICENSE-CODE.md		LICENSE-CODE.md
LICENSE-CONTENT.md		LICENSE-CONTENT.md
LICENSE.md		LICENSE.md
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Driftmap Public Harness (llm-eval-harness-lite)

Why This Matters

Driftmap docs (public-safe)

License

How to run (manual, no code)

Quickstart (No code)

Repository structure

Privacy boundary

About

Licenses found

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Driftmap Public Harness (llm-eval-harness-lite)

Why This Matters

Driftmap docs (public-safe)

License

How to run (manual, no code)

Quickstart (No code)

Repository structure

Privacy boundary

About

Topics

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages