A public portfolio focused on adversarial AI evaluation, multimodal model behavior, and trust and safety systems.
Live site: https://jackiejay077.github.io/
This repository contains the source files for my professional portfolio and public body of work in AI safety.
The project is structured around three types of material:
- Case Files — evidence-based analyses of model failures, risk trajectories, and evaluation outcomes
- Evaluation Frameworks — reusable rubrics, taxonomies, and methods for assessing model behavior
- Field Notes — shorter observations on safety judgment, refusal behavior, ambiguity, and multimodal risk
- adversarial evaluation
- multimodal safety testing
- model behavior analysis
- trust and safety systems
- failure-mode classification
- evaluator calibration
- qualitative root-cause analysis
When Reassurance Overrides the Evidence
A longitudinal analysis of cumulative self-harm risk, context abandonment, and premature de-escalation after user reassurance.
A Refusal Is Not Automatically a Safe Response
An examination of why refusal behavior alone is an incomplete measure of model safety.
jackiejay077.github.io/
├── assets/
│ └── css/
│ └── style.css
├── case-files/
│ └── reassurance-overrides-evidence.html
├── field-notes/
│ └── refusal-is-not-safety.html
├── index.html
└── README.md
The work in this repository emphasizes:
- conversation-level rather than prompt-level evaluation
- cumulative interpretation of risk signals
- distinction between surface compliance and actual reasoning quality
- contextual analysis across text and image inputs
- reproducibility and transparent failure classification
- synthetic or sanitized examples that preserve analytic value without exposing confidential material
The site is intentionally designed to feel like a working evaluation environment rather than a traditional portfolio template.
The visual system uses:
- dark operational interfaces
- restrained teal accents
- status-based color semantics
- case-oriented information architecture
- minimal decorative elements
- evidence-first presentation
This portfolio is under active development.
Current priorities:
- expanding the evaluation framework section
- publishing additional case files and field notes
- consolidating shared styles across all pages
- improving navigation and accessibility
- adding professional links and downloadable materials
All public examples are independently authored, synthetic, sanitized, paraphrased, or adapted from non-confidential work.
No proprietary datasets, internal policies, confidential prompts, or restricted evaluation materials are reproduced in this repository.
Jacqueline Jiang AI Safety Analyst Adversarial Evaluation · Multimodal Safety · Trust & Safety