Skip to content

jackiejay077/jackiejay077.github.io

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Jacqueline Jiang — AI Safety Portfolio

A public portfolio focused on adversarial AI evaluation, multimodal model behavior, and trust and safety systems.

Live site: https://jackiejay077.github.io/

Overview

This repository contains the source files for my professional portfolio and public body of work in AI safety.

The project is structured around three types of material:

  • Case Files — evidence-based analyses of model failures, risk trajectories, and evaluation outcomes
  • Evaluation Frameworks — reusable rubrics, taxonomies, and methods for assessing model behavior
  • Field Notes — shorter observations on safety judgment, refusal behavior, ambiguity, and multimodal risk

Current Focus

  • adversarial evaluation
  • multimodal safety testing
  • model behavior analysis
  • trust and safety systems
  • failure-mode classification
  • evaluator calibration
  • qualitative root-cause analysis

Published Work

Case File 001

When Reassurance Overrides the Evidence

A longitudinal analysis of cumulative self-harm risk, context abandonment, and premature de-escalation after user reassurance.

Read the case file

Field Note 001

A Refusal Is Not Automatically a Safe Response

An examination of why refusal behavior alone is an incomplete measure of model safety.

Read the field note

Repository Structure

jackiejay077.github.io/
├── assets/
│   └── css/
│       └── style.css
├── case-files/
│   └── reassurance-overrides-evidence.html
├── field-notes/
│   └── refusal-is-not-safety.html
├── index.html
└── README.md

Methodology

The work in this repository emphasizes:

  • conversation-level rather than prompt-level evaluation
  • cumulative interpretation of risk signals
  • distinction between surface compliance and actual reasoning quality
  • contextual analysis across text and image inputs
  • reproducibility and transparent failure classification
  • synthetic or sanitized examples that preserve analytic value without exposing confidential material

Design Principles

The site is intentionally designed to feel like a working evaluation environment rather than a traditional portfolio template.

The visual system uses:

  • dark operational interfaces
  • restrained teal accents
  • status-based color semantics
  • case-oriented information architecture
  • minimal decorative elements
  • evidence-first presentation

Development Status

This portfolio is under active development.

Current priorities:

  • expanding the evaluation framework section
  • publishing additional case files and field notes
  • consolidating shared styles across all pages
  • improving navigation and accessibility
  • adding professional links and downloadable materials

Confidentiality

All public examples are independently authored, synthetic, sanitized, paraphrased, or adapted from non-confidential work.

No proprietary datasets, internal policies, confidential prompts, or restricted evaluation materials are reproduced in this repository.

Author

Jacqueline Jiang AI Safety Analyst Adversarial Evaluation · Multimodal Safety · Trust & Safety

About

A public portfolio exploring adversarial AI evaluation, multimodal model behavior, and trust and safety systems.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors