feat(marking): borderline detection — flag rating vs derived-band disagreement#55
Open
hyperpolymath wants to merge 1 commit into
Conversation
…agreement
When the tutor's ordinal rating disagrees significantly with the band
derived from the per-component numeric grid, the composer flags it so
the tutor can review before generating feedback.
Severity heuristic
- Ratings ranked 0-5 (missing..strong); bands ranked 0-5 (fail..excellent)
- |rating_rank - band_rank| >= 2 -> flagged
- >= 3 step gap -> :severe; >= 2 -> :mild
- Sign of gap -> :over_rated (rating > derived) or :under_rated
Composer module
- borderline_check/2 public API
Input: rating string, per-component aggregate (or nil)
Output: flag map or nil
- collect_borderlines/3 (private) builds the per-component report inside
compose/1
- compose/1 result map gains :borderlines (%{component_id => flag})
LiveView
- Amber banner above the per-component cards: "N component(s) show
rating ↔ numeric disagreement — review before generating."
- Per-component cards with a flag get amber styling + a one-line
explainer: "rating: X, numeric: Y (severe over rated)" etc.
Tests (45 total, 11 new)
- severe over_rated (strong + fail-band)
- severe under_rated (weak + excellent-band, missing + good-band)
- mild over_rated (2-step gap)
- no flag for 1-step gap (strong + good, sound + fair)
- no flag for ranks within tolerance
- returns nil for missing aggregate / unrecognised rating
- compose/1: per-component flagged when disagreement; empty map when
ratings agree or no numeric data
Verified locally via standalone elixirc + ExUnit (45/0).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced Jun 10, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When the tutor's ordinal rating for a component disagrees significantly with the band derived from the per-component numeric grid, the composer flags it so the tutor can sanity-check before generating feedback.
Severity heuristic
A 1-step gap (e.g. "strong" rating + good-band numeric) is not flagged — that's normal rating-vs-band variance.
What ships in this PR
What this does NOT do
Tests
Test plan
🤖 Generated with Claude Code