This is the reference for the solver's off-path detection and cleaning — the read-layer that hides CFR's untrained "phantom" strategies so charts, the API, the CLI, and chained solves never surface a meaningless action at a node a hand never actually reaches.
- Preflop API:
poker_solver/preflop_offpath.py - Postflop API:
poker_solver/postflop_offpath.py - Read-layer accessors:
RangeVsRangeNashResult.per_history_strategy_view(poker_solver/range_aggregator.py),strategy_table(poker_solver/preflop_offpath.py) - CLI: the
chainedsubcommand (poker_solver/cli.py), opt-out via--raw-offpath - GUI: the preflop chart (
ui/views/preflop_chart.py) greys off-path cells
CFR (and DCFR) assigns a strategy to every decision node in the game tree, including nodes a hand reaches with ≈ 0 probability. Regret at an unreached node is never trained, so the stored "strategy" there is leftover noise — not a recommendation. Dumping it verbatim produces confusing output:
- Preflop. On a 4-bet line
||p|b200r400r1000, the engine still stores a strategy row for82sandA3o— hands that folded (or flat-called and closed the action) long before the 4-bet. The chart would otherwise show "82sraises 100% facing a 4-bet," which is meaningless:82sis never in range there. - Postflop. A combo that gives up on the flop (checks-to-close, folds a bet, or only rarely bets) still has a strategy row stored at same-street phantom nodes the engine materializes (the full tree is built). Reading it back shows "this hand raises after checking 100%" for a hand that is gone on this street — pure noise.
Off-path handling detects these nodes and, on the read layer, overwrites the
same-street ones to a pure fold so consumers see a clean, on-path-only
strategy. Crucially, it does not fold a hand that legitimately continued to
a new street (a new community card = a real new decision); those are shown
verbatim (annotated rarely_reached if low-frequency). See Postflop
specifics below.
A hand/combo is off-path at a node when either rule fires:
- Reach rule. Its normalized reach across the node's hands/combos is below the threshold (0.5% of the total reach mass at that node).
- Dominance rule. It is dominantly blocked (≥ 99%) at one of the displayed/hero player's own ancestor decision nodes on the line. Only the acting player's own decisions gate their reach; the opponent's actions define the line but do not enter the product.
Both signals come from a single walk per hand/combo. The walk is FAIL-SAFE: if it can't be fully computed for any hand on a line (a prior node missing, e.g. a partial live snapshot, or a token that can't be resolved), nothing is marked off-path for that line — the read layer never invents a fold from incomplete data.
Reach is computed by walking the displayed player's own ancestor decisions and multiplying in their continuing-action probabilities. At a bet/raise ancestor the walk credits the hand's total aggression mass (summed over every bet/raise size label at that node), not just the single size the line happened to take.
This is a deliberate fix. Blueprint raise nodes offer multiple sizes (e.g. the
BB's 3-bet menu at ||p|b300 is r600 / r700 / r900). A premium like AA
3-bets ~98% but puts almost all of that on one size (say r900, with
P(r700) ≈ 0.006). Reading only the single matched size returned ~0 reach and
falsely greyed AA on a sibling-size line. Crediting total aggression mass
fixes this: a hand that raised by any size took the aggressive action and is
in this branch. Call/limp (c) and all-in (A) stay exact — they are single,
unambiguous actions.
The postflop solver emits a per-(history, concrete-combo) strategy, so
detection is per combo. The infoset key is the engine's lossless
HUNLState.infoset_key: four single-| fields
"<hole>|<board>|<street>|<history>" (e.g. "KcAh|2d5s7cKhAs|r|"), where
<street> is a single token (f flop / t turn / r river — the
authoritative current street) and <history> is
"/".join("".join(tokens) for tokens in all_streets), so history.count("/")
equals the number of completed earlier streets (the node's street index within
the subgame). Boards are sorted by card-int, so a later street's board is
not a string-prefix of an earlier one (an extra run-out card interleaves).
-
Reach =
hero_range_weight[combo] × Π P_continueover the hero's own ancestor decisions. Same size-agnostic principle as preflop (a hero bet/raise ancestor credits total aggression mass over all sizes). The product is split intowithin_street_reach(gating decisions in the node's current street) andreach_to_street(gating decisions on earlier streets; their shorter, sorted boards are resolved by cardset-subset lookup). -
Board-aware. A combo whose hole card is on the board has reach 0 and is off-path (a blocked combo can't be in range). The blocking card is on the node's current street, so this is a same-street impossibility → fold.
-
STREET-AWARE — the central rule. Whether a low-reach node is a fold or a genuine new decision depends on whether a new community card has appeared since the reach collapsed:
- Same-street off-path → FOLD. No new card means the hand's situation is
unchanged, so a node it shouldn't be at is a give-up. This covers
fold-dominant /
all_in/checked_closed/called_closedat a same-street ancestor, and a within-street reach collapse (low_reach_same_street). - Cross-street → SHOW the real strategy (never force-fold). If the hand's
reach collapsed on an earlier street but it legitimately (if rarely)
continued to this street, the new card creates a real decision — e.g. a
gutshot that lacked flop odds but hits the turn must not be folded.
The engine's actual strategy is shown; if its normalized reach is tiny it is
annotated
rarely_reached(low-frequency / low-confidence) but the real action is kept.
This corrects a prior bug. Earlier,
foldedandall_incarried across streets as a forced fold — wrongly folding a hand that legitimately reached a new street. All fold-producing rules are now restricted to the same street; across a street boundary the only signal is low normalized reach, attributed to the earlier-street factor (reach_to_street) and surfaced as therarely_reachedannotation, not a fold. Preflop is unaffected (a single betting street — there is no cross-street boundary). - Same-street off-path → FOLD. No new card means the hand's situation is
unchanged, so a node it shouldn't be at is a give-up. This covers
fold-dominant /
Both modules expose a reason-aware variant (mark_off_path_with_reason) that
returns None for on-path hands or one of the codes below. When several could
apply at the same ancestor, per-node priority is
folded > all_in > checked_closed > called_closed, and any dominant block
always outranks the generic low-reach rule.
Two classes of postflop reason. Every reason except rarely_reached is a
same-street off-path code → the row is cleaned to a fold.
rarely_reached is the lone cross-street annotation → the real strategy is
kept (never folded). mark_off_path returns True (= "clean to fold")
only for the same-street fold codes; rarely_reached returns False.
| Reason | Class | Meaning | Cleaned to fold? |
|---|---|---|---|
folded |
same-street | Fold ≥ 99% at a current-street ancestor. | Yes |
all_in |
same-street | All-in ≥ 99% at a current-street ancestor — no voluntary action after committing the stack (this street). | Yes |
checked_closed (postflop only) |
same-street | A check that closed the current street's betting at an ancestor where the line then has the hero act again on the same street ("supposed to check 100% here, but the line has us acting again — treat as a fold"). | Yes |
called_closed |
same-street | A flat-call that closed the action at an ancestor where the line then continues with further same-street aggression (preflop: the A3o/K3s flat-call-then-4-bet case). | Yes |
low_reach_same_street |
same-street | Normalized reach below 0.5% attributable to within-street play (within_street_reach is the small factor); also the reason for board-blocked postflop combos. |
Yes |
rarely_reached (postflop only) |
cross-street | Normalized reach below 0.5% but the collapse is attributable to earlier streets (reach_to_street is the small factor, within-street play is normal) — the hand barely got to this street, but a new card makes this a genuine decision. Annotation only. |
No |
Compatibility.
low_reachremains importable as an alias oflow_reach_same_street(it keeps fold-producing semantics). Thefolded/all_incodes no longer carry across streets — a hand that folded/committed an earlier street simply has no later key on that line; if it nonetheless has a row on a new street, the new card makes it a realrarely_reacheddecision, not a carried fold.
Why
all_inmatters. Before the all-in rule, only fold was checked, so an all-in-dominant hand was wrongly left in-range on a same-street size-raise continuation. All-in mass is read from the exact all-in label and is never summed into the bet/raise aggregation.
Off-path cleaning is on by default everywhere a human or downstream tool reads a strategy back out:
-
GUI preflop chart (
ui/views/preflop_chart.py) — off-path cells render greyed with an em-dash (—), faded, and a reason-aware tooltip:Reason Tooltip all_in<hand> — not in range on this line (already all-in earlier)folded<hand> — not in range on this line (folded earlier on this line)called_closed<hand> — not in range on this line (called & closed action earlier)low_reach/ other<hand> — not in range on this line (doesn't reach this line; reach ≈ 0%) -
Preflop API —
strategy_table(average_strategy)returns the cleaned per-line table (clean=Trueby default). -
Postflop API —
RangeVsRangeNashResult.per_history_strategy_view()returns the cleaned per-history view (clean=Trueby default). -
CLI — the
chainedsubcommand's JSON output is off-path-cleaned by default. -
Chained solves — each per-street postflop subgame is a
RangeVsRangeNashResult, so it inherits the default-onper_history_strategy_viewaccessor.
Cleaned means each same-street off-path row is overwritten to a pure
fold: the preflop fold label is forced to 1.0 (everything else 0.0); the
postflop row's index 0 (the passive/give-up action) is forced to 1.0 (rest
0.0). A postflop cross-street rarely_reached row is left intact — the
engine's real (if low-frequency) strategy is shown, not folded.
To set expectations precisely:
- The data layer (postflop API / CLI / chained solves) cleans off-path by default.
- The postflop tree browser is already reach-filtered (node-aggregate), so it does not surface unreachable nodes.
- The postflop range-matrix GUI does not yet grey off-path cells — the per-combo greying the preflop chart has is not wired into the postflop matrix. Read the cleaned data via the API/CLI for now.
The raw engine output is available whenever you need it (diffing, debugging, exploitability checks):
| Layer | Raw opt-out |
|---|---|
| Preflop API | strategy_table(average_strategy, clean=False) |
| Postflop API | result.per_history_strategy_view(clean=False) |
CLI (chained) |
--raw-offpath |
This is the central safety property. The cleaning functions
(clean_off_path, strategy_table, per_history_strategy_view) all operate on
a freshly built / deep-copied structure and never touch the raw engine
output:
RangeVsRangeNashResult.per_history_strategy(the attribute) is never mutated, regardless ofclean.- The raw preflop
average_strategymapping passed intostrategy_tableis never mutated.
The raw strategy stays the source of truth consumed by exploitability computation, blueprint generation, and differential tests — those read the untrained-but-complete strategy directly. Off-path cleaning is a read-layer presentation concern, not an engine change.
Off-path detection keys on near-deterministic dominance (≥ 99% fold/all-in/ passive-close) and a small reach threshold. On low-iteration live solves, near-indifferent hands at the deepest nodes have not fully separated, so the detector can over-grey a hand that is genuinely close to a boundary. This is under-convergence, not a bug — run more iterations and it cleans up. Production blueprints (solved at 25,000 DCFR iterations) are clean. If you see unexpected greying on a quick live solve, re-check at higher iterations before suspecting the off-path logic.
from poker_solver.preflop_offpath import strategy_table
# rust_out is the result of a preflop range-vs-range solve.
average_strategy = rust_out["average_strategy"]
# Cleaned by default: off-path entries (folded / all-in / called-closed /
# low-reach) are overwritten to a pure fold.
table = strategy_table(average_strategy) # clean=True
# table[line][hand_class][action_label] -> probability
# e.g. table["||p|b200r400r1000"]["82s"] == {"fold": 1.0, ...}
# Raw projection — keep the untrained off-path noise intact:
raw = strategy_table(average_strategy, clean=False)from poker_solver.range_aggregator import solve_range_vs_range_nash
result = solve_range_vs_range_nash(cfg, hero_range, villain_range,
iterations=500, hero_player=1)
# Cleaned by default. Hero is OOP when the solve reports the defender seat;
# pass hero_is_oop explicitly to match your seat convention.
view = result.per_history_strategy_view(hero_is_oop=(result.position == "defender"))
# view[infoset_key] -> positional probability list. SAME-STREET off-path rows
# are folded (index 0 == 1.0): board-blocked combos and folded/all-in/closed
# lines on the CURRENT street. CROSS-STREET rows (the hand legitimately reached
# a new community card) are KEPT as the engine's real strategy — annotated
# `rarely_reached` when their normalized reach is tiny, NOT folded.
# Raw rows (a per-row copy of the unmutated attribute):
raw = result.per_history_strategy_view(clean=False)
# The raw attribute itself is always available and never mutated:
result.per_history_strategy # source of truth for exploitability / diff-tests# Default: chained JSON output is off-path-cleaned (SAME-STREET folded /
# all-in / closed lines and board-blocked combos are overwritten to fold;
# cross-street rarely_reached strategies are kept intact).
poker-solver chained --hero-range "AA,KK,AKs" --villain-range "QQ,JJ,AKs" \
--board "Ad 8h 9d" --lazy-postflop > chained_clean.json
# Opt out: emit the RAW postflop per_history_strategy (every combo at every
# node, off-path rows included).
poker-solver chained --hero-range "AA,KK,AKs" --villain-range "QQ,JJ,AKs" \
--board "Ad 8h 9d" --lazy-postflop --raw-offpath > chained_raw.jsonUSAGE.md§5.7 documents an older, complementary reach-annotation feature (SolveResult.off_path_keys/reach_probability) on the scalar postflop solve path — a set of unreachable infoset keys, not the per-line/per-combo fold-cleaning described here.docs/AGENT_COORDINATION.md§3e tracks the (now low-priority, optional) engine-side "bake off-path→fold into blueprint generation" ask — superseded for read-time purposes by the read-layer cleaning described here.