Skip to content

Fix garbled ANSI output when using bat as MANPAGER (--strip-ansi=auto now preserves man formatting)#3792

Open
Anntoin wants to merge 9 commits into
sharkdp:masterfrom
Anntoin:feat/ansi-overlay-reapply
Open

Fix garbled ANSI output when using bat as MANPAGER (--strip-ansi=auto now preserves man formatting)#3792
Anntoin wants to merge 9 commits into
sharkdp:masterfrom
Anntoin:feat/ansi-overlay-reapply

Conversation

@Anntoin

@Anntoin Anntoin commented Jun 3, 2026

Copy link
Copy Markdown

Problem

When using bat as MANPAGER, ANSI SGR reset codes like \x1B[22m (reset bold) appear as literal text "22m" in the output. This happens because the Manpage sublime-syntax definition does not consume ANSI escapes broadly enough, so syntect splits escape sequences across highlight regions.

Even with a syntax fix, ^-anchored heading patterns like ^(NAME|SYNOPSIS|...) cannot match after an ANSI escape is consumed at position 0, meaning syntax highlighting of headings is fundamentally incompatible with ANSI passthrough.

Fixes #3724.

Solution: ANSI Overlay

This PR introduces a "strip-and-reapply" overlay mechanism for --strip-ansi=auto:

  1. Strip ANSI escape sequences from the input before syntect tokenization
  2. Build a byte-offset → AnsiStyle overlay that captures all formatting state (bold, underline, colors, OSC8 hyperlinks)
  3. Re-apply the overlay style during rendering, so both syntect highlighting AND man formatting are preserved

Example

MANPAGER="bat -pl man --strip-ansi=auto" man uname

This produces output with:

  • ✅ Correct heading colors from syntect (color 182 for NAME, SYNOPSIS, etc.)
  • ✅ Bold/underline from man formatting preserved
  • ✅ No garbled \x1B[22m visible text
  • ✅ OSC8 hyperlinks preserved

Mode behavior

--strip-ansi ANSI stripped? Overlay re-applied? Result
never (default) No N/A ANSI passes through as-is (unchanged behavior)
auto Yes Yes Clean text for syntect + man formatting preserved
always Yes No All ANSI removed, syntect-only colors

Commits

  1. fix: preserve ANSI escapes while stripping overstrike — Cherry-picked from fix: preserve ANSI escapes while stripping manpage overstrike #3762 (independently needed)
  2. feat: add Clone/Default derives to AnsiStyle and Attributes — Prerequisite for overlay snapshots
  3. feat: add strip_ansi_with_overlay() to preprocessor — Core overlay function with 20 unit tests
  4. feat: re-apply ANSI overlay when stripping for syntax highlighting — Wires overlay into the printer with three-way region_style logic
  5. docs: update --strip-ansi help text for auto mode and MANPAGER usage — Documents the new behavior
  6. feat: update Manpage.sublime-syntax to consume ANSI/OSC escapes in prototype — Prevents remaining garbled text from partial escape sequences

Performance

Negligible overhead. Benchmarked with 5 man pages (70K–2501K):

Page Size never auto auto/never
home-configuration.nix 2501K 0.817s 0.790s 0.97x
bash 473K 0.109s 0.109s 1.00x
ffmpeg 177K 0.043s 0.052s 1.20x
sshd_config 84K 0.024s 0.027s 1.14x
git 70K 0.024s 0.027s 1.14x

The largest page shows no measurable overhead. For medium pages, overhead is 14–20% (< 10ms absolute).

Test coverage

  • 20 new unit tests for strip_ansi_with_overlay() covering: bold, underline, dim, italic, foreground/background color, 256-color, truecolor, SGR reset, OSC8 hyperlinks (open, close, combined with bold), adjacent style changes, empty input, escapes-only input
  • All 233 integration tests pass
  • All 145 unit tests pass
  • Pre-existing no_duplicate_extensions failure (unrelated, from syntax set)

Note on PR #3762

This PR includes commit 9e0c2a9d from #3762 (the pop_visible_char fix for overstrike stripping). That PR is independently valuable and should be merged regardless of this PR. If #3762 is merged first, this PR will need a rebase to drop the cherry-picked commit.

Anntoin added a commit to Anntoin/bat that referenced this pull request Jun 3, 2026
…kdp#3792)

Signed-off-by: Anntóin Wilkinson <anntoin@gmail.com>
Anntoin added a commit to Anntoin/bat that referenced this pull request Jun 3, 2026
…kdp#3792)

Signed-off-by: Anntóin Wilkinson <anntoin@gmail.com>
@Anntoin Anntoin force-pushed the feat/ansi-overlay-reapply branch from 835835c to 567f54b Compare June 3, 2026 13:58
sjh9714 and others added 9 commits June 3, 2026 16:09
AnsiStyle and Attributes need Clone and Default implementations so that
the overlay approach can clone style snapshots and create default (empty)
instances. These derives are a prerequisite for strip_ansi_with_overlay()
which builds a Vec<(usize, AnsiStyle)> mapping.

Signed-off-by: Anntóin Wilkinson <anntoin@gmail.com>
Add strip_ansi_with_overlay() which strips ANSI escape sequences from a
line and returns both the stripped text and a style overlay mapping byte
positions to the AnsiStyle active at each position in the original text.

This allows syntax highlighting (syntect) to operate on clean text, while
preserving original ANSI formatting (bold, underline, color, hyperlinks)
for re-application during rendering.

Includes 20 unit tests covering bold, underline, dim, italic, foreground
and background colors (8-bit, 256-color, truecolor), SGR reset, OSC8
hyperlinks, combined styles, adjacent style changes, and edge cases.

Update StripAnsiMode doc comment to document the new Auto mode semantics.
Mark strip_ansi() with #[allow(dead_code)] as it is no longer called
outside of tests.

Signed-off-by: Anntóin Wilkinson <anntoin@gmail.com>
Wire strip_ansi_with_overlay() into the InteractivePrinter so that when
strip_ansi is active, the ANSI style overlay is computed for each line and
used to determine the correct style for each syntax-highlighted region.

The three-way region_style logic:
- reapply_ansi_overlay (StripAnsiMode::Auto + colored output): look up the
  overlay style at the region's byte offset, so original ANSI formatting
  (bold, underline, hyperlinks) is layered on top of syntect colors.
- strip_ansi without overlay (StripAnsiMode::Always): use a default/empty
  style, discarding all ANSI formatting.
- no strip_ansi: use the live mutable ansi_style so escape sequences within
  a region are reflected in subsequent text chunks.

Add ansi_style_overlay and reapply_ansi_overlay fields to InteractivePrinter.

Signed-off-by: Anntóin Wilkinson <anntoin@gmail.com>
Update the long-help text for --strip-ansi to explain that 'auto' mode
strips ANSI before syntax highlighting but re-applies semantic formatting
(bold, underline, hyperlinks) on the output so both highlighting and input
formatting are preserved. Note that this is the recommended mode for use
as MANPAGER.

Signed-off-by: Anntóin Wilkinson <anntoin@gmail.com>
…ototype

Add ANSI and OSC escape sequence patterns to the prototype context in
Manpage.sublime-syntax so they are consumed before syntax highlighting
rules match against them. This prevents escape sequences from breaking
across highlight regions when using --strip-ansi=auto with man pages.

Signed-off-by: Anntóin Wilkinson <anntoin@gmail.com>
…kdp#3792)

Signed-off-by: Anntóin Wilkinson <anntoin@gmail.com>
- Add #[cfg(test)] to test-only overlay_style_at helper in preprocessor.rs
- Derive PartialEq on AnsiStyle/Attributes to replace format!()-based
  comparison with direct equality check (simpler, zero-allocation)
- Replace map_or(true, |(_, prev)| format!(...) != format!(...))
  with map_or(true, |(_, prev)| *prev != current_style)
- Extract region_ansi_style() method from duplicated get_region_style
  closures in both colored and uncolored printer branches
- Remove 36 lines of duplicated closure+comment boilerplate

Signed-off-by: Anntóin Wilkinson <anntoin@gmail.com>
@Anntoin Anntoin force-pushed the feat/ansi-overlay-reapply branch from 262c2a2 to 8cb6c21 Compare June 3, 2026 15:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Man pages displaying bold reset code 22m

2 participants