test(walker): add byte-equal parity corpus for storageToMarkdown (#138)#159
Merged
Conversation
Add tests/fixtures/storage-samples/ with 11 storage XML samples paired with checked-in .expected.md files. tests/storage-walker-parity.test.js iterates the corpus and asserts byte-equal output, so future structural refactors of the walker face every dispatch path on PR open instead of needing a human reviewer to spot a regression. Background: the regex → walker migration in #137 took three review rounds because each round surfaced one more entity-decoding path the walker had silently dropped. Subsequent fixes (#143, #152, #154, #156, #157) repeated the pattern. The corpus pins the current behaviour for each handler so the next refactor can iterate against CI. Corpus coverage: - prose, headings, inline formatting, entities (named, ASCII map, numeric) - callouts (info / warning / note + empty body) - expand (with title, without title), anchor - code (lang / no-lang / fence escape), mermaid - panel variants (title+body, body-only, title-only) - include (shared / personal space), shared-block, include-shared-block - tables with formatting and br collapse - lists (ul / ol / nested) and task lists - images (ri:attachment / ri:url / no source) and view-file - ac:link variants (anchor, ri:url, link-body, ri:page) - layouts, CDATA + entities, toc/floatmenu drop, blockquote, unknown tag To regenerate after an intentional output change: WRITE_PARITY_EXPECTED=1 npx jest tests/storage-walker-parity.test.js
Address four issues from self-review of #159: - cleanupWithFences ends with .trim(), so walker output has no trailing newline. Editors / pre-commit hooks that auto-add a final newline on save would silently break byte-equal compare. Now serialize and compare with a final '\n', keeping the .expected.md POSIX-compliant. - Test name was "round-trips to its pinned markdown" — confusing, storageToMarkdown is one-way. Rename to "matches its pinned markdown". - WRITE_PARITY_EXPECTED=1 silently rewrote the corpus and passed without asserting. A stray env var in a shell rc could overwrite the corpus during CI. Now log a console.warn so the regenerate path is visible. - Three handler paths were not exercised: handleMacro default branch (unknown macro name), handleAcLink fallback drop (empty <ac:link/>), handleImage drop on empty ri:url value. Extend fixtures 09 / 10 / 11.
|
🎉 This PR is included in version 2.1.11 🎉 The release is available on: Your semantic-release bot 📦🚀 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a parity corpus (
tests/fixtures/storage-samples/*.xml+.expected.md) andtests/storage-walker-parity.test.jsthat pins the currentstorageToMarkdownoutput for 11 representative storage samples. Future structural refactors of the walker have to face every dispatch path in CI on PR open instead of needing a human reviewer to spot the regression.Closes #138.
Why
The regex → walker migration in #137 went through three review rounds and each round surfaced the same shape of regression: an entity-decoding path the walker had silently dropped (text nodes, attributes, CDATA bodies, params, URLs). Subsequent walker fixes (#143, #152, #154, #156, #157) repeated the pattern. A pinned-output corpus closes that loop — the next time someone restructures the walker, every path is exercised and any output drift surfaces in the diff of one of the
.expected.mdfiles.Corpus coverage
Each fixture is one coherent group of paths that travel together, sized so a
.expected.mddiff is reviewable.01-prose-and-entities02-callouts03-expand-and-anchor04-code-and-mermaid05-panels06-includes-and-shared-blocks07-tables08-lists-and-tasks09-images-and-files10-links11-layouts-and-edge-casesRegenerating after an intentional output change
```
WRITE_PARITY_EXPECTED=1 npx jest tests/storage-walker-parity.test.js
```
The diff in
.expected.mdbecomes part of the PR for review.Test plan
WRITE_PARITY_EXPECTED=1 npx jest tests/storage-walker-parity.test.jsgenerates 11 expected filesnpx jest tests/storage-walker-parity.test.js(no env) → 11 / 11 pass byte-equalnpm test→ 578 / 578 passnpm run lintclean