fix(walker): escape markdown special chars in ri:content-title#156
Merged
Conversation
Confluence page titles routinely contain `()` (e.g. "Release notes (v2.0)") and could in theory contain `[]` or `\`. Interpolating them raw into markdown link syntax breaks downstream parsers and — for an adversarially-crafted title — allows a sibling link to be injected. Add an `escapeMarkdownText` helper and apply it at the three sites that splice `ri:content-title` into markdown: `handleAcLink`'s `[title]` form, `handleInclude`'s link text (URL still uses the unescaped title via `encodeURIComponent`), and `handleSharedBlock`'s prose interpolation.
|
🎉 This PR is included in version 2.1.9 🎉 The release is available on: Your semantic-release bot 📦🚀 |
4 tasks
pchuri
added a commit
that referenced
this pull request
May 1, 2026
… (#159) * test(walker): add byte-equal parity corpus for storageToMarkdown (#138) Add tests/fixtures/storage-samples/ with 11 storage XML samples paired with checked-in .expected.md files. tests/storage-walker-parity.test.js iterates the corpus and asserts byte-equal output, so future structural refactors of the walker face every dispatch path on PR open instead of needing a human reviewer to spot a regression. Background: the regex → walker migration in #137 took three review rounds because each round surfaced one more entity-decoding path the walker had silently dropped. Subsequent fixes (#143, #152, #154, #156, #157) repeated the pattern. The corpus pins the current behaviour for each handler so the next refactor can iterate against CI. Corpus coverage: - prose, headings, inline formatting, entities (named, ASCII map, numeric) - callouts (info / warning / note + empty body) - expand (with title, without title), anchor - code (lang / no-lang / fence escape), mermaid - panel variants (title+body, body-only, title-only) - include (shared / personal space), shared-block, include-shared-block - tables with formatting and br collapse - lists (ul / ol / nested) and task lists - images (ri:attachment / ri:url / no source) and view-file - ac:link variants (anchor, ri:url, link-body, ri:page) - layouts, CDATA + entities, toc/floatmenu drop, blockquote, unknown tag To regenerate after an intentional output change: WRITE_PARITY_EXPECTED=1 npx jest tests/storage-walker-parity.test.js * test(walker): self-review fixes for parity corpus (#138) Address four issues from self-review of #159: - cleanupWithFences ends with .trim(), so walker output has no trailing newline. Editors / pre-commit hooks that auto-add a final newline on save would silently break byte-equal compare. Now serialize and compare with a final '\n', keeping the .expected.md POSIX-compliant. - Test name was "round-trips to its pinned markdown" — confusing, storageToMarkdown is one-way. Rename to "matches its pinned markdown". - WRITE_PARITY_EXPECTED=1 silently rewrote the corpus and passed without asserting. A stray env var in a shell rc could overwrite the corpus during CI. Now log a console.warn so the regenerate path is visible. - Three handler paths were not exercised: handleMacro default branch (unknown macro name), handleAcLink fallback drop (empty <ac:link/>), handleImage drop on empty ri:url value. Extend fixtures 09 / 10 / 11.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #143.
ri:content-titleis interpolated directly into markdown link syntax. If a title contains(,),[,], or\, the link breaks or — for an adversarially-crafted title likeevil) [pwn](http://attacker.com— allows injection of an unintended sibling link. This is pre-existing on main (the regex pipeline didn't escape either), but it's a real correctness bug for any Confluence page whose title legitimately contains parentheses.Changes
escapeMarkdownText(s)helper that escapes\ [ ] ( ).ri:content-titleinto markdown:handleAcLink—[title]form for<ri:page>without link body.handleInclude— link text in include-page output. URL path still uses the unescaped title viaencodeURIComponent, so URL encoding is unaffected.handleSharedBlock—pageTitleinterpolated into prose between surrounding parens.Out of scope:
ri:filename(handleViewFile / handleImage) and the generic<a>link path. Those are pre-existing concerns the issue did not include.Test plan
npx jest— 560/560 passing.npx eslint— clean on changed files.