Fix RSCProvider retry after rejected getComponent promise#4022
Fix RSCProvider retry after rejected getComponent promise#4022justin808 wants to merge 7 commits into
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Code Review: Fix RSCProvider retry after rejected getComponent promiseOverall Assessment: LGTM with minor notes The fix is correct, minimal, and well-tested. The two-argument What works well
One design note to verify
In practice this seems safe today because:
But it's a gotcha worth a comment in Minor test nit
|
size-limit report 📦
|
Code ReviewThe fix is correct and well-targeted. The Test coverage is thorough: 0a covers the dedup/success-caching path, 0b covers retry-after-rejection, and 0c covers the stale-eviction-vs-newer-refetch race. Nothing is obviously missing. A few things worth discussing: The timer is never canceled on unmountIf the provider unmounts in the ~0 ms window between a rejection and the scheduled eviction, the
|
Code ReviewThe fix is logically sound. The core insight is correct: when The identity guard ( Observations1. Brief rejected-promise window (design gap in docs) 2. 3. FIFO macrotask assumption in 4. Test naming convention mismatch SummaryNo bugs found. No security implications. Performance impact of |
Adversarial PR Review — #4022 Fix RSCProvider retry after rejected getComponent promiseHead: `0274c35` | Base: `main` | State: OPEN / DRAFT / BLOCKED | Merged: no SummaryThe core fix is correct and well-targeted. The two-argument `.then(onFulfilled, evictPromiseIfRejected)` pattern is the right choice over `.catch()` (which would intercept errors thrown by the success handler), and the `setTimeout(0)` boundary over `queueMicrotask` is the right boundary to use — microtasks can race with React's internal queue and evict the cache entry before Suspense observes the rejection. The identity guard (`fetchRSCPromisesRef.current[key] === promise`) correctly handles the concurrent-refetch case (verified by test 0c). The new tests are targeted and cover the three behaviors that matter: deduplication, eviction-on-rejection, and non-eviction of superseded promises. BLOCKING1. PR is still a draft; no human maintainer has approved it; CI is incomplete. DISCUSS2. Missing CHANGELOG entry. 3. The unmount-safety comment is misleading (code is correct, comment is not). FOLLOWUP (post-merge)4. No Suspense-level integration test for the retry path. 5. Fake-timer hazard for `waitForRejectedGetComponentEviction`. NON_BLOCKING_DECISION6. `setTimeout(0)` over `queueMicrotask`. 7. `no-dynamic-delete` ESLint suppression. |
|
Review follow-up after
Validation rerun:
|
| return payload; | ||
| }; | ||
| promise = getServerComponent({ componentName, componentProps }).then(markPayloadIfSuccessful); | ||
| const evictPromiseIfRejected = (error: unknown) => { |
There was a problem hiding this comment.
Note the asymmetry between the two error paths handled here:
evictPromiseIfRejected(promise rejection) → cache entry is evicted after one macrotask, enabling retrymarkPayloadIfSuccessfulabove forpayload instanceof Error(error-as-value) → cache entry is retained permanently
If getServerComponent can fulfill with an Error object (the error-as-value pattern), those entries will block retries just like the old rejected-promise behavior — getComponent keeps returning the same fulfilled-with-Error promise until an explicit refetchComponent call. Is this intentional? If error-as-value responses are retryable, the same eviction logic would apply to the !(payload instanceof Error) branch too.
There was a problem hiding this comment.
Leaving this open as a product/contract decision rather than changing behavior in this safe-small review pass. Issue #3929 and its acceptance criteria cover rejected getComponent promises, and this PR is intentionally scoped to that rejected-promise retry path.
The fulfilled Error-as-value path is currently treated separately in RSCProvider for both getComponent and refetchComponent, while route-loader payloads from createRscPayloadNode already normalize serialized Error payloads into promise rejections. Making fulfilled Error values retryable could change the contract for direct getServerComponent implementations that intentionally fulfill with an Error object. Since #3929 is assigned to ihabadham, I am leaving this thread unresolved for assignee/maintainer direction before expanding scope.
Code ReviewThe core fix is correct and well-designed. The Strengths
Issues
Error-as-value path is not evicted (inline comment at line 122 in RSCProvider.tsx) Suspense timing gap in test coverage Test numbering ( Minor nitIn test 0c, the assertion comment Overall: ready to merge after addressing the Sinon |
Adversarial Review — PR #4022 (RSCProvider
|
| @@ -119,7 +119,31 @@ export const createRSCProvider = ({ | |||
| } | |||
| return payload; | |||
There was a problem hiding this comment.
DISCUSS: Error-as-value payloads are permanently cached with no retry path
When getServerComponent resolves (rather than rejects) with an Error instance, the fulfillment branch fires: markPayloadIfSuccessful skips markSuccessfulPromise (correct — it is not a successful payload) but returns the Error, so the .then() chain resolves successfully. evictPromiseIfRejected is never called. The cache entry keeps the resolved-Error promise indefinitely.
After this PR, getComponent for that key will always return the same resolved-Error promise — no retry, no eviction. This is the open product question noted in the PR body. It needs an explicit decision before undraft:
- Option A (deliberate boundary): close the unresolved thread, document that
getServerComponentmust throw (not return) to get retry behavior, and note this in the RSCProvider contract. - Option B (follow-up fix): track a follow-up issue to also evict on resolved-Error payloads (likely needs the same
setTimeout(0)treatment to let Suspense/consumers observe the error value before eviction).
|
CI process update after #4036 merged: Please rebase or otherwise update this PR onto current Post-merge audit tracker: #4055 This PR is currently a draft; no immediate action is required until it is resumed, but it should still use the new process before it is considered merge-ready. Legacy labels currently present here: Recommended update path:
Vocabulary changes:
|
Code Review: Fix RSCProvider retry after rejected getComponent promiseOverviewThe fix is sound. Caching rejected promises indefinitely was a real bug — any Correctness ✅
Issues1. Asymmetry:
2.
3. FIFO timer-ordering assumption in The helper is clever and the FIFO guarantee holds in Node.js/V8. The fake-timer guard is a good safety net. The comment explains how it works but not why it is safe in this environment. Adding "FIFO is guaranteed by the event loop spec and holds in Node.js/V8/JSDOM" locks down the assumption for future test maintainers. 4. Test naming: These were inserted before the existing Minor
Bundle size CI failurePer the PR description, the failing check reports unchanged OSS bundle entries ( VerdictThe implementation is correct and well-tested. The two main actionable items are: (1) a test or expanded comment documenting the |
| @@ -168,6 +192,8 @@ export const createRSCProvider = ({ | |||
| return payload; | |||
| }, | |||
| (error: unknown) => { | |||
There was a problem hiding this comment.
The comment correctly describes the intent, but consider adding the downstream implication for getComponent callers: after a non-recovering refetchComponent failure the rejected promise remains cached, so any subsequent getComponent call for the same key will return the cached rejection until another refetchComponent is issued. This is the asymmetry with getComponent's auto-eviction and could surprise integrators:
| (error: unknown) => { | |
| // Non-recovering refetches keep the rejected promise so the | |
| // caller observes the failure; route refetches recover in production. | |
| // Note: unlike getComponent (which auto-evicts after one macrotask), | |
| // this rejection stays cached — subsequent getComponent calls for the | |
| // same key will return this failure until the next refetchComponent. |
| }, 0); | ||
| throw error; | ||
| }; | ||
| promise = getServerComponent({ componentName, componentProps }).then( |
There was a problem hiding this comment.
The let promise! declaration a few lines above this assignment is captured by the evictPromiseIfRejected closure before it's assigned here. This is valid because the closure only executes asynchronously (on rejection), so promise is always fully assigned by the time it runs. The non-null assertion (!) silences TypeScript's concern but doesn't document the invariant — a short note at the let promise! declaration site ("assigned synchronously below; closure runs async so promise is always set by the time it executes") would help future readers avoid a potential confusion.
| setTimeout(resolve, 0); | ||
| }); | ||
| }; | ||
|
|
There was a problem hiding this comment.
The FIFO macrotask ordering assumption is sound (guaranteed by the event loop spec, holds in Node.js/V8/JSDOM), but it could silently pass in environments where timers are not truly FIFO. The existing fake-timer guard is a good safety net. Worth appending to the comment: "FIFO is guaranteed by the HTML event loop spec and holds in V8/JSDOM" so future maintainers know the exact invariant this relies on.
a00d251 to
704b7f9
Compare
Bundle-size blocker: resolved (stale-base artifact, not a real regression)The previously failing Evidence. The failing run reported every one of the 8 tracked bundles growing ~700–1100 bytes, including the OSS
This PR changes only two Pro files ( Root cause. The branch was 7 commits behind Fix applied. Rebased onto current No Note: This resolves only the bundle-size blocker. The PR remains draft pending the other tracked items (Error-as-value retryability product/contract decision, open review threads, and #3929 assignee/maintainer coordination). Not merging here. |
| @@ -168,6 +192,8 @@ export const createRSCProvider = ({ | |||
| return payload; | |||
| }, | |||
| (error: unknown) => { | |||
| // Non-recovering refetches keep the rejected promise so the | |||
| // caller observes the failure; route refetches recover in production. | |||
| if (recoverOnError) { | |||
There was a problem hiding this comment.
When recoverOnError is false (the default), the rejected refetch promise stays in fetchRSCPromisesRef.current[key] (set just below at line 203). Because getComponent checks key in fetchRSCPromisesRef.current before doing anything (line 111), it will return this stuck rejected promise on every subsequent call — with no eviction timer scheduled and no new fetch triggered.
This is the opposite of what this PR does for getComponent rejections: they get a deferred eviction timer so the next call can retry. A failed non-recovering refetchComponent leaves the cache poisoned until another refetchComponent is issued. If any part of the UI calls getComponent (rather than refetchComponent) after this, it will be permanently broken for that key.
Consider either:
- Scheduling a
setTimeout(0)eviction here with the same identity guard, or - Adding a test that exercises
getComponentafter a non-recoveringrefetchComponentfailure and documents the current behaviour as an explicit contract.
| } | ||
|
|
||
| return new Promise<void>((resolve) => { | ||
| setTimeout(resolve, 0); |
There was a problem hiding this comment.
The FIFO guarantee here is coupled to the production implementation using setTimeout(0) in evictPromiseIfRejected. If that ever changes to setImmediate, queueMicrotask, or a longer delay, this helper silently stops being sufficient — the eviction would either run later or sooner than expected, and the test could flake or produce false positives.
Consider a brief note in the comment (or the function's JSDoc) that explicitly states: "this helper must be kept in sync with the eviction mechanism in RSCProvider.tsx." The fake-timer guard is already a good defensive check; this would make the coupling explicit for future maintainers.
Code Review: Fix RSCProvider retry after rejected getComponent promiseThe fix addresses a correctness bug where a rejected The fix is correct:
Key concern: This PR fixes
The comment at lines 195-196 says this is intentional, but Either apply the same deferred-eviction pattern here, or add a test documenting the permanent-rejection behaviour as an explicit contract. Test quality: Missing test: No test covers the poisoned-cache case after a non-recovering |
Pro Node Renderer Benchmark Summary
▲/▼ non-zero change vs baseline · 0.0% exact/near-zero match · 🔴 significant regression · 🟢 significant improvement (tracked measures) · (n) = baseline |
Core Benchmark Summary
▲/▼ non-zero change vs baseline · 0.0% exact/near-zero match · 🔴 significant regression · 🟢 significant improvement (tracked measures) · (n) = baseline |
Pro (shard 2/2) Benchmark Summary
▲/▼ non-zero change vs baseline · 0.0% exact/near-zero match · 🔴 significant regression · 🟢 significant improvement (tracked measures) · (n) = baseline |
Pro (shard 1/2) Benchmark Summary
▲/▼ non-zero change vs baseline · 0.0% exact/near-zero match · 🔴 significant regression · 🟢 significant improvement (tracked measures) · (n) = baseline |
…4110) ## Problem `check-bundle-size` produced false-positive failures for PRs whose branch had fallen behind the base branch. The job built **base** sizes from `github.event.pull_request.base.sha` but built **PR-head** sizes from `github.sha` — the `refs/pull/N/merge` commit (the head merged into the base-branch tip). When the PR branch is several commits behind `main`, those two values reference **different base-branch baselines**. Growth that landed on `main` after the branch point gets absorbed into the delta and mis-attributed to the PR, producing a uniform ~700–1100 byte "increase" across **every** tracked bundle — including the OSS `react-on-rails/client` bundle a Pro-only PR never touched. Hit concretely by #4022 and #4023; the only workaround was a manual rebase onto current `main`. ## Root cause (verified) For `pull_request` events `github.sha` is the ephemeral merge commit `refs/pull/N/merge`, which has two parents: - **parent 1** = the base-branch commit the head was merged into - **parent 2** = the PR head The PR-head build already uses this merge commit, so its parent 1 is the *exact* base baseline the merge was computed against. Building base from the separate, frozen `base.sha` lets the two drift apart as `main` advances. ## Fix (Option 3, anchored to the merge commit) Derive base from **`github.sha`'s first parent** instead of `base.sha`: - Base and PR-head builds now share the same base-branch baseline **by construction** — skew is provably impossible, with no frozen-payload value trusted and no live-tip fetch (which would reintroduce a race). - Self-heals stale branches: a behind-`main` PR is always measured against the baseline its own merge ref uses. vs. the other options: - **Option 1 (`git merge-base`)** pairs against the branch point, but the head build is the *merge ref* — so merge-base vs merge-ref still skews. - **Option 2 (fail-soft "rebase and re-run")** leaves the manual toil in place. ### Changes (`.github/workflows/bundle-size.yml` only) - PR-runtime checkout now uses `fetch-depth: 0` so the merge commit's parents are real (a shallow checkout grafts the tip and hides them). - New **Determine base SHA** step validates `github.sha` is a 2-parent merge commit whose second parent matches the PR head, and fails with an actionable message when the PR is not mergeable (`github.sha == head sha`, i.e. base-branch conflicts). - Base checkout consumes the derived SHA with `fetch-depth: 0`. No changes to `scripts/bundle-size.mjs` or `.size-limit.json`; the `set-limits` flow is unchanged. ## Validation - `actionlint .github/workflows/bundle-size.yml` — clean (SC2207 fixed via `read -ra`). - Derivation logic unit-tested under bash against the real `refs/pull/4022/merge` ref: - real merge ref → base = correct first parent ✓ - unmergeable (`github.sha == head`) → actionable error ✓ - non-merge (1-parent) commit → caught ✓ - merge ref with mismatched head → caught ✓ - `prettier --check` clean (and `*.yml` is prettier-ignored). ## Workflow Change Audit - **Secret references:** none added/changed. - **`permissions:`** unchanged (`contents: read`; job retains `pull-requests: write`). - **`on:` triggers:** unchanged. - **Third-party actions:** none added or version-changed (`actions/checkout@v4`, `actions/setup-node@v4`, `pnpm/action-setup@v4`, `andresz1/size-limit-action@v1` all unchanged). - **Command injection:** new `run:` block reads only SHAs, passed via `env:` and quoted (`"$MERGE_SHA"`, `"$HEAD_SHA"`); `ref:` consumes a git-validated SHA, not free text. ### New-gate / stale-base rollout note This change does not add or broaden enforcement — it makes the existing gate **immune to stale-base skew by construction**, so it can only reduce false positives. Stale-based open PRs that touch `packages/**` will, on their next `check-bundle-size` run, be measured against their own merge-ref baseline and get accurate results (the fix is self-applying; re-running the check is the race control). No sweep of in-flight PRs is required to land safely. <!-- CURSOR_SUMMARY --> --- > [!NOTE] > **Low Risk** > CI-only workflow change with no secrets or app code; it should reduce false positives and only fails early on unmergeable PRs. > > **Overview** > Fixes **false-positive bundle size failures** when a PR branch is behind the base branch. The job used to measure the PR from `github.sha` (the ephemeral merge commit) but measured the baseline from `github.event.pull_request.base.sha`, so unrelated growth on `main` could look like a uniform increase across every tracked bundle. > > The initial PR checkout now uses **`fetch-depth: 0`** so the merge commit’s parents are visible. A new **Determine base SHA** step reads the **first parent** of `github.sha`, checks it is a two-parent merge whose second parent matches the PR head, and fails with a clear message when the PR is not mergeable. The **base branch checkout** uses that derived SHA instead of `base.sha`, so base and PR-head builds share the same baseline by construction. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit 52ae4bd. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Chores** * Improved bundle size checking accuracy and reliability by enhancing how the CI workflow determines and checks out the baseline SHA for pull requests. * Added validation and clearer failure messages when pull request merge structure is unexpected or not mergeable. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --- ## Merge confidence note Merged with delegated authority. Rationale: - **Scope is contained and low-risk.** Single-file change to `.github/workflows/bundle-size.yml`; no product/runtime code, no `permissions:`/`on:`/secret/third-party-action changes (per the Workflow Change Audit comment). The gate only loosens — it removes a false-positive source and cannot make a real size regression pass (the `base + 0.5KB` threshold logic is unchanged). - **Proven end-to-end, not just reasoned.** `check-bundle-size` ran green on this PR's own head, exercising the new merge-commit-first-parent base derivation and the base checkout at default depth. Root cause was verified empirically against the real `refs/pull/4022/merge` ref (parent1 = base, parent2 = head). - **Full required gate green:** `required-pr-gate`, `check-bundle-size`, `actionlint`, `zizmor`, `CodeQL`, `Analyze (ruby/js)` all SUCCESS. - **Independent review satisfied:** CodeRabbit approved; Cursor Bugbot green; `claude-review` green for the current head; both inline review findings (error-message underflow, base `fetch-depth`) addressed in `52ae4bdb8` and replied to. No unresolved actionable feedback. - **Stale-base rollout handled:** the change is self-applying — open behind-`main` PRs touching `packages/**` get accurate results on their next `check-bundle-size` run; no pre-land sweep required. --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Summary
RSCProviderretry path after a rejectedgetComponentpromise by evicting the rejected same-key cache entry on the next macrotask.Issue Disposition
ihabadham; this remains draft until maintainer/assignee coordination and current-head CI/review finish.Agent Merge Confidence
Mode: accelerated-rc
Current head SHA: a00d251
Score: 3/10
Auto-merge recommendation: no
Affected areas: Pro RSCProvider retry behavior, Pro RSC client tests
CI detector:
script/ci-changes-detector origin/main-> Pro/RSC runtime coverage expected; labelsfull-ci,benchmarkremain appropriate.Validation run:
pnpm --dir packages/react-on-rails-pro exec jest tests/imperativeRefetch.client.test.tsx --runInBand-> passed, 25 tests.pnpm --dir packages/react-on-rails-pro run type-check-> passed./Users/justin/.codex/worktrees/7e0e/react_on_rails/node_modules/.bin/prettier --check packages/react-on-rails-pro/src/RSCProvider.tsx-> passed for the latest comment-only change.git diff --checkandgit diff --check origin/main...HEAD-> passed.a00d2510e1da2603e0c81c906d536bc59e94c9b7.Review/check gate:
a00d2510e1da2603e0c81c906d536bc59e94c9b7has one failed check and multiple jobs still queued/in progress.Bundle Size / check-bundle-sizefailed because static size limits were exceeded; the log includes unchanged OSS bundle entries (react-on-rails/client63,845 > 63,392 gzip and 54,774 > 54,406 brotli), so this appears broader than the two Pro files touched by this PR, but it remains merge-blocking until fixed, rerun clean, or explicitly waived.de31fe2afa76ad966e9bced42a23ef9b03642b02:webpack-dummy-app-tests / build-dummy-app-test-bundles (4.0, 22, latest, webpack)failed before repo build/test code because hosted runnerapt-get updatereceived 403s from Microsoft package repositories while installinglibyaml-dev.refetchComponentrejection caching, the delayed-eviction closure overpromise, and timer FIFO assumptions. The non-blocking macrotask-window thread was addressed and resolved ina00d2510e.a00d2510e1da2603e0c81c906d536bc59e94c9b7; full CI/benchmark completion is still pending.Feedback triage:
de31fe2afremoved the Sinon-specifictimerApi.clockfake-timer branch from the Jest-only test helper.de31fe2afadded a concise FIFO timer-ordering comment at the rejectedgetComponenteviction wait.a00d2510eclarified that the cached macrotask-window entry is the chained rejected promise itself.getComponentpromises,createRscPayloadNodealready turns serialized Error payloads into rejections for route-loader payloads, and changing fulfilledErrorretention could expand the directgetServerComponentcontract.Labels:
full-ci,benchmark. This changes Pro/RSC retry behavior and should retain broad CI plus benchmarks.Known residual risk: High until the current-head bundle-size failure is resolved or waived, the five unresolved review threads are addressed or explicitly triaged, and the Error-as-value product decision is made; PR remains draft pending maintainer/assignee coordination for Pro RSCProvider should evict rejected getComponent promises so transient failures can retry #3929.
Finalized by: pending current-head independent review/check completion.
Current Readiness
Draft, not merge-ready. Local validation is green for
a00d2510e1da2603e0c81c906d536bc59e94c9b7, but current-head GitHub checks include a failedBundle Size / check-bundle-sizejob, and five current review threads remain open. #3929 is assigned toihabadham, so undraft/merge should wait for maintainer or assignee coordination.Note
Medium Risk
Changes Pro RSC promise caching and retry timing around Suspense; behavior is well-tested but affects production RSC fetch recovery paths.
Overview
Fixes Pro
RSCProviderso a failedgetComponentfetch does not leave a permanently cached rejected promise for the same cache key.getComponentnow chains a rejection handler that still re-throws for Suspense, then deletes the in-flight cache entry on the nextsetTimeout(0)macrotask, but only when the entry is still the same promise—so a newer same-keyrefetchComponentpromise is not cleared by a stale failure. Comments document the Suspense timing choice and unmount safety.refetchComponentbehavior is unchanged; a comment clarifies that non-recovering refetches keep the rejected promise in cache.Client tests add an
RSCProbeharness and cases for dedupe/success caching, retry after eviction, and superseded refetches, plus a real-timer helper aligned with the deferred eviction.Reviewed by Cursor Bugbot for commit 704b7f9. Bugbot is set up for automated code reviews on this repo. Configure here.