Fix ReadableStream reuse in retries by fetching fresh body#5660
Fix ReadableStream reuse in retries by fetching fresh body#5660chitalian wants to merge 2 commits into
Conversation
Proxied LLM calls were failing with "Body has already been used. It can only be used once. Use tee() first if you need to read it twice." when the Remote request-body buffer was selected (large / unknown-size bodies). `proxyRequest.body` was a ReadableStream consumed by the first fetch in callProvider, and callProviderWithRetry reused the same CallProps on retries (429 / 5xx), so the second fetch saw an already-consumed stream. Replace CallProps.body with a getBody() factory and call it inside callProvider right before fetch. Each attempt — including retries — now pulls a fresh body from the buffer: a new ReadableStream from the Durable Object container for Remote, or the same cached string for InMemory. Also collapse the duplicate safelyGetBody() calls in HeliconeProxyRequestMapper.tryToProxyRequest so we stop orphaning a stream per request. https://claude.ai/code/session_01FtHETTgsrBtPz9BLB3KmHh
|
The latest updates on your projects. Learn more about Vercel for GitHub. 3 Skipped Deployments
|
|
This pull request has been ignored for the connected project Preview Branches by Supabase. |
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
|
This pull request has been ignored for the connected project Preview Branches by Supabase. |
|
Claude finished @chitalian's task —— View job PR Review: ReadableStream Reuse FixTodo List:
Review Analysis✅ Technical Approach: ExcellentThe core fix is architecturally sound:
✅ Code Quality: GoodStrengths:
Performance Considerations:
|
…apper Keep the retry fix in ProviderClient but leave tryToProxyRequest exactly as it was before: two safelyGetBody() calls bracketing the token-limit exception handler. Collapsing them shifted the order of applyBodyOverrides vs. applyTokenLimitExceptionHandler, which would change the model the token-limit handler sees whenever a fallback bodyKeyOverride is present. Not worth the risk for a perf-only cleanup on a critical path. https://claude.ai/code/session_01FtHETTgsrBtPz9BLB3KmHh
|
Hi @chitalian - any estimate on when this one will be merged? |
Ticket
N/A
Component/Service
Type of Change
Deployment Notes
Context
ReadableStream bodies can only be consumed once. When
callProviderretries a request, reusing the same body reference causes a "Body has already been used" error on the second attempt. Additionally, fetching the body twice from a Remote buffer inHeliconeProxyRequestMapperleaks a ReadableStream from the container DO on every request.Changes
ProviderClient.ts: Changed
CallProps.bodyfrom a staticValidRequestBodyto agetBody()factory function that returns a fresh body for each fetch attempt. This ensures retries get a new stream instead of reusing a consumed one.ProviderClient.ts: Updated
callProviderto callprops.getBody()only when needed (non-GET requests), fetching a fresh body per attempt.HeliconeProxyRequestMapper.ts: Removed the initial body fetch before
applyTokenLimitExceptionHandler. The body is now only fetched once after the handler runs, and callers must use thegetBody()factory instead of consuming the stored reference directly.callPropsFromProxyRequest: Implemented the
getBody()factory to safely fetch the body with fallback to text extraction.Extra Notes
The body is retained on
HeliconeProxyRequestfor logging purposes but must not be consumed by callers—they must use thegetBody()factory to ensure fresh streams on retries.Test Plan
Existing tests should pass. The change is backward compatible at the call site since
getBody()is invoked transparently withincallProvider.https://claude.ai/code/session_01FtHETTgsrBtPz9BLB3KmHh