php-vcr version
master (currently unreleased, post-1.7.x)
PHP version
All supported (8.0–8.5)
Library hook in use
curl
Storage backend
Not sure (parsing happens before storage)
Description
Responses routed through HTTP proxies can include multiple HTTP/... status lines before the final response — for example a 200 Connection established tunnel acknowledgement from the proxy followed by the actual 201 Created. The current implementation in HttpUtil::parseResponse() only strips the literal sequence HTTP/1.1 100 Continue\r\n\r\n; any other intermediate HTTP/ prefix survives and ends up parsed as body content, corrupting both the status line and the body.
How to reproduce
use VCR\Util\HttpUtil;
// Proxy ACK (200) followed by a real response (201)
$raw = "HTTP/1.1 200 OK\r\n\r\nHTTP/1.1 200 OK\r\n\r\nHTTP/1.1 201 Created\r\nContent-Type: text/html\r\nDate: Fri, 19 Jun 2015 16:05:18 GMT\r\nVary: Accept-Encoding\r\nContent-Length: 0\r\n\r\n";
[$status, $headers, $body] = HttpUtil::parseResponse($raw);
// Expected: 'HTTP/1.1 201 Created', body: ''
// Actual: status is 'HTTP/1.1 200 OK', body contains the remaining HTTP parts
$this->assertEquals('HTTP/1.1 201 Created', $status);
$this->assertEquals('', $body);
A unit test version of this reproducer is in tests/Unit/Util/HttpUtilTest.php — adding a case for the proxy-200 prefix (in addition to the existing 100-Continue cases) will fail on current master.
Possible Solution
Iteratively strip leading HTTP/...\r\n\r\n blocks until the next block no longer starts with HTTP/. The existing 100-Continue stripping can be generalised: split on \r\n\r\n, advance the pointer past any segment that starts with HTTP/, then treat the remainder as header + body. Avoid split/rejoin tricks that corrupt CRLF sequences inside the body — use substr from the offset of the first non-HTTP block instead.
Additional Context
This was first reported in the context of proxy tunnelling. Two earlier PRs shipped fixes but neither landed:
The test cases from both PRs are preserved here as the acceptance criteria for a fresh implementation.
php-vcr version
master (currently unreleased, post-1.7.x)
PHP version
All supported (8.0–8.5)
Library hook in use
curl
Storage backend
Not sure (parsing happens before storage)
Description
Responses routed through HTTP proxies can include multiple
HTTP/...status lines before the final response — for example a200 Connection establishedtunnel acknowledgement from the proxy followed by the actual201 Created. The current implementation inHttpUtil::parseResponse()only strips the literal sequenceHTTP/1.1 100 Continue\r\n\r\n; any other intermediateHTTP/prefix survives and ends up parsed as body content, corrupting both the status line and the body.How to reproduce
A unit test version of this reproducer is in
tests/Unit/Util/HttpUtilTest.php— adding a case for the proxy-200 prefix (in addition to the existing 100-Continue cases) will fail on current master.Possible Solution
Iteratively strip leading
HTTP/...\r\n\r\nblocks until the next block no longer starts withHTTP/. The existing 100-Continue stripping can be generalised: split on\r\n\r\n, advance the pointer past any segment that starts withHTTP/, then treat the remainder as header + body. Avoid split/rejoin tricks that corrupt CRLF sequences inside the body — usesubstrfrom the offset of the first non-HTTP block instead.Additional Context
This was first reported in the context of proxy tunnelling. Two earlier PRs shipped fixes but neither landed:
implode('\r\n\r\n', $bodyParts)uses the literal string, not the CRLF escape sequence)The test cases from both PRs are preserved here as the acceptance criteria for a fresh implementation.