I suspect that the input PDF that I'm dealing with is invalid...but I wanted to mention that it was working in 1.0.20, but no longer in 1.0.21.
The PDF appears to have an invalid stream defined near the end of my file (relevant part here::
8 0 obj\r<</Length 2200\r/Type\r/Metadata\r/Subtype \r/XML>>\rstream\rendstream\rendobj\r9 0 obj\r<< /Keywords()\r/Creator(HP Scan) \r/CreationDate(D:20210326163700-08'00')\r/ModDate(D:20210326163700-08'00')\r/Author ()\r/Producer (HP Scan Extended Application)\r/Title ()\r/Subject ()\r>>\rendobj\rxref\r0 10\r0000000000 65535 f \r0000000009 00000 n \r0000522282 00000 n \r0000522379 00000 n \r0000522588 00000 n \r0000522646 00000 n \r0000522697 00000 n \r0000522746 00000 n \r0000522892 00000 n \r0000522972 00000 n \rtrailer\r<<\r/Size 10\r/Root 5 0 R\r/Info 6 0 R\r/Info 7 0 R\r/Info 8 0 R\r/Info 9 0 R\r>>\rstartxref\r523171\r%%EOF\r
(pretty printed):
8 0 obj
<</Length 2200
/Type
/Metadata
/Subtype
/XML>>
stream
endstream
endobj
9 0 obj
<< /Keywords()
/Creator(HP Scan)
/CreationDate(D:20210326163700-08'00')
/ModDate(D:20210326163700-08'00')
/Author ()
/Producer (HP Scan Extended Application)
/Title ()
/Subject ()
>>
endobj
xref
0 10
0000000000 65535 f
0000000009 00000 n
0000522282 00000 n
0000522379 00000 n
0000522588 00000 n
0000522646 00000 n
0000522697 00000 n
0000522746 00000 n
0000522892 00000 n
0000522972 00000 n
trailer
<<
/Size 10
/Root 5 0 R
/Info 6 0 R
/Info 7 0 R
/Info 8 0 R
/Info 9 0 R
>>
startxref
523171
%%EOF
As you can see, the Length is 2200, but there are not 2200 bytes left in the file, and thus the @scanner.pos += out.last[:Length].to_i - 2
(here)[https://github.com/boazsegev/combine_pdf/blob/b966e703fd897ff50832d3823e74791099b82ca3/lib/combine_pdf/parser.rb#L364] causes a RangeError.
I am opening this ticket because I'm 90% sure that this is an invalid PDF, but I wanted to mention it out loud that the change introduced in 1.0.21 is (to me) a regression in capability. I recognize that #184 is a related issue.
For now, I've resolved my issue by reverting to 1.0.20. Not ideal, but sufficient for my purposes for now.
I suspect that the input PDF that I'm dealing with is invalid...but I wanted to mention that it was working in 1.0.20, but no longer in 1.0.21.
The PDF appears to have an invalid stream defined near the end of my file (relevant part here::
(pretty printed):
As you can see, the Length is 2200, but there are not 2200 bytes left in the file, and thus the
@scanner.pos += out.last[:Length].to_i - 2(here)[https://github.com/boazsegev/combine_pdf/blob/b966e703fd897ff50832d3823e74791099b82ca3/lib/combine_pdf/parser.rb#L364] causes a RangeError.
I am opening this ticket because I'm 90% sure that this is an invalid PDF, but I wanted to mention it out loud that the change introduced in 1.0.21 is (to me) a regression in capability. I recognize that #184 is a related issue.
For now, I've resolved my issue by reverting to 1.0.20. Not ideal, but sufficient for my purposes for now.