fix(mysql): treat BINARY columns as Blob in Any type mapping#4225
fix(mysql): treat BINARY columns as Blob in Any type mapping#4225barry3406 wants to merge 1 commit intolaunchbadge:mainfrom
Conversation
MySQL BINARY(size) columns report ColumnType::String with the BINARY flag set. The Any driver type conversion only checked the column type without looking at flags, so BINARY columns got mapped to AnyTypeInfoKind::Text instead of Blob. This caused UTF-8 decoding errors when reading raw binary data through AnyRow. Check ColumnFlags::BINARY for String/VarString/VarChar column types and map to Blob when set. Fixes launchbadge#4132
abonander
left a comment
There was a problem hiding this comment.
@barry3406 you somehow bypassed the PR template which asks you if there are any breaking changes, and if so, to characterize them. This is a breaking behavior change, so can you please add that to your PR description?
|
Thanks for flagging — updated the description to explicitly characterize the breaking change under "Is this a breaking change?". TL;DR: any code that previously relied on |
| | ColumnType::LongBlob => AnyTypeInfoKind::Blob, | ||
| ColumnType::String | ColumnType::VarString | ColumnType::VarChar => { | ||
| AnyTypeInfoKind::Text | ||
| if type_info.flags.contains(ColumnFlags::BINARY) { |
There was a problem hiding this comment.
Using the BINARY flag is actually misleading, because it's also set for text collations with the binary property, e.g. utf8mb4_bin. This was the root cause of #3387 which resulted in a lot of complaints and headache.
There's some more context in https://github.com/launchbadge/sqlx/blob/main/sqlx-mysql/src/collation.rs#L1-L37 but the gist of it is that the main discriminator is the collation ID: binary (63) is a binary blob, everything else we just assume is a string (ignoring the character set, because it's actually always transcoded to the connection character set).
To avoid duplicating this logic, this could just forward to str::is_compatible() and [u8]::is_compatible().
I think it's also important to add a test for this, both to verify the fix and to guard against future regressions.
Does your PR solve an issue?
Fixes #4132
Is this a breaking change?
Yes, this is a breaking behavior change.
Previously, MySQL
BINARY(size)columns were mapped through the Any driver toAnyTypeInfoKind::Text, which caused a UTF-8 decoding panic (Utf8Error { valid_up_to: 4, error_len: Some(1) }) when reading raw binary data throughAnyRow. After this change, those columns are mapped toAnyTypeInfoKind::Bloband can be decoded asVec<u8>/&[u8]instead.Code that previously relied on the (broken)
Textmapping and was somehow decoding these columns as strings would need to switch to a binary type, so per Hyrum's Law this should land in the next major release (0.{x + 1}.0).MySQL
BINARY(size)columns reportColumnType::Stringwith theBINARYflag set. The Any driver's type conversion checked the column type but not the flags, so BINARY columns got mapped toAnyTypeInfoKind::Text. Reading raw binary data throughAnyRowthen blew up with a UTF-8 decoding error.Before:
After:
The fix checks
ColumnFlags::BINARYforString/VarString/VarCharcolumn types and maps toBlobwhen set.