Optimize generic InList static filtering#21927
Draft
geoffreyclaude wants to merge 3 commits intoapache:mainfrom
Draft
Optimize generic InList static filtering#21927geoffreyclaude wants to merge 3 commits intoapache:mainfrom
geoffreyclaude wants to merge 3 commits intoapache:mainfrom
Conversation
Contributor
Author
|
run benchmark in_list_strategy |
|
🤖 Criterion benchmark running (GKE) | trigger CPU Details (lscpu)Comparing perf/in_list_generic_static_filter (ba66c2f) to 3aefba7 (merge-base) diff File an issue against this benchmark runner |
|
🤖 Criterion benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagebase (merge-base)
branch
File an issue against this benchmark runner |
ba66c2f to
5a9378f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
IN LISToptimization series in IN LIST optims #19390.Rationale for this change
After #21649, non-primitive constant
IN LISTevaluation still uses the extractedArrayStaticFilterfallback path. That path relies on comparator checks for each input row. This PR replaces that fallback lookup with a precomputed hash table and shared result construction so generic constant-list evaluation is cheaper before the later specialized primitive and string optimizations from #19390.What changes are included in this PR?
The PR is split so reviewers can separate mechanical cleanup from the behavior/performance changes:
Refactor generic InList static filter helpersPure refactoring. This moves the existing generic static-filter construction and probe loop into helper methods inside
ArrayStaticFilter, without changing the lookup data structure or result semantics.Build InList results from bitmapsChanges how the generic path materializes
BooleanArrayresults after membership has been computed. Instead of mixing membership checks and SQL three-valued null handling in the row loop, this builds a contains bitmap first and applies the null/negation rules with bitmap operations. This keeps the sameIN/NOT INsemantics, including theNULLcases.Optimize generic InList static filteringReplaces the fallback lookup storage from a unit-valued raw-entry
HashMaptohashbrown::HashTable<usize>. The table still stores indices into the constant list and still uses Arrow hashing plusmake_comparatorfor equality, but avoids the extra map value bookkeeping.The existing specialized primitive filters and dictionary handling are intentionally left out of scope.
Are these changes tested?
Yes.
Are there any user-facing changes?
No. This is an internal performance optimization only.