bench: add correlated-proxy case to the predicate_eval suite#22919
Open
adriangb wants to merge 1 commit into
Open
bench: add correlated-proxy case to the predicate_eval suite#22919adriangb wants to merge 1 commit into
adriangb wants to merge 1 commit into
Conversation
The correlation subgroup's existing cases (q70-q72) use two predicates of equal cost and equal selectivity, so the two orders cost the same and correlation only affects the result cardinality - no ordering system can win or lose on them. They measure overhead, not opportunity. Add q73: a cheap integer predicate that is a perfect proxy for three string regexes, plus one independent regex of the same ~30% selectivity and similar cost. Marginal statistics cannot tell the four regexes apart in any position; their joint distribution with the proxy is what matters. Written in the natural-but-pessimal order (redundant regexes grouped with their proxy), the query runs ~1.9x slower than the hand-optimal order [c0, s2, s1...] on an M-series laptop, so a correlation-aware ordering system has real, measurable headroom here while an independence-assuming one is blind to it. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
adriangb
added a commit
to pydantic/datafusion
that referenced
this pull request
Jun 12, 2026
…posal Marginal per-conjunct statistics are blind to correlation: arrangements with very different costs can be statistically identical (the new predicate_eval correlation_q73 case has ~1.9x headroom invisible to any independence-based ranking), and the fused-vs-compact-once strategy difference is invisible to per-conjunct numbers entirely. Borrow the exploration idea from DuckDB's AdaptiveFilter (keep-or-revert timing of random swaps, src/execution/adaptive_filter.cpp): when a measuring window ends with nothing material to propose, occasionally put a random adjacent swap of the incumbent through the existing shared paired A/B trial instead of freezing. Each position carries a likelihood (halved when a swap there loses its trial, restored to 100 when one wins), so exploration of barren positions decays geometrically on top of the re-thaw backoff. The candidate bypasses the model gates by design — it exists because the model cannot see it — but adoption still requires the same measured, confidence-separated end-to-end win as any other proposal, which is a stronger keep-or-revert rule than DuckDB's strict mean comparison. On correlation_q73 (PR apache#22919) this captures 1.28x of the ~1.9x headroom within each 122-batch query (convergence needs two specific adjacent swaps; the rest needs cross-query persistence, cf. DuckDB's multi-file adaptive filter cache, left as future work). Tied micro-queries pay ~4-6% for the exploration trials they decline. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
predicate_evalsuite added in bench: add predicate_eval SQL micro-benchmark suite for conjunctive filter evaluation #22704. No single issue closed.
Rationale for this change
The
correlationsubgroup's existing cases (q70–q72) use two predicates of equalcost and equal selectivity. For two conjuncts the evaluation cost of an order is
cost(first) + selectivity(first) × cost(second), which is symmetric here — sothe two orders cost the same and correlation only affects the result cardinality.
These cases measure the overhead of an ordering system, but give it no
opportunity: nothing in the suite rewards (or even detects) correlation-aware
ordering.
This adds a case with real, measurable headroom that only joint statistics can
find. A cheap integer predicate (
c0 = 1, ~30%) is a perfect proxy for threestring regexes on
s1; a fourth regex ons2has the same ~30% selectivity andsimilar cost but is independent. Marginally, the four regexes are
indistinguishable in any position. Conditionally — behind the proxy — the three
s1regexes keep every survivor while thes2regex still discards ~70%.The query is written in the natural-but-pessimal order (the redundant regexes
grouped with their proxy, the informative one last). On an M-series laptop the
written order runs ~1.9x slower than the hand-optimal order
[c0, s2-regex, s1-regexes...](16.4 ms vs 8.6 ms per iteration), so:independence assumption) is blind to the difference — every ranking of the four
regexes looks equivalent;
What changes are included in this PR?
load/corrproxy.sql— the correlated-proxy dataset (deterministic, generatedfrom
generate_serieslike the existing datasets;PRED_ROWS/PRED_FILLknobs as elsewhere).
queries/correlation/q73.sql,benchmarks/correlation/q73.benchmark— the newcase, following the suite's existing conventions.
Run with:
BENCH_NAME=predicate_eval BENCH_SUBGROUP=correlation cargo bench --bench sqlAre these changes tested?
The suite's shared template asserts the query returns rows; the case runs green
locally alongside q70–q72.
Are there any user-facing changes?
No — benchmark-only.
🤖 Generated with Claude Code