Draft: implement non-blocking morsel API#21852
Conversation
7291951 to
ad17058
Compare
|
@alamb is this something I could help push forward, if you still think it's the right direction? I'm happy to review, etc. Just going through your open PRs to see what I can help with. |
Thank you for the ping. Basically I think to be motivated to move this along I need to find some query where it improves performance. I didn't have such evidence and thus is dropped off my radar. So in other words, I am not sure what to do with this -- I think it basically "realizes the vision" of morsel driven API but just makes the code more complicated and has no direct benefit yet 🤔 |
TODO
poll_scanloop into separate functions to reduce the indent level / control flowWhich issue does this PR close?
Rationale for this change
In #21342 (comment), @adriangb pointed out that the current Morsel API relied on a comment rather than they typesystem to separate IO and CPU.
Also, it should be pointed out that the current Parquet opener actually does now do IO in the stream reader. This makes overlapping the IO and CPU harder
What changes are included in this PR?
Morsel::into_streamto return either a Sync or Async stream.I don't expect this change will have much of an actual impact (yet) but I do expect that it will set us up for better IO interleaving / work stealing
Are these changes tested?
yes by CI and new tests (to be written)
Are there any user-facing changes?
The unreleased Morsel API is slightly different