(improvement) (Cython only) deserializers: use direct PyUnicode_DecodeUTF8/ASCII fr… (tens hundreds of us improvement, x2.5-x3 speedup) by mykaul · Pull Request #743 · scylladb/python-driver

mykaul · 2026-03-13T11:48:37Z

(improvement) (Cython only) deserializers: use direct PyUnicode_DecodeUTF8/ASCII from C buffer pointer

Replace the two-step to_bytes(buf).decode('utf8') pattern in DesUTF8Type and DesAsciiType with direct CPython C API calls (PyUnicode_DecodeUTF8 and PyUnicode_DecodeASCII). This eliminates an intermediate bytes object allocation per text cell — the old code created a Python bytes object from the C buffer pointer via to_bytes(buf), then immediately decoded it to str and discarded the bytes.

Text (UTF8Type/VarcharType) is the most common CQL column type, so this optimization applies to the majority of cells in typical workloads.

Benchmark results

Measured on the same machine, back-to-back (Python 3.14.3, Cython 3.2.4, GCC 15.2.1, --benchmark-disable-gc). Median times reported.

Scenario	Before (master)	After (this PR)	Speedup
UTF8 1row x 1col short (11B)	1,158 ns	454 ns	2.55x
UTF8 1row x 10col short	3,262 ns	1,505 ns	2.17x
UTF8 100rows x 5col medium (46B)	129,349 ns	36,270 ns	3.57x
UTF8 1000rows x 5col medium	1,131,159 ns	322,309 ns	3.51x
UTF8 100rows x 5col long (200B)	121,866 ns	40,662 ns	3.00x
UTF8 100rows x 5col multibyte	275,148 ns	113,751 ns	2.42x
ASCII 100rows x 5col medium	86,118 ns	36,401 ns	2.37x
ASCII 1000rows x 5col medium	832,972 ns	348,660 ns	2.39x
Mixed 100rows 3text+2int	82,798 ns	27,792 ns	2.98x

All existing unit tests pass (15 Cython tests, 9 deserializer-specific including invalid-input error propagation tests).

Review comments addressed

All 3 Copilot review comments have been addressed:

Benchmark file renamed from test_utf8_decode_benchmark.py to utf8_decode_benchmark.py (no test_ prefix, won't be auto-collected) and pytest.importorskip guards added.
Correctness tests moved to tests/unit/cython/test_deserializers.py (decoupled from benchmark dependency).
cythontest decorator adopted instead of hand-rolled try/except ImportError, consistent with the rest of tests/unit/cython/ and respecting VERIFY_CYTHON=True CI mode.

Pre-review checklist

I have split my patch into logically separate commits.
All commit messages clearly explain what they change and why.
I added relevant tests for new features and bug fixes.
All commits compile, pass static checks and pass test.
PR description sums up the changes and reasons why they should be introduced.
I have provided docstrings for the public items that I want to introduce.
I have adjusted the documentation in ./docs/source/.
I added appropriate Fixes: annotations to PR description.

Copilot

Pull request overview

Optimizes text deserialization performance in the Cython row parser by decoding UTF-8/ASCII directly from the underlying buffer, and adds a benchmark module intended to measure the improvement.

Changes:

Replace to_bytes(buf).decode(...) with PyUnicode_DecodeASCII/UTF8(buf.ptr, buf.size, NULL) in Cython deserializers.
Add a new pytest-based benchmark suite for UTF-8/ASCII decode and pipeline parsing.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`cassandra/deserializers.pyx`	Uses CPython unicode decode APIs to avoid intermediate `bytes` allocations during ASCII/UTF-8 deserialization.
`benchmarks/test_utf8_decode_benchmark.py`	Adds end-to-end and microbenchmarks plus correctness checks for the new decode approach.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

This PR optimizes Cython-based UTF-8 and ASCII text deserialization by decoding directly from the underlying C buffer (avoiding an intermediate bytes allocation), and adds correctness tests plus a benchmark to validate/measure the change.

Changes:

Switch DesUTF8Type/DesAsciiType to use PyUnicode_DecodeUTF8 / PyUnicode_DecodeASCII on Buffer.ptr/Buffer.size.
Add unit tests covering common and edge-case text decode scenarios (empty, ASCII, multibyte, long strings, NULLs, multiple rows).
Add a pytest-benchmark benchmark file to compare end-to-end Cython row parsing before/after the optimization.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File	Description
`cassandra/deserializers.pyx`	Replace `to_bytes(...).decode(...)` with direct CPython Unicode decoding calls for UTF-8/ASCII deserializers.
`tests/unit/cython/test_deserializers.py`	New correctness-focused unit tests for the optimized Cython decoding path.
`benchmarks/utf8_decode_benchmark.py`	New benchmark suite to measure the performance impact of the UTF-8/ASCII decode optimization.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…om C buffer pointer Replace the two-step to_bytes(buf).decode('utf8') pattern in DesUTF8Type and DesAsciiType with direct CPython C API calls (PyUnicode_DecodeUTF8 and PyUnicode_DecodeASCII). This eliminates an intermediate bytes object allocation per text cell — the old code created a Python bytes object from the C buffer pointer via to_bytes(buf), then immediately decoded it to str and discarded the bytes. Text (UTF8Type/VarcharType) is the most common CQL column type, so this optimization applies to the majority of cells in typical workloads. Benchmark results (Cython row parsing pipeline, median times): | Scenario | Before (original) | After (direct decode) | Speedup | |---------------------------------|-------------------:|----------------------:|--------:| | UTF8 1row x 1col short (11B) | 565 ns | 454 ns | 1.24x | | UTF8 1row x 10col short | 1,594 ns | 1,023 ns | 1.56x | | UTF8 100rows x 5col medium | 61,396 ns | 28,766 ns | 2.13x | | UTF8 1000rows x 5col medium | 547,145 ns | 290,361 ns | 1.88x | | UTF8 100rows x 5col long(200B) | 57,940 ns | 35,680 ns | 1.62x | | UTF8 100rows x 5col multibyte | 125,149 ns | 103,370 ns | 1.21x | | ASCII 100rows x 5col medium | 41,608 ns | 35,817 ns | 1.16x | | ASCII 1000rows x 5col medium | 416,350 ns | 374,341 ns | 1.11x | | Mixed 100rows 3text+2int | 44,646 ns | 31,189 ns | 1.43x | All existing unit tests pass (62 type tests, 116 total across key suites).

…uards, add invalid-input tests - Replace hand-rolled try/except ImportError with the project-standard cythontest decorator and HAVE_CYTHON conditional imports, so VERIFY_CYTHON=True CI mode fails loudly instead of silently skipping. - Add pytest.importorskip guards to the benchmark file so it skips gracefully when pytest-benchmark or Cython extensions are missing. - Add test_utf8_invalid_bytes and test_ascii_invalid_bytes to confirm error propagation through the DriverException wrapper.

mykaul marked this pull request as draft March 13, 2026 11:48

mykaul changed the title ~~(improvement) deserializers: use direct PyUnicode_DecodeUTF8/ASCII fr…~~ (improvement) (Cython only) deserializers: use direct PyUnicode_DecodeUTF8/ASCII fr… Mar 13, 2026

mykaul requested a review from Copilot March 14, 2026 09:24

Copilot started reviewing on behalf of mykaul March 14, 2026 09:25 View session

Copilot AI reviewed Mar 14, 2026

View reviewed changes

Comment thread benchmarks/test_utf8_decode_benchmark.py Outdated

Comment thread benchmarks/test_utf8_decode_benchmark.py Outdated

This was referenced Mar 14, 2026

Tracking: Vector search (VectorType) performance improvement PRs #746

Open

Tracking: General (non-vector) performance improvement PRs #747

Open

mykaul force-pushed the perf/direct-utf8-decode branch from cbb8652 to 97eb4e8 Compare March 15, 2026 15:56

mykaul requested a review from Copilot March 16, 2026 19:03

Copilot started reviewing on behalf of mykaul March 16, 2026 19:03 View session

Copilot AI reviewed Mar 16, 2026

View reviewed changes

Comment thread tests/unit/cython/test_deserializers.py Outdated

mykaul force-pushed the perf/direct-utf8-decode branch from 97eb4e8 to a9815c1 Compare April 2, 2026 14:30

mykaul added 2 commits April 2, 2026 20:07

mykaul force-pushed the perf/direct-utf8-decode branch from a9815c1 to f0ce46c Compare April 2, 2026 17:07

mykaul changed the title ~~(improvement) (Cython only) deserializers: use direct PyUnicode_DecodeUTF8/ASCII fr…~~ (improvement) (Cython only) deserializers: use direct PyUnicode_DecodeUTF8/ASCII fr… (hundreds to tens of thousands of ns improvement, x2.5-x3 speedup) Apr 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(improvement) (Cython only) deserializers: use direct PyUnicode_DecodeUTF8/ASCII fr… (tens hundreds of us improvement, x2.5-x3 speedup)#743

(improvement) (Cython only) deserializers: use direct PyUnicode_DecodeUTF8/ASCII fr… (tens hundreds of us improvement, x2.5-x3 speedup)#743
mykaul wants to merge 2 commits into
scylladb:masterfrom
mykaul:perf/direct-utf8-decode

mykaul commented Mar 13, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mykaul commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

(improvement) (Cython only) deserializers: use direct PyUnicode_DecodeUTF8/ASCII from C buffer pointer

Benchmark results

Review comments addressed

Pre-review checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mykaul commented Mar 13, 2026 •

edited

Loading