You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* feat: expose arrow_field, arrow_try_cast, cast_to_type, with_metadata
Adds Python bindings for five scalar functions from
datafusion::functions::expr_fn that were not previously surfaced:
- arrow_field: returns a struct describing an expression's Arrow field
(name, data_type, nullable, metadata).
- arrow_try_cast: like arrow_cast but yields NULL on cast failure.
- cast_to_type / try_cast_to_type: casts a value to the type of a
reference expression. These are exposed as a single Python entry
point cast_to_type(value, type_ref, *, try_cast=False); the kwarg
switches between the strict and try variants.
- with_metadata: attach Arrow field metadata; the inverse of
arrow_metadata. Accepts a dict[str, str] for ergonomics.
Updates skills/datafusion_python/SKILL.md to list the new functions
and documents the cast_to_type kwarg behavior.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor: collapse try_cast_to_type into cast_to_type kwarg
The previous commit exposed cast_to_type and try_cast_to_type as two
separate pyo3 bindings and unified them in the Python wrapper via a
try_cast kwarg. That left try_cast_to_type in datafusion._internal
without a matching public Python name, breaking
test_datafusion_missing_exports.
Move the dispatch into the rust binding: cast_to_type now takes a
try_cast kwarg and selects between functions::expr_fn::cast_to_type
and try_cast_to_type internally. Only one pyo3 binding is registered,
so the wrapper-coverage check passes and the Python entrypoint is
unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat: accept pyarrow DataType in arrow_try_cast
Mirrors arrow_cast: arrow_try_cast now accepts `pa.DataType` in addition
to `str` and `Expr`. Adds `Expr.try_cast(pa.DataType)` PyO3 binding for
the pyarrow-type routing path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: guard with_metadata against empty dict and empty keys
Empty `metadata` dict now returns the input expression unchanged
(previously bubbled an opaque DataFusion error about minimum arg
count). Empty keys raise `ValueError` to match the docstring contract.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: assert full struct shape in arrow_field doctest
Previous doctest set metadata on the input field but only checked the
name — the metadata setup was dead. Now the example asserts the full
returned struct (name, data_type, nullable, metadata) so the demo
shows what the function actually produces.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test: add unit tests for arrow_try_cast, arrow_field, cast_to_type, with_metadata
Mirrors the existing test_arrow_cast pattern. Covers:
- arrow_try_cast: string-syntax, pa.DataType, and null-on-failure paths
- arrow_field: full returned struct shape (name, data_type, nullable, metadata)
- cast_to_type: type-from-expr happy path and try_cast=True null behavior
- with_metadata: round-trip through arrow_metadata, empty-dict no-op, and
empty-key ValueError
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test: parameterize arrow cast / try_cast tests
Folds the previous four cast tests (arrow_cast + arrow_try_cast × str
+ pyarrow target type) into a single parameterized test that runs both
functions across all five target-type variants. Collapses the two
cast_to_type tests (happy path + try_cast=True) into one parameterized
test, and parameterizes arrow_try_cast null-on-failure over both
target-type syntaxes. 7 test functions, 19 cases — net less code, same
coverage.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: point cast_to_type at arrow_cast for static target types
Adds a one-line cross-reference so users with a known target type
reach for arrow_cast / arrow_try_cast instead of building a sentinel
expression to feed cast_to_type.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor: split cast_to_type into cast_to_type and try_cast_to_type
Replace the try_cast bool flag with separate cast_to_type and
try_cast_to_type functions, matching upstream DataFusion and the
arrow_cast / arrow_try_cast pair. Also drop the redundant data_type
parametrization on test_arrow_try_cast_null_on_failure, since the
str-vs-pyarrow distinction is already covered by test_arrow_cast_variants.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
0 commit comments