Skip to content

chore(canonical-formats): polish bundle (B+C+D+E+F)#849

Merged
bokelley merged 1 commit into
mainfrom
bokelley/canonical-formats-polish
May 24, 2026
Merged

chore(canonical-formats): polish bundle (B+C+D+E+F)#849
bokelley merged 1 commit into
mainfrom
bokelley/canonical-formats-polish

Conversation

@bokelley
Copy link
Copy Markdown
Contributor

Summary

Bundles the five deferred follow-ups from #845's expert review. No behavioral surprise — additive public API, internal rename with re-exports, and one fail-loud → log-and-skip flip.

What's in the bundle

Item Change
B projection.pyv2_to_v1.py for symmetry with v1_to_v2.py. git mv preserves history; package re-exports unchanged.
C Typed Divergence dataclass (field / kind / cap / value + DivergenceKind Literal). check_narrows returns list[Divergence]. Divergence.to_dict() preserves the wire-shape details.divergences key vocabulary.
D New adcp.canonical_formats.fixtures public module — adopters reuse the 14 v2 + 50 v1 vendored fixtures via load_reference_product(name) / load_v1_reference_catalog() / REFERENCE_PRODUCT_NAMES without re-vendoring upstream.
E group_declarations_by_product(decls, mapping) — buckets the output of project_v1_catalog_to_v2 into per-product format_options[] lists.
F _versions_overlap log-and-skip on unknown DSL operator prefixes (~>, ^) instead of raise. Registry-side forward-compat posture.

Public-API additions

3 new names on adcp.canonical_formats:

  • Divergence (dataclass)
  • DivergenceKind (Literal)
  • group_declarations_by_product (function)

Plus the new submodule adcp.canonical_formats.fixtures (reached in explicitly, not auto-re-exported on the package surface).

Breaking surface (very narrow)

  • check_narrows return type: list[dict[str, Any]]list[Divergence]. The wire-shape projection Divergence.to_dict() preserves the original key vocabulary; only adopters who called check_narrows and indexed the raw dict need to switch to attribute access.
  • _versions_overlap no longer raises ValueError on unknown operators — it logs and returns False. This was internal but documented as a public matcher in the registry module.

Adopters who reached into the private path from adcp.canonical_formats.projection import ... must switch to from adcp.canonical_formats.v2_to_v1 import .... The documented public path (from adcp.canonical_formats import project_product_to_v1) is unchanged.

Test plan

  • ruff check src/ — passed
  • mypy src/adcp/ — passed (901 source files)
  • pytest tests/ — 5271 passed, 38 skipped, 1 xfailed
  • 11 new tests (8 fixture-loader + 3 group helper)
  • Existing narrowing + registry + projection tests updated for typed Divergence + forward-compat tone

Polish backlog: complete

After this lands, the #741 polish backlog from PR #847's description is fully addressed.

Refs: #741, #845

…ixtures module, group helper, version fwd-compat

Bundles the deferred follow-ups from #845's expert review. Five small
changes; no behavioral surprise.

## B — projection.py → v2_to_v1.py

Symmetric with v1_to_v2.py. ``git mv`` preserves history. Imports
continue to work via the package re-export. Adopters who reached
into the private path ``from adcp.canonical_formats.projection``
must switch to ``from adcp.canonical_formats.v2_to_v1``.

## C — Typed Divergence dataclass

``check_narrows`` returns ``list[Divergence]`` instead of
``list[dict[str, Any]]``. ``Divergence.to_dict()`` preserves the
original advisory ``details.divergences`` key vocabulary
(``v1_max`` / ``v1_min`` / ``v1_allowed`` / ``v1_value``) so wire
parsers aren't affected. Adopters calling ``check_narrows`` and
indexing ``d["field"]`` must switch to ``d.field``.

## D — adcp.canonical_formats.fixtures public module

14 v2 + 50 v1 fixtures bundled under
``src/adcp/canonical_formats/_fixtures/`` and exposed via
``load_reference_product(name)``,
``load_v1_reference_catalog()``, ``REFERENCE_PRODUCT_NAMES``.
Adopters reuse without re-vendoring.

## E — group_declarations_by_product

After ``project_v1_catalog_to_v2``, buckets declarations into per-
product ``format_options[]`` lists given a
``{product_id: [v1_format_id, ...]}`` mapping. Bucket key is the
first ``v1_format_ref`` (matches ``find_declaration_by_v1_format_id``
semantics).

## F — _versions_overlap forward-compat

Unknown DSL operator prefixes (``~>``, ``^``) log WARNING and treat
as non-matching, rather than raising. Registry MAY publish operators
ahead of SDK support; crashing cached sessions is worse than missing
a match.

## Tests

5271 pass. New ``test_canonical_formats_fixtures.py`` covers the
public loader. ``test_canonical_formats_v1_to_v2.py`` extended for
``group_declarations_by_product``. Narrowing + registry tests
updated for typed Divergence and forward-compat tone.

Refs: #741, #845
Copy link
Copy Markdown

@aao-ipr-bot aao-ipr-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Follow-ups noted below. Wire-shape vocabulary is preserved character-identical against #845, the polish surface is narrow, and the bundle is well-tested.

Things I checked

  • Divergence.to_dict() at src/adcp/canonical_formats/narrowing.py:90-109 emits the exact details.divergences key set (v1_max / v1_min / v1_allowed / v1_value paired with v2_value / v2_declared) that #845 already shipped on the wire. The four-arm closed DivergenceKind literal matches the four kind strings the half-2 advisory was already emitting. Adopters indexing the advisory dict are not affected; only adopters who called check_narrows directly need to switch to attribute access.
  • 14 v2 product fixtures conform to schemas/cache/3.1.0-beta.3/core/product.json. Spot-checked chatgpt_brand_mention (agent_placement + sponsored_intelligence), google_performance_max (responsive_creative), meta_carousel (image_carousel) — all format_kind values are in the published canonical-format-kind.json enum, all carry valid pricing_options[].pricing_model, delivery_type, and reporting_capabilities.
  • _versions_overlap log-and-skip on unknown operator prefixes is the right protocol posture — it mirrors the open-on-consumer pattern already established by canonical-format-kind.json's normative "Consumer SDKs MUST treat this enum as open at parse time."
  • group_declarations_by_product first-ref grouping is sound for the documented input contract: project_v1_format_to_declaration constructs declarations with single-entry v1_format_ref at v1_to_v2.py:219-223, so the "multi-size fan-out" docstring caveat is forward-looking only.
  • pyproject.toml package-data inclusion of canonical_formats/_fixtures/*.json is wired correctly so the fixtures ship in the wheel.
  • projection.pyv2_to_v1.py is a pure git mv; no in-repo importer reaches into the old private path.
  • Test plan checked: 5271 pass, 11 new tests, wire-shape preservation pinned at tests/test_canonical_formats_narrowing.py:64-69, 115-116.

Follow-ups (non-blocking — file as issues)

  1. .x vs operator-prefix asymmetry in _versions_overlap. src/adcp/canonical_formats/registry.py:213-220 still raises ValueError on an unparseable .x major, while the operator-prefix branch at :234-242 log-and-skips. The forward-compat doc at :195-203 reads as if the whole DSL log-and-skips. A registry-side "foo.x" typo crashes the matcher; a "~>foo" typo does not. Either flip the .x branch to symmetric log+continue or sharpen the docstring to call the .x malformed path out as intentionally strict. (code-reviewer flag.)
  2. _UnknownFixtureError is underscore-prefixed but raised by the public load_reference_product. src/adcp/canonical_formats/fixtures.py:55-58. Adopters can only catch via the ValueError parent, which defeats the typed-exception value. Either rename to UnknownFixtureError and add to __all__, or document that the contract is catching ValueError.
  3. group_declarations_by_product collision semantics are documented but untested. src/adcp/canonical_formats/v1_to_v2.py:470-473 — the first-product-wins setdefault is the right call, but tests/test_canonical_formats_v1_to_v2.py:222-305 never pins it. Add a collision test so future refactors can't silently flip the semantic.
  4. Commit semver signal. chore(canonical-formats): for a PR that adds three public exports (Divergence, DivergenceKind, group_declarations_by_product) and flips the return type of check_narrows is the wrong release-please signal — strict reading is feat!: with a BREAKING CHANGE: footer. The mitigation is real (namespace landed days ago in the same 6.1 beta cycle, documented Experimental, MIGRATION_v5_to_v6.md is updated to enumerate the breaking surface) which is why this isn't blocking, but a notable choice for the next polish bundle to get right.

Minor nits (non-blocking)

  1. group_declarations_by_product docstring forward-looking caveat. src/adcp/canonical_formats/v1_to_v2.py:480-483 explains the refs[0] choice in terms of "multi-size fan-out," but the catalog-projection codepath produces single-ref declarations by construction. One-line precondition note ("input declarations from project_v1_catalog_to_v2 carry exactly one ref") would prevent a future adopter from passing in a hand-authored multi-ref and silently losing product membership for non-first refs.

Approved.

@bokelley bokelley merged commit 1001849 into main May 24, 2026
23 checks passed
@bokelley bokelley deleted the bokelley/canonical-formats-polish branch May 24, 2026 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant