Skip to content

feat(schemas): patches/ post-process infra + restore #753/#792 hand-edits#795

Merged
bokelley merged 2 commits into
mainfrom
bokelley/schemas-patches-infra
May 22, 2026
Merged

feat(schemas): patches/ post-process infra + restore #753/#792 hand-edits#795
bokelley merged 2 commits into
mainfrom
bokelley/schemas-patches-infra

Conversation

@bokelley
Copy link
Copy Markdown
Contributor

Summary

Two commits introducing a tracked-patches layer on top of the regenerated schema cache, and restoring three prior hand-edits that PR #791's regen silently dropped (#753's two and #792's one).

Commit What
153b7c49 schemas/patches/ infrastructure — directory + convention + sync_schemas.py post-process + 6 tests
f0ff82bc 3 restoration patches + Pydantic regen with patches applied

Why this exists

PR #791 bumped ADCP_VERSION 3.0.7 → 3.0.12 and the regen silently overwrote PR #753's two hand-patched surfaces (revoked_publisher_domains[] on adagents.json, publisher_domains[] compact form on publisher-property-selector.json). The SDK's dict-layer helpers kept implementing those contracts but the Pydantic-model layer lost the fields.

The same regen would have wiped PR #792's manifest.json hand-edit (signal_owned discovery-only correction) if #792 had landed before #791. Two near-miss data points on one anti-pattern.

This PR fixes the anti-pattern: hand-edits on the regenerated cache must be tracked as named, reviewable .patch files with an audit-trail header. The post-process step applies them after a clean upstream extraction and fails loudly when a patch is dead (upstream landed it) or broken (upstream restructured the target).

How patches lifecycle

Per-patch state machine:

  • Alive — forward-applies cleanly → apply, continue.
  • Dead — reverse-applies cleanly (upstream landed it) → fail with the directive to delete the file.
  • Broken — neither direction applies → fail with the directive to update the hunks or remove.

Patches apply once at the end of main() (after primary + preview bundles extract), not per-bundle, so a 3.0 patch doesn't misclassify as "dead" during the 3.1 preview pass.

What's in this PR

schemas/patches/
├── README.md                                          # convention doc
├── 01-adagents-revoked-publisher-domains.patch
├── 02-publisher-property-selector-publisher-domains.patch
└── 03-manifest-signal-owned-discovery-only.patch

scripts/sync_schemas.py
  + apply_tracked_patches() + _classify_patch() + _patch_dry_run() + _apply_patch()
  + wired into main() after preview-bundle loop

tests/test_sync_schemas.py
  + TestApplyTrackedPatches (6 tests: empty/missing-dir, alive/dead/broken/lex-order)

schemas/cache/3.0/{adagents.json, core/publisher-property-selector.json, manifest.json}
  cache state after patches apply (kept in the tree so adopters not running regen see a functional cache)

src/adcp/types/{_generated.py, generated_poc/adagents.py, generated_poc/core/publisher_property_selector.py}
  Pydantic regen picks up RevokedPublisherDomain + PublisherDomain types from the patched cache

SCHEMA_DELTAS.md
  reports "no field-shape changes detected" — the post-patch state matches the pre-#791 committed types

Why these three patches

File Reason Restores from Drop when
01-adagents-revoked-publisher-domains.patch revoked_publisher_domains[] enables forward-looking dict-layer contract PR #753 SDK pins to 3.1.x and upstream 3.1 ships the same shape
02-publisher-property-selector-publisher-domains.patch publisher_domains[] compact form, same rationale PR #753 SDK pins to 3.1.x and upstream 3.1 ships the same shape
03-manifest-signal-owned-discovery-only.patch signal_owned is discovery-only by design; upstream manifest is stale PR #792 upstream manifest catches up

Tests

  • 6 new unit tests for the patch state machine.
  • Full suite: pytest tests/ — 5026 passed, 30 skipped, 1 xfailed.
  • ruff + mypy clean.
  • Manual end-to-end: make regenerate-schemas produces a cache identical to what's checked in (no drift), confirming the patches apply cleanly and the Pydantic regen picks them up.

Anti-patterns this prevents

  • Silent overwrites on regen. Now every hand-edit is a tracked .patch that either applies (success) or fails loudly (operator-actionable).
  • Invisible hand-rolls. ls schemas/patches/*.patch and the headers tell you exactly what's been added on top of upstream and why.
  • Dead patches lingering forever. When upstream lands a patched surface, the next regen fails with "Patch X is dead — delete the file with a rationale," forcing the cleanup to be intentional.

Test plan

  • CI green on Python 3.10–3.13
  • make check-schema-drift passes (the patch-apply is now part of the drift check)
  • downstream import smoke — RevokedPublisherDomain and PublisherDomain re-importable from adcp.types

🤖 Generated with Claude Code

bokelley and others added 2 commits May 22, 2026 06:22
Hand-edits to the regenerated schema cache used to get silently
overwritten by ``make regenerate-schemas`` — that's exactly how PR #753's
forward-looking ``revoked_publisher_domains[]`` + ``publisher_domains[]``
compact-form patches got wiped on PR #791's bump to 3.0.12. The patches
were invisible to anyone running the regen, and the loss only surfaced
on line-by-line diff review of the regen output.

This introduces ``schemas/patches/`` as a tracked, reviewable layer of
hand-edits applied AFTER the upstream-verbatim extraction in
``scripts/sync_schemas.py``. Each ``.patch`` file is a unified diff with
a comment header (Patch / Reason / Filed / Upstream status / Drop when),
applied in lex order from the repo root via ``patch -p1``.

The state machine in ``apply_tracked_patches`` classifies each patch as:

- **Alive** — forward-applies cleanly → apply, continue.
- **Dead** — reverse-applies cleanly (upstream landed it) → exit
  non-zero with the patch name + directive to delete the file.
- **Broken** — neither direction applies → exit non-zero with the patch
  name + directive to either update the hunks or remove the patch.

Dead and broken both fail loudly because silently no-op'ing on dead
would let stale ``.patch`` files linger forever, and silently skipping
broken would let the SDK ship a cache whose patched fields don't
actually exist in the working tree.

Patch-application runs ONCE at the end of ``main()`` (after all
primary + preview bundles have been extracted), not inside ``_sync_one``
per-bundle. Per-bundle would misclassify a 3.0 patch as "dead" during
the subsequent 3.1 preview pass because the cache wouldn't reset
between passes.

The existing ``make check-schema-drift`` target picks up patch-apply
without changes — it already re-runs ``sync_schemas.py`` and diffs the
cache. With patches in the directory, the diff now validates that
patches still apply AND the resulting cache matches what's checked in.

The directory ships empty in this commit. The two #753-restoration
patches follow in a separate commit once PR #791 (3.0.12 regen) lands
on main — they need the post-regen cache as the diff base.

Refs #791, #753

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three patches restore prior hand-edits to the 3.0 schema cache that
were silently dropped on the 3.0.7 → 3.0.12 regen (#791):

01-adagents-revoked-publisher-domains.patch
  ``adagents.json`` gains top-level ``revoked_publisher_domains[]`` with
  the Reason enum. SDK dict-layer helpers
  (``validate_revoked_publisher_domain_entry``,
  ``filter_revoked_selectors``) implement the contract today; this
  patch restores the field on the Pydantic-model layer so adopters get
  parity across both code paths. Filed in PR #753 — upstream took the
  shape into 3.1.0-beta.x rather than 3.0.x, so the patch stays alive
  until the SDK moves to a 3.1 floor.

02-publisher-property-selector-publisher-domains.patch
  ``publisher-property-selector.json`` gains optional
  ``publisher_domains[]`` (compact form) on the `all` and `by_tag`
  selectors, XOR with the singular ``publisher_domain``. SDK helpers
  (``_fanout_publisher_properties``, ``validate_publisher_properties_item``,
  ``get_properties_by_agent``) implement the contract; same Pydantic-
  layer parity restoration. Filed in PR #753 — same upstream-status as
  patch 01 (landed in 3.1.0-beta.x).

03-manifest-signal-owned-discovery-only.patch
  ``manifest.json`` removes ``signal_owned`` from
  ``activate_signal.specialisms`` and ``activate_signal`` from
  ``signal_owned.exercised_tools``. Owned signal agents are
  discovery-only by design; upstream's 3.0.12 manifest still includes
  the old two-specialism shape, which would require owned signal
  agents to implement ``activate_signal`` (defeating the specialism
  purpose) and make conformance runners exercise marketplace
  activation. Filed in PR #792 — SDK self-correction. Drops when
  upstream updates the manifest shape.

Each patch carries its own audit-trail header (Patch / Reason / Filed
/ Upstream status / Drop when) so the next reader has full context
without needing git archaeology. See ``schemas/patches/README.md`` for
the convention.

Pydantic regen (``make regenerate-schemas``) picks up all three
restored fields:

  src/adcp/types/_generated.py            — RevokedPublisherDomain, PublisherDomain (per-arm)
  src/adcp/types/generated_poc/adagents.py
  src/adcp/types/generated_poc/core/publisher_property_selector.py

SCHEMA_DELTAS.md now reports "no field-shape changes detected" because
the post-patch generated types match the prior committed state. PR #791
had recorded the deletions as part of its regen; this commit reverses
them via the patches infra.

Tests: ``pytest tests/`` — 5026 passed, 30 skipped, 1 xfailed. ruff +
mypy clean.

Refs #753, #791, #792

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley bokelley enabled auto-merge (squash) May 22, 2026 10:44
Copy link
Copy Markdown

@aao-ipr-bot aao-ipr-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sound layer on top of the regen. The fail-loud alive/dead/broken state machine is the right shape — silent overwrites are how PR #791 wiped #753's hand-edits in the first place, and this turns that footgun into a CI-actionable directive.

Things I checked

  • _classify_patch posture (scripts/sync_schemas.py:337-348) — dry-run forward, then reverse on forward-fail. Partial-application (some hunks land, others don't) classifies as broken. --force suppresses interactive prompts and the auto-reverse heuristic, so dry-run returncode is a reliable boolean.
  • End-of-main() placement (scripts/sync_schemas.py:615-628) — comment is correct: per-bundle apply would misclassify a 3.0 patch as "dead" on the subsequent 3.1 preview pass because the cache doesn't reset between passes. One pass at the end avoids the artifact.
  • _generated.py:243-249, 752 shifts Reason from creative.sync_creatives_async_response_input_required to adagents. Private module per the docstring; Reason is not re-exported from adcp.types.__init__. No public-surface break.
  • Pydantic XOR enforcement for publisher_domain / publisher_domains: aliases.py:1822-1914 attaches a post-hoc @model_validator(mode='after') via Pydantic private API, with a drift test at tests/test_publisher_selector_xor_autoenforce.py that fails loudly if registration shape changes. Pre-existing — not added in this PR — but it's what makes the field-shape change here safe at the model layer, not just the dict-layer helper.
  • Release-boundary check: git tag --contains fe09c02d is empty. PR #791's 3.0.12 bump has not shipped in a tagged release, so restoring the pre-#791 field shape doesn't cross a public-API boundary. SCHEMA_DELTAS.md reporting "no field-shape changes detected" is correct under that read — feat(schemas): (no !) is the right conventional-commit prefix.
  • Patch 03 (schemas/patches/03-manifest-signal-owned-discovery-only.patch) is protocol-correct: owned-signal agents have no marketplace-activation handshake; listing activate_signal in signal_owned.exercised_tools would force conformance runners to exercise a tool the specialism doesn't implement.
  • code-reviewer: sound-with-caveats — no Blockers, no Major. ad-tech-protocol-expert: sound-with-caveats — XOR via allOf[not[required[both]]] + anyOf[required[either]] is the canonical Draft-07 idiom and the right choice over oneOf here since selection_type already holds the discriminator.

Follow-ups (non-blocking — file as issues)

  • _apply_patch exit posture. scripts/sync_schemas.py:389-393 raises RuntimeError on TOCTOU. Rest of the script uses print(..., file=sys.stderr); sys.exit(1). The TOCTOU path is vanishingly unlikely in CI but produces a Python traceback rather than the operator-friendly directive the docstring at lines 289-291 promises. Posture parity.
  • Missing patch(1) binary. No pre-check before the loop. Hosts without GNU patch get a FileNotFoundError stack trace instead of "install GNU patch and re-run." CI hosts have it; container builds elsewhere may not. shutil.which('patch') at the top of apply_tracked_patches is the cheap fix.
  • Patch 01's 7-day cache-hold reference. The revoked_publisher_domains[] description cites "the file-level 7-day cache cap" — that rule lives in upstream adcp#4504 and SDK code (src/adcp/adagents.py), not in the local 3.0 schema. Cite the upstream anchor inline so the next reader doesn't have to git-archaeology.
  • Patch 03 conformance-runner divergence. SDK-built runners will skip the signal_owned × activate_signal cell that upstream-strict runners still execute. Intended, but worth a CHANGELOG line so adopters comparing conformance results across implementations aren't surprised.

Minor nits (non-blocking)

  1. Test coverage gap on environment-dependent failure modes. tests/test_sync_schemas.py:421-626 covers empty / missing-dir / alive / dead / broken / lex-order. No test for the _apply_patch TOCTOU raise (cheap — monkeypatch _patch_dry_run to return True, corrupt the target file between classify and apply, assert the loud failure) or the missing-patch binary path. Pairs with the two follow-ups above.
  2. Classifier --force semantics undocumented. _classify_patch at scripts/sync_schemas.py:337-348 correctly handles the partial-application edge case but the property isn't called out. A one-line note ("--force + --dry-run makes returncode a reliable boolean — no auto-skip, no prompts") forestalls future "should we add --reject-file?" debate.

Approving on the strength of the fail-loud classifier plus the end-of-main() placement reasoning. The next regen that hits a dead patch will tell the operator exactly what to do, which is the whole point.

@bokelley bokelley merged commit 4e6a722 into main May 22, 2026
23 checks passed
@bokelley bokelley deleted the bokelley/schemas-patches-infra branch May 22, 2026 10:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant