Skip to content

feat(server): idempotency middleware per AdCP #2315 spec#196

Merged
bokelley merged 2 commits into
mainfrom
bokelley/issue-182-server-idempotency
Apr 19, 2026
Merged

feat(server): idempotency middleware per AdCP #2315 spec#196
bokelley merged 2 commits into
mainfrom
bokelley/issue-182-server-idempotency

Conversation

@bokelley
Copy link
Copy Markdown
Contributor

Closes #182. Server-side counterpart to PR #194 (client-side idempotency).

What

Drop-in decorator for Python sellers so they don't hand-roll the (principal, key, canonical-hash) → cached-response machinery per tool handler.

from adcp.server.idempotency import IdempotencyStore, MemoryBackend

idempotency = IdempotencyStore(
    backend=MemoryBackend(),
    ttl_seconds=86400,
)

class MySeller(ADCPHandler):
    @idempotency.wrap
    async def create_media_buy(self, params, context=None):
        return my_create_logic(params)

    async def get_adcp_capabilities(self, params, context=None):
        caps = build_caps(...)
        caps.adcp.idempotency = idempotency.capability()
        return caps

Spec mapping

Implements the seller side of AdCP #2315 per security.mdx#idempotency:

  • Scope lookups by (authenticated_principal, idempotency_key) — security requirement
  • Canonical hash via RFC 8785 JCS + SHA-256 over the payload minus the spec's closed exclusion list (idempotency_key, context, governance_context, push_notification_config.authentication.credentials)
  • Cache hit + matching hash → return cached response, skip handler
  • Cache hit + different hash → raise IdempotencyConflictError before the handler runs (preserves at-most-once)
  • Cache miss → run handler, commit (hash, response) to the backend

What ships

New subpackage src/adcp/server/idempotency/:

  • canonicalize.pycanonical_json_sha256, strip_excluded_fields, EXCLUDED_FIELDS (closed set). Input payload never mutated.
  • backends.pyIdempotencyBackend ABC, MemoryBackend (dict + asyncio.Lock + lazy expiry + sweeper), PgBackend scaffold that raises NotImplementedError with a clear pointer to the follow-up.
  • store.pyIdempotencyStore class, @wrap decorator, .capability() returning {\"replay_ttl_seconds\": ttl} for inline capability declaration.

New core dependency: rfc8785 (tiny pure-Python JCS).

Design calls worth flagging for review

  • Fail-closed on missing caller_identity: no principal → fall through to the plain handler rather than collapse every buyer into a shared namespace. Degrades the feature but prevents a security regression.
  • Deep-copy on put and get: callers who mutate the returned response (adding derived fields, appending to lists) don't poison future replays. Test locks this in (TestCachedResponseImmutability).
  • Cache commit after handler returns: MemoryBackend has no transactional relationship to external resources; a crash between handler-success and cache-commit re-executes on retry. Documented. PgBackend follow-up will commit in the handler's transaction.
  • Backend put failure: log a WARNING, return the handler's result. Raising would look to the caller like the handler failed → retry → double-execute. Swallowing is wrong. Warning is the honest compromise. Test locks this in.
  • TTL bounds: enforced to [3600, 604800] per spec capabilities.idempotency.replay_ttl_seconds (1h–7d).

Tests

38 new tests in tests/test_server_idempotency.py covering:

  • Canonicalize: determinism, key-order independence, closed exclusion set lock, nested path stripping, no input mutation, ext preserved in hash
  • MemoryBackend: put/get/miss, lazy TTL expiry, per-principal isolation, sweeper, concurrent put under asyncio.gather
  • IdempotencyStore.wrap: miss→hit→conflict→fresh-key, per-principal scope, excluded-field change is not a conflict, Pydantic / dict / dict-context inputs, fail-closed when no caller identity
  • Instance-method decorator shape (@store.wrap above an async method on a handler class)
  • Cached response immutability — caller mutation doesn't poison replays
  • Backend put failure logs WARNING and returns handler's response
  • Capability fragment + TTL bound validation (low/high/min/max)
  • TTL expiry via time-patch

Total: 1483 pytest passing, mypy clean (660 source files), ruff clean on the new code.

Expert review addressed

Code-reviewer flagged three should-fix items; all applied:

  • Dead HandlerReturn TypeVar removed
  • Cached response deep-copied on put AND get (the shallow-copy-only approach leaked nested-list mutations)
  • Backend put failure now logs WARNING with structured context instead of masking as handler failure

Follow-ups (explicitly out of scope)

  • PgBackend implementation that commits the cache row in the same SQLAlchemy/asyncpg transaction as the handler's business writes
  • Wire the middleware into the AdCP training agent / reference seller so the upstream idempotency storyboard passes end-to-end (currently fails because the reference agent declares the capability but doesn't enforce it — adcp#2346)

Test plan

  • pytest tests/ --no-cov -q — 1483 passed
  • mypy src/adcp/ — 0 errors, 660 files
  • ruff check — clean
  • Spec exclusion list locked in a dedicated regression test so future maintainers can't silently broaden it

🤖 Generated with Claude Code

bokelley and others added 2 commits April 18, 2026 20:53
Closes #182. Gives Python sellers a drop-in decorator so they don't have to
hand-roll the (principal, key, canonical-hash) → cached-response machinery
every tool handler.

## Spec mapping

Implements the seller side of AdCP #2315 per
`docs/building/implementation/security.mdx#idempotency`:

- Extract `idempotency_key` from the request
- Scope lookups by `(authenticated_principal, idempotency_key)` — security
  requirement: keys from principal A on seller S have no meaning for
  principal B
- Hash the canonical payload (RFC 8785 JCS + SHA-256), stripping the spec's
  closed exclusion list: `idempotency_key`, `context`, `governance_context`,
  and `push_notification_config.authentication.credentials` (nested)
- Cache hit + matching hash → return cached response, skip handler
- Cache hit + different hash → `IdempotencyConflictError` before handler runs
  (preserves at-most-once)
- Cache miss → run handler, then commit `(hash, response)` to the backend

## API

```python
from adcp.server.idempotency import IdempotencyStore, MemoryBackend

idempotency = IdempotencyStore(
    backend=MemoryBackend(),
    ttl_seconds=86400,  # 24h, AdCP storyboard recommended minimum
)

class MySeller(ADCPHandler):
    @idempotency.wrap
    async def create_media_buy(self, params, context=None):
        return my_create_logic(params)

    async def get_adcp_capabilities(self, params, context=None):
        caps = build_caps(...)
        caps.adcp.idempotency = idempotency.capability()
        return caps
```

## What's in the PR

- `src/adcp/server/idempotency/` subpackage:
  - `canonicalize.py` — `canonical_json_sha256`, `strip_excluded_fields`,
    `EXCLUDED_FIELDS` (closed set); deep-copy before strip so callers'
    payloads are never mutated
  - `backends.py` — `IdempotencyBackend` ABC, `MemoryBackend` (dict +
    asyncio.Lock + lazy expiry + sweeper), `PgBackend` scaffold that raises
    `NotImplementedError` with a clear pointer to the follow-up
  - `store.py` — `IdempotencyStore` class, `.wrap` decorator, `.capability()`
    returning `{"replay_ttl_seconds": ttl}` for inline capability declaration
- `rfc8785` added as a core dependency (tiny pure-Python JCS implementation
  that produces identical bytes to the spec reference)
- 38 tests in `tests/test_server_idempotency.py` covering:
  - Canonicalize: determinism, key-order independence, closed exclusion set,
    nested path stripping, no input mutation, `ext` preserved
  - MemoryBackend: put/get/miss, lazy TTL expiry, per-principal isolation,
    sweeper, concurrent put under asyncio.gather
  - IdempotencyStore.wrap: miss→hit→conflict→fresh-key, per-principal scope,
    excluded fields don't trigger conflict, Pydantic / dict / dict-context
    inputs, fail-closed when no caller identity
  - Instance-method decorator shape (`@store.wrap` above an async method)
  - Cached response immutability — caller mutation doesn't poison replays
  - Backend `put` failure logs WARNING but returns handler's response
  - Capability fragment and TTL bound validation (1h–7d per spec)
  - TTL expiry via time-patch

## Design decisions

- **Fail-closed on missing caller_identity**: without a principal we cannot
  safely scope the key. Middleware falls through to the plain handler rather
  than collapsing every buyer into a shared namespace. Degrades the feature
  but prevents a security regression.
- **Deep-copy on both put and get paths**: handler returns a response; if the
  caller mutates it (common: adding derived fields, appending to a list),
  that mutation must NOT affect future replays. Deep-copy on the way in AND
  on the way out gives each replay an independent object.
- **Cache commit after handler returns**: MemoryBackend has no transactional
  relationship to external resources. A crash between handler-success and
  cache-commit causes the retry to re-execute. Documented. `PgBackend` (next
  PR) will share the handler's transaction when they use the same engine.
- **Backend `put` failure behavior**: log a WARNING, return the handler's
  result. Raising would look to the caller like the handler failed → they
  retry → double-execute. Swallowing is wrong (operators need the signal).
  Warning is the honest compromise.
- **TTL bounds**: enforced to `[3600, 604800]` per spec
  `capabilities.idempotency.replay_ttl_seconds` (1h–7d).

## Follow-ups (separate PR)

- `PgBackend` implementation that commits the cache entry in the same
  SQLAlchemy/asyncpg transaction as the handler's business writes
- Wire the middleware into the AdCP training agent / reference seller so
  the upstream spec's idempotency storyboard passes end-to-end (currently
  fails because the reference agent declares the capability but doesn't
  enforce it — adcontextprotocol/adcp#2346)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…X polish

Addresses expert review findings from the initial PR #196 round.

## P0 fixes

**Wire translation (spec-blocking).** `IdempotencyConflictError` raised from
`@idempotency.wrap` previously bubbled as a generic Python exception →
FastMCP's "Tool execution failed" / A2A blanket Exception → no
`IDEMPOTENCY_CONFLICT` on the wire. Fixed on both transports:

- MCP: `serve.py _register_tool` wraps the tool `fn()` in `try/except
  ADCPError → translate_error(exc, "mcp")`, raising a `ToolError` that
  FastMCP surfaces as `is_error=true` with the code in text (our own client's
  text-mode extraction from PR #195 recovers the code).
- A2A: `ADCPAgentExecutor.execute` catches `ADCPError` before the blanket
  `Exception` handler and routes to a new `_send_adcp_error` helper that
  emits a `TaskState.failed` with a `DataPart({"adcp_error": {...}})` per
  `transport-errors.mdx §A2A Binding`, including `recovery="terminal"` for
  IDEMPOTENCY_CONFLICT / IDEMPOTENCY_EXPIRED.
- Added `IDEMPOTENCY_CONFLICT` and `IDEMPOTENCY_EXPIRED` to
  `STANDARD_ERROR_CODES` with `recovery="terminal"` so `translate_error`
  looks up the right recovery classification.

**Capabilities helper.** `capabilities_response()` now accepts an
`idempotency=` kwarg that nests the declaration under `adcp.idempotency`.
The docstring quickstart used `caps.adcp.idempotency = ...` which raised
`AttributeError` since `capabilities_response` returns a plain dict. Fixed
by updating both the helper and the docstring example to pass
`idempotency=store.capability()`.

## P1 fixes

**Discoverability.** Re-exported `IdempotencyStore` and `MemoryBackend` from
`adcp.server` so sellers can `from adcp.server import IdempotencyStore,
MemoryBackend` alongside `ADCPHandler`. Matches the existing convention for
every other seller-facing type.

**README.** Added a "Building a seller: idempotency middleware" section
adjacent to the buyer-side section. Covers `@idempotency.wrap`, the
`capability()` declaration, the IDEMPOTENCY_CONFLICT wire behavior, backend
status (MemoryBackend ships, PgBackend is scaffolded), and the atomicity
caveat.

## P2 fixes

**Test ergonomics.** `IdempotencyStore` and `MemoryBackend` now accept an
optional `clock: Callable[[], float]` so TTL tests don't have to monkeypatch
`time.time`. `MemoryBackend.clear()` added for fixture resets. Regression
test `TestTTLExpiry.test_cached_response_expires_after_ttl` migrated to use
the injected clock.

**PgBackend scaffold.** Docstring now leads with `.. warning::`, the
`NotImplementedError` message includes the full issue URL, and the schema
sketch notes `COLLATE "C"` (to prevent case-insensitive collation
collapsing distinct tenants) and recommends row-level security on
`principal_id` as a defense-in-depth layer.

**Quickstart docstring.** `adcp.server.idempotency.__init__` quickstart now
shows `params.idempotency_key` and `ToolContext(caller_identity=...)`
explicitly so first-time users don't silently fall into the no-dedup
fallthrough path.

## New tests

- `TestWireTranslation.test_mcp_conflict_translates_to_tool_error` — asserts
  an IdempotencyConflictError from `@idempotency.wrap` surfaces as a
  `ToolError` with `IDEMPOTENCY_CONFLICT` in the text.
- `TestWireTranslation.test_a2a_conflict_emits_failed_task_with_adcp_error`
  — asserts `_send_adcp_error` emits a `TaskState.failed` with a
  `DataPart({"adcp_error": {"code": "IDEMPOTENCY_CONFLICT",
  "recovery": "terminal"}})`.
- `TestCapability.test_capabilities_response_accepts_idempotency` — asserts
  the new kwarg nests under `adcp.idempotency`.
- `TestCapability.test_server_reexports` — locks the `adcp.server`
  re-exports so future refactors can't silently drop them.

Total: 1488 passing (from 1483). mypy clean. ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley bokelley merged commit 101713c into main Apr 19, 2026
8 checks passed
bokelley added a commit that referenced this pull request Apr 19, 2026
Closes #185.

## Schema sync

Resynced against `adcp-latest`: 564 files updated, 1 removed
(`enums/budget-authority-level.json`). The rc.4 spec drops
``Budget.authority_level`` (enum: agent_full/agent_limited/human_required)
in favor of two orthogonal fields on ``plan.budget``:

- ``reallocation_threshold: number ≥ 0`` — budget-reallocation autonomy
  denominated in ``budget.currency``
- ``reallocation_unlimited: true`` — explicit full-autonomy sentinel,
  mutually exclusive with ``reallocation_threshold`` (enforced via
  ``oneOf`` in the schema)

Plus a new ``plan.human_review_required`` flag that governs decisions
affecting data subjects (targeting, creative, delivery) under GDPR Art 22
/ EU AI Act Annex III — distinct from budget reallocation autonomy.

No hand-written SDK code referenced the old ``authority_level`` /
``BudgetAuthorityLevel`` type, so codegen picks up the new shape
automatically.

## New task: update_rights

The rc.4 schema index added a ``update_rights`` task (partner to
``acquire_rights``). Lets buyers modify an existing rights acquisition —
typically extend ``end_date``, raise ``impression_cap``, pause/unpause
via ``paused``, or swap to a compatible ``pricing_option_id``. Wired at
the 6 spec-coverage sites:

- ``ADCPClient.update_rights`` (src/adcp/client.py)
- ``ADCPHandler.update_rights`` default stub (src/adcp/server/base.py)
- ``ProtocolAdapter.update_rights`` abstract + concrete A2A/MCP impls
  (src/adcp/protocols/{base,a2a,mcp}.py)
- CLI dispatch table (src/adcp/__main__.py)
- MCP tool definition + BrandHandler tool-filter set + TypeAdapter
  registration (src/adcp/server/mcp_tools.py)
- ``adcp.types`` public re-exports for ``UpdateRightsRequest`` /
  ``UpdateRightsResponse``

All 7 ``tests/test_spec_coverage.py`` assertions now pass again.

## Validation

- 1499 tests passing (unchanged; spec-coverage suite was red on main
  post-sync, now green)
- mypy clean across 659 source files
- ruff clean

## Out of scope

- ``plan.human_review_required`` — new field on the plan request schema,
  generated types pick it up automatically. No extra wiring required; the
  field flows through ``SyncPlansRequest`` as-is.
- Release cut to PyPI (4.0.0b2) — separate concern tracked elsewhere.
- PgBackend implementation for ``IdempotencyStore`` — deferred per the
  post-#196 plan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bokelley added a commit that referenced this pull request Apr 19, 2026
Closes #185.

## Schema sync

Resynced against `adcp-latest`: 564 files updated, 1 removed
(`enums/budget-authority-level.json`). The rc.4 spec drops
``Budget.authority_level`` (enum: agent_full/agent_limited/human_required)
in favor of two orthogonal fields on ``plan.budget``:

- ``reallocation_threshold: number ≥ 0`` — budget-reallocation autonomy
  denominated in ``budget.currency``
- ``reallocation_unlimited: true`` — explicit full-autonomy sentinel,
  mutually exclusive with ``reallocation_threshold`` (enforced via
  ``oneOf`` in the schema)

Plus a new ``plan.human_review_required`` flag that governs decisions
affecting data subjects (targeting, creative, delivery) under GDPR Art 22
/ EU AI Act Annex III — distinct from budget reallocation autonomy.

No hand-written SDK code referenced the old ``authority_level`` /
``BudgetAuthorityLevel`` type, so codegen picks up the new shape
automatically.

## New task: update_rights

The rc.4 schema index added a ``update_rights`` task (partner to
``acquire_rights``). Lets buyers modify an existing rights acquisition —
typically extend ``end_date``, raise ``impression_cap``, pause/unpause
via ``paused``, or swap to a compatible ``pricing_option_id``. Wired at
the 6 spec-coverage sites:

- ``ADCPClient.update_rights`` (src/adcp/client.py)
- ``ADCPHandler.update_rights`` default stub (src/adcp/server/base.py)
- ``ProtocolAdapter.update_rights`` abstract + concrete A2A/MCP impls
  (src/adcp/protocols/{base,a2a,mcp}.py)
- CLI dispatch table (src/adcp/__main__.py)
- MCP tool definition + BrandHandler tool-filter set + TypeAdapter
  registration (src/adcp/server/mcp_tools.py)
- ``adcp.types`` public re-exports for ``UpdateRightsRequest`` /
  ``UpdateRightsResponse``

Also registered ``update_rights`` in ``TASK_FEATURE_MAP`` (capabilities.py)
and ``HANDLER_TO_DOMAIN`` (builder.py) so the client-side feature gate
enforces it and decorator-style servers auto-advertise the ``brand``
domain. Added ``test_feature_and_domain_maps_cover_brand_tasks`` regression
test to prevent silent fail-open for future brand tasks.

## Round-trip coverage

New ``tests/test_update_rights_roundtrip.py`` (4 tests):

- A2A partial update reaches the wire with only mutated fields
- A2A response parses through the ``UpdateRightsResponse1 | 2`` Union
- MCP ``call_tool`` receives the right tool name + params
- Default ``ADCPHandler.update_rights`` returns ``NotImplementedResponse``

## Validation

- 1504 tests passing (includes the 4 new round-trip tests + 1 new
  regression test)
- mypy clean across 659 source files
- ruff clean on changed files

## Out of scope

- Release cut to PyPI (4.0.0b2) — separate concern tracked elsewhere.
- PgBackend implementation for ``IdempotencyStore`` — deferred per the
  post-#196 plan.

BREAKING CHANGE: ``Budget.authority_level`` is removed. Migrate to
``reallocation_threshold`` / ``reallocation_unlimited`` on ``plan.budget``,
and set ``plan.human_review_required`` for decisions affecting data
subjects. See the rc.4 migration section in README.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Server-side idempotency middleware with pluggable storage

1 participant