Skip to content

feat(signing): PostgreSQL-backed PgReplayStore for multi-instance verifiers#203

Merged
bokelley merged 3 commits into
mainfrom
bokelley/signing-replay-postgres
Apr 19, 2026
Merged

feat(signing): PostgreSQL-backed PgReplayStore for multi-instance verifiers#203
bokelley merged 3 commits into
mainfrom
bokelley/signing-replay-postgres

Conversation

@bokelley
Copy link
Copy Markdown
Contributor

Summary

Ships a production ReplayStore backed by PostgreSQL so multi-instance AdCP verifiers share nonce-seen state. A replay accepted on worker A can no longer land on worker B within the signature's validity window — the in-memory default left that gap on any load-balanced deployment.

Closes #187.

Backend choice: Postgres (not Redis)

Issue #187 originally proposed Redis. Since then the idempotency work (PR #196) landed PgBackend scaffolding as the intended shared-state backend, signaling the library's direction. Starting with Postgres:

  • aligns integrators who already adopted Postgres for idempotency
  • avoids pulling a second infra dep (Redis) when Postgres handles the workload with simple TTL cleanup
  • leaves Redis as a clean follow-up if a caller asks

Shape

from psycopg_pool import ConnectionPool
from adcp.signing import PgReplayStore, VerifyOptions

pool = ConnectionPool("postgresql://...", min_size=4, max_size=20)
replay = PgReplayStore(pool=pool)

options = VerifyOptions(..., replay_store=replay)

Design notes

  • Caller owns the pool — integrators typically already have one; second pool would be wasteful
  • Sync, not async — matches the current verifier; async follows whenever an async verifier lands
  • Three single-statement queries (seen / remember / at_capacity) + optional sweep_expired()
  • ON CONFLICT ... DO UPDATE handles legitimate nonce refresh without error paths
  • Lazy-only sweepseen self-filters; public sweep_expired() for cron
  • Fail-closed — connection errors propagate; verifier rejects
  • Table-name kwarg is identifier-validated — zero SQL injection surface
  • COLLATE "C" on keyid / nonce — blocks locale-dependent case folding that could collapse distinct slots
  • at_capacity uses COUNT(*) >= cap against a partial index — pays for accuracy exactly when a signer is misbehaving

Schema

Ships as plain SQL at src/adcp/signing/pg/replay_store.sql. Run it through whatever migration tool you already use — Alembic, Flyway, psql. Partial index for at_capacity is commented out in the DDL so integrators opt in per deployment.

Optional dep via [pg] extra

[project.optional-dependencies]
pg = [
    "psycopg[binary]>=3.1.0",
    "psycopg-pool>=3.2.0",
]

Core SDK stays installable without SQL deps; adcp.signing.PgReplayStore resolves to None when the extra isn't installed. Constructor raises ImportError with the install hint if called without the extra.

Testing

  • test_pg_replay_store.py — 16 tests, skipped when ADCP_PG_TEST_URL is unset
  • New CI job pg-replay-store runs against a Postgres 16 service container on every PR
  • Covers: all Protocol methods, TTL semantics, at_capacity threshold + per-keyid isolation, sweep_expired, 10-thread concurrent remember on the same nonce (tests ON CONFLICT correctness), COLLATE "C" case-variant isolation, identifier validation
  • Verified locally against a real Postgres 17 instance: all 16 pass

Not in scope (follow-ups)

  • Async PgReplayStore — needs an async verifier first (tracked separately)
  • Redis backend — follow up if a caller asks
  • #182 Pg idempotency backend — separate issue; this PR paves the psycopg3 dep path they'll share

Test plan

  • pytest tests/conformance/signing/ -q — 239 passed, 2 skipped (Pg tests skip without env var)
  • ADCP_PG_TEST_URL=... pytest tests/conformance/signing/test_pg_replay_store.py -v — all 16 pass locally
  • ruff check src/adcp/signing/ tests/conformance/signing/ — clean
  • mypy src/adcp/signing/ — clean (18 source files)
  • CI: pg-replay-store job runs full suite against Postgres 16 service

🤖 Generated with Claude Code

@gitguardian
Copy link
Copy Markdown

gitguardian Bot commented Apr 19, 2026

️✅ There are no secrets present in this pull request anymore.

If these secrets were true positive and are still valid, we highly recommend you to revoke them.
While these secrets were previously flagged, we no longer have a reference to the
specific commits where they were detected. Once a secret has been leaked into a git
repository, you should consider it compromised, even if it was deleted immediately.
Find here more information about risks.


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

…ifiers

Ships a production ReplayStore backed by PostgreSQL so multi-instance
AdCP verifiers share nonce-seen state. A replay accepted on worker A
can no longer land on worker B within the signature window.

Closes #187.

Postgres, not Redis: aligns with the idempotency PgBackend scaffolding
direction, avoids a second infra dep, Redis is a clean follow-up.

Shape:

    from psycopg_pool import ConnectionPool
    from adcp.signing import PgReplayStore, VerifyOptions
    pool = ConnectionPool('postgresql://...')
    replay = PgReplayStore(pool=pool)
    options = VerifyOptions(..., replay_store=replay)

Design:
- Caller owns the pool
- Sync (matches current verifier); async lands with async verifier
- Three single-statement queries; ON CONFLICT ... DO UPDATE handles
  legitimate nonce refresh
- Lazy-only sweep; public sweep_expired() for cron
- Fail-closed on errors
- Table-name kwarg identifier-validated (zero injection surface)
- COLLATE 'C' avoids locale case-folding
- at_capacity via COUNT(*) >= cap on partial index

Schema ships as plain SQL at src/adcp/signing/pg/replay_store.sql.

Optional dep via [pg] extra (psycopg[binary] + psycopg-pool). Core SDK
stays installable; adcp.signing.PgReplayStore resolves to None without
the extra.

Tests (+16) gated on ADCP_PG_TEST_URL. New pg-replay-store CI job
runs against a Postgres 16 service container. Covers Protocol
methods, TTL, at_capacity, per-keyid isolation, sweep, 10-thread
ON CONFLICT correctness, case-variant isolation, identifier
validation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley bokelley force-pushed the bokelley/signing-replay-postgres branch from ac6e63f to 3ad3951 Compare April 19, 2026 20:04
bokelley and others added 2 commits April 19, 2026 16:18
MUST FIX:
- _is_safe_identifier now uses re.fullmatch for byte-ASCII check.
  str.islower/isalpha accept Unicode homoglyphs (fullwidth Latin,
  café, Greek µ) that format into SQL as DIFFERENT tables than
  operators configured — a replay-bypass vector under multi-tenant
  config.
- Dropped invalid commented partial index from replay_store.sql
  (now() is STABLE, not IMMUTABLE — the DDL was uncompilable).
- Softened module-docstring fail-closed claim to match reality:
  psycopg errors propagate unchanged; frameworks return 5xx.

DX P0:
- PgReplayStore = None trap replaced with stub class raising
  ImportError + install hint (incl. Poetry command).
- Module docstring example now shows full verify_request_signature
  wiring, not just construction.
- New PgReplayStore.create_schema(pool) classmethod runs packaged
  DDL via importlib.resources — one-line bootstrap.
- REQUIRED-labelled sweep callout with pg_cron + in-process snippets.

SHOULD FIX:
- ON CONFLICT DO UPDATE gated on EXCLUDED.expires_at > current to
  avoid MVCC write amp on shorter-TTL refreshes.
- Removed dead import logging / logger scaffold.
- Error message matches validator (ASCII-only).

Tests (+3):
- test_non_ascii_table_name_rejected (fullwidth, accented, Greek µ).
- test_remember_twice_with_shorter_ttl_keeps_longer_expiry.
- test_create_schema_idempotent.

E2e verified: sign → verify (accept) → 2nd verify (rejected with
request_signature_replayed) via PgReplayStore through
verify_request_signature.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After shipping the base PR I ran a fresh-integrator walkthrough via
subagent: pretend you've never seen the library, wire up PgReplayStore
from scratch, report friction. Surfaced 5 real issues — all fixed —
plus added genuine HTTP-over-the-wire e2e coverage.

Full-wire e2e
-------------

New test_pg_replay_store_e2e.py — signed HTTP requests over
httpx.ASGITransport to a Starlette app running
verify_starlette_request with PgReplayStore. Four scenarios:
- happy path → 200
- replay → 401 with WWW-Authenticate: Signature error="request_signature_replayed"
- fresh nonce after replay → 200
- cross-instance replay — two PgReplayStore instances on same pool;
  worker B rejects a replay that landed on worker A. The load-bearing
  property Postgres exists to provide over InMemoryReplayStore.

CI's pg-replay-store job now runs both the unit + e2e files.

DX fixes
--------

1. PgReplayStore.create_schema() now instance method honoring
   table_name. Previously a classmethod that silently created
   adcp_replay only — integrators with per-tenant table names had no
   working bootstrap path. Real bug, not just DX.
2. New load_private_key_pem(pem, *, password=None) helper on the
   public API. Closes the loop between adcp-keygen (writes PEM) and
   sign_request (takes PrivateKey) without requiring a direct
   cryptography import.
3. Package docstring now has a Quickstart section listing the ~10
   names buyer / seller / governance paths actually reach for.
4. verify_starlette_request docstring corrected — previously
   promised request.state / VerifiedSigner body re-read that
   didn't exist. Now accurately documents Starlette's body
   caching and the Raises: contract.
5. New test_create_schema_honors_table_name exercises the bootstrap
   path with a custom table and asserts remember/seen both target
   the right table.

Verified
--------

Re-ran the fresh-integrator walkthrough — integrator reached green
in 1 iteration, 0 tracebacks (down from 2 iterations + 1 traceback
before). Custom table_name worked end-to-end.

262 signing tests pass (4 new e2e + 1 reshaped bootstrap test).
Ruff + mypy clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley bokelley merged commit 8118033 into main Apr 19, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Signing: Redis-backed replay store for multi-instance verifiers

1 participant