The 9 skipped cases are rewritten to the async "POST + poll-until-terminal" shape, the timeout case is re-homed to the worker layer, a new `AC-Async` case asserts the `running → complete` polling tran
A reusable `arq_pool_spy` integration fixture that records every `enqueue_job(name, *args)` call, letting studies-POST tests positively assert `spy.calls == []` on rejection and `spy.calls == [("start
-
-
@@ -1069,8 +1056,6 @@
Dependency graph (feat_ + infra_)
classDef idea fill:#f1f5f9,stroke:#334155,color:#334155;
chore_demo_seeding_integration_tests_rewrite["demo seeding integration tests rewrite"]
class chore_demo_seeding_integration_tests_rewrite plan;
- chore_studies_post_arq_spy_fixture["studies post arq spy fixture"]
- class chore_studies_post_arq_spy_fixture plan;
feat_apply_path_normalizer_declaration["apply path normalizer declaration"]
class feat_apply_path_normalizer_declaration plan;
feat_query_normalizer_typed_pipeline["query normalizer typed pipeline"]
@@ -1128,8 +1113,6 @@
Dependency graph (feat_ + infra_)
classDef idea fill:#f1f5f9,stroke:#334155,color:#334155;
chore_demo_seeding_integration_tests_rewrite["demo seeding integration tests rewrite"]
class chore_demo_seeding_integration_tests_rewrite plan;
- chore_studies_post_arq_spy_fixture["studies post arq spy fixture"]
- class chore_studies_post_arq_spy_fixture plan;
feat_apply_path_normalizer_declaration["apply path normalizer declaration"]
class feat_apply_path_normalizer_declaration plan;
feat_query_normalizer_typed_pipeline["query normalizer typed pipeline"]
diff --git a/state.md b/state.md
index 09ec81cd..b8ef40ec 100644
--- a/state.md
+++ b/state.md
@@ -16,8 +16,8 @@ MVP1 (v0.1) **shipped** — all six differentiators live (Bayesian/TPE optimizer
## Current branch / execution context
-- **Branch:** `main` (PR #474 `chore_ubi_reader_search_after_pagination` just merged `d9afbce`, 2026-06-05). All 19 `pr.yml` checks green (smoke skipped — opt-in/off).
-- **Active feature:** None in flight — the 3-feature queue is fully shipped: #1 `bug_judgment_header_omits_click_bucket` (PR #470), #2 `feat_fts_rank_ordering` (PR #472), #3 `chore_ubi_reader_search_after_pagination` (PR #474). Pull the next item from the MVP2 backlog (run `/pipeline status`).
+- **Branch:** `main` (PR #476 `chore_studies_post_arq_spy_fixture` just merged `ed85d84`, 2026-06-05). All 19 `pr.yml` checks green (smoke skipped — opt-in/off).
+- **Active feature:** Mid a 3-chore queue (2026-06-05 afternoon): #1 `chore_studies_post_arq_spy_fixture` **shipped** (PR #476); **next #2 `chore_demo_seeding_integration_tests_rewrite`** (plan-stage, GPT-5.5-reviewed plan exists → executes via impl-execute); #3 `chore_pr_yml_parallelize_backend_job` (idea-stage → full pipeline). GPT-5.5 unreachable in this env → Opus self-review substitution.
- **Alembic head:** `0023_proposals_superseded_status` (unchanged — `feat_fts_rank_ordering` is no-migration; head last moved by `feat_overnight_final_solution_phase3` PR #457).
- **Python:** 3.13. **Frontend stack:** Next 16 (App Router + Turbopack), React 19, Tailwind 4 (CSS-first), Vitest 4, ESLint 9 (flat), TypeScript 6, Playwright (chromium, single worker) for E2E.
- **Coverage gates:** backend 80% (`fail_under` in pyproject), UI vitest + tsc + ESLint + Next build, plus a full-stack smoke E2E job. Live pass counts: see the latest `pr.yml` run (the historical per-feature counts moved to `state_history.md`).
@@ -26,17 +26,17 @@ MVP1 (v0.1) **shipped** — all six differentiators live (Bayesian/TPE optimizer
Detail + reasoning for each is in [`state_history.md`](state_history.md).
+- **2026-06-05** — `chore_studies_post_arq_spy_fixture` (PR #476, squash-merged `ed85d84`). **`arq_pool_spy` integration fixture for studies-POST enqueue assertions.** Test-infra only — **zero production diff, no migration** (head stays `0023`). 2 stories / 1 epic: (1.1) `SpyArqPool` recording double (flattened `(name, *args)`, truthy sentinel return) + `install_arq_pool_spy(app)` contextmanager (captures prior `app.state.arq_pool` via `_UNSET`, restores exactly — `delattr` when originally unset, reassign otherwise) + `arq_pool_spy` pytest-asyncio fixture (depends on `async_client` for install-after-lifespan ordering, NOT autouse) in `integration/conftest.py` + 5 unit tests; (1.2) wired the fixture into 13 studies-POST tests — 10 rejection paths assert `spy.calls == []`, 3 success paths assert `spy.calls == [("start_study", )]` (proves the spy not the real pool received the enqueue → confirms install ordering) — plus 2 AC-3 restore tests each forcing their own precondition so both restore branches run deterministically. Cross-model: **GPT-5.5 unreachable → Opus self-review substituted** (test-only, converged clean). Gemini 1 theme / 2 line comments: 1 accepted (drop redundant `@pytest.mark.asyncio` — redundant under `asyncio_mode = "auto"`), 1 rejected w/ evidence (module `pytestmark` unnecessary; `test_ubi_reader.py` has 17 async tests + 0 markers). 5 unit + 13 wired + 2 restore integration; backend unit 2476. All 19 `pr.yml` checks green. `docs/05_quality/testing.md` note added. Finalization bundled the dashboard + public-roadmap regen (no extra PR).
- **2026-06-05** — `chore_ubi_reader_search_after_pagination` (PR #474, squash-merged `d9afbce`). **Exact full-traffic UBI aggregation via cursor pagination (`scan_all`).** Replaces `UbiReader`'s single-page `search_batch` (a silent 10k-row sample) with a `scan_all` loop walking the full `ubi_events`/`ubi_queries` stream behind the `SearchAdapter` Protocol, so dense (>10k-event) clusters get exact judgment aggregation. **Backend + tests only, no migration** (head stays `0023`). 5 stories / 3 epics: (1.1) Protocol surface — `ScanPage(hits, cursor)` + `scan_all(target, body, *, page_size, cursor, fl, request_id)` + `close_scan(cursor, …)`, opaque round-tripped cursor; (2.1) `ElasticAdapter` ES+OpenSearch via PIT + `search_after` over `[{timestamp:asc},{_shard_doc:asc}]` — PIT id rotation into the cursor, engine-branched open/close endpoints + bodies (ES `/_pit` `{id}` / OpenSearch `/_search/point_in_time` `{pit_id:[…]}`, both unindexed-DELETE per the read-only invariant), narrow 405/501/400-unsupported fallback (configured `ubi_no_pit_tiebreaker_field` else sampled+WARN, never `_id` sort), adapter-owned pagination keys stripped before BOTH PIT and no-PIT construction (incl. `pit` per P5-A1), best-effort cleanup on exception (P3-A2) + terminal (P4-A3), cursor-before-fold (P1-B2); (2.2) `SolrAdapter` via `cursorMark` over POST `/select` form body (large `{!terms f=query_id}` fq → body not URL, P1-B1), uniqueKey-terminated sort, `start`/`rows`/`cursorMark`/`sort` stripped (P4-A2), `close_scan` no-op; (3.1) `UbiReader` loops `scan_all` per scan, folds incrementally, enforces an **exact** ceiling (`min(ES_MAX_RESULT_WINDOW, remaining)` per page + final-page slice), chunks `query_id` by count AND encoded byte length (`_chunk_query_ids` engine-aware fragment size), closes the cursor in `finally`, emits `ubi_reader_scan_truncated` only on genuine truncation; (3.2) 5 non-secret `Settings` ceiling fields injected by worker + dispatcher (readiness stays at defaults — probe-only, documented invariant). **Cross-model: GPT-5.5 unreachable in this env → Opus self-review substituted per `feat_fts_rank_ordering` precedent** — 0 High/1 Med/2 Low; F2 (truncation-WARN false-positive on terminal exact-fill) + F3 (`scanned` = inspected not kept rows) fixed; F1 (readiness ceiling injection) adjudicated as documented-invariant (provably dead — readiness only probes — + plan §3.2-resolved; injecting broke 6 tests). Gemini 3 Med: 2 accepted (`isinstance(error, dict)` guard on 4 sites + 2 regression tests), 1 rejected with cited counter-evidence (chunker "O(N²)" is O(N×max_count) bounded by `max_count`; the suggested byte approximation is numerically wrong — 27≠29-byte wrapper, missed the `, ` separator — so it would breach the hard byte ceiling). CI surfaced 2 unrelated failures, both fixed inline: the integration UBI stub mocked the old `search_batch` (added `scan_all`/`close_scan` mocks serving the same data); **also fixed a pre-existing ~50% flake in `feat_fts_rank_ordering`'s `test_no_q_does_not_rank`** — `_seed_clusters_rank` created both rows in one transaction so `created_at` (`func.now()` = transaction time) tied, falling to the random-UUID `id DESC` tiebreak; stamped explicit distinct timestamps. 18 ES + 15 Solr + 4 ScanPage + 33 reader (AC-5/6/9/12/13/14 + P1-B2 + truncation accounting) + 5 no-writes (PIT allowlist) unit; backend unit 2469. All 19 `pr.yml` checks green (smoke skipped — opt-in/off). Finalization bundled the dashboard + public-roadmap regen (no extra PR).
- **2026-06-05** — `feat_fts_rank_ordering` (PR #472, squash-merged `f970c05`). **Rank-order FTS results by `ts_rank` when `?q=` is active.** When a list request carries `?q=` and no explicit `?sort=`, the 6 searchable endpoints (clusters, studies, query_sets, query_templates, judgment_lists, conversations) now order by relevance — `floor(ts_rank*1e6) DESC, id DESC` — instead of `created_at`; `?sort=` overrides (rank is the implicit default, not a column sort-key). **Backend + small frontend, no migration** (head stays `0023`). Keyset stays exact: the cursor encodes the **same** int `rank_bucket` the ORDER BY uses, so the existing `parsed=None` 2-tuple keyset predicate is exact (rows share a bucket → tiebreak `id DESC`, UUIDv7 ≈ newest-first); the computed bucket is stashed as a transient `_fts_rank_bucket` for the router's next cursor (read via `rank_bucket_of`). Helpers `rank_bucket_expr`/`rank_active`/`rank_bucket_of`/`rows_with_rank` in `_fts.py` + an additive rank branch in 6 repos + 6 routers (only fires when `q` present & no sort → non-search listing byte-identical) + a "Sorted by relevance" `` pill + `fts.relevance_sort` glossary. **Cross-model: GPT-5.5 unreachable in this env (no key, egress 403) → Opus self-review substituted per operator decision** (spec/plan §13/§8). Gemini 6 findings (one per router) ALL accepted (`a48a6d3`): a stale datetime cursor reused on the rank path now 422s instead of 500ing on the int rank_bucket comparison. CI run-1 tripped ruff F841 (unused oracle local, added after the last local lint) — fixed inline. 10 unit (in-memory keyset oracle proving no-skip/no-dupe locally) + 11-case DB integration matrix (relevance order, cursor pagination exactness, explicit-sort override, no-q regression, tampered + stale-datetime cursor 422) + 4 vitest pill; full vitest 1243. All 19 `pr.yml` checks green. Finalization bundled the dashboard + public-roadmap regen (no extra PR).
- **2026-06-05** — `bug_judgment_header_omits_click_bucket` (PR #470, squash-merged `66d1873`). **Judgment-list header renders the `click` (UBI) source bucket.** The detail header showed only `LLM / Human`, omitting the `click` bucket, so the displayed source terms didn't sum to the total (and the component's "renders all three buckets" doc-claim was false). **Frontend-only — no backend, no migration** (head stays `0023`). 2 stories / 1 epic: (1.1) render a third slash-joined term `source_breakdown.click`, relabel `LLM / Human / Clicks`, add a source-of-truth comment pointing at backend `_SourceBreakdown`, and add an `InfoTooltip` on the label reusing the existing `judgment.source.click` glossary key (FR-4 implemented, no new key); presentational-only, no fetch/types:gen; (1.2) extend the real-backend `ubi-source-filter.spec.ts` to assert `header-breakdown` shows the three-term breakdown on a pure-CTR list. 9 vitest cases (3 existing chip + 6 new) + the E2E; 1239 vitest green. Gemini 2 medium: 1 accepted (`570861d`: locale-robust E2E digit-parse over `toLocaleString()` comparison), 1 rejected-with-evidence (optional-chain the `click` field — it's non-optional in the TS `_SourceBreakdown` type, the sibling `llm`/`human` don't guard, and the Compose deploy is atomic, so the guard would mask a contract violation). Final GPT-5.5 skipped (frontend-only, ≤3 files). **Finalization ran the in-line dashboard + public-roadmap regen (per `f4b5ed0`) so the dashboards + relyloop.com roadmap stay fresh — bundled into the finalization PR, no 3rd PR.** All 19 `pr.yml` checks green.
- **2026-06-05** — `bug_baseline_phase_test_isolation` (PR #466, squash-merged `6298e77`). **Hermetic baseline-wait unit tests via lazy settings read.** `_compute_baseline_wait_s` (`backend/workers/orchestrator.py`) called `get_settings()` unconditionally, so the three explicit-timeout `TestComputeBaselineWaitS` cases only passed when an earlier test module had already seeded `DATABASE_URL_FILE`/`POSTGRES_PASSWORD_FILE` — they failed standalone (`3 failed, 1 passed`). **Backend test-only — no migration** (head stays `0023`). 2 stories / 1 epic: (1.1) defer the `get_settings()` read into the missing/falsy-`trial_timeout_s` branch so explicit-timeout callers never construct `Settings` (return values unchanged, falsy-fallback semantics preserved); (1.2) add an autouse `_settings_env_and_restore` fixture (seeds the secret env vars at `/dev/null` + clears the canonical `get_settings` lru_cache — not `orch.get_settings`, which `test_missing_trial_timeout_uses_settings_default` monkeypatches) so the module is hermetic regardless of collection order, plus a `test_explicit_timeout_does_not_read_settings` regression that fails on pre-fix code and passes post-fix. Standalone with secrets unset: 14 passed; full unit suite 2400 passed. No Gemini findings; final GPT-5.5 skipped (≤40 LOC, test-only). All 19 `pr.yml` checks green.
-- **2026-06-05** — `chore_cluster_detail_rung_badge` (PR #464, squash-merged `3e03ce7`). **Cluster-detail UBI readiness card with rung badge.** A new `ClusterDetailUbiReadinessCard` on `/clusters/[id]` (between the action bar and indices card) lets the operator see a cluster's UBI readiness rung without opening the generate-judgments dialog. **Frontend-only — no backend, no migration** (head stays `0023`). 8 stories / 1 epic, executed 8→1→…→7: (S8) shared `useUbiReadiness` gains `placeholderData: keepPreviousData` so the rung persists across `(query_set_id, target)` edits without a skeleton flash (no-op for the dialog consumer); (S1) card scaffold + `` on the title (reachable in every state); (S2) query-set picker (`limit=50`, `has_more` "Browse all" footer, empty-state Link, explicit **Clear** button since Radix `