diff --git a/.claude/skills/impl-execute/SKILL.md b/.claude/skills/impl-execute/SKILL.md index 83dc9c0b..a94a80d3 100644 --- a/.claude/skills/impl-execute/SKILL.md +++ b/.claude/skills/impl-execute/SKILL.md @@ -354,7 +354,7 @@ Agent({ 3. Commit with a descriptive message referencing the story: ```bash -git commit -m "$(cat <<'EOF' +git commit -s -m "$(cat <<'EOF' feat(): (Story X.Y) @@ -364,6 +364,12 @@ EOF )" ``` +The `-s` flag adds a `Signed-off-by:` trailer required by the DCO check +(`.github/workflows/dco.yml` + `scripts/check-dco-signoff.sh`). See +CONTRIBUTING.md → Developer Certificate of Origin (DCO). The local pre-commit +hook will reject the commit if the trailer is missing; the CI gate will fail +the PR if any commit is missing it. + ### Step 8: Update progress 1. Mark the story as complete in the implementation plan's execution tracker (`[x]`) @@ -628,7 +634,7 @@ make typecheck ./.venv/bin/ruff format --check backend/ # exactly what CI runs; must pass here too ``` -If `make fmt` produced any diff, commit it first (`git add -A && git commit -m "style: apply ruff format (pre-push)"`) before pushing — that keeps the gate honest on the pushed SHA. +If `make fmt` produced any diff, commit it first (`git add -A && git commit -s -m "style: apply ruff format (pre-push)"`) before pushing — that keeps the gate honest on the pushed SHA. The `-s` is required by the DCO gate (see Step 7). ```bash git push -u origin @@ -881,7 +887,7 @@ Send to GPT-5.5 with the full implementation plan. This catches cross-story issu 8. **Commit and push** the finalization changes: ```bash - git add && git commit -m "docs: move to implemented, update state.md" + git add && git commit -s -m "docs: move to implemented, update state.md" git push ``` diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS new file mode 100644 index 00000000..4eb24433 --- /dev/null +++ b/.github/CODEOWNERS @@ -0,0 +1,37 @@ +# CODEOWNERS — auto-request review when these paths change. +# +# See https://docs.github.com/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners +# +# Order matters: later rules override earlier ones. The last matching pattern +# wins. Patterns follow the same syntax as .gitignore. +# +# Source-of-truth note: when a maintainer joins or leaves, update both +# MAINTAINERS.md and this file in the same PR. Prefer team handles +# (e.g. @SoundMindsAI/relyloop-maintainers) over individuals once the team +# exists in the org — teams survive departures, individuals don't. + +# Default — everything not matched below. +* @SoundMindsAI + +# Governance / contributor surface — changes here have project-wide impact; +# the project lead is the only required reviewer at v0.1. +/LICENSE @SoundMindsAI +/NOTICE @SoundMindsAI +/GOVERNANCE.md @SoundMindsAI +/MAINTAINERS.md @SoundMindsAI +/SECURITY.md @SoundMindsAI +/CODE_OF_CONDUCT.md @SoundMindsAI +/CONTRIBUTING.md @SoundMindsAI +/CLAUDE.md @SoundMindsAI +/.github/ @SoundMindsAI + +# CI / release infra +/.github/workflows/ @SoundMindsAI +/Dockerfile @SoundMindsAI +/docker-compose.yml @SoundMindsAI +/Makefile @SoundMindsAI +/scripts/ @SoundMindsAI + +# Migrations — DB schema changes need explicit project-lead review. +/migrations/ @SoundMindsAI +/alembic.ini @SoundMindsAI diff --git a/.github/ISSUE_TEMPLATE/bug_report.yml b/.github/ISSUE_TEMPLATE/bug_report.yml new file mode 100644 index 00000000..c3839edf --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bug_report.yml @@ -0,0 +1,112 @@ +name: Bug report +description: Something in RelyLoop doesn't behave the way the docs or spec say it should. +title: "bug: " +labels: ["bug", "needs-triage"] +body: + - type: markdown + attributes: + value: | + Thanks for taking the time to file a report. The more concrete the + repro, the faster we can act. + + **Security vulnerability?** Stop. Do not file a public issue. Follow + [SECURITY.md](https://github.com/SoundMindsAI/relyloop/blob/main/SECURITY.md) for private reporting. + + - type: textarea + id: what-happened + attributes: + label: What happened + description: A clear, factual description of the failure — what you did, what you saw. + placeholder: | + Ran `make seed-clusters`, opened `/chat`, asked for a relevance study, + got HTTP 500 from `/api/v1/studies` instead of the queued response. + validations: + required: true + + - type: textarea + id: expected + attributes: + label: What you expected + description: What should have happened, based on the docs / tutorial / spec. + validations: + required: true + + - type: textarea + id: repro + attributes: + label: Reproduction + description: | + Step-by-step. The most useful repros run against a fresh `make up` + stack. If you can paste a `curl` command or a `pytest` test, even + better. + placeholder: | + 1. `make up && make migrate && make seed-clusters && make seed-es` + 2. `curl -X POST http://127.0.0.1:8000/api/v1/studies -H "Content-Type: application/json" -d '{"...": "..."}'` + 3. Response is 500 instead of 202. + validations: + required: true + + - type: input + id: version + attributes: + label: RelyLoop version + description: Output of `git rev-parse --short HEAD` or the release tag. + placeholder: "v0.1.0 (or 6ff9c211)" + validations: + required: true + + - type: dropdown + id: engine + attributes: + label: Search engine + description: Which engine were you exercising when this happened? + options: + - Elasticsearch (local-es from compose) + - OpenSearch (local-opensearch from compose) + - Elasticsearch (operator-managed, not from compose) + - OpenSearch (operator-managed, not from compose) + - Not engine-related (UI, infra, docs, etc.) + - Other (please describe in "What happened") + validations: + required: true + + - type: input + id: engine-version + attributes: + label: Engine version + description: e.g. ES 9.4.0, OpenSearch 2.18.0. Leave blank if not engine-related. + placeholder: "9.4.0" + + - type: dropdown + id: llm + attributes: + label: LLM endpoint + description: What was `OPENAI_BASE_URL` pointed at? + options: + - api.openai.com (default) + - Ollama (local) + - LM Studio (local) + - vLLM (self-hosted) + - HuggingFace TGI + - Azure OpenAI + - Other OpenAI-compatible endpoint + - LLM-irrelevant for this bug + validations: + required: true + + - type: textarea + id: logs + attributes: + label: Relevant logs + description: | + Output of `make logs` (or `docker compose logs api worker`) around + the failure. Scrub anything that looks like a secret before pasting. + render: shell + + - type: textarea + id: env + attributes: + label: Environment + description: OS, Docker version, anything else you think matters. + placeholder: | + macOS 15.5, Docker 27.3, 32 GB RAM diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml new file mode 100644 index 00000000..9e41b50d --- /dev/null +++ b/.github/ISSUE_TEMPLATE/config.yml @@ -0,0 +1,11 @@ +blank_issues_enabled: false +contact_links: + - name: Security vulnerability + url: https://github.com/SoundMindsAI/relyloop/security/advisories/new + about: Report a security vulnerability privately. Do not file a public issue. + - name: Question or design discussion + url: https://github.com/SoundMindsAI/relyloop/discussions + about: Ask a usage question, propose a design, or start an open-ended thread. + - name: Documentation issue + url: https://github.com/SoundMindsAI/relyloop/issues/new?template=bug_report.yml&labels=bug,docs,needs-triage&title=docs%3A+%3Cshort+summary%3E + about: Found a typo or a docs claim that doesn't match the code? File a bug with the "docs" label. diff --git a/.github/ISSUE_TEMPLATE/feature_request.yml b/.github/ISSUE_TEMPLATE/feature_request.yml new file mode 100644 index 00000000..607c1c68 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/feature_request.yml @@ -0,0 +1,49 @@ +name: Feature request +description: Propose a new capability or a change to existing behavior. +title: "feat: " +labels: ["enhancement", "needs-triage"] +body: + - type: markdown + attributes: + value: | + Thanks for thinking about how to make RelyLoop better. Before filing, + please skim the [umbrella spec §4 (non-goals)](https://github.com/SoundMindsAI/relyloop/blob/main/docs/00_overview/relyloop-spec.md) + and the [release matrix](https://github.com/SoundMindsAI/relyloop/blob/main/docs/01_architecture/tech-stack.md) + to see whether the capability is already planned for a later release. + + - type: textarea + id: problem + attributes: + label: Problem + description: What problem are you trying to solve? Concrete situation, not abstract wish. + placeholder: | + I want to run a study against three judgment lists at once to see + which one disagrees least with the LLM-judge baseline, and right + now I have to launch and compare three separate studies by hand. + validations: + required: true + + - type: textarea + id: proposal + attributes: + label: Proposed solution + description: What change in RelyLoop would make this easier? Sketch the API / UI / CLI surface. + validations: + required: true + + - type: textarea + id: alternatives + attributes: + label: Alternatives considered + description: | + Other approaches you thought about and why you didn't pick them. + "I couldn't think of any" is a legitimate answer for small requests + but is a yellow flag for large ones. + + - type: textarea + id: scope + attributes: + label: Scope and willingness to contribute + description: | + Roughly how big do you think this is? Are you offering to send a PR, + or filing the request and hoping someone picks it up? diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md new file mode 100644 index 00000000..ea73013b --- /dev/null +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -0,0 +1,62 @@ + + +## Summary + + + +## Linked issues + + + +## Type of change + + + +- feat (new capability) +- fix (bug fix) +- docs (documentation only) +- chore (tooling, deps, repo hygiene) +- refactor (no behavior change) +- test (adding or updating tests) +- ci / infra (build, deploy, hooks) + +## Testing + + + +## Notes for reviewers + + + +## Checklist + +- [ ] Commits are signed off (`git commit -s`) and follow Conventional Commits. +- [ ] New behavior has tests at every layer it touches (unit / integration / contract / E2E). +- [ ] Docs updated (`README.md`, `CLAUDE.md`, `state.md`, `architecture.md`, runbooks under `docs/03_runbooks/`) where applicable. +- [ ] If this changes the spec, the spec was updated first (and the change is referenced above). diff --git a/.github/workflows/dco.yml b/.github/workflows/dco.yml new file mode 100644 index 00000000..7360ce0f --- /dev/null +++ b/.github/workflows/dco.yml @@ -0,0 +1,56 @@ +name: DCO + +on: + pull_request: + branches: [main] + +permissions: + contents: read + +jobs: + dco-check: + name: DCO sign-off check + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + with: + # Fetch full history so `git log ..HEAD` resolves. + fetch-depth: 0 + + - name: Verify Signed-off-by trailer on every PR commit + env: + BASE_REF: ${{ github.event.pull_request.base.ref }} + HEAD_SHA: ${{ github.event.pull_request.head.sha }} + run: | + set -euo pipefail + + BASE="origin/${BASE_REF}" + git fetch --quiet origin "${BASE_REF}" + + MISSING=() + while IFS= read -r sha; do + [[ -z "$sha" ]] && continue + # Skip merge commits — their parents already had the check. + if [[ $(git rev-list --parents -n 1 "$sha" | awk '{print NF-1}') -gt 1 ]]; then + continue + fi + body=$(git log -1 --format='%B' "$sha") + if ! grep -qE '^Signed-off-by: .+ <.+@.+>$' <<< "$body"; then + subject=$(git log -1 --format='%s' "$sha") + MISSING+=("${sha:0:12} ${subject}") + fi + done < <(git rev-list "${BASE}..${HEAD_SHA}") + + if [ "${#MISSING[@]}" -gt 0 ]; then + echo "::error::DCO violation. The following commits are missing a Signed-off-by trailer:" + printf ' %s\n' "${MISSING[@]}" + echo "" + echo "Fix locally with:" + echo " git rebase --signoff origin/${BASE_REF}" + echo " git push --force-with-lease" + echo "" + echo "See CONTRIBUTING.md → Developer Certificate of Origin (DCO)." + exit 1 + fi + + echo "DCO check passed: every commit in ${BASE}..HEAD has a Signed-off-by trailer." diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index dd75bdf7..bf9bf6e4 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -143,3 +143,14 @@ repos: entry: bash scripts/check-conventional-commit.sh language: system stages: [commit-msg] + + # --------------------------------------------------------------------------- + # DCO sign-off (commit-msg stage) — see CONTRIBUTING.md "Developer Certificate + # of Origin (DCO)". CI gate at .github/workflows/dco.yml enforces the same on + # every PR; this hook catches the miss locally. + # --------------------------------------------------------------------------- + - id: dco-signoff + name: DCO Signed-off-by trailer check + entry: bash scripts/check-dco-signoff.sh + language: system + stages: [commit-msg] diff --git a/CLAUDE.md b/CLAUDE.md index 2975814f..12596f4c 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -101,7 +101,7 @@ scripts/ install.sh # auto-generates required + optional secrets, then docker compose up -d check-conventional-commit.sh # commit-msg pre-commit hook docs/ - 00_overview/ # umbrella spec (relevance-copilot-spec.md), implemented_features/_/ + 00_overview/ # umbrella spec (relyloop-spec.md), implemented_features/_/ 01_architecture/# topical arch docs: tech-stack, system-overview, data-model, deployment, api-conventions, adapters, llm-orchestration, optimization, ui-architecture, agent-tools, apply-path, mvp1-overview 02_product/ # mvp1-user-stories.md + planned_features// 03_runbooks/ # local-dev.md (and per-feature runbooks as features ship) diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md index 82ada004..d7bebb98 100644 --- a/CODE_OF_CONDUCT.md +++ b/CODE_OF_CONDUCT.md @@ -1,13 +1,18 @@ # Code of Conduct -This project adopts the [Contributor Covenant 2.1](https://www.contributor-covenant.org/version/2/1/code_of_conduct/) as its Code of Conduct. +A short ask, not a long list of rules. -All contributors, maintainers, and community members are expected to read and follow it. +We would like RelyLoop to be a project where people are kind to each other. +Search relevance work is hard, opinions vary, and the people you are +disagreeing with are usually doing their best. Please assume good faith, +keep critique focused on the work rather than the person, and remember that +text on a screen flattens a lot of nuance. -## Reporting +If something makes the project feel less welcoming — to you or to someone +else — and you think the maintainers should know, please email the contact +listed in [MAINTAINERS.md](MAINTAINERS.md). Reports are handled +confidentially. The maintainers may ask follow-up questions, ask a +participant to step back from a thread, or in serious cases remove +someone's ability to interact with the project. -To report a concern, please email the maintainers privately. The current maintainer contact is listed in `MAINTAINERS.md`. Reports are handled confidentially. - -## Enforcement - -The maintainers will review reports and respond as appropriate, following the enforcement guidelines in the Contributor Covenant. +That is the whole policy. diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index fb1b2b2c..95048a2d 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -4,11 +4,15 @@ Thanks for your interest in contributing! RelyLoop is an open-source project und This document explains how to set up a development environment, propose changes, and sign your commits under the Developer Certificate of Origin (DCO). -> **Status (alpha):** RelyLoop is pre-MVP1. We are not yet accepting external code contributions while the foundation is being built. Issues, design feedback, and discussions are welcome. Code-contribution guidelines below are forward-looking and will become active once MVP1 ships. +> **Status (alpha):** RelyLoop shipped MVP1 (`v0.1.0`) as alpha. APIs, schemas, and adapter contracts are still evolving — expect breaking changes between minor releases until v1.0 GA. Issues, design feedback, and pull requests are all welcome. ## Code of Conduct -This project adopts the [Contributor Covenant 2.1](CODE_OF_CONDUCT.md). All contributors are expected to follow it. Report any violations to the maintainers privately at the email listed in `CODE_OF_CONDUCT.md`. +A short kindness ask, not a long list of rules. See [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md). + +## Governance + +Decision-making, who has merge rights, and the path to becoming a maintainer are in [GOVERNANCE.md](GOVERNANCE.md). RelyLoop is currently single-vendor-stewarded (all maintainers are soundminds.ai employees); the transition plan toward multi-organization maintainership is in that document. ## Developer Certificate of Origin (DCO) @@ -26,7 +30,13 @@ To sign off a commit, add `Signed-off-by: Your Name ` to git commit -s -m "feat(adapter): add OpenSearch sigv4 auth" ``` -CI rejects PRs whose commits are not signed off. +A DCO check workflow (`.github/workflows/dco.yml`) runs on every PR and fails if any commit lacks a `Signed-off-by:` trailer. The same check runs locally as a `commit-msg` pre-commit hook (`scripts/check-dco-signoff.sh`) once you've run `pre-commit install --hook-type commit-msg`. If a commit is missing the trailer, backfill with `git rebase --signoff origin/main && git push --force-with-lease`. + +You can also opt into auto-sign-off for this repo: + +```bash +git config format.signoff true +``` ## Commit message format @@ -68,11 +78,11 @@ git clone https://github.com/SoundMindsAI/relyloop.git cd relyloop uv sync # install Python deps + create .venv pnpm --dir ui install # install frontend deps -make pre-commit-install # install Git hooks (Story 1.4) +make pre-commit-install # install pre-commit + commit-msg hooks make up # boot the Docker stack ``` -The full local-development guide ships with MVP1 — see [`docs/03_runbooks/local-dev.md`](docs/03_runbooks/local-dev.md) when `infra_foundation` lands. +The full local-development guide is [`docs/03_runbooks/local-dev.md`](docs/03_runbooks/local-dev.md). ## Pre-commit hooks @@ -120,35 +130,38 @@ Trunk-based development: 1. Fork the repo and create a feature branch from `main` 2. Make your changes with sign-off (`git commit -s`) -3. Run tests locally (`make test` once that target is in place) -4. Push to your fork and open a PR against `main` -5. CI runs lint, type-check, unit tests, contract tests, security scans +3. Run tests locally (`make test`) +4. Push to your fork and open a PR against `main`. The PR template will prompt you for the right information. +5. CI runs lint, type-check, unit tests, contract tests, secret scanning, and frontend build 6. At least one maintainer review approval is required 7. Squash-merge by a maintainer once approved and CI is green ## Reporting issues -- **Bugs**: use the bug-report template in `.github/ISSUE_TEMPLATE/`. Include reproduction steps, environment details, and logs. -- **Feature requests**: use the feature-request template. Explain the use case and why existing functionality doesn't cover it. -- **Security vulnerabilities**: do **not** open a public issue. Follow the process in `SECURITY.md`. +- **Bugs**: use the bug-report template in [`.github/ISSUE_TEMPLATE/bug_report.yml`](.github/ISSUE_TEMPLATE/bug_report.yml). Include reproduction steps, environment details, and logs. +- **Feature requests**: use the feature-request template at [`.github/ISSUE_TEMPLATE/feature_request.yml`](.github/ISSUE_TEMPLATE/feature_request.yml). Explain the use case and why existing functionality doesn't cover it. +- **Security vulnerabilities**: do **not** open a public issue. Follow the process in [SECURITY.md](SECURITY.md). ## Adding a new adapter RelyLoop's engine, LLM provider, and Git provider adapters are designed for community extension. Each adapter: -- Implements the relevant Protocol in `backend/adapters/`, `backend/llm/`, or `backend/git/` -- Passes the conformance test suite in `tests/contracts/` -- Includes unit tests with `pytest-recording` cassettes -- Documents auth flow, version support, and any quirks in `docs/06_vendor_docs/adapters/.md` +- Implements the relevant Protocol in [`backend/app/adapters/`](backend/app/adapters/), [`backend/app/llm/`](backend/app/llm/), or [`backend/app/git/`](backend/app/git/) +- Passes the contract test suite in [`backend/tests/contract/`](backend/tests/contract/) +- Includes unit tests under [`backend/tests/unit/`](backend/tests/unit/) (use `pytest-recording` cassettes when exercising real HTTP) +- Documents auth flow, version support, and any quirks under [`docs/06_vendor_docs/`](docs/06_vendor_docs/) -See the spec (`docs/00_overview/product/relevance-copilot-spec.md` §8 for engine adapters, §15 for LLM providers, §16 for Git providers) for the full contracts. +See the spec ([`docs/00_overview/relyloop-spec.md`](docs/00_overview/relyloop-spec.md) §8 for engine adapters, §15 for LLM providers, §16 for Git providers) and the architecture-level adapters doc ([`docs/01_architecture/adapters.md`](docs/01_architecture/adapters.md)) for the full contracts. -## Maintainers +## Maintainers and governance -See `MAINTAINERS.md` for the current maintainer list and their areas of focus. +- Current maintainer roster: [MAINTAINERS.md](MAINTAINERS.md) +- How decisions are made and how to become a maintainer: [GOVERNANCE.md](GOVERNANCE.md) ## Questions For questions about the project direction, roadmap, or design choices, open a GitHub Discussion (once enabled). For specific implementation questions, open an issue. +For casual outreach or design conversations the project lead is also reachable on [X](https://x.com/Starrman777) and [LinkedIn](https://www.linkedin.com/in/starrman/) — see [MAINTAINERS.md](MAINTAINERS.md) for the canonical contact list. + Thank you for contributing. diff --git a/GOVERNANCE.md b/GOVERNANCE.md new file mode 100644 index 00000000..bc25b23c --- /dev/null +++ b/GOVERNANCE.md @@ -0,0 +1,112 @@ +# Governance + +This document describes who decides what in RelyLoop and how the project +intends to evolve. + +## Current state — single-vendor stewardship + +RelyLoop is at v0.1 (MVP1 alpha). **All maintainers are soundminds.ai +employees**, and final merge authority on `main` rests with the project +lead. We are stating this openly so that prospective contributors and +enterprise reviewers can size up the bus factor and capture risk +honestly. + +We chose this model because the project is still in the foundation-laying +phase: schemas, adapter contracts, and APIs are changing fast, and a small +group of people with full context can move much faster than a committee. +The model is intentionally temporary — see "Transition plan" below. + +## Project scope + +RelyLoop is an open-source tool for tuning query-time search relevance on +Elasticsearch, OpenSearch, and (in later releases) Lucidworks Fusion. The +authoritative scope statement is the [umbrella spec](docs/00_overview/relyloop-spec.md), +particularly §4 (non-goals). Proposals that materially expand scope +(new engine families, online A/B testing, LTR training, sitting on the live +search-serving path) are decided by the maintainers and the project lead. + +## Roles + +- **Contributor.** Anyone who opens an issue, comments on a discussion, or + submits a pull request. No special status is required; see + [CONTRIBUTING.md](CONTRIBUTING.md) for how to get started. +- **Maintainer.** Has commit rights, reviews and merges pull requests, and + triages issues. Current maintainers are listed in + [MAINTAINERS.md](MAINTAINERS.md). Maintainers are expected to follow the + contribution norms in [CONTRIBUTING.md](CONTRIBUTING.md) — Conventional + Commits, DCO sign-off, no force-push to `main`. +- **Project lead.** Holds final say on direction, scope, releases, and the + maintainer roster. At v0.1 this role is held by Eric Starr + (soundminds.ai). + +## How decisions are made + +Day-to-day technical decisions use **lazy consensus**: if a maintainer +opens a PR or proposal and no other maintainer objects within a reasonable +review window (typically a few business days), it ships. Substantive +disagreements are resolved through discussion in the PR or issue thread; +where consensus cannot be reached, the project lead decides. + +Decisions that change the public API surface, alter the umbrella spec's +non-goals, drop a supported engine or LLM provider, or change the +licensing or governance posture require an explicit `+1` from at least two +maintainers, including the project lead. + +**Single-maintainer transitional rule (v0.1):** while there is only one +maintainer, the two-maintainer `+1` requirement is suspended — the project +lead's decision stands, recorded in the PR or issue thread. The rule +activates automatically the moment a second maintainer is added. This is +called out openly in [MAINTAINERS.md](MAINTAINERS.md) so contributors can +size the governance state honestly. + +## How to become a maintainer + +There is no time-served quota. The realistic bar is: + +1. A track record of merged PRs across at least two subsystems + (backend / frontend / adapters / docs / infra). +2. Demonstrated judgement in code review — catching design issues, not + just style nits. +3. Willingness to take review and triage shifts, not just ship features. + +The nomination flow: an existing maintainer opens an issue proposing the +addition; other maintainers `+1` or raise objections; the project lead +makes the final call. While there is only one maintainer, the project +lead adds the first additional maintainer unilaterally based on the +criteria above — the `+1` flow activates once N≥2. We will document the +first community maintainer addition publicly so that the path is visible. + +## Transition plan + +We intend to move from single-vendor stewardship toward a multi-organization +maintainer model over **12–24 months** from MVP1's first stable release. +The umbrella spec discusses this in §29 ("OSS positioning & governance"). +Concrete milestones we will report on: + +- **First external maintainer added** (target: within the first 12 months + of community contribution). +- **Maintainer roster spans at least two organizations** (target: before + v1.0 GA). +- **Governance amendment process delegated to the maintainers** (target: + at v1.0 GA — i.e., the project lead's veto on this document goes away). + +We will publish progress against these milestones in release notes. + +## Conflict resolution + +If a disagreement cannot be resolved through discussion in a PR or issue: + +1. Move the conversation to a dedicated issue summarizing the positions. +2. Tag the project lead for a decision. The lead may request input from + the broader maintainer group before deciding. +3. The decision and its rationale are recorded in the issue and, if + architecturally material, in `docs/01_architecture/`. + +Conduct concerns (as distinct from technical disagreements) follow the +process in [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md). + +## Amending this document + +Changes to GOVERNANCE.md require a pull request approved by the project +lead. We will move to a maintainer-vote amendment process at v1.0 GA per +the transition plan above. diff --git a/MAINTAINERS.md b/MAINTAINERS.md new file mode 100644 index 00000000..3a0f43ef --- /dev/null +++ b/MAINTAINERS.md @@ -0,0 +1,78 @@ +# Maintainers + +This file lists the people with commit and merge rights on the `main` +branch of this repository. + +Last updated: 2026-05-27. + +| Name | GitHub | Contact | Role | Affiliation | Areas | +|---|---|---|---|---|---| +| Eric Starr | [@SoundMindsAI](https://github.com/SoundMindsAI) | `eric.starr@soundminds.ai` · [@Starrman777 on X](https://x.com/Starrman777) · [LinkedIn](https://www.linkedin.com/in/starrman/) | Project lead, maintainer | soundminds.ai | All subsystems | + +**Note on contact channels:** the email and socials above are for casual outreach, design conversations, and "is RelyLoop right for my team?" questions. They are **not** for security disclosures (use [SECURITY.md](SECURITY.md)) or bug reports (use [GitHub Issues](https://github.com/SoundMindsAI/relyloop/issues)). + +At v0.1, the project has a single active maintainer, employed by +soundminds.ai. This is stated openly so that contributors and downstream +users can size the bus factor honestly. Operationally this means the +project lead self-merges their own PRs after CI is green; the governance +rules that would normally require multi-maintainer `+1` quorum are +suspended under the transitional rule documented in +[GOVERNANCE.md](GOVERNANCE.md#how-decisions-are-made). The plan to grow +the maintainer set across organizations is in +[GOVERNANCE.md](GOVERNANCE.md) ("Transition plan"). + +## Responsibilities + +Maintainers are expected to: + +- Review pull requests in their areas of ownership within a few business + days (US Eastern Time). +- Triage incoming issues — label, ask for missing info, route to the right + template, close as appropriate. +- Follow the contribution norms in [CONTRIBUTING.md](CONTRIBUTING.md): + Conventional Commits, DCO sign-off on every commit, no force-push to + `main`, squash-merge. +- Adhere to the decision-making rules in [GOVERNANCE.md](GOVERNANCE.md): + lazy consensus for routine work, explicit `+1`s for substantive + changes. +- Handle security reports per [SECURITY.md](SECURITY.md) when assigned. + +## Becoming a maintainer + +See [GOVERNANCE.md](GOVERNANCE.md#how-to-become-a-maintainer) for the +nomination flow and the realistic bar. + +## Branch protection (operator setup notes) + +These notes apply to whoever administers the GitHub repository. They are +intentionally kept in the repo (not in a maintainer's head) so the setup +survives a maintainer transition. + +**At v0.1 with N=1 maintainer**, the safe branch-protection configuration +on `main` is: + +- ✅ Require a pull request before merging. +- ✅ Require status checks to pass before merging — mark these as required: + `DCO sign-off check`, `secrets-defense`, and the `pr.yml` job names + (backend, frontend, docker-build). +- ✅ Require conversation resolution before merging. +- ✅ Require linear history (matches the squash-merge convention). +- ❌ Do **not** "Require approvals" (≥1). GitHub forbids a user from + approving their own PR; with N=1 maintainer this toggle deadlocks the + project lead's ability to merge their own work. +- ❌ Do **not** "Require review from Code Owners" while N=1, for the same + reason — CODEOWNERS routes everything to the lone maintainer. +- ❌ Do **not** "Include administrators" while N=1 with the above + approval/CODEOWNERS toggles enabled, otherwise the lead cannot bypass + their own deadlock. + +**When N≥2**, flip the three ❌ toggles to ✅ in the same PR that adds the +second maintainer, and update this section. + +The matching configuration in the operator's own checkout (DCO sign-off +hook locally, etc.) is in [CONTRIBUTING.md](CONTRIBUTING.md). + +## Emeritus + +None yet. When a maintainer steps back, we will move their row to an +**Emeritus** section below and update CODEOWNERS in the same PR. diff --git a/README.md b/README.md index a2d66c10..4e62f562 100644 --- a/README.md +++ b/README.md @@ -57,10 +57,30 @@ do not duplicate here, the matrix is the source of truth. See spec §4 (non-goals) for the full set. +## How RelyLoop fits with other relevance tools + +RelyLoop is not the first tool in the search-relevance space and does not try +to replace the tools already there. It sits alongside Quepid (interactive +workbench), the OpenSearch Relevance Agent (in-cluster automated tuning for +OpenSearch-only shops), Chorus (reference integration stack), SMUI + Querqy +(query-rewriting rules), Elasticsearch / Solr LTR (reranker model training), +OpenSearch UBI (real user signals), and the rest of the open-source relevance +ecosystem. + +The slice RelyLoop owns is **autonomous, engine-agnostic, Git-PR-mediated +query-time parameter tuning** — useful when you operate Elasticsearch or both +ES + OpenSearch, want production config changes to flow through a Pull +Request reviewed by named approvers, run multiple clusters / environments, +or eventually want one tool that spans engines. + +The full breakdown — honest assessment of where each adjacent tool fits, +where RelyLoop fits, and the pairing patterns we recommend — is in +[`docs/00_overview/adjacent-tools.md`](docs/00_overview/adjacent-tools.md). + ## Links - Tutorial: [`docs/08_guides/tutorial-first-study.md`](docs/08_guides/tutorial-first-study.md) -- Umbrella spec: [`docs/00_overview/product/relevance-copilot-spec.md`](docs/00_overview/product/relevance-copilot-spec.md) +- Umbrella spec: [`docs/00_overview/relyloop-spec.md`](docs/00_overview/relyloop-spec.md) - Architecture index: [`docs/01_architecture/`](docs/01_architecture/) - Local-dev runbook: [`docs/03_runbooks/local-dev.md`](docs/03_runbooks/local-dev.md) - Release checklist (maintainers): [`docs/03_runbooks/release-checklist.md`](docs/03_runbooks/release-checklist.md) @@ -68,12 +88,27 @@ See spec §4 (non-goals) for the full set. ## License -Apache License 2.0 — see [LICENSE](LICENSE). +Apache License 2.0 — see [LICENSE](LICENSE) and [NOTICE](NOTICE). ## Contributing -See [CONTRIBUTING.md](CONTRIBUTING.md). Contributions use the Developer Certificate of Origin (DCO) — sign your commits with `git commit -s`. +See [CONTRIBUTING.md](CONTRIBUTING.md) for dev setup, branching, and PR conventions. Contributions use the Developer Certificate of Origin (DCO) — sign your commits with `git commit -s`. Be kind ([CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md)). + +## Security + +Vulnerabilities go through [SECURITY.md](SECURITY.md), not public issues. + +## Governance and maintainers + +- Current maintainers: [MAINTAINERS.md](MAINTAINERS.md). At v0.1, all maintainers are soundminds.ai employees — stated openly so the bus factor is visible. +- How decisions are made + the plan to grow the maintainer set across organizations: [GOVERNANCE.md](GOVERNANCE.md). The transition target is 12–24 months from MVP1's first stable release. + +## Reaching out + +For casual outreach, design conversations, or "is RelyLoop right for my team?" questions, the project lead is reachable at: -## Maintainers +- Email: `eric.starr@soundminds.ai` +- X: [@Starrman777](https://x.com/Starrman777) +- LinkedIn: [linkedin.com/in/starrman](https://www.linkedin.com/in/starrman/) -soundminds.ai is the initial maintainer. The project plans to transition toward community maintainership over 12–24 months. See spec §29 *OSS positioning & governance*. +For bug reports use [GitHub Issues](https://github.com/SoundMindsAI/relyloop/issues); for security vulnerabilities use [SECURITY.md](SECURITY.md); for design discussions in the open use [GitHub Discussions](https://github.com/SoundMindsAI/relyloop/discussions). diff --git a/SECURITY.md b/SECURITY.md new file mode 100644 index 00000000..e72954f8 --- /dev/null +++ b/SECURITY.md @@ -0,0 +1,86 @@ +# Security Policy + +We take security seriously. If you believe you have found a vulnerability in +RelyLoop, please report it privately so we can investigate and ship a fix +before details become public. + +## Supported versions + +RelyLoop is pre-1.0 alpha software. Only the latest minor release receives +security fixes. Older versions are not patched — upgrade to the latest +release to pick up security work. + +| Version | Supported | +|---|---| +| `v0.1.x` (MVP1) | yes | +| `< v0.1.0` | no | + +## Reporting a vulnerability + +**Preferred:** use GitHub's [private vulnerability reporting](https://docs.github.com/code-security/security-advisories/guidance-on-reporting-and-writing/privately-reporting-a-security-vulnerability) +on this repository — click **Security** → **Report a vulnerability**. The +channel is end-to-end private to the project maintainers. + +**Backup:** email `security@soundminds.ai`. If you need an encrypted reply, +say so in your first message and we will arrange a PGP-encrypted thread. + +Please include: + +- A description of the vulnerability and its impact. +- A reproduction (minimal failing input, sequence of API calls, or steps in + the UI). Reproductions against a local `make up` stack are easiest for us + to act on. +- The version (`git rev-parse HEAD` or the release tag) you tested. +- Whether you would like credit in the public advisory. + +Please **do not** open a public GitHub issue, draft PR, or Discussion thread +for an unpatched vulnerability. + +## What to expect + +- We will acknowledge your report within **3 business days**. +- We will share an initial assessment (accept / decline with reasoning) within + **10 business days**. +- We aim to ship a fix within **90 days** of acknowledgement. Complex issues + may need longer; if so, we will share a written timeline with you. +- We will coordinate the public-disclosure date with you. By default we + publish a [GitHub Security Advisory](https://docs.github.com/code-security/security-advisories/working-with-repository-security-advisories/about-repository-security-advisories) + with credit to the reporter, request a CVE, and tag the patched release. + +The project is currently maintained by soundminds.ai employees. There is no +24/7 PSIRT. Reports are triaged during the maintainers' working hours +(US Eastern Time business days). + +## Scope + +**In scope:** + +- The `relyloop` codebase in this repository. +- Container images published under `relyloop/api` and any other images this + repository ships. +- The published [`docs/03_runbooks/`](docs/03_runbooks/) operational guidance, + if it materially misleads operators into an insecure configuration. + +**Out of scope:** + +- Operator-deployed clusters (Elasticsearch, OpenSearch, Postgres, Redis) — + report those to the upstream projects. +- Third-party LLM providers reachable via the `OPENAI_BASE_URL` setting — + report those to the provider. +- The operator's own Git provider account, config repo, or CI — these are + the operator's surface, not RelyLoop's. +- Vulnerabilities that require an attacker to already have a credential or + shell on the host (e.g., "if I can `docker exec` into the api container, I + can read the database password"). The threat model assumes the host is + trusted. +- Denial-of-service via expensive search-space or trial-count parameters — + these are operator-tunable limits, not security boundaries in MVP1. + +If you are not sure whether something is in scope, report it anyway and we +will route it appropriately. + +## Hardening guidance for operators + +See [`docs/04_security/`](docs/04_security/) for the operator-facing security +documentation, including secrets handling, GitHub token rotation, and the +LLM data-flow summary. diff --git a/docs/00_overview/DASHBOARD.md b/docs/00_overview/DASHBOARD.md index 3e5f6469..a2405108 100644 --- a/docs/00_overview/DASHBOARD.md +++ b/docs/00_overview/DASHBOARD.md @@ -6,7 +6,7 @@ _Top-level index across MVP1 → GA v1+ as of **2026-05-27**. Click a release na | Release | Theme | Progress | Status | |---|---|---|---| -| [MVP1 / v0.1](MVP1_DASHBOARD.md) | The Loop | 88 / 89 scoped done · 12 remaining | **In progress** | +| [MVP1 / v0.1](MVP1_DASHBOARD.md) | The Loop | 88 / 89 scoped done · 14 remaining | **In progress** | | [MVP1.5 / v0.1.5](MVP1_5_DASHBOARD.md) | Real Signals | 1 item(s) queued | **Held / queued** | | [MVP2 / v0.2](MVP2_DASHBOARD.md) | Observable | 1 / 1 scoped done · 1 remaining | **In progress** | | MVP3 / v0.3 | Production Stacks | — | **Not yet scoped** | diff --git a/docs/00_overview/MVP1_DASHBOARD.md b/docs/00_overview/MVP1_DASHBOARD.md index 97249189..fa7579c0 100644 --- a/docs/00_overview/MVP1_DASHBOARD.md +++ b/docs/00_overview/MVP1_DASHBOARD.md @@ -21,19 +21,19 @@ Implementation in progress — resume to finish | Metric | Value | |---|---| | Scoped items done | **88 / 89** (99%) — feat_/infra_/chore_/epic_ past idea stage | -| Pending work | **13** items (every not-done feat/infra/chore/bug across all priorities) | +| Pending work | **15** items (every not-done feat/infra/chore/bug across all priorities) | | → P0 — do next | **0** unblocking / paying daily cost | -| → P1 | **1** high-value, ready when P0 clears | +| → P1 | **3** high-value, ready when P0 clears | | → P2 (default) | 10 important to file, not blocking | | → Backlog | 2 captured for record, not planned | | Open bugs | 5 | -| Legacy "Path to MVP1" | 12 items — scoped-not-done + bugs + chore-ideas only (excludes feat/infra ideas) | +| Legacy "Path to MVP1" | 14 items — scoped-not-done + bugs + chore-ideas only (excludes feat/infra ideas) | | Backlog ideas | 1 idea-only feat/infra (not yet scoped into MVP1) | | In flight | 1 feature(s) actively shipping | ## Pipeline -### Done (118) +### Done (119) | Feature | Type | One-liner | Depends on | Status | |---|---|---|---|---| @@ -53,7 +53,7 @@ Implementation in progress — resume to finish | [feat_github_webhook](implemented_features/2026_05_12_feat_github_webhook/feature_spec.md) | Feature | GitHub posts to `POST /webhooks/github` with HMAC-SHA256 signature; the receiver verifies the signature, looks up the proposal by `pr_url`, updates `pr_state` and `pr_merged_at`. | `infra_foundation` `infra_adapter_elastic` `feat_github_pr_worker` | [PR #56](https://github.com/SoundMindsAI/relyloop/pull/56) merged 2026-05-12 | | [feat_home_demo_reseed_endpoint](implemented_features/2026_05_24_feat_home_demo_reseed_endpoint/feature_spec.md) | Feature | A dev-only `POST /api/v1/_test/demo/reseed` endpoint plus a "Reset to demo state" button inside `StartHereChecklist` that lets an operator wipe + re-seed the 4 demo scenarios from the browser. | — | [PR #228](https://github.com/SoundMindsAI/relyloop/pull/228) merged 2026-05-24 | | [feat_home_first_run_demo_nudge](implemented_features/2026_05_22_feat_home_first_run_demo_nudge/feature_spec.md) | Feature | An operator landing on a freshly-seeded stack sees an unambiguous banner above the dashboard's empty/populated content that names the present demo clusters, explains they ship with realistic queries + | — | [PR #188](https://github.com/SoundMindsAI/relyloop/pull/188) merged 2026-05-22 | -| [feat_index_document_browser](../02_product/planned_features/feat_index_document_browser/feature_spec.md) | Feature | A read-only document browser, reachable from two independent entry points (cluster detail + study detail), that lets operators see corpus shape, paginate documents, and inspect any single doc's `_sour | — | [PR #282](https://github.com/SoundMindsAI/relyloop/pull/282) | +| [feat_index_document_browser](implemented_features/2026_05_27_feat_index_document_browser/feature_spec.md) | Feature | A read-only document browser, reachable from two independent entry points (cluster detail + study detail), that lets operators see corpus shape, paginate documents, and inspect any single doc's `_sour | — | [PR #285](https://github.com/SoundMindsAI/relyloop/pull/285) merged 2026-05-27 | | [feat_judgments_periodic_resume_sweep](implemented_features/2026_05_14_feat_judgments_periodic_resume_sweep/feature_spec.md) | Feature | A new Arq cron job `resume_stuck_judgment_lists` ticks every `RELYLOOP_JUDGMENTS_RESUME_SWEEP_MINUTES` minutes (default 15), re-enqueues every `judgment_lists.status='generating'` row via deterministi | — | [PR #104](https://github.com/SoundMindsAI/relyloop/pull/104) merged 2026-05-12 | | [feat_llm_judgments](implemented_features/2026_05_11_feat_llm_judgments/feature_spec.md) | Feature | A relevance engineer selects a query set + cluster + target + rubric and the system runs the current template to fetch top-K hits per query, asks OpenAI to rate each (query, doc) on a 0–3 scale with r | `infra_foundation` `infra_adapter_elastic` `feat_study_lifecycle` | [PR #35](https://github.com/SoundMindsAI/relyloop/pull/35) merged 2026-05-11 | | [feat_orchestrator_zero_streak_abort](implemented_features/2026_05_22_feat_orchestrator_zero_streak_abort/feature_spec.md) | Feature | Complete (PR #191, merged 2026-05-22 as squash `51ae4b3c`) | — | [PR #191](https://github.com/SoundMindsAI/relyloop/pull/191) merged 2026-05-22 | @@ -136,6 +136,7 @@ Implementation in progress — resume to finish | [bug_dashboard_reset_disclosure_gating_too_strict](implemented_features/2026_05_26_bug_dashboard_reset_disclosure_gating_too_strict/idea.md) | Bug | [`ui/src/components/dashboard/start-here-checklist.tsx:150-160`](../ui/src/components/dashboard/start-here-checklist.tsx#L150-L160): | — | Complete | | [bug_datatable_col_vis_density_localstorage_undefined_jsdom](implemented_features/2026_05_26_bug_datatable_col_vis_density_localstorage_undefined_jsdom/idea.md) | Bug | The first integration test in the file (`toggling a column off via the menu removes its cells and persists to localStorage`, line 148) accesses `window.localStorage` successfully. By the time the 3rd– | — | Complete | | [bug_demo_clusters_unreachable_in_healthz](implemented_features/2026_05_25_bug_demo_clusters_unreachable_in_healthz/feature_spec.md) | Bug | **After** the warmup task completes (typically within ~5 seconds of API startup, bounded by per-cluster `httpx` probe latency), `/healthz` reports the accurate `healthy` / `unreachable` aggregate for | — | [PR #236](https://github.com/SoundMindsAI/relyloop/pull/236) merged 2026-05-25 | +| [bug_demo_reseed_fake_metric_regression](implemented_features/2026_05_27_bug_demo_reseed_fake_metric_regression) | Bug | Complete | — | Complete | | [bug_digest_param_importance_seam](implemented_features/2026_05_13_bug_digest_param_importance_seam/idea.md) | Bug | The test fixture builds its own `RDBStorage` via `build_storage(...)`, constructs sampler/pruner with `seed=42`, and calls `tell()` against THAT handle. The worker independently calls `build_storage(. | — | Complete | | [bug_dockerfile_missing_prompts](implemented_features/2026_05_13_bug_dockerfile_missing_prompts/idea.md) | Bug | The `Dockerfile` at the repo root copies `backend/`, `migrations/`, `alembic.ini`, and `pyproject.toml` into `/app/` but does NOT copy `prompts/`. Any code that loads a file from `prompts/` at module- | — | Complete | | [bug_dockerfile_missing_scripts_dir](implemented_features/2026_05_24_bug_dockerfile_missing_scripts_dir/idea.md) | Bug | [`backend/app/services/demo_seeding.py:39`](../../backend/app/services/demo_seeding.py#L39) imports four constants from `scripts/seed_meaningful_demos.py`: | — | Complete | @@ -170,22 +171,24 @@ _None._ _None._ -### Idea (12) +### Idea (14) | # | Priority | Feature | Type | One-liner | Depends on | Status | |---|---|---|---|---|---|---| | 1 | P1 | [infra_smoke_job_chronic_flake](../02_product/planned_features/infra_smoke_job_chronic_flake/idea.md) | Infra | Recent `pr.yml` runs on `main` (newest first): | — | Idea — captured during feat_index_document_browser CI watch (PR #285) | -| 2 | P2 | [chore_e2e_api_base_url_construction](../02_product/planned_features/chore_e2e_api_base_url_construction/idea.md) | Chore | Five sites in three e2e specs concatenate `API_BASE` with a path string: | — | Idea — surfaced during Gemini Code Assist review on PR #273 (`chore_clone_narrow_bounds_full_roundtrip_e2e`). | -| 3 | P2 | [chore_state_md_size_compression](../02_product/planned_features/chore_state_md_size_compression/idea.md) | Chore | `state.md` is structured around two concerns conflated into one file: | — | Idea — tangential observation surfaced during `/impl-execute` for `infra_agent_sibling_worktree_isolation` (Phase 1, this PR). | -| 4 | P2 | [chore_studies_post_arq_spy_fixture](../02_product/planned_features/chore_studies_post_arq_spy_fixture/idea.md) | Chore | The studies POST handler at [`backend/app/api/v1/studies.py:307`](../../backend/app/api/v1/studies.py#L307) calls `await _enqueue_start_study(request, study_id)` after a successful create. The helper | — | Idea — surfaced during `feat_study_preflight_overlap_probe` (PR ___) phase-gate review | -| 5 | P2 | [chore_template_library_expansion](../02_product/planned_features/chore_template_library_expansion/idea.md) | Chore | Three connected gaps: | — | Idea — surfaced during a UX review of parameter-tuning ergonomics on 2026-05-19. | -| 6 | P2 | [bug_ceiling_badge_assumes_maximize_direction](../02_product/planned_features/bug_ceiling_badge_assumes_maximize_direction/idea.md) | Bug | The `CEILING` badge in [`studies-table.column-config.tsx:METRIC_CEILING_THRESHOLD`](../ui/src/components/studies/studies-table.column-config.tsx) flags rows where `best_metric >= 0.99`. The threshold | — | — | -| 7 | P2 | [bug_demo_reseed_fake_metric_regression](../02_product/planned_features/bug_demo_reseed_fake_metric_regression) | Bug | | — | — | -| 8 | P2 | [bug_smoke_studies_data_table_search_flake](../02_product/planned_features/bug_smoke_studies_data_table_search_flake/idea.md) | Bug | [`ui/tests/e2e/studies-data-table.spec.ts:20-40`](../../ui/tests/e2e/studies-data-table.spec.ts#L20-L40): | — | Idea — surfaced during PR #273 CI watch. | -| 9 | P2 | [bug_starlette_request_poisons_fastapi_depends_tests](../02_product/planned_features/bug_starlette_request_poisons_fastapi_depends_tests/idea.md) | Bug | There is shared state somewhere in starlette / FastAPI that is mutated by `Request(scope={"type": "http", ...})` and breaks subsequent `Depends` resolution. Possible suspects: | — | Idea — bug captured during feat_index_document_browser Story 2.1 | -| 10 | P2 | [bug_webhook_concurrent_merge_race_timing_sensitive](../02_product/planned_features/bug_webhook_concurrent_merge_race_timing_sensitive/idea.md) | Bug | Idea — surfaced during `bug_demo_clusters_unreachable_in_healthz` PR #236 CI. | — | Idea — surfaced during `bug_demo_clusters_unreachable_in_healthz` PR #236 CI. | -| 11 | Backlog | [chore_auto_followup_parent_advisory_lock](../02_product/planned_features/chore_auto_followup_parent_advisory_lock/idea.md) | Chore | The shipped `feat_auto_followup_studies` worker uses a two-layer idempotency scheme: | — | Idea — captured as a standalone file to resolve broken cross-references in `feat_auto_followup_studies` D-11 + plan F2 + `bug_auto_followup_completed_parent_stop_chain_race/idea.md`. The slug was coined 2026-05-24 in D-11 but only existed as descriptive prose across other documents until now. | -| 12 | Backlog | [chore_e2e_seed_acme_helper_dead](../02_product/planned_features/chore_e2e_seed_acme_helper_dead/idea.md) | Chore | `seedAcmeProductsChain` is a 140-line helper that constructs a cluster + query_set + template + judgment_list + study + optional proposal/digest chain "Acme Products" demo scenario. The function is co | — | Closed (2026-05-25) — superseded by guide-06 spec wiring (commit `2cbcb93b`, 2026-05-22). Real caller: `ui/tests/e2e/guides/06_create_and_monitor_study.spec.ts`. No further action beyond the coverage-audit refresh that ships in the same PR. | +| 2 | P1 | [chore_oss_public_launch_punchlist](../02_product/planned_features/chore_oss_public_launch_punchlist/idea.md) | Chore | The `chore_oss_launch_prep` PR adds the foundational governance / security / contributor files that prospective contributors and enterprise reviewers look for first. Three remaining items are gates on | — | Idea — captured during `chore_oss_launch_prep` (the PR that added SECURITY.md / GOVERNANCE.md / MAINTAINERS.md / CODEOWNERS / issue + PR templates and replaced the Code of Conduct) | +| 3 | P1 | [bug_demo_reseed_button_silent_enqueue_failure](../02_product/planned_features/bug_demo_reseed_button_silent_enqueue_failure/idea.md) | Bug | There is at least one untrapped exception path in `backend/workers/demo_reseed.py:run_demo_reseed`'s pre-main-body initialization that: | — | Idea — bug captured during PR #286 first-run testing | +| 4 | P2 | [chore_demo_seeding_integration_tests_rewrite](../02_product/planned_features/chore_demo_seeding_integration_tests_rewrite/idea.md) | Chore | The async flow's contract: | — | Idea — chore captured during PR #286 | +| 5 | P2 | [chore_e2e_api_base_url_construction](../02_product/planned_features/chore_e2e_api_base_url_construction/idea.md) | Chore | Five sites in three e2e specs concatenate `API_BASE` with a path string: | — | Idea — surfaced during Gemini Code Assist review on PR #273 (`chore_clone_narrow_bounds_full_roundtrip_e2e`). | +| 6 | P2 | [chore_state_md_size_compression](../02_product/planned_features/chore_state_md_size_compression/idea.md) | Chore | `state.md` is structured around two concerns conflated into one file: | — | Idea — tangential observation surfaced during `/impl-execute` for `infra_agent_sibling_worktree_isolation` (Phase 1, this PR). | +| 7 | P2 | [chore_studies_post_arq_spy_fixture](../02_product/planned_features/chore_studies_post_arq_spy_fixture/idea.md) | Chore | The studies POST handler at [`backend/app/api/v1/studies.py:307`](../../backend/app/api/v1/studies.py#L307) calls `await _enqueue_start_study(request, study_id)` after a successful create. The helper | — | Idea — surfaced during `feat_study_preflight_overlap_probe` (PR ___) phase-gate review | +| 8 | P2 | [chore_template_library_expansion](../02_product/planned_features/chore_template_library_expansion/idea.md) | Chore | Three connected gaps: | — | Idea — surfaced during a UX review of parameter-tuning ergonomics on 2026-05-19. | +| 9 | P2 | [bug_ceiling_badge_assumes_maximize_direction](../02_product/planned_features/bug_ceiling_badge_assumes_maximize_direction/idea.md) | Bug | The `CEILING` badge in [`studies-table.column-config.tsx:METRIC_CEILING_THRESHOLD`](../ui/src/components/studies/studies-table.column-config.tsx) flags rows where `best_metric >= 0.99`. The threshold | — | — | +| 10 | P2 | [bug_smoke_studies_data_table_search_flake](../02_product/planned_features/bug_smoke_studies_data_table_search_flake/idea.md) | Bug | [`ui/tests/e2e/studies-data-table.spec.ts:20-40`](../../ui/tests/e2e/studies-data-table.spec.ts#L20-L40): | — | Idea — surfaced during PR #273 CI watch. | +| 11 | P2 | [bug_starlette_request_poisons_fastapi_depends_tests](../02_product/planned_features/bug_starlette_request_poisons_fastapi_depends_tests/idea.md) | Bug | There is shared state somewhere in starlette / FastAPI that is mutated by `Request(scope={"type": "http", ...})` and breaks subsequent `Depends` resolution. Possible suspects: | — | Idea — bug captured during feat_index_document_browser Story 2.1 | +| 12 | P2 | [bug_webhook_concurrent_merge_race_timing_sensitive](../02_product/planned_features/bug_webhook_concurrent_merge_race_timing_sensitive/idea.md) | Bug | Idea — surfaced during `bug_demo_clusters_unreachable_in_healthz` PR #236 CI. | — | Idea — surfaced during `bug_demo_clusters_unreachable_in_healthz` PR #236 CI. | +| 13 | Backlog | [chore_auto_followup_parent_advisory_lock](../02_product/planned_features/chore_auto_followup_parent_advisory_lock/idea.md) | Chore | The shipped `feat_auto_followup_studies` worker uses a two-layer idempotency scheme: | — | Idea — captured as a standalone file to resolve broken cross-references in `feat_auto_followup_studies` D-11 + plan F2 + `bug_auto_followup_completed_parent_stop_chain_race/idea.md`. The slug was coined 2026-05-24 in D-11 but only existed as descriptive prose across other documents until now. | +| 14 | Backlog | [chore_e2e_seed_acme_helper_dead](../02_product/planned_features/chore_e2e_seed_acme_helper_dead/idea.md) | Chore | `seedAcmeProductsChain` is a 140-line helper that constructs a cluster + query_set + template + judgment_list + study + optional proposal/digest chain "Acme Products" demo scenario. The function is co | — | Closed (2026-05-25) — superseded by guide-06 spec wiring (commit `2cbcb93b`, 2026-05-22). Real caller: `ui/tests/e2e/guides/06_create_and_monitor_study.spec.ts`. No further action beyond the coverage-audit refresh that ships in the same PR. | ## Dependency graph @@ -198,8 +201,6 @@ graph LR classDef plan fill:#fef9c3,stroke:#854d0e,color:#854d0e; classDef spec fill:#dbeafe,stroke:#1e40af,color:#1e40af; classDef idea fill:#f1f5f9,stroke:#334155,color:#334155; - feat_index_document_browser["index document browser"] - class feat_index_document_browser done; infra_agent_sibling_worktree_isolation["agent sibling worktree isolation"] class infra_agent_sibling_worktree_isolation implement; infra_foundation["foundation"] @@ -376,6 +377,8 @@ graph LR class infra_dockerfile_invariant_smoke_in_ci done; infra_test_worktree_missing_integration_envs["test worktree missing integration envs"] class infra_test_worktree_missing_integration_envs done; + feat_index_document_browser["index document browser"] + class feat_index_document_browser done; feat_study_lifecycle --> feat_digest_proposal feat_llm_judgments --> feat_digest_proposal infra_foundation --> feat_llm_judgments diff --git a/docs/00_overview/README.md b/docs/00_overview/README.md index 9ddd9efe..28f42235 100644 --- a/docs/00_overview/README.md +++ b/docs/00_overview/README.md @@ -4,7 +4,10 @@ Use this section for high-level project context, repository orientation, and umb Current contents: -- `product/` — full product spec - - `product/relevance-copilot-spec.md` — full product and system specification +- [`relyloop-spec.md`](relyloop-spec.md) — the umbrella product and system specification (the authoritative scope statement) +- [`adjacent-tools.md`](adjacent-tools.md) — honest comparison to other tools in the relevance landscape (Quepid, OpenSearch Relevance Agent, Chorus, SMUI/Querqy, LTR plugins, etc.) and how RelyLoop fits alongside them +- `implemented_features/` — archived planning artifacts for features that have shipped +- `dashboard.html` / `DASHBOARD.md` / `mvp*_dashboard.html` / `MVP*_DASHBOARD.md` — auto-generated progress dashboards built from `planned_features/` + `implemented_features/` +- `dashboard_overrides/` — manual annotations layered on top of the auto-generated dashboards for features whose scope spans multiple folders For MVP1 decomposition (user stories + per-feature spec folders), see [`docs/02_product/`](../02_product/). diff --git a/docs/00_overview/adjacent-tools.md b/docs/00_overview/adjacent-tools.md new file mode 100644 index 00000000..c14d60f0 --- /dev/null +++ b/docs/00_overview/adjacent-tools.md @@ -0,0 +1,194 @@ +# Adjacent tools — how RelyLoop fits alongside the rest of the relevance landscape + +RelyLoop is not the first tool in the search-relevance space, and it does not +try to replace the tools that already exist there. It is deliberately scoped to +a specific gap — **autonomous, engine-agnostic, Git-PR-mediated query-time +parameter tuning** — and is designed to coexist with the tools listed below. + +This document explains, honestly, where each adjacent tool fits, where RelyLoop +fits, and how they can be used together. We would rather you pick the right +tool for your job than pick ours by default. + +## At a glance + +| Tool | Primary role | Engines | Automated multi-trial optimization | Output channel | Pairs with RelyLoop how | +|---|---|---|---|---|---| +| [**OpenSearch Relevance Agent**](https://opensearch.org/blog/introducing-opensearch-relevance-agent-ai-powered-search-tuning/) (OpenSearch 3.6+, experimental) | Conversational tuning agent w/ hypothesis-driven experiments inside OpenSearch | OpenSearch only | Yes (query-DSL adjustments) | Proposals reviewed in OpenSearch Dashboards | **Direct overlap.** Choose this for single-cluster OpenSearch shops that don't need Git-PR workflow; choose RelyLoop if you need engine-agnosticism, PR-based change management, or multi-cluster/multi-tenant scope | +| [**Quepid**](https://quepid-docs.dev.o19s.com/2/quepid) (OpenSource Connections) | Interactive workbench for per-query exploration; human + AI judging | ES, OpenSearch, Solr, any HTTP-accessible engine | No (manual iteration) | Manual config edits, "Cases" + snapshots | **Strongly complementary.** Quepid is the microscope for individual queries; RelyLoop is the optimization sweep for the whole query set | +| [**OpenSearch Search Relevance Workbench**](https://docs.opensearch.org/latest/search-plugins/search-relevance/) | Built-in query-sets / judgments / experiments framework on OpenSearch | OpenSearch only | No (single-experiment runs) | Cluster-side artifacts | RelyLoop's MVP1.5 can read OpenSearch judgments and query sets directly; OpenSearch SRW is the source-side building block, RelyLoop is the optimization layer above it | +| [**Chorus**](https://github.com/querqy/chorus) (querqy / o19s) | Reference integration stack bundling Solr/ES + Quepid + SMUI + Querqy + monitoring | Solr + ES | No | n/a (it's a stack composition) | RelyLoop can be a member of a Chorus-like stack — Chorus provides the integrated workbench, RelyLoop provides the optimization loop | +| [**SMUI + Querqy**](https://querqy.org/docs/smui/) | Query-rewriting rules management (synonyms, boosts, filters) | Solr (native), ES (via Querqy port) | No | Rule files deployed to engine | **Different layer of the stack.** SMUI/Querqy rewrites queries *before* the engine sees them; RelyLoop tunes the parameters the engine itself uses. Both can run simultaneously | +| [**RRE (Rated Ranking Evaluator)**](https://github.com/SeaseLtd/rated-ranking-evaluator) (Sease) | Java/Maven offline evaluation library + Maven plugins | Solr, Elasticsearch | No | CI metrics, multi-version comparisons | RRE plays the role of the evaluation primitive in CI; RelyLoop uses `ir_measures` (Python) to play the same role inside its own loop. If your team already runs RRE for regression guards, keep it — RelyLoop owns the upstream "find good parameters" step | +| [**Elasticsearch Ranking Evaluation API**](https://www.elastic.co/docs/reference/elasticsearch/rest-apis/search-rank-eval) | ES-native endpoint that computes IR metrics from judgments | Elasticsearch only | No (single-metric request) | API response | A low-level primitive. RelyLoop's adapter could call it for in-engine metric computation; today RelyLoop computes metrics off-engine via `ir_measures` for engine-agnosticism | +| [**Elasticsearch LTR plugin / Solr LTR**](https://github.com/o19s/elasticsearch-learning-to-rank) | Learning-to-Rank reranker model training + serving | ES, Solr | n/a (LTR-specific) | Trained reranker model on cluster | **Downstream of RelyLoop.** RelyLoop tunes query-time params (BM25 stage); LTR layers a reranker on top. Tune the base first, train the reranker second. LTR is explicitly out of RelyLoop's v1 scope (spec §4 non-goal) | +| [**Splainer**](https://splainer.io/) (o19s) | Single-query `_explain` visualizer | Solr + ES | No | Diagnostic UI | RelyLoop is the telescope; Splainer is the microscope. Use Splainer when one query is broken; use RelyLoop when the whole template needs systematic improvement | +| [**OpenSearch UBI plugin**](https://github.com/opensearch-project/user-behavior-insights) | Server-side click / event capture | OpenSearch (via plugin); the same UBI schema is portable to other engines | n/a (signals only, no tuning) | UBI tables (`ubi_queries`, `ubi_events`) | **Strongest pairing.** RelyLoop MVP1.5 ships a `UbiReader` and pluggable `SignalsConverter` that turn UBI events into judgments — UBI provides the trust anchor that pure LLM-as-judge can't | +| [**Algolia, Coveo, Vespa Cloud, Elastic Cloud, etc.**](https://www.algolia.com/) (proprietary SaaS) | Hosted search engines with built-in relevance tooling | Their own engine | Varies; some include automated tuning | Vendor dashboard | **Different market.** These replace the engine itself. If you're on Algolia or Coveo, your relevance tuning is in their console — RelyLoop is not for you. RelyLoop is for shops that operate their own ES / OpenSearch / Solr / Fusion | + +## Where the overlap is, and why RelyLoop exists + +The closest tool to RelyLoop in 2026 is the **OpenSearch Relevance Agent**, +introduced in OpenSearch 3.6 as an experimental release inside OpenSearch Agent +Server. It overlaps with RelyLoop on every dimension that matters to the +elevator pitch: + +- Conversational, LLM-driven tuning interface +- Hypothesis generation by an LLM ("Hypothesis Generator Agent" on Bedrock) +- Query Sets + Judgment Lists as primitives +- End-to-end automated experimentation pipeline +- Output is a human-reviewed proposal, not an auto-applied change +- Metrics offloaded to deterministic tools (not LLM-computed) + +**Honest assessment:** if you operate OpenSearch only, want the simplest +possible deployment, and don't need a Git-PR change-management workflow for +production config, the OpenSearch Relevance Agent is likely the better choice +for your shop. It's bundled with your OpenSearch stack, runs inside the +cluster, integrates with OpenSearch Dashboards, and avoids the operational +overhead of a second tool. The OpenSearch team are search experts shipping a +search-experts' product; we're not going to claim our hosted-LLM-tuning loop +is fundamentally better than theirs on OpenSearch's own turf. + +**RelyLoop is the better choice for your shop when one or more of these is +true:** + +1. **You operate Elasticsearch** (or both Elasticsearch and OpenSearch, or + Lucidworks Fusion). The Relevance Agent only helps on OpenSearch. RelyLoop's + single adapter spans ES 8.11+/9.x and OpenSearch 2.x/3.x today, with + Lucidworks Fusion landing at MVP3 and pure Solr deferred to v2. + +2. **You require Git-as-source-of-truth for production search-config changes.** + RelyLoop opens Pull Requests against a central config repo where named + approvers review and merge them. The Relevance Agent applies changes + inside OpenSearch directly. RelyLoop's posture is appropriate when the + operating model says "production behavior is determined by what an approver + merges, not by what the tool decides." + +3. **You have multiple clusters and environments** (prod / staging / dev across + one or more product lines). RelyLoop's data model is built around running + studies against any registered cluster from one deployment, and its + `proposals` workflow is tied to a `config_repo` that can map onto your + real branch / environment topology. + +4. **You need multi-tenant isolation from the schema level** (MVP4 onward). + RelyLoop is built to serve many downstream customers from one deployment in + a shared-cluster, isolated-data posture. + +5. **You want provider-agnostic LLM choice.** RelyLoop targets OpenAI in MVP1 + via an OpenAI-compatible endpoint that already works against Ollama, vLLM, + LM Studio, and HuggingFace TGI; at MVP4 it adds native Anthropic, AWS + Bedrock, Azure OpenAI, and Vertex providers. The Relevance Agent runs on + OpenSearch ML Commons connectors with its own provider list. + +6. **You want a longer-term path off OpenSearch-only architecture** — for + example, you're evaluating a future Lucidworks Fusion adoption, or you + want one relevance-tuning approach that survives an engine migration. + +These differences are deliberate. RelyLoop is not trying to be a better +OpenSearch Relevance Agent on OpenSearch's home turf. It targets a different +operating posture — one that prioritizes engine-agnosticism, Git-mediated +change management, multi-cluster / multi-tenant scope, and provider-agnostic +LLM use. + +## Pairing patterns + +The honest pitch is rarely "pick one." Most mature relevance stacks layer +multiple tools. + +### RelyLoop + Quepid — automated sweep meets interactive workbench + +- **Quepid** for: investigating why a specific query is broken, gathering + subject-matter-expert ratings, hand-crafting judgments interactively, and + exploring "what if" hypotheses on individual cases. +- **RelyLoop** for: running the overnight optimization sweep that finds + parameters that improve the whole query set, then opening the PR. +- **Workflow:** start in Quepid to identify a relevance failure mode; export + the judgment list (or have RelyLoop generate one with LLM-as-judge); run + a RelyLoop study; review the proposal PR; if the digest surfaces a + genuinely puzzling sub-population, drop back into Quepid to investigate. + +### RelyLoop + OpenSearch UBI — the strongest single pairing + +- **UBI** captures real user search behavior (queries, clicks, dwell, + refinements) server-side via the OpenSearch UBI plugin. +- **RelyLoop MVP1.5** ships a `UbiReader` (engine-agnostic; reads `ubi_queries` + + `ubi_events`) and a pluggable `SignalsConverter` Protocol with built-in + position-bias-corrected CTR, dwell-time, and hybrid UBI+LLM converters. +- **Why it matters:** LLM-as-judge is fast and cheap but operators with real + traffic distrust it as the sole trust anchor. UBI gives you ground truth + derived from your actual users. RelyLoop is built to consume it as a + first-class judgment source. + +### RelyLoop + SMUI/Querqy — different layers of the same stack + +- **SMUI/Querqy** rewrites the user's query *before* the engine evaluates it + (synonyms, boosts, filters, spelling). +- **RelyLoop** tunes the parameters the engine uses to evaluate the + (possibly rewritten) query (field weights, function-score, tie-breakers, + minimum-should-match). +- **They are orthogonal.** A mature stack uses both — rewriting rules in + SMUI/Querqy, parameter tuning via RelyLoop. RelyLoop's adapter renders + query templates against whatever the engine actually sees, including + Querqy-rewritten queries. + +### RelyLoop + Elasticsearch/Solr LTR — base-tier first, reranker second + +- **RelyLoop** tunes BM25-stage parameters (the base retrieval). +- **LTR plugins** train a reranker model that re-scores the top-K retrieved. +- **Order matters.** Tune the base first; the reranker compounds on what the + base hands it. LTR is explicitly out of RelyLoop's v1 scope (spec §4 + non-goal) — these are different problems. + +### RelyLoop + RRE — orchestration upstream, regression guard downstream + +- **RRE** is a Java/Maven library invoked from CI to catch ranking + regressions in committed code. It computes metrics against committed + judgment lists during build. +- **RelyLoop** is the upstream optimization layer that *finds* the + parameters in the first place, then opens the PR that RRE's CI gate + will evaluate. +- Both can read the same judgment-list format with light adaptation. + +### RelyLoop + Chorus — optimization loop joins the reference stack + +- **Chorus** is a reference stack for e-commerce search composition: Solr or + ES + SMUI + Querqy + Quepid + Keycloak + Prometheus + Grafana + Jaeger, + pre-wired. +- **RelyLoop** can sit alongside as the automated tuning service that opens + PRs against the search-config repo Chorus's deployment uses. + +## Where RelyLoop deliberately does not compete + +RelyLoop explicitly is not, and will not become, a competitor to: + +- **Live search-serving runtimes.** RelyLoop never sits on the query-serving + path (spec §4 non-goal). The engine handles serving; RelyLoop handles + off-cluster experimentation. +- **Online A/B-testing platforms.** RelyLoop evaluates offline against + judgment lists. Online A/B is a different operating model with different + guardrails. +- **Production search-quality monitoring.** Streaming rolling-window metrics + and alerting on degradation belong to APM (DataDog, Grafana, Fusion's own + analytics) — not RelyLoop (spec §4 non-goal). +- **Learning-to-Rank model training.** Out of scope for v1. LTR plugins + (Elastic LTR, Solr LTR) remain the right tool. RelyLoop tunes the + retrieval layer that LTR reranks. +- **Query-rewriting rule management.** SMUI / Querqy own this layer. +- **Hosted SaaS relevance platforms.** Algolia, Coveo, Vespa Cloud, + Elastic Cloud Enterprise are different products in a different market. + RelyLoop is for organizations that operate their own engine. + +The scope is deliberately small so that the slice RelyLoop owns — autonomous +query-time parameter tuning, engine-agnostic, Git-PR-mediated — can be done +well, and so that the rest of the relevance ecosystem stays useful around it. + +## A note on the OpenSearch Relevance Agent's roadmap + +The OpenSearch Relevance Agent's announced roadmap mentions interleaving +tests, schema-evolution recommendations, and LTR automation. If those land +inside OpenSearch, the OpenSearch Relevance Agent will expand into territory +RelyLoop's v1 deliberately doesn't address (e.g., LTR; see spec §4 non-goal). +That is good for OpenSearch operators and not a threat to RelyLoop's +positioning — RelyLoop's pitch (engine-agnostic, Git-mediated, multi-cluster, +multi-tenant) remains differentiated regardless of what the in-engine agent +does on OpenSearch alone. We will keep this document updated as the +landscape changes. diff --git a/docs/00_overview/dashboard.html b/docs/00_overview/dashboard.html index fff82db8..c14d4842 100644 --- a/docs/00_overview/dashboard.html +++ b/docs/00_overview/dashboard.html @@ -384,7 +384,7 @@

Releases

The Loop
-
88 / 89 scoped done · 12 remaining
+
88 / 89 scoped done · 14 remaining
In progress
diff --git a/docs/00_overview/implemented_features/2026_05_09_infra_foundation/feature_spec.md b/docs/00_overview/implemented_features/2026_05_09_infra_foundation/feature_spec.md index f4a0a260..d5e12a93 100644 --- a/docs/00_overview/implemented_features/2026_05_09_infra_foundation/feature_spec.md +++ b/docs/00_overview/implemented_features/2026_05_09_infra_foundation/feature_spec.md @@ -10,7 +10,7 @@ - [docs/01_architecture/tech-stack.md](../../../01_architecture/tech-stack.md) — stack choices this feature wires up - [docs/01_architecture/deployment.md](../../../01_architecture/deployment.md) — Compose layout this feature implements - [docs/01_architecture/api-conventions.md](../../../01_architecture/api-conventions.md) — conventions the `/healthz` endpoint follows -- [docs/00_overview/product/relevance-copilot-spec.md](../../../00_overview/product/relevance-copilot-spec.md) §27 — MVP1 scope (umbrella) +- [docs/00_overview/relyloop-spec.md](../../../00_overview/relyloop-spec.md) §27 — MVP1 scope (umbrella) --- diff --git a/docs/00_overview/implemented_features/2026_05_09_infra_foundation/pipeline_status.md b/docs/00_overview/implemented_features/2026_05_09_infra_foundation/pipeline_status.md index a8b8b851..79e95285 100644 --- a/docs/00_overview/implemented_features/2026_05_09_infra_foundation/pipeline_status.md +++ b/docs/00_overview/implemented_features/2026_05_09_infra_foundation/pipeline_status.md @@ -2,7 +2,7 @@ ## Idea - Status: N/A — no `idea.md`; spec was authored directly. (Common for the bootstrap feature where the umbrella docs serve as the brief.) -- Origin: [`docs/00_overview/product/relevance-copilot-spec.md` §27](../../../00_overview/product/relevance-copilot-spec.md) ("MVP1 / v0.1 — The Loop") +- Origin: [`docs/00_overview/relyloop-spec.md` §27](../../../00_overview/relyloop-spec.md) ("MVP1 / v0.1 — The Loop") ## Spec - Status: Approved (merged to `main` via PR #2 on 2026-05-09) diff --git a/docs/00_overview/implemented_features/2026_05_10_infra_adapter_elastic/feature_spec.md b/docs/00_overview/implemented_features/2026_05_10_infra_adapter_elastic/feature_spec.md index 0fed0caf..ad93f860 100644 --- a/docs/00_overview/implemented_features/2026_05_10_infra_adapter_elastic/feature_spec.md +++ b/docs/00_overview/implemented_features/2026_05_10_infra_adapter_elastic/feature_spec.md @@ -9,7 +9,7 @@ - [docs/01_architecture/data-model.md](../../../01_architecture/data-model.md) — `clusters` and `config_repos` tables (MVP1 shape) - [docs/01_architecture/api-conventions.md](../../../01_architecture/api-conventions.md) — endpoint conventions this feature follows - [docs/01_architecture/mvp1-overview.md](../../../01_architecture/mvp1-overview.md) — MVP1 architecture entry point -- [docs/00_overview/product/relevance-copilot-spec.md](../../../00_overview/product/relevance-copilot-spec.md) §19 — agent tools (`get_schema`, `run_query` consumed in `feat_chat_agent`) +- [docs/00_overview/relyloop-spec.md](../../../00_overview/relyloop-spec.md) §19 — agent tools (`get_schema`, `run_query` consumed in `feat_chat_agent`) - Depends on: [`infra_foundation/feature_spec.md`](../infra_foundation/feature_spec.md) --- diff --git a/docs/00_overview/implemented_features/2026_05_12_chore_tutorial_polish/feature_spec.md b/docs/00_overview/implemented_features/2026_05_12_chore_tutorial_polish/feature_spec.md index 2c2168c4..bd5ed3b3 100644 --- a/docs/00_overview/implemented_features/2026_05_12_chore_tutorial_polish/feature_spec.md +++ b/docs/00_overview/implemented_features/2026_05_12_chore_tutorial_polish/feature_spec.md @@ -5,7 +5,7 @@ **Owners:** Maintainer (writes tutorial + records demo + cuts release); peer reviewer for the smoke-test gate. **Related docs:** - [docs/02_product/mvp1-user-stories.md](../../mvp1-user-stories.md) — covers US-30, US-31, US-32 (current count is 32; US-32 added after this spec was first authored) -- [docs/00_overview/product/relevance-copilot-spec.md §27](../../../00_overview/product/relevance-copilot-spec.md) — MVP1 release definition +- [docs/00_overview/relyloop-spec.md §27](../../../00_overview/relyloop-spec.md) — MVP1 release definition - [docs/01_architecture/deployment.md](../../../01_architecture/deployment.md) - [docs/01_architecture/mvp1-overview.md](../../../01_architecture/mvp1-overview.md) - Depends on: ALL prior MVP1 features (this is the release-readiness step) diff --git a/docs/00_overview/implemented_features/2026_05_23_infra_ir_measures_migration/feature_spec.md b/docs/00_overview/implemented_features/2026_05_23_infra_ir_measures_migration/feature_spec.md index 866af6f8..1422dd36 100644 --- a/docs/00_overview/implemented_features/2026_05_23_infra_ir_measures_migration/feature_spec.md +++ b/docs/00_overview/implemented_features/2026_05_23_infra_ir_measures_migration/feature_spec.md @@ -46,7 +46,7 @@ Verified via `grep -rn 'pytrec_eval\|pytrec-eval'` on `main` HEAD (2026-05-22). | [`CLAUDE.md`](../../../../CLAUDE.md) | 15, 29 | Update both mentions. | | [`architecture.md`](../../../../architecture.md) | 131 | Update — "eval/ pytrec_eval scoring + Optuna runtime helpers". | | [`release-notes-v0.1.0-draft.md`](../../../../release-notes-v0.1.0-draft.md) | 12 | Update — release notes for the not-yet-tagged v0.1.0 / v0.2.0 cycle. | -| [`docs/00_overview/product/relevance-copilot-spec.md`](../../../00_overview/product/relevance-copilot-spec.md) | 12, 155, 688, 690, 692–693, 711, 2192, 2302, 2513, 2658, 2722 (~11 mentions) | Update — durable umbrella spec, NOT a historical artifact. Includes the "Engine: pytrec_eval everywhere" subsection (lines 688–693) that needs reframing as the provider-abstracted engine choice. | +| [`docs/00_overview/relyloop-spec.md`](../../../00_overview/relyloop-spec.md) | 12, 155, 688, 690, 692–693, 711, 2192, 2302, 2513, 2658, 2722 (~11 mentions) | Update — durable umbrella spec, NOT a historical artifact. Includes the "Engine: pytrec_eval everywhere" subsection (lines 688–693) that needs reframing as the provider-abstracted engine choice. | | [`docs/01_architecture/optimization.md`](../../../01_architecture/optimization.md) | 1 (title), 3, 15, 48, 50, 52–53, 69, 76, 87, 90, 176 (~10 mentions) | Update — **the canonical IR-evaluation architecture page**, including the code example block at lines 87–90. Title `# Optimization (Optuna + pytrec_eval)` becomes `# Optimization (Optuna + ir_measures)`. | | [`docs/01_architecture/tech-stack.md`](../../../01_architecture/tech-stack.md) | 41 | Update — IR evaluation row in the stack table. | | [`docs/01_architecture/system-overview.md`](../../../01_architecture/system-overview.md) | 76 | Update — component table row. | @@ -597,7 +597,7 @@ N/A — no event-driven surfaces. - `docs/01_architecture/README.md` — line 21 cross-reference updated. - `docs/01_architecture/data-model.md` — lines 52, 231 reworded. - `docs/01_architecture/cluster-lifecycle.md` — line 159 reworded. -- `docs/00_overview/product/relevance-copilot-spec.md` — umbrella spec; ~11 mentions reworded. The "Engine: pytrec_eval everywhere" subsection (lines 688–693) reframed as a provider-abstraction discussion. Stack table at line 155 + line 2513 + decision log at 2658 + appendix at 2722 all updated. +- `docs/00_overview/relyloop-spec.md` — umbrella spec; ~11 mentions reworded. The "Engine: pytrec_eval everywhere" subsection (lines 688–693) reframed as a provider-abstraction discussion. Stack table at line 155 + line 2513 + decision log at 2658 + appendix at 2722 all updated. - `docs/02_product/mvp1-user-stories.md` — US-7 narrative reworded. - `docs/02_product/planned_features/feat_study_baseline_trial/idea.md` — sibling-coordination: line 56 ("scores via `pytrec_eval`") reworded to `ir_measures`. Same-PR update. - `docs/02_product/planned_features/feat_auto_followup_studies/idea.md` — sibling-coordination: line 47 ("Optuna + pytrec_eval are deterministic") reworded to `Optuna + ir_measures`. Same-PR update. @@ -653,7 +653,7 @@ This feature is complete when: - [ ] The parity test (AC-2) passes for all 30 parametrized `(metric, k)` cases. - [ ] Per-query shape parity passes for the 4 edge-case queries (no-relevant, qrel-only, run-only, empty-overlap). - [ ] The existing-row read regression (AC-12) passes — pre-migration JSONB shapes hydrate confidence + trial-list + digest without raising. -- [ ] `docs/01_architecture/optimization.md` + `tech-stack.md` + `system-overview.md` + all docs in §15 are updated and merged in the same PR. The umbrella spec (`docs/00_overview/product/relevance-copilot-spec.md`)'s "Engine: pytrec_eval everywhere" subsection is reframed as a provider-abstraction discussion. +- [ ] `docs/01_architecture/optimization.md` + `tech-stack.md` + `system-overview.md` + all docs in §15 are updated and merged in the same PR. The umbrella spec (`docs/00_overview/relyloop-spec.md`)'s "Engine: pytrec_eval everywhere" subsection is reframed as a provider-abstraction discussion. - [ ] `state.md` has a new dated entry describing the migration (per AC-11). - [ ] `docs/00_overview/MVP1_DASHBOARD.md` regenerated via `scripts/build_mvp1_dashboard.py`. - [ ] Q1/Q2/Q3/Q4/Q5 resolutions are recorded in the decision log with cited verification output (per §19). diff --git a/docs/00_overview/implemented_features/2026_05_23_infra_ir_measures_migration/implementation_plan.md b/docs/00_overview/implemented_features/2026_05_23_infra_ir_measures_migration/implementation_plan.md index d173c895..1ee598df 100644 --- a/docs/00_overview/implemented_features/2026_05_23_infra_ir_measures_migration/implementation_plan.md +++ b/docs/00_overview/implemented_features/2026_05_23_infra_ir_measures_migration/implementation_plan.md @@ -649,7 +649,7 @@ None. | [`CLAUDE.md`](../../../../CLAUDE.md) | Lines 15 + 29 — both `pytrec_eval` mentions → `ir_measures`. | | [`architecture.md`](../../../../architecture.md) | Line 131 — `eval/ pytrec_eval scoring` → `eval/ ir_measures scoring`. | | [`release-notes-v0.1.0-draft.md`](../../../../release-notes-v0.1.0-draft.md) | Line 12 — stack table entry. | -| [`docs/00_overview/product/relevance-copilot-spec.md`](../../../00_overview/product/relevance-copilot-spec.md) | All 11 mentions (lines 12, 155, 688, 690, 692–693, 711, 2192, 2302, 2513, 2658, 2722). The "Engine: pytrec_eval everywhere" subsection (lines 688–693) is reframed as "Engine: provider-abstracted via `ir_measures`" with the reasons restated as: standard IR metric semantics across engines, per-query inspectability, cross-engine comparability (the old "de facto standard wrapper for trec_eval" framing becomes "provider abstraction means swapping backends is config, not rewrite"). | +| [`docs/00_overview/relyloop-spec.md`](../../../00_overview/relyloop-spec.md) | All 11 mentions (lines 12, 155, 688, 690, 692–693, 711, 2192, 2302, 2513, 2658, 2722). The "Engine: pytrec_eval everywhere" subsection (lines 688–693) is reframed as "Engine: provider-abstracted via `ir_measures`" with the reasons restated as: standard IR metric semantics across engines, per-query inspectability, cross-engine comparability (the old "de facto standard wrapper for trec_eval" framing becomes "provider abstraction means swapping backends is config, not rewrite"). | | [`docs/01_architecture/optimization.md`](../../../01_architecture/optimization.md) | All 10 mentions. Title `# Optimization (Optuna + pytrec_eval)` → `# Optimization (Optuna + ir_measures)`. Code-example block at lines 87–90 (`pytrec_eval.RelevanceEvaluator(qrels, {"ndcg_cut_10", "map", "P_10"}).evaluate(run)`) rewritten to: `import ir_measures` + `metrics = list(ir_measures.iter_calc([nDCG@10, AP, P@10], qrels, run))` plus a note that RelyLoop's `score()` re-keys back to user-facing tokens (`ndcg@10`, `map`, `precision@10`). | | [`docs/01_architecture/tech-stack.md`](../../../01_architecture/tech-stack.md) | Line 41 IR-evaluation row updated. | | [`docs/01_architecture/system-overview.md`](../../../01_architecture/system-overview.md) | Line 76 component table row updated. | @@ -678,7 +678,7 @@ None. 1. **Doc rewordings (bulk).** Open each file in the modified files table; apply the rewording. Use `sed -i` where the replacement is a literal token swap (`pytrec_eval` → `ir_measures`) but verify each file before bulk substitution — the umbrella spec and optimization.md have prose context that may need a sentence-level rewrite, not a token swap. -2. **Umbrella spec rewrite (subsection).** The "Engine: pytrec_eval everywhere" subsection at `docs/00_overview/product/relevance-copilot-spec.md:688–693` needs more than a token swap. Rewrite to **never name `pytrec_eval` in the live umbrella spec** (per cycle-1 F5 — FR-7's allowlist does not include the umbrella spec, and the provider-abstraction framing doesn't require naming the underlying backend): +2. **Umbrella spec rewrite (subsection).** The "Engine: pytrec_eval everywhere" subsection at `docs/00_overview/relyloop-spec.md:688–693` needs more than a token swap. Rewrite to **never name `pytrec_eval` in the live umbrella spec** (per cycle-1 F5 — FR-7's allowlist does not include the umbrella spec, and the provider-abstraction framing doesn't require naming the underlying backend): ```markdown ### Engine: provider-abstracted IR evaluation via `ir_measures` @@ -863,7 +863,7 @@ The migration adds 2 new test files and extends 3 existing assertions; existing - [x] `docs/02_product/mvp1-user-stories.md` — line 40. - [x] `docs/02_product/planned_features/feat_study_baseline_trial/idea.md` — sibling coordination, line 56. - [x] `docs/02_product/planned_features/feat_auto_followup_studies/idea.md` — sibling coordination, line 47. -- [x] `docs/00_overview/product/relevance-copilot-spec.md` — umbrella spec, 11 mentions including subsection rewrite. +- [x] `docs/00_overview/relyloop-spec.md` — umbrella spec, 11 mentions including subsection rewrite. ### 4.3 Runbooks diff --git a/docs/00_overview/implemented_features/2026_05_25_feat_study_clone_from_previous/feature_spec.md b/docs/00_overview/implemented_features/2026_05_25_feat_study_clone_from_previous/feature_spec.md index 163448ca..9adbe531 100644 --- a/docs/00_overview/implemented_features/2026_05_25_feat_study_clone_from_previous/feature_spec.md +++ b/docs/00_overview/implemented_features/2026_05_25_feat_study_clone_from_previous/feature_spec.md @@ -19,7 +19,7 @@ ## 1) Purpose -- **Problem:** A relevance engineer's iterative tuning loop after a study completes is "read digest → narrow params → re-run." Step 3 today means re-entering the cluster, target, query set, judgment list, template, search space (JSON paste), objective, and config from scratch in [`CreateStudyModal`](../../../../ui/src/components/studies/create-study-modal.tsx). ~2–5 minutes/iteration + invites JSON copy-paste errors. The umbrella spec ([`relevance-copilot-spec.md`](../../../00_overview/product/relevance-copilot-spec.md) §6) frames RelyLoop as an iterative loop; the create-study surface treats every study as green-field. +- **Problem:** A relevance engineer's iterative tuning loop after a study completes is "read digest → narrow params → re-run." Step 3 today means re-entering the cluster, target, query set, judgment list, template, search space (JSON paste), objective, and config from scratch in [`CreateStudyModal`](../../../../ui/src/components/studies/create-study-modal.tsx). ~2–5 minutes/iteration + invites JSON copy-paste errors. The umbrella spec ([`relyloop-spec.md`](../../../00_overview/relyloop-spec.md) §6) frames RelyLoop as an iterative loop; the create-study surface treats every study as green-field. - **Outcome:** A "Clone study" button on the study-detail page opens `CreateStudyModal` pre-filled with the source study's fields (cluster, target, query set, judgment list, template, search space, objective, config). The POST carries a new optional `parent_study_id` field; the server validates it and writes it into the existing `studies.parent_study_id` column. Lineage is preserved alongside `auto_followup`'s existing writes to the same column. Manual iteration time drops to "click → tweak the one field that changed → submit." diff --git a/docs/00_overview/implemented_features/2026_05_25_feat_study_clone_from_previous/idea.md b/docs/00_overview/implemented_features/2026_05_25_feat_study_clone_from_previous/idea.md index 1cb06ced..cc907ef2 100644 --- a/docs/00_overview/implemented_features/2026_05_25_feat_study_clone_from_previous/idea.md +++ b/docs/00_overview/implemented_features/2026_05_25_feat_study_clone_from_previous/idea.md @@ -20,7 +20,7 @@ A relevance engineer's normal manual follow-up workflow after a study completes: 2. Decide which params mattered, narrow their bounds, possibly switch objective metric, possibly extend trial budget. 3. Re-run. -Step 3 today means clicking "New study", picking the cluster + target + query set + judgment list + template + objective again, then pasting a hand-edited copy of the previous study's `search_space` JSON into Step 4. The umbrella spec ([`docs/00_overview/product/relevance-copilot-spec.md`](../../../00_overview/product/relevance-copilot-spec.md) §6, persona description) frames RelyLoop as an iterative loop; the create-study modal treats every study as a green-field configuration exercise. The mismatch costs ~2–5 minutes per iteration and invites copy-paste errors in the JSON. +Step 3 today means clicking "New study", picking the cluster + target + query set + judgment list + template + objective again, then pasting a hand-edited copy of the previous study's `search_space` JSON into Step 4. The umbrella spec ([`docs/00_overview/relyloop-spec.md`](../../../00_overview/relyloop-spec.md) §6, persona description) frames RelyLoop as an iterative loop; the create-study modal treats every study as a green-field configuration exercise. The mismatch costs ~2–5 minutes per iteration and invites copy-paste errors in the JSON. **Why this isn't already solved by the executable-followups work that shipped 2026-05-24:** `feat_digest_executable_followups` ships an LLM-prescribed "Run this followup" action on the **proposal-detail page** ([`feature_spec.md`](../../../00_overview/implemented_features/2026_05_24_feat_digest_executable_followups/feature_spec.md)). That path is for LLM-suggested narrow follow-ups with prefilled `search_space` — useful, but distinct from "the engineer wants to iterate on their own terms." The clone surface targets the manual-iteration path: same config + every field editable, anchored on the **study-detail page** (the source-of-truth view for "did the last study work?"). diff --git a/docs/00_overview/mvp1_dashboard.html b/docs/00_overview/mvp1_dashboard.html index 9c79b71c..8f130e40 100644 --- a/docs/00_overview/mvp1_dashboard.html +++ b/docs/00_overview/mvp1_dashboard.html @@ -403,7 +403,7 @@

MVP1 Progress

Pending work
-
13
+
15
every not-done feat/infra/chore/bug across all priorities
@@ -420,7 +420,7 @@

MVP1 Progress

P1
-
1
+
3
high-value, ready when P0 clears
@@ -435,7 +435,7 @@

MVP1 Progress

Legacy "Path to MVP1"
-
12
+
14
scoped not-done + bugs + chore-ideas only (excludes feat/infra ideas)
@@ -463,7 +463,7 @@

Pipeline

-

Idea 12

+

Idea 14

@@ -478,6 +478,45 @@

Idea 12

+
+ +
+ Chore + P1 + +
+
The `chore_oss_launch_prep` PR adds the foundational governance / security / contributor files that prospective contributors and enterprise reviewers look for first. Three remaining items are gates on
+ + +
+ + +
+ +
+ Bug + P1 + +
+
There is at least one untrapped exception path in `backend/workers/demo_reseed.py:run_demo_reseed`'s pre-main-body initialization that:
+ + +
+ + +
+ +
+ Chore + P2 + +
+
The async flow's contract:
+ + +
+ +
@@ -543,19 +582,6 @@

Idea 12

-
- -
- Bug - P2 - -
-
- - -
- -
@@ -650,7 +676,7 @@

Implementing 1

-

Done 118

+

Done 119

@@ -861,11 +887,11 @@

Done 118

- +
Feature - PR #282 + PR #285 merged 2026-05-27
A read-only document browser, reachable from two independent entry points (cluster detail + study detail), that lets operators see corpus shape, paginate documents, and inspect any single doc's `_sour
@@ -1939,6 +1965,19 @@

Done 118

+
+ +
+ Bug + + merged 2026-05-27 +
+
Complete
+ + +
+ +
@@ -2198,8 +2237,6 @@

Dependency graph (feat_ + infra_)

classDef plan fill:#fef9c3,stroke:#854d0e,color:#854d0e; classDef spec fill:#dbeafe,stroke:#1e40af,color:#1e40af; classDef idea fill:#f1f5f9,stroke:#334155,color:#334155; - feat_index_document_browser["index document browser"] - class feat_index_document_browser done; infra_agent_sibling_worktree_isolation["agent sibling worktree isolation"] class infra_agent_sibling_worktree_isolation implement; infra_foundation["foundation"] @@ -2376,6 +2413,8 @@

Dependency graph (feat_ + infra_)

class infra_dockerfile_invariant_smoke_in_ci done; infra_test_worktree_missing_integration_envs["test worktree missing integration envs"] class infra_test_worktree_missing_integration_envs done; + feat_index_document_browser["index document browser"] + class feat_index_document_browser done; feat_study_lifecycle --> feat_digest_proposal feat_llm_judgments --> feat_digest_proposal infra_foundation --> feat_llm_judgments @@ -2429,8 +2468,6 @@

Dependency graph (feat_ + infra_)

classDef plan fill:#fef9c3,stroke:#854d0e,color:#854d0e; classDef spec fill:#dbeafe,stroke:#1e40af,color:#1e40af; classDef idea fill:#f1f5f9,stroke:#334155,color:#334155; - feat_index_document_browser["index document browser"] - class feat_index_document_browser done; infra_agent_sibling_worktree_isolation["agent sibling worktree isolation"] class infra_agent_sibling_worktree_isolation implement; infra_foundation["foundation"] @@ -2607,6 +2644,8 @@

Dependency graph (feat_ + infra_)

class infra_dockerfile_invariant_smoke_in_ci done; infra_test_worktree_missing_integration_envs["test worktree missing integration envs"] class infra_test_worktree_missing_integration_envs done; + feat_index_document_browser["index document browser"] + class feat_index_document_browser done; feat_study_lifecycle --> feat_digest_proposal feat_llm_judgments --> feat_digest_proposal infra_foundation --> feat_llm_judgments diff --git a/docs/00_overview/product/relevance-copilot-spec.md b/docs/00_overview/relyloop-spec.md similarity index 98% rename from docs/00_overview/product/relevance-copilot-spec.md rename to docs/00_overview/relyloop-spec.md index 1e3755bc..55020ea5 100644 --- a/docs/00_overview/product/relevance-copilot-spec.md +++ b/docs/00_overview/relyloop-spec.md @@ -1,9 +1,9 @@ -# RelyLoop — Internal Tool Specification +# RelyLoop — Project Specification **Status:** Draft v0.1 **Date:** 2026-05-07 -**Owner:** Relevance team -**Audience:** Engineers and stakeholders building or evaluating the tool +**Owner:** RelyLoop maintainers (see [MAINTAINERS.md](../../MAINTAINERS.md)) +**Audience:** Engineers, operators, and stakeholders building, evaluating, or contributing to the tool --- @@ -39,7 +39,11 @@ Search relevance tuning at our organization is currently manual, ad-hoc, and eng 1. **Systematic exploration.** The space of tunable parameters (field weights, boosts, tie-breakers, fuzziness, slop, function-score parameters, hybrid-search alphas) is too large to explore manually. We routinely ship the first plausible win rather than the best win. 2. **Quantified evaluation.** Without a standing query set and judgment list, we can't tell whether a change generalizes or just happens to fix the three queries the engineer noticed. -Off-the-shelf tools (Quepid, RRE, Chorus) cover the manual workbench problem well but don't drive automated overnight studies, and don't have an LLM in the loop to design the search space. The OpenSearch Relevance Agent does the LLM-and-conversation part but is OpenSearch-only and lacks the autonomous-optimization loop. This tool combines both. +The off-the-shelf landscape covers adjacent problems well. Quepid (OpenSource Connections) is the dominant interactive workbench — per-query exploration with human and AI judging, but not an automated optimization sweep. RRE (Sease) is a Java/Maven evaluation library wired into CI for regression guards. Chorus (querqy/o19s) is a reference integration stack that bundles Solr or Elasticsearch with Quepid + SMUI + Querqy. SMUI + Querqy themselves live at a different layer entirely — query *rewriting* rules, not query-time parameter tuning. The Elasticsearch Ranking Evaluation API is a low-level engine-side primitive for computing metrics against judgments. + +The closest tool to RelyLoop's elevator pitch is the **OpenSearch Relevance Agent**, introduced in OpenSearch 3.6 (experimental) as part of OpenSearch Agent Server. It is genuinely overlapping: a conversational LLM-driven tuning agent that runs query-DSL-level experiments against query sets and judgment lists, with metrics offloaded to deterministic tools and proposals reviewed by a human. RelyLoop is deliberately scoped differently — engine-agnostic across Elasticsearch / OpenSearch / Lucidworks Fusion (with pure Solr deferred to v2), Git-PR-mediated change management against a central config repo, multi-cluster / multi-tenant, provider-agnostic LLM. For OpenSearch-only single-cluster shops that don't need a PR workflow, the OpenSearch Relevance Agent may be the simpler choice; RelyLoop exists for shops where one or more of those differentiators matters. + +A full honest tool-by-tool breakdown — including pairing patterns and where RelyLoop deliberately does not compete — is in [`adjacent-tools.md`](adjacent-tools.md). ## 3. Goals @@ -737,7 +741,7 @@ Because UBI is just two indices in the cluster RelyLoop is already adapting, the The pluggable `SignalsConverter` then maps these features to a 0–3 rating. Initial converters: position-bias-corrected CTR threshold, dwell-time threshold, and **hybrid UBI+LLM** (UBI rates the dense head; LLM-as-judge fills the long tail for queries below an impression threshold). Counterfactual click models (CCM, DBN) are documented as v1.5+ post-GA extensions because they need enough impressions per (query, doc) to be statistically meaningful. -The judgments table accepts mixed-source lists today (the `source IN ('llm', 'human', 'click')` CHECK has shipped since MVP1) — no schema migration is required to turn this on. The MVP1.5 deliverable is the `UbiReader` + `SignalsConverter` + a new `POST /api/v1/judgment-lists/generate-from-ubi` endpoint + a new `generate_judgments_from_ubi` agent tool. See [`feat_ubi_judgments/idea.md`](../../02_product/planned_features/feat_ubi_judgments/idea.md) for the planned-feature scope. +The judgments table accepts mixed-source lists today (the `source IN ('llm', 'human', 'click')` CHECK has shipped since MVP1) — no schema migration is required to turn this on. The MVP1.5 deliverable is the `UbiReader` + `SignalsConverter` + a new `POST /api/v1/judgment-lists/generate-from-ubi` endpoint + a new `generate_judgments_from_ubi` agent tool. See [`feat_ubi_judgments/idea.md`](../02_product/planned_features/feat_ubi_judgments/idea.md) for the planned-feature scope. Predicated on the operator having installed the OpenSearch UBI plugin and logged enough events to be statistically useful. Deployments without UBI continue to run LLM-as-judge unchanged. diff --git a/docs/01_architecture/adapters.md b/docs/01_architecture/adapters.md index ca167ae5..33b37ae9 100644 --- a/docs/01_architecture/adapters.md +++ b/docs/01_architecture/adapters.md @@ -1,7 +1,7 @@ # Adapters **Status:** Adopted for MVP1. ElasticAdapter (handling ES + OpenSearch) is the only implementation in MVP1; Lucidworks Fusion ships at MVP3; Apache Solr at v2+. Per-release timing per [`tech-stack.md` §"Canonical release matrix"](tech-stack.md). -**Source of truth for product context:** [docs/00_overview/product/relevance-copilot-spec.md §8](../00_overview/product/relevance-copilot-spec.md) ("Engine adapter specification") and §11 ("Search space & parameters"). +**Source of truth for product context:** [docs/00_overview/relyloop-spec.md §8](../00_overview/relyloop-spec.md) ("Engine adapter specification") and §11 ("Search space & parameters"). --- diff --git a/docs/01_architecture/agent-tools.md b/docs/01_architecture/agent-tools.md index 6f84d170..25bc5fb6 100644 --- a/docs/01_architecture/agent-tools.md +++ b/docs/01_architecture/agent-tools.md @@ -1,7 +1,7 @@ # Agent Tools **Status:** Adopted for MVP1 with OpenAI function-calling. The tool registry pattern persists into LangGraph (GA v1) without breaking changes. -**Source of truth for product context:** [docs/00_overview/product/relevance-copilot-spec.md §19](../00_overview/product/relevance-copilot-spec.md) ("Agent tools") + §21 ("Agent integration"). +**Source of truth for product context:** [docs/00_overview/relyloop-spec.md §19](../00_overview/relyloop-spec.md) ("Agent tools") + §21 ("Agent integration"). --- diff --git a/docs/01_architecture/api-conventions.md b/docs/01_architecture/api-conventions.md index d25928c9..11c8d864 100644 --- a/docs/01_architecture/api-conventions.md +++ b/docs/01_architecture/api-conventions.md @@ -1,7 +1,7 @@ # API Conventions **Status:** Adopted for MVP1. New conventions activate at the release noted on each row. -**Source of truth for product context:** [docs/00_overview/product/relevance-copilot-spec.md §28](../00_overview/product/relevance-copilot-spec.md) ("API conventions" subsection). +**Source of truth for product context:** [docs/00_overview/relyloop-spec.md §28](../00_overview/relyloop-spec.md) ("API conventions" subsection). --- diff --git a/docs/01_architecture/apply-path.md b/docs/01_architecture/apply-path.md index 4440cb9f..975fa2c1 100644 --- a/docs/01_architecture/apply-path.md +++ b/docs/01_architecture/apply-path.md @@ -1,7 +1,7 @@ # Apply Path: Git PR Workflow **Status:** Adopted for MVP1 with GitHub-only. Multi-Git-provider abstraction (GitLab + Bitbucket) ships at MVP3 per [`tech-stack.md` §"Canonical release matrix"](tech-stack.md). -**Source of truth for product context:** [docs/00_overview/product/relevance-copilot-spec.md §16](../00_overview/product/relevance-copilot-spec.md) ("Apply path: Git PR workflow"). +**Source of truth for product context:** [docs/00_overview/relyloop-spec.md §16](../00_overview/relyloop-spec.md) ("Apply path: Git PR workflow"). --- diff --git a/docs/01_architecture/data-model.md b/docs/01_architecture/data-model.md index 0630d8ac..8594d7fa 100644 --- a/docs/01_architecture/data-model.md +++ b/docs/01_architecture/data-model.md @@ -1,7 +1,7 @@ # Data Model **Status:** Adopted for MVP1. Tables shown with their MVP1 shape; deferred columns and tables are flagged. -**Source of truth for product context:** [docs/00_overview/product/relevance-copilot-spec.md §9](../00_overview/product/relevance-copilot-spec.md) ("Data model"). +**Source of truth for product context:** [docs/00_overview/relyloop-spec.md §9](../00_overview/relyloop-spec.md) ("Data model"). --- diff --git a/docs/01_architecture/deployment.md b/docs/01_architecture/deployment.md index 9a7b29ae..e839545d 100644 --- a/docs/01_architecture/deployment.md +++ b/docs/01_architecture/deployment.md @@ -1,7 +1,7 @@ # Deployment **Status:** Adopted for MVP1. Local Docker Compose only; production-grade deployment activates as later releases add the missing pieces (TLS, SSO, observability). -**Source of truth for product context:** [docs/00_overview/product/relevance-copilot-spec.md §25](../00_overview/product/relevance-copilot-spec.md) ("Deployment"). +**Source of truth for product context:** [docs/00_overview/relyloop-spec.md §25](../00_overview/relyloop-spec.md) ("Deployment"). --- diff --git a/docs/01_architecture/llm-orchestration.md b/docs/01_architecture/llm-orchestration.md index 7d5207b9..ded67502 100644 --- a/docs/01_architecture/llm-orchestration.md +++ b/docs/01_architecture/llm-orchestration.md @@ -1,7 +1,7 @@ # LLM Orchestration **Status:** Adopted for MVP1 with the plain `openai` SDK + function calling. **The SDK is pointed at any OpenAI-compatible endpoint via `OPENAI_BASE_URL`** (defaults to `https://api.openai.com/v1`; works against Ollama, LM Studio, vLLM, HuggingFace TGI for air-gapped evaluation). LangGraph orchestrator + native non-OpenAI-compatible provider SDKs (Anthropic, Bedrock, Vertex) + Langfuse + RedisCache arrive at later releases per the canonical [`tech-stack.md` §"Canonical release matrix"](tech-stack.md). -**Source of truth for product context:** [docs/00_overview/product/relevance-copilot-spec.md §15](../00_overview/product/relevance-copilot-spec.md) ("LLM orchestration & observability"). +**Source of truth for product context:** [docs/00_overview/relyloop-spec.md §15](../00_overview/relyloop-spec.md) ("LLM orchestration & observability"). --- diff --git a/docs/01_architecture/mvp1-overview.md b/docs/01_architecture/mvp1-overview.md index a39d1044..32178c0b 100644 --- a/docs/01_architecture/mvp1-overview.md +++ b/docs/01_architecture/mvp1-overview.md @@ -2,7 +2,7 @@ **Status:** This is the architecture as it exists in MVP1 ("The Loop"). Each topical doc covers all releases; this page is a fast entry point that filters them down to MVP1's active scope. -**For product context:** [docs/00_overview/product/relevance-copilot-spec.md §27](../00_overview/product/relevance-copilot-spec.md) ("MVP1 / v0.1 — The Loop"). +**For product context:** [docs/00_overview/relyloop-spec.md §27](../00_overview/relyloop-spec.md) ("MVP1 / v0.1 — The Loop"). --- @@ -126,4 +126,4 @@ The "TBA" docs are authored alongside their corresponding feature spec. - All arch docs in this section: [`docs/01_architecture/`](./) - MVP1 feature folders: [`docs/02_product/planned_features/`](../02_product/planned_features/) - MVP1 user stories: [`docs/02_product/mvp1-user-stories.md`](../02_product/mvp1-user-stories.md) -- Umbrella spec MVP1 section: [`docs/00_overview/product/relevance-copilot-spec.md` §27](../00_overview/product/relevance-copilot-spec.md) +- Umbrella spec MVP1 section: [`docs/00_overview/relyloop-spec.md` §27](../00_overview/relyloop-spec.md) diff --git a/docs/01_architecture/optimization.md b/docs/01_architecture/optimization.md index 7b1bb7df..dfde07fc 100644 --- a/docs/01_architecture/optimization.md +++ b/docs/01_architecture/optimization.md @@ -1,7 +1,7 @@ # Optimization (Optuna + ir_measures) **Status:** Adopted for MVP1. Single-objective TPE + median pruner; provider-abstracted IR evaluation via `ir_measures` (wraps multiple cut-aware-metric backends behind a typed metric-object DSL). Multi-objective optimization (CMA-ES + multi-metric) reserved for v2 per umbrella spec. -**Source of truth for product context:** [docs/00_overview/product/relevance-copilot-spec.md §13–§14](../00_overview/product/relevance-copilot-spec.md). Per-release timing per [`tech-stack.md` §"Canonical release matrix"](tech-stack.md). +**Source of truth for product context:** [docs/00_overview/relyloop-spec.md §13–§14](../00_overview/relyloop-spec.md). Per-release timing per [`tech-stack.md` §"Canonical release matrix"](tech-stack.md). --- diff --git a/docs/01_architecture/system-overview.md b/docs/01_architecture/system-overview.md index 1efcc865..6f4bcbaa 100644 --- a/docs/01_architecture/system-overview.md +++ b/docs/01_architecture/system-overview.md @@ -1,7 +1,7 @@ # System Overview **Status:** Adopted for MVP1. Each release adds services; this doc shows the full topology with MVP1-active services highlighted. -**Source of truth for product context:** [docs/00_overview/product/relevance-copilot-spec.md §7](../00_overview/product/relevance-copilot-spec.md) ("System architecture"). +**Source of truth for product context:** [docs/00_overview/relyloop-spec.md §7](../00_overview/relyloop-spec.md) ("System architecture"). --- diff --git a/docs/01_architecture/tech-stack.md b/docs/01_architecture/tech-stack.md index 7d29b262..1df501ec 100644 --- a/docs/01_architecture/tech-stack.md +++ b/docs/01_architecture/tech-stack.md @@ -1,7 +1,7 @@ # Tech Stack **Status:** Adopted for MVP1. Revisited per release as new layers come online. -**Source of truth for product context:** [docs/00_overview/product/relevance-copilot-spec.md §28](../00_overview/product/relevance-copilot-spec.md) ("Tech stack & implementation decisions"). This document is the engineering-facing distillation of those decisions, scoped to what's relevant for MVP1 with explicit notes on what activates in later releases. +**Source of truth for product context:** [docs/00_overview/relyloop-spec.md §28](../00_overview/relyloop-spec.md) ("Tech stack & implementation decisions"). This document is the engineering-facing distillation of those decisions, scoped to what's relevant for MVP1 with explicit notes on what activates in later releases. --- diff --git a/docs/01_architecture/ui-architecture.md b/docs/01_architecture/ui-architecture.md index 048dccfb..4d07323b 100644 --- a/docs/01_architecture/ui-architecture.md +++ b/docs/01_architecture/ui-architecture.md @@ -1,7 +1,7 @@ # UI Architecture **Status:** Adopted for MVP1. Next.js 16 App Router (React 19, Turbopack) + shadcn/ui + Tailwind 4 (CSS-first) + TanStack Query + Vitest 4. Per-screen feature specs (`feat_studies_ui`, `feat_proposals_ui`, `feat_chat_agent`) implement the patterns documented here. Stack bumped from Next 14 / React 18 / Tailwind 3 / Vitest 2 on 2026-05-12 via `infra_frontend_stack_refresh` (the placeholder UI was the optimal upgrade window before `feat_studies_ui` adds component volume). -**Source of truth for product context:** [docs/00_overview/product/relevance-copilot-spec.md §22](../00_overview/product/relevance-copilot-spec.md) ("UI screens") and §28 ("Frontend stack"). +**Source of truth for product context:** [docs/00_overview/relyloop-spec.md §22](../00_overview/relyloop-spec.md) ("UI screens") and §28 ("Frontend stack"). --- diff --git a/docs/02_product/README.md b/docs/02_product/README.md index 87cc0d80..cb81339d 100644 --- a/docs/02_product/README.md +++ b/docs/02_product/README.md @@ -7,6 +7,6 @@ Use this section for roadmap, release planning, scope definitions, execution pla - `planned_features/` — per-feature planning artifacts (idea, spec, plan, pipeline_status) for work that is queued, in flight, or recently shipped. The pipeline of `.claude/skills/` (idea-preflight → spec-gen → impl-plan-gen → impl-execute → guide-gen) produces and consumes these folders. - `planned_features/feature_templates/` — copy-and-fill templates plus reference examples. Templates are ported from a sibling project; each carries an HTML porting banner at the top with the conditional-section guidance. -## Forthcoming +## Related -The umbrella product spec and MVP1 execution plan currently live under [`../00_overview/product/`](../00_overview/product/) while the IA is being broken down. Section-specific product docs (per-feature scope, release notes, etc.) belong here as they are extracted from the umbrella spec. +The umbrella product spec lives at [`../00_overview/relyloop-spec.md`](../00_overview/relyloop-spec.md) — that document is the authoritative scope statement. Section-specific product docs (per-feature scope, release notes, etc.) belong in this directory as they are extracted from the umbrella spec. diff --git a/docs/02_product/mvp1-user-stories.md b/docs/02_product/mvp1-user-stories.md index cffbbe0b..f97a3ffc 100644 --- a/docs/02_product/mvp1-user-stories.md +++ b/docs/02_product/mvp1-user-stories.md @@ -3,8 +3,8 @@ **Status:** Source-of-truth user-story enumeration for MVP1 ("The Loop"). Each story is referenced by ID (`US-N`) from the matching feature_spec.md in `planned_features//`. **Source material:** -- Umbrella spec [§6 Personas & user stories](../00_overview/product/relevance-copilot-spec.md) (lines 85–100) — system-level stories -- Umbrella spec [§27 MVP1 scope](../00_overview/product/relevance-copilot-spec.md) (lines 2286–2322) — in-scope capabilities +- Umbrella spec [§6 Personas & user stories](../00_overview/relyloop-spec.md) (lines 85–100) — system-level stories +- Umbrella spec [§27 MVP1 scope](../00_overview/relyloop-spec.md) (lines 2286–2322) — in-scope capabilities - Umbrella spec §8, §12, §14, §15, §16, §19, §22 — capability detail **Scope boundary:** MVP1 only. Stories that depend on later-release capabilities (Langfuse → MVP2; Lucidworks Fusion + GitLab/Bitbucket → MVP3; multi-tenant + multi-LLM provider abstraction + SSO + API keys → MVP4; LangGraph state graph + subagents + PostgresSaver → GA v1) are explicitly out of scope and live in their respective release plans. See [`docs/01_architecture/tech-stack.md` §"Canonical release matrix"](../01_architecture/tech-stack.md) for the source of truth. diff --git a/docs/02_product/planned_features/chore_oss_public_launch_punchlist/idea.md b/docs/02_product/planned_features/chore_oss_public_launch_punchlist/idea.md new file mode 100644 index 00000000..fc92af88 --- /dev/null +++ b/docs/02_product/planned_features/chore_oss_public_launch_punchlist/idea.md @@ -0,0 +1,131 @@ +# OSS Public-Launch Punchlist + +**Date:** 2026-05-27 +**Status:** Idea — captured during `chore_oss_launch_prep` (the PR that added SECURITY.md / GOVERNANCE.md / MAINTAINERS.md / CODEOWNERS / issue + PR templates and replaced the Code of Conduct) +**Priority:** P1 — gates flipping the repository from private to public. +**Origin:** Items the user named in the `chore_oss_launch_prep` request that are operator-decisions or bulk-mechanical sweeps too large to land in the same documentation-focused PR. +**Depends on:** None code-wise. Sequencing: do these before changing repo visibility. + +## Problem + +The `chore_oss_launch_prep` PR adds the foundational governance / security / +contributor files that prospective contributors and enterprise reviewers +look for first. Three remaining items are gates on flipping the repository +from private to public, but each is either an **operator action** (not +something a PR can land) or a **bulk-mechanical sweep** large enough to +deserve its own focused review pass. Bundling them into the docs PR would +hide the review surface; deferring them without a tracking file risks +forgetting them before the public flip. + +## Proposed capabilities + +### Capability 1 — SPDX-License-Identifier headers across source files + +Adopt the [FSFE REUSE](https://reuse.software/) convention. Every source +file gets a two-line header: + +``` +# SPDX-FileCopyrightText: 2026 soundminds.ai +# SPDX-License-Identifier: Apache-2.0 +``` + +(Comment marker swaps per language: `#` for Python / YAML / Dockerfile, +`//` for TS / JS / Go, `