docs(architecture): explain RAG design decisions by fanza-ks · Pull Request #1 · Kernel-Science/physlibsearch

fanza-ks · 2026-04-16T14:03:48Z

Summary

Adds a Design decisions section to docs/architecture.md explaining why the RAG pipeline is shaped the way it is, prompted by an external question asking how the architecture was arrived at (and why it differs from generic notes-RAGs like gbrain).
Covers: per-declaration chunking, Herald-style informalization before embedding, dependency-aware topological ordering, HyDE on the query side, dual Postgres + Chroma store, and model/task-type tiering.
Ends with a small comparison table contrasting this pipeline against a generic notes-RAG.

Test plan

Render docs/architecture.md (VS Code preview or on GitHub) and verify heading hierarchy, code-block formatting, and the comparison table.
Confirm relative links (../database/vector_db.py, etc.) resolve on the GitHub view.
Cross-check the cited claims against the code:
- database/vector_db.py:43 — hybrid embedding-input string
- database/informalize.py — ±2 neighbours + dependency context
- database/embedding.py — RETRIEVAL_DOCUMENT vs RETRIEVAL_QUERY

🤖 Generated with Claude Code

…neric RAGs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fanza-ks · 2026-04-16T14:16:53Z

Following up on a question by Shlok Vaibhav on Zulip, I've added a more detailed explanation of what we do and why we made certain design choices for the RAG system in physlibsearch

docs(architecture): explain RAG design decisions and contrast with ge…

8da1c12

…neric RAGs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fanza-ks merged commit 05ea001 into main Apr 16, 2026
1 check failed

fanza-ks mentioned this pull request Apr 16, 2026

chore: fix ruff lint + format violations #2

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(architecture): explain RAG design decisions#1

docs(architecture): explain RAG design decisions#1
fanza-ks merged 1 commit intomainfrom
docs/rag-design-decisions

fanza-ks commented Apr 16, 2026

Uh oh!

fanza-ks commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fanza-ks commented Apr 16, 2026

Summary

Test plan

Uh oh!

fanza-ks commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant