ci: add daily audit suites with 5 rotating recipes and scheduled workflow#543
ci: add daily audit suites with 5 rotating recipes and scheduled workflow#543andreatgretel wants to merge 4 commits intomainfrom
Conversation
Add the daily maintenance infrastructure (Phase 2+3 of the agentic CI plan). A new workflow runs one audit suite per weekday via day-of-week rotation, with runner memory persisted via actions/cache. Recipes: docs-and-references (Mon), dependencies (Tue), structure (Wed), code-quality (Thu), test-health (Fri). Each targets gaps that CI and ruff don't cover: cross-reference validation, transitive dep analysis, lazy import compliance, complexity trends, and test-to-source mapping. Reports go to the Actions step summary. Code changes use /create-pr.
Add executable smoke checks to test-health and code-quality recipes that exercise real code paths (config build, validate, import timing, registry completeness, error hierarchy, input rejection) without needing an LLM provider. Checks are split into fixed canaries (same every run) and creative checks (agent varies inputs each run). Harden runner memory: define JSON schema in _runner.md with TTL and size rules, validate state file after agent runs, only update last_run on success, drop unused audit-log.md. Add make install-dev workflow step so recipes can run Python against the installed packages.
Fix issues found by Codex review: - Fix test paths: tests/ does not exist at repo root, use packages/*/tests/ and packages/data-designer/tests/test_import_perf.py - Remove DataDesigner(model_providers=[]) from smoke checks - raises NoModelProvidersError; keep config-layer checks only - Fix audit step gating: remove continue-on-error, use step outcome to gate runner memory update (|| true + continue-on-error made the step always "succeed", defeating the success() condition)
Fix heredoc with indented EOF terminator that never terminates - replace with printf. Run state validation on all outcomes (not just success) so corrupted state from a failed audit is caught before caching. Only stamp last_run when audit succeeds. Align test-health lazy import section with its own Constraints (report count only, don't duplicate structure audit). Also fixes datetime.utcnow() deprecation and shell variable injection in Python string by using os.environ instead.
PR Review: #543 - ci: add daily audit suites with 5 rotating recipes and scheduled workflowReviewer: Agentic CI SummaryThis PR introduces a daily agentic CI system that runs rotating code health audits on weekdays. It adds:
The design is well-structured: each recipe targets gaps that existing CI (ruff, pytest, Dependabot) doesn't cover, with clear delineation of responsibilities. FindingsWorkflow (
|
Greptile SummaryAdds a scheduled weekday CI workflow that runs one of five rotating agentic audit suites (docs, dependencies, structure, code-quality, test-health) per day via the Claude CLI on a self-hosted runner, with per-suite runner-state persistence through
|
| Filename | Overview |
|---|---|
| .github/workflows/agentic-ci-daily.yml | New scheduled workflow with day-of-week suite rotation, matrix parallelism, runner memory via actions/cache, and API pre-flight check; top-level write permissions apply to all jobs including the lightweight determine-suite job. |
| .agents/recipes/_runner.md | Adds environment docs, runner-state.json schema with TTL/size rules, and updates PR creation instructions to use /create-pr; schema is clear and well-constrained. |
| .agents/recipes/code-quality/recipe.md | Thursday audit covering C901 complexity, exception hygiene, type annotation coverage, and TODO aging; fixed/creative executable checks are well-specified with clear success/failure criteria. |
| .agents/recipes/dependencies/recipe.md | Tuesday audit covering transitive dependency gaps, cross-package version consistency, unused deps, and pinning review; constraints correctly scope the audit to what Dependabot cannot do. |
| .agents/recipes/docs-and-references/recipe.md | Monday audit for docstring/signature drift, broken internal links, stale architecture references, and docs site accuracy; well-scoped to avoid duplicating ruff checks. |
| .agents/recipes/structure/recipe.md | Wednesday audit for import boundary violations, lazy import compliance, future annotations, and dead exports; correctly excludes TYPE_CHECKING blocks and documents the expected clean baseline. |
| .agents/recipes/test-health/recipe.md | Friday audit with solid fixed canaries (import verification, timing budget, registry completeness) and well-documented creative smoke checks with a correct note about NoModelProvidersError limiting what can be tested. |
| .github/CODEOWNERS | Adds CODEOWNERS entry for .agents/recipes/ to require core-team review on recipe changes; appropriate given recipes control what the agent executes with write permissions. |
| plans/472/agentic-ci-plan.md | Phase 2, 3, and CODEOWNERS deliverables marked complete; accurate reflection of what's been implemented in this PR. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A([schedule: weekdays 08:00 UTC\nor workflow_dispatch]) --> B[determine-suite job\nubuntu-latest]
B --> C{OVERRIDE input?}
C -->|none| D[date -u +%u\nMon-Fri rotation]
C -->|specific suite| E[suites = override]
C -->|all| F[suites = all 5]
D --> G[suites = single suite]
E --> H{suites != empty?}
G --> H
F --> H
H -->|no - weekend| I([skip])
H -->|yes| J[audit job matrix\nself-hosted agentic-ci runner]
J --> K[Restore runner memory\nactions/cache]
K --> L[make install-dev\n.venv/bin on PATH]
L --> M[Pre-flight: claude CLI\n+ API connectivity check]
M --> N[Run audit recipe\n_runner.md + recipe body\nsed frontmatter strip\ntemplate substitution]
N --> O[claude --model ...\n-p PROMPT\n--max-turns 30]
O --> P[Update runner memory\nvalidate JSON\nstamp last_run on success]
P --> Q[Write job summary\n/tmp/audit-SUITE.md\n+ agent log]
P -->|always| Q
M -->|fail| P
Prompt To Fix All With AI
This is a comment left during a code review.
Path: .github/workflows/agentic-ci-daily.yml
Line: 16-18
Comment:
**Overly broad permissions on `determine-suite`**
The top-level `permissions` block grants `contents: write` and `pull-requests: write` to every job in the workflow, including `determine-suite`, which only runs `date` and string manipulation. Following least-privilege, write access should be scoped to the `audit` job where PRs could actually be created.
```yaml
# Remove top-level permissions block, then add per-job scopes:
jobs:
determine-suite:
permissions:
contents: read
...
audit:
permissions:
contents: write
pull-requests: write
...
```
This limits the blast radius if the `ubuntu-latest` `determine-suite` job is ever compromised via a supply-chain attack on a future action added to it.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "ci: fix review findings - heredoc, state..." | Re-trigger Greptile
📋 Summary
Add a daily agentic CI system that runs rotating code health audits on weekdays, catching quality drift that existing CI doesn't cover (no C901/ANN/BLE ruff rules, no cross-reference validation, no transitive dep analysis, no docs-vs-code accuracy checks). Each audit runs as a Claude Code agent on the self-hosted runner, guided by a recipe, and reports findings to the GitHub Actions step summary.
Closes #472
🔗 Related Issue
Closes #472
🔄 Changes
✨ Added
.github/workflows/agentic-ci-daily.yml- Scheduled workflow with day-of-week suite rotation (Mon-Fri), per-suite concurrency, runner memory viaactions/cache,make install-devenvironment setup, andworkflow_dispatchoverride (including "all" to run everything in parallel).agents/recipes/docs-and-references/recipe.md- Monday: docstring vs signature drift, broken internal links, architecture doc references, docs site content accuracy.agents/recipes/dependencies/recipe.md- Tuesday: transitive dependency gaps, cross-package version consistency, unused deps, version pinning review.agents/recipes/structure/recipe.md- Wednesday: import boundary violations, lazy import compliance, future annotations, dead exports.agents/recipes/code-quality/recipe.md- Thursday: complexity hotspots (C901), exception hygiene, type annotation coverage, TODO aging, executable quality checks (error hierarchy, input validation).agents/recipes/test-health/recipe.md- Friday: test-to-source mapping, hollow test detection, import performance, executable smoke checks (fixed canaries + creative agent-varied checks), test isolation verification🔧 Changed
.agents/recipes/_runner.md- Added environment docs (.venv/binon PATH), runner memory JSON schema with TTL and size rules, updated PR creation instructions to use/create-prskill.github/CODEOWNERS- Added.agents/recipes/ownership entryplans/472/agentic-ci-plan.md- Marked Phase 2, 3, and 4 deliverables as complete🔍 Attention Areas
agentic-ci-daily.yml- New workflow withcontents: writeandpull-requests: writepermissions. Write access is intentional to support future recipe-driven PRs, but all current recipes are read-only audits.test-health/recipe.mdandcode-quality/recipe.md- These run real Python against the installed packages. Fixed canaries are deterministic; creative checks are agent-designed each run._runner.md- Defines the JSON contract for cross-run state persistence including TTL rules for known_issues.🧪 Testing
make check-allpasses (ruff lint + format)workflow_dispatchafter merge.✅ Checklist