Skip to content

feat(AGX1-275): per-RPC task permission rewire and 404/403 wrap#249

Draft
asherfink wants to merge 1 commit into
mainfrom
asher.fink/agx1-275-task-route-migration
Draft

feat(AGX1-275): per-RPC task permission rewire and 404/403 wrap#249
asherfink wants to merge 1 commit into
mainfrom
asher.fink/agx1-275-task-route-migration

Conversation

@asherfink
Copy link
Copy Markdown

@asherfink asherfink commented May 26, 2026

Related work

This stack lands per-task FGAC for AGX1-264. Merge order: 2b → 2 → 3 (scale-agentex assumes agentex-auth understands cancel before sending it).

Stream Repo PR Purpose
2b scaleapi/agentex #353 agentex-auth per-account routing + cancel op
2 scaleapi/scale-agentex #246 task creator audit columns + migration safety
3 (this PR) scaleapi/scale-agentex #249 per-RPC operation rewire + 404/403 wrap

Parent epic: AGX1-264. Follow-ups bundled in AGX1-291.

Summary

  • Routes each RPC method to the correct AuthorizedOperationType: MESSAGE_SEND/EVENT_SENDupdate, TASK_CANCELcancel, TASK_CREATE stays create. Previously every method used update.
  • Collapses every task-resource denial into 404 (via ItemDoesNotExist) across all surfaces — path id, query id, body id, and name routes — so callers can no longer distinguish "task present in another tenant" from "task absent" by comparing 403 vs 404.
  • Extracts the collapse helper to a new module src/utils/task_authorization.py, reused from both the FastAPI dep factories and the RPC authorize hook.

What changed

  • src/utils/task_authorization.py (new): _check_task_or_collapse_to_404(authorization, task_id, operation) — the shared wrap. Renamed from _check_task_or_distinguish_404 (the previous name implied a 403/404 split the helper does not actually perform).
  • src/utils/authorization_shortcuts.py: DAuthorizedId/DAuthorizedQuery/DAuthorizedBodyId route task checks through the wrap; their inner deps no longer take a task_repository (the parameter was unused). DAuthorizedName now applies the wrap when resource_type == AgentexResourceType.task — previously the name surface leaked 403 vs 404 because tasks.name is globally unique, so a probe checked the entire system rather than a tenant.
  • src/api/routes/agents.py _authorize_rpc_request: each task-resource branch now routes through the wrap. MESSAGE_SEND with task_name was restructured to a try/else shape so a denied-update on an existing task surfaces as 404 (it must NOT fall through to the create-fallback except — that would silently promote a denied-update into a create check). TASK_CREATE and the wildcard task("*") checks intentionally untouched.

Why the structural change in MESSAGE_SEND

The original block wrapped both get_task(name=...) and the auth check in one try. Applying the new wrap inside it as-is would let a denied-update raise ItemDoesNotExist, get caught by the outer except ItemDoesNotExist, and silently fall through to the create check — a privilege escalation. Splitting into try: get_task / except: create-check / else: wrapped update-check keeps the create-on-absent semantics while ensuring denied-update propagates outward (→ 404).

Tests

  • tests/unit/api/test_tasks_authz.py — 17/17 pass.
    • 8 TestPerRpcOperationRouting tests (incl. MESSAGE_SEND create-fallback preserved through the restructure).
    • 2 TestCheckTaskOrCollapseTo404 tests (allow + denied-collapses-to-404).
    • 3 TestDAuthorizedBodyIdTaskWrap tests.
    • 3 new TestDAuthorizedNameTaskWrap tests (denied-task → 404, allow returns name, agent path unaffected).
    • Cross-repo wire-contract test ("cancel") mirrors agentex-auth's.

Out of scope / follow-ups (tracked in AGX1-291)

  • /agents/name/{agent_name} has the same leak shape — agent FGAC is outside AGX1-264.
  • Restoring the 403/404 split for same-tenant calls once tasks carry tenant scope at the data layer (AGX1-290).

Test plan

  • uv run pytest tests/unit/api/test_tasks_authz.py — 17 passed.
  • Ruff + ruff-format clean across all touched files.
  • CI to run the broader unit + integration suite.
  • Manual: deny an existing task by id/name/query/body across each surface in a dev cluster; confirm 404 on every surface.

dm36 added a commit that referenced this pull request May 27, 2026
…se and two-factor mutations

Mirrors AGX1-275 (PR #249) for agent_api_keys. Wires Spark AuthZ checks
into every api_key route, collapses denials to 404 (so name/id probes
can't distinguish "present in another tenant" from "absent"), and relies
on SpiceDB's transitive expansion of api_key.{update,delete} (= editor &
parent_agent->update & tenant_gate) for two-factor mutations rather than
issuing two explicit checks at the route layer.

- src/utils/agent_api_key_authorization.py (new):
  _check_api_key_or_collapse_to_404 — catches AuthorizationError, raises
  ItemDoesNotExist. Same shape as Asher's task helper.
- src/utils/authorization_shortcuts.py: DAuthorizedId routes
  AgentexResourceType.api_key through the wrap. (DAuthorizedName isn't
  used for api_keys; the name lookup is (agent_id, name, api_key_type),
  not a single globally-unique path param — the route handlers call the
  collapse helper inline instead.)
- src/api/routes/agent_api_keys.py:
  * POST: explicit agent.update on parent (no api_key resource yet).
  * GET list: DAuthorizedResourceIds + filter; None passes through.
  * GET /name/{name}: inline collapse helper.
  * GET /{id}: DAuthorizedId(api_key, read).
  * DELETE /{id}: DAuthorizedId(api_key, delete). Two-factor via SpiceDB
    schema (api_key.delete expands to parent_agent.update); no second
    route-layer check.
  * DELETE /name/{api_key_name}: inline collapse helper.
- tests/unit/api/test_agent_api_keys_authz.py (new): 12 tests, all pass.

Stacked on dhruv/agx1-272-agent-api-keys-dual-write (PR A). Does NOT
touch dual-write logic. Does NOT modify agentex-auth.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant