fix(sandbox): resolve symlinked binary paths in network policy matching#774
fix(sandbox): resolve symlinked binary paths in network policy matching#774johntmyers wants to merge 5 commits intomainfrom
Conversation
|
Confirming this affects real deployments. We run 51+ CLI tools in an OpenShell sandbox, all symlinked from Current workaround: list both the symlink path AND the resolved binary path in the policy Would be great to see this merged — it would simplify our policy config significantly. |
|
Thanks @mjamiv any chance you were able to build this branch and verify? |
Policy binary paths specified as symlinks (e.g., /usr/bin/python3) were silently denied because the kernel reports the canonical path via /proc/<pid>/exe (e.g., /usr/bin/python3.11). The strict string equality in Rego never matched. Expand policy binary paths by resolving symlinks through the container filesystem (/proc/<pid>/root/) after the entrypoint starts. The OPA data now contains both the original and resolved paths, so Rego's existing strict equality check naturally matches either. - Add resolve_binary_in_container() helper for Linux symlink resolution - Add from_proto_with_pid() and reload_from_proto_with_pid() to OpaEngine - Trigger one-shot OPA rebuild after entrypoint_pid is stored - Thread entrypoint_pid through run_policy_poll_loop for hot-reloads - Improve deny reason with symlink debugging hint - Add 18 new tests including hot-reload and Linux symlink e2e tests Closes #770
…naccessible The Linux-specific symlink resolution tests depend on /proc/<pid>/root/ being readable, which requires CAP_SYS_PTRACE or permissive ptrace scope. This is unavailable in CI containers, rootless containers, and hardened hosts. Add a procfs_root_accessible() guard that skips these tests gracefully instead of failing.
…improve deny messages
When /proc/<pid>/root/ is inaccessible (restricted ptrace, rootless
containers, hardened hosts), resolve_binary_in_container now logs a
per-binary warning with the specific error, the path it tried, and
actionable guidance (use canonical path or grant CAP_SYS_PTRACE).
Previously this was completely silent.
The Rego deny reason for binary mismatches now leads with 'SYMLINK HINT'
and includes a concrete fix command ('readlink -f' inside the sandbox)
plus what to look for in logs if automatic resolution isn't working.
…ution std::fs::canonicalize resolves /proc/<pid>/root itself (a kernel pseudo-symlink to /) which strips the prefix needed for path extraction. This caused resolution to silently fail in all environments, not just CI. Replace with an iterative read_link loop that walks the symlink chain within the container namespace without resolving the /proc mount point. Add normalize_path helper for relative symlink targets containing .. components. Update procfs_root_accessible test guard to actually probe the full resolution path instead of just checking path existence.
8253c6c to
907b9fe
Compare
|
@johntmyers Haven't been able to build from source — no Rust toolchain on this host and it's a production VPS I'd rather not clutter. Happy to test a pre-built binary or release candidate if one's available. Here's what I'd verify: Test case: 51 CLI tools symlinked as Current workaround: Listing both symlink AND resolved paths in the What I'd verify:
If there's a dev release tag or binary I can drop in, I'll test same-day. |
|
@johntmyers Built, deployed, and tested. Here's what I found — the fix compiles cleanly and runs, but does not actually resolve the bug in our environment. The warning path fires unconditionally and the 403 reproduces. Details: Environment
Build
Deployment
Test case
With the patched supervisorSame 403. The supervisor logs 23 warnings at startup (once per policy binary ref), all identical: No The puzzleRunning the exact same access manually from within the supervisor's PID + mount namespace works fine: So the path exists, the namespaces are right, CAP_SYS_PTRACE is held, and the supervisor is root. Yet Best guesses
What would help
Happy to rebuild with an instrumented version if you want me to patch in a few extra logs and re-test. I also kept the build cache so re-runs are ~1 min. Rollback verified clean, sandbox is back on stock v0.0.25. |
…ready The one-shot resolve ran immediately after ProcessHandle::spawn, before the child's mount namespace and /proc/<pid>/root/ were populated. This caused symlink_metadata to fail with ENOENT on every binary, and the poll loop never retried because it only reloads when the policy hash changes on the server. Replace the synchronous resolve with an async task that probes /proc/<pid>/root/ with retries (10 attempts, 500ms apart, 5s total). The child's mount namespace is typically ready within a few hundred ms. Also inline error values into warning message strings so they appear in default log output (not just as structured tracing fields that may be elided), and add debug-level logs before each symlink_metadata call to aid diagnosis.
|
@johntmyers Retested with commit Environment
Observation: warning pattern changed dramatically vs. synchronous versionPrevious test (pre-retry version, commit
Current test (
The 7 existing binaries each resolved through Direct verification from inside the podAfter the restart, That's exactly what the fix needs: What I couldn't testI wanted to also do a functional A/B — i.e., reproduce a 403 on a symlinked binary under stock and show it gone under patched. That part was inconclusive in this sandbox because our claw-test pod isn't currently running an active egress-enforcement iptables/netfilter interception path (separate issue — I'll dig in or file a new bug once I figure out why). Under both stock and patched, unrestricted HTTPS requests to So: no direct 200/403 proof-of-fix, but the log-pattern evidence is crisp — the previous test failed because CleanupRolled the cluster container back to stock One small nit
LGTM from a real-world behavioral standpoint. Thanks for the fast turnaround on the iteration. |
Summary
Policy binary paths specified as symlinks (e.g.,
/usr/bin/python3) were silently denied because the kernel reports the canonical path via/proc/<pid>/exe(e.g.,/usr/bin/python3.11). This fix resolves symlinks through the container filesystem after the entrypoint starts, expanding the OPA policy data so both the original and resolved paths match.Related Issue
Closes #770
Changes
opa.rs: Addedresolve_binary_in_container()helper that resolves symlinks via/proc/<pid>/root/on Linux using iterativeread_link(notcanonicalize, which resolves the procfs mount itself). Addedfrom_proto_with_pid()andreload_from_proto_with_pid()methods that expand binary paths during OPA data construction. Existingfrom_proto()/reload_from_proto()delegate withpid=0(backward-compatible, no expansion). Addednormalize_path()for relative symlink targets with..components.lib.rs:load_policy()now retains the proto for post-start OPA rebuild. Afterentrypoint_pid.store(), triggers a one-shot OPA rebuild with the real PID.run_policy_poll_loop()passes the PID on each hot-reload so symlinks are re-resolved.sandbox-policy.rego: Deny reason for binary mismatches now leads withSYMLINK HINTand includes actionable fix guidance (readlink -fcommand, what to check in logs).Design decisions
b.path == exec.pathstrict equality naturally matches the expanded entry.read_linkovercanonicalize—std::fs::canonicalizeresolves/proc/<pid>/rootitself (a kernel pseudo-symlink to/), stripping the prefix needed for path extraction. We use iterativeread_linkwhich reads only the specified symlink target, staying within the container namespace.Best-effort approach and known risks
Symlink resolution is opportunistic — it improves the common case but cannot be guaranteed in all environments. When resolution fails, we are loud about it: per-binary
WARN-level logs explain exactly what failed and what the operator should do. Deny reasons include prominentSYMLINK HINTtext with actionable fix commands. Both flow through the gRPCLogPushLayerand are visible viaopenshell logs.Environments where resolution will not work:
kernel.yama.ptrace_scope >= 2)/proc/<pid>/root/returnsEACCESeven for own PID--policy-rules/--policy-data, no--sandbox-id)readlink -finside sandboxIn all failure cases: the original user-specified path is preserved, the deny behavior is identical to pre-fix, and the operator gets a clear warning log explaining why resolution didn't work and what to do about it.
Testing
mise run pre-commitpassesnormalize_pathhelper for../.resolutionresolve_binary_in_containeredge cases (glob skip, pid=0, nonexistent paths)_with_pidvariantsSYMLINK HINTandreadlink -fcommandChecklist