chore(docker): wire http_proxy / https_proxy / no_proxy ARGs through builds#519
Conversation
…builds Add three new build args to both Dockerfile and ui/Dockerfile so corporate installs behind an HTTP proxy can route apt / PyPI / npm fetches at build time AND outbound HTTP from the runtime container (OpenAI, GitHub, cluster ES/OpenSearch/Solr HTTP) through that proxy. Empty defaults preserve current behavior; both case variants (http_proxy + HTTP_PROXY etc.) are written by the ENV blocks because Linux tooling is split on the convention. Compose forwards the three env vars into every service's build.args block (migrate / api / worker / ui), so 'http_proxy=... make up' or setting them in '.env' works end-to-end. The Compose-service-names gotcha is documented loudly. Without 'postgres,redis,elasticsearch,opensearch,solr,api,worker,migrate' in 'no_proxy', the worker's HTTP call to 'http://elasticsearch:9200' (and similar in-network HTTP) gets routed through the corporate proxy, which has no path to those Compose-internal hostnames. The recommended .env.example default bakes them in alongside 169.254.169.254 (cloud-metadata) and 10.0.0.0/8 (internal VPC). Bundles in lockstep: - New 'Corporate HTTP proxy (apt / PyPI / npm + runtime egress)' subsection in docs/01_architecture/deployment.md covering the env vars, the no_proxy Compose-service-names gotcha, and the deeper Artifactory-mirror case (not currently supported by the Dockerfiles — pointers only). - .env.example documents all three with a copy-pasteable default no_proxy. Validated: - 'docker buildx build --check' clean on both Dockerfiles - 'docker compose config' resolves all 12 build-arg slots (4 services x 3 vars) for both default-empty and override paths - backend/tests/unit/test_dockerfile_runtime_stage.py: 3/3 pass Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai>
UV_REGISTRY (added in PR #517) read like an ARG for uv-managed dependencies when it actually controls only the registry prefix for the ghcr.io/astral-sh/uv tooling image. Rename to GHCR_REGISTRY — describes *what* upstream it overrides (the GHCR namespace), matches BASE_REGISTRY's naming style (the Docker Hub equivalent), and stays accurate if any future GHCR-hosted image is added. Touches 5 files: Dockerfile, docker-compose.yml (3 build.args blocks — migrate / api / worker; ui doesn't reference GHCR), .env.example, and the two corp-proxy doc surfaces (deployment.md table row + override example; local-dev.md troubleshooting bullet). The deployment.md description column was also generalized — previously coupled the name back to "the astral-sh/uv COPY-from stage", now reads as "every GHCR-hosted image … currently used by the uv-source alias stage; any future GHCR image lands under the same prefix." Breaking-change footprint is intentionally tiny: PR #517 merged hours ago and no operator has had time to bake UV_REGISTRY into their .env yet. state.md's PR #517 historical entry intentionally retains the original 'UV_REGISTRY' wording (accurate for what shipped at that merge); the upcoming state finalization for this PR will note the rename. Validated: - 'docker buildx build --check' clean on both Dockerfiles - 'docker compose config' resolves the new name in all three backend services' build.args (migrate / api / worker) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai>
There was a problem hiding this comment.
Code Review
This pull request introduces corporate HTTP proxy support for both build-time and runtime egress across the backend and frontend services, updating the Dockerfiles, Docker Compose configuration, and deployment documentation. The feedback highlights that baking proxy settings into the Docker images via ENV is an anti-pattern; instead, these should be dynamically passed at runtime via the environment section in docker-compose.yml. Additionally, it is recommended to include host.docker.internal in the default no_proxy lists to ensure local LLM development works seamlessly behind a proxy.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| ENV http_proxy=${http_proxy} \ | ||
| https_proxy=${https_proxy} \ | ||
| no_proxy=${no_proxy} \ | ||
| HTTP_PROXY=${http_proxy} \ | ||
| HTTPS_PROXY=${https_proxy} \ | ||
| NO_PROXY=${no_proxy} |
There was a problem hiding this comment.
Baking build-time proxy configuration into the Docker image via ENV is a known anti-pattern. Docker automatically treats proxy variables (http_proxy, https_proxy, no_proxy, etc.) as special build-time arguments that are available during RUN instructions but are purposely not persisted in the final image to keep it environment-agnostic.
By explicitly defining ENV http_proxy=${http_proxy}, you are overriding this behavior and hardcoding the builder's proxy settings into the image. Furthermore, because these variables are not defined in the environment section of docker-compose.yml, any runtime container started from a pre-built image will not have these proxy settings at runtime.
Recommendation:
- Remove the
ENVblock from the Dockerfile. - Pass these variables dynamically at runtime by adding them to the
environmentsection of each service indocker-compose.yml:
environment:
- http_proxy
- https_proxy
- no_proxy
- HTTP_PROXY
- HTTPS_PROXY
- NO_PROXY| ENV http_proxy=${http_proxy} \ | ||
| https_proxy=${https_proxy} \ | ||
| no_proxy=${no_proxy} \ | ||
| HTTP_PROXY=${http_proxy} \ | ||
| HTTPS_PROXY=${https_proxy} \ | ||
| NO_PROXY=${no_proxy} |
| ARG http_proxy | ||
| ARG https_proxy | ||
| ARG no_proxy | ||
| ENV http_proxy=${http_proxy} \ | ||
| https_proxy=${https_proxy} \ | ||
| no_proxy=${no_proxy} \ | ||
| HTTP_PROXY=${http_proxy} \ | ||
| HTTPS_PROXY=${https_proxy} \ | ||
| NO_PROXY=${no_proxy} |
There was a problem hiding this comment.
The runner stage does not run any build-time commands (like npm install or apt-get) that require a proxy. Declaring these ARGs and ENVs in the runner stage is purely for runtime proxying, which should be handled dynamically via docker-compose.yml's environment section rather than being baked into the image.
| http_proxy: ${http_proxy:-} | ||
| https_proxy: ${https_proxy:-} | ||
| no_proxy: ${no_proxy:-} |
There was a problem hiding this comment.
Passing these as build.args is correct for build-time proxying (e.g., apt-get and uv sync). However, to support runtime proxying (especially when using pre-built images or when running without rebuilding), these variables must also be declared in the environment section of the services.
Add them to the environment section of migrate, api, worker, and ui services:
environment:
- http_proxy
- https_proxy
- no_proxy
- HTTP_PROXY
- HTTPS_PROXY
- NO_PROXY| # which convention it reads. | ||
| # http_proxy=http://http.proxy.your-corp.com:8000 | ||
| # https_proxy=http://http.proxy.your-corp.com:8000 | ||
| # no_proxy=your-corp.com,.your-corp-cloud.com,localhost,127.0.0.1,10.0.0.0/8,169.254.169.254,postgres,redis,elasticsearch,opensearch,solr,api,worker,migrate |
There was a problem hiding this comment.
When developers run local LLMs (such as Ollama, LM Studio, or vLLM) behind a corporate proxy, they configure OPENAI_BASE_URL to point to http://host.docker.internal. Without adding host.docker.internal to the no_proxy list, requests to the local LLM will be incorrectly routed to the corporate proxy and fail.
Include host.docker.internal in the default no_proxy list to ensure local LLM development works seamlessly out of the box.
# no_proxy=your-corp.com,.your-corp-cloud.com,localhost,127.0.0.1,10.0.0.0/8,169.254.169.254,host.docker.internal,postgres,redis,elasticsearch,opensearch,solr,api,worker,migrate
| # no_proxy=your-corp.com,.your-corp-cloud.com,localhost,127.0.0.1, | ||
| # 10.0.0.0/8,169.254.169.254, | ||
| # postgres,redis,elasticsearch,opensearch,solr,api,worker,migrate |
There was a problem hiding this comment.
Add host.docker.internal to the documented no_proxy example in the Dockerfile comments to keep it in sync with .env.example and support local LLM development behind a proxy.
# no_proxy=your-corp.com,.your-corp-cloud.com,localhost,127.0.0.1,
# 10.0.0.0/8,169.254.169.254,host.docker.internal,
# postgres,redis,elasticsearch,opensearch,solr,api,worker,migrate
| # In .env | ||
| http_proxy=http://http.proxy.your-corp.com:8000 | ||
| https_proxy=http://http.proxy.your-corp.com:8000 | ||
| no_proxy=your-corp.com,.your-corp-cloud.com,localhost,127.0.0.1,10.0.0.0/8,169.254.169.254,postgres,redis,elasticsearch,opensearch,solr,api,worker,migrate |
There was a problem hiding this comment.
Update the documentation example to include host.docker.internal in the no_proxy list to ensure consistency with .env.example.
| no_proxy=your-corp.com,.your-corp-cloud.com,localhost,127.0.0.1,10.0.0.0/8,169.254.169.254,postgres,redis,elasticsearch,opensearch,solr,api,worker,migrate | |
| no_proxy=your-corp.com,.your-corp-cloud.com,localhost,127.0.0.1,10.0.0.0/8,169.254.169.254,host.docker.internal,postgres,redis,elasticsearch,opensearch,solr,api,worker,migrate |
… + compose environment Accept all 7 Gemini findings on PR #519: HIGH (1-4) — anti-pattern: Dockerfile ENV bakes proxy URL into image, and explicit ARG declarations were unnecessary. The fix: rely on BuildKit's predefined proxy ARGs. Docker treats `http_proxy` / `https_proxy` / `no_proxy` (plus uppercase + FTP/ALL variants) as predefined ARGs — BuildKit forwards them from --build-arg into every RUN step's environment automatically, with no `ARG` declaration needed, and intentionally excludes them from `docker history` so the proxy URL never gets baked into the image. The previous explicit `ARG http_proxy=` + `ENV http_proxy=...` blocks in both Dockerfiles were both redundant AND counterproductive (the ENV baked the URL into the runtime image, making it environment-coupled). Removed: - Backend Dockerfile: global ARGs + base-stage ARG+ENV block. Kept the explanatory comment. - ui/Dockerfile: global ARGs + deps-stage ARG+ENV block + runner-stage ARG+ENV block. Kept the explanatory comment. Added to docker-compose.yml: - `environment:` block proxy entries on migrate / api / worker / ui services (six entries each — both case variants). Reads from .env via `${http_proxy:-}` etc. Empty default = no proxy = unchanged behavior. - Updated the build.args comment to explain the new architecture (build-time = BuildKit predefined; runtime = environment: block). MEDIUM (5-7) — `host.docker.internal` missing from default `no_proxy`. Without it, local-LLM development (Ollama / LM Studio / vLLM via `OPENAI_BASE_URL=http://host.docker.internal:…`) breaks behind a corp proxy because the proxy intercepts the local-machine call. Added to: - .env.example default `no_proxy` line + the gotcha block - Dockerfile explanatory comment - deployment.md override example + gotcha paragraph Also added an "Architecture: build-time vs runtime" paragraph to deployment.md linking to Docker's predefined-ARGs reference and explaining the dual-path design. Validated: - 'docker buildx build --check' clean on both Dockerfiles - 'docker compose config' resolves proxy vars in all 4 services' build.args (build-time) AND environment: (runtime) — both default-empty and override paths - backend/tests/unit/test_dockerfile_runtime_stage.py: 3/3 pass Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai>
Adjudication of Gemini Code Assist reviewPer CLAUDE.md ("Before considering a PR ready to merge"). 7 findings — all accepted, all addressed in
The crucial technical correction (Findings 1-4)Docker treats My initial design used explicit The corrected architecture:
Both paths read the same Findings 5-7 —
|
Adjudication of 2 follow-up Gemini findings on
|
| # | File:Line | Severity | Verdict | Counter-evidence |
|---|---|---|---|---|
| A | docker-compose.yml:79 ("add environment: proxy block") |
HIGH | Reject — stale | All 4 services already have http_proxy/https_proxy/no_proxy + uppercase in their environment: blocks. See docker-compose.yml:144-149 (api), :93-98 (migrate), :226-231 (worker), :271-276 (ui). grep -c "http_proxy: \${http_proxy:-}" docker-compose.yml = 8 (4 × build.args + 4 × environment), confirming both layers are wired. |
| B | Dockerfile:60 ("add host.docker.internal to comment") |
MED | Reject — stale | host.docker.internal already present in the cited comment at Dockerfile:59 (in the no_proxy=… example), plus lines 64, 67, 68 as part of the gotcha explanation. grep -n "host.docker.internal" Dockerfile returns 4 matches. |
Both look like the bot re-scanned the build.args block in isolation and flagged based on it alone, without cross-checking the environment: block 60 lines further down (Finding A) and without scanning the full no_proxy=… comment line (Finding B). Pattern matches the project's known "Gemini re-flags resolved findings on the new SHA" failure mode.
No file changes; merging.
Update state.md current-branch / execution context to reflect the 4ff6e33 merge, prepend a one-line entry to "Last 5 merges", drop the now-6th row (bug_seed_meaningful_demos_silent_bulk_errors PR #482) into the state_history.md older-entries reference, and add the full reasoning entry to state_history.md. The narrative covers the Gemini-driven design correction (Dockerfile ARG+ENV blocks removed in favor of BuildKit predefined-ARGs for build-time + compose environment: for runtime), the UV_REGISTRY → GHCR_REGISTRY rename rationale, the `host.docker.internal` local-LLM gotcha, the no_proxy Compose-service-names gotcha, and the two stale-rejected follow-up Gemini findings. Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…) (#522) Update state.md current-branch + execution context, prepend the new one-line entry to "Last 5 merges", drop the now-6th row (chore_overnight_result_card_screenshot PR #492) into the older-entries reference, and add the full reasoning entry to state_history.md. The narrative covers the live corp-firewall reproducer (`make up` failing at line 1 with 403 on docker.io/docker/dockerfile:1.7), the architectural reason ARGs cannot fix it (syntax directive parses before ARGs are in scope), the safety analysis (no BuildKit-1.7+ features used), and frames this as the third PR completing today's corp-proxy install story alongside #517 + #519. Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…gnostics + runbook (#523) * chore(docker): optional corp CA cert via build secret (TLS interception) Add optional corporate CA certificate support for installs behind a corp HTTPS proxy that performs TLS interception (re-signs traffic with an internal CA the container does not trust). Empty file = no-op (OSS users unaffected); operators behind a TLS-intercepting proxy drop their PEM cert at ./secrets/corp_ca.crt and re-run 'make up'. The error signatures this fixes: - npm/pnpm: SELF_SIGNED_CERT_IN_CHAIN - curl/openssl: unable to get local issuer certificate - Python (requests/httpx/openai): CERTIFICATE_VERIFY_FAILED - Go: x509: certificate signed by unknown authority Mechanism. The cert is mounted as a BuildKit secret via '--mount=type=secret,id=corp_ca,target=/tmp/corp_ca.crt,required=false' in both Dockerfiles. When non-empty, copied to /usr/local/share/ca-certificates/corp_ca.crt and `update-ca-certificates` rebuilds /etc/ssl/certs/ca-certificates.crt. Every HTTPS tool in the container then trusts the corp CA at BOTH build time AND runtime (the cert content is baked into the system trust bundle). Touched: - Dockerfile: install block in base stage after the apt 'ca-certificates' install (inherited by deps + runtime via 'FROM base'; one block covers all backend build steps + runtime egress). - ui/Dockerfile: install block in deps stage (build-time, before 'npm install -g pnpm@9' — the canonical TLS-interception failure point); SECOND install in runner stage (runtime egress; runner is a fresh FROM, doesn't inherit from deps). - docker-compose.yml: new top-level 'secrets.corp_ca' entry pointing at './secrets/corp_ca.crt'; 'build.secrets: - corp_ca' on all 4 services (migrate / api / worker / ui). - scripts/install.sh: auto-create empty './secrets/corp_ca.crt' placeholder so Compose's secrets validation doesn't fail at startup. - .env.example: explanatory block describing the symptom + the file path (no env var — the cert is a file, not a string). - docs/01_architecture/deployment.md: new 'Corporate TLS interception' subsection with error signatures table, mechanism, one-time setup, verification, and cross-reference to the upcoming runbook. - docs/03_runbooks/local-dev.md: troubleshooting bullet pointing to the runbook for the TLS-interception case. Validated: - 'docker buildx build --check' clean on both Dockerfiles - 'docker compose config --quiet' clean (new build.secrets block parses) - backend/tests/unit/test_dockerfile_runtime_stage.py: 3/3 pass Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai> * chore(install): add diagnose_build_failure wrapper to install.sh When 'docker compose build' fails inside scripts/install.sh, wrap the output and scan for known corp-network failure signatures, then print an actionable diagnostic block pointing at the specific runbook section. Underlying tool errors (SELF_SIGNED_CERT_IN_CHAIN from npm, "403 Forbidden" from registry-1.docker.io, "Could not resolve host" from curl) are technically correct but operationally useless — a developer seeing them does not know to drop a corp CA cert at ./secrets/corp_ca.crt or set BASE_REGISTRY in .env. This wrapper closes the gap. Three detection patterns (each prints a tailored fix + runbook pointer): 1. TLS interception (corp HTTPS proxy with internal CA) Matches: SELF_SIGNED_CERT_IN_CHAIN, self-signed cert in chain, unable to get local issuer certificate, CERTIFICATE_VERIFY_FAILED, certificate verify failed, x509: certificate signed by unknown Hint: drop corp CA at ./secrets/corp_ca.crt 2. Container registry blocked (BASE_REGISTRY / GHCR_REGISTRY not set or set to wrong path) Matches: failed to resolve source metadata for docker.io, registry-1.docker.io 401/403, no such host (docker.io|ghcr.io) Hint: set BASE_REGISTRY + GHCR_REGISTRY in .env 3. Outbound HTTP blocked (apt/PyPI/npm can't reach upstream) Matches: Could not resolve host, Temporary failure resolving, dial tcp.*no such host, Connection refused/timed out, ETIMEDOUT, ECONNREFUSED Hint: set http_proxy + https_proxy + no_proxy in .env Fail-open fallback: if no pattern matches, print a generic pointer at docs/03_runbooks/corporate-network-install.md (the FAQ) and docs/03_runbooks/local-dev.md "Stack will not start" (general troubleshooting). The runbook file is added in a follow-up commit on this same branch so the pointer resolves the moment this PR merges. Implementation. Wraps the existing 'docker compose build' call with 'docker compose build 2>&1 | tee "$build_log"' captured to a temp file; the 'if !' guard catches the pipeline exit (set -e + pipefail is on); on failure, runs the three pattern detectors then 'exit 1'. The wrapper preserves existing CI behavior — 'RELYLOOP_SKIP_BUILD=1' still skips the build entirely. Validated: - 'bash -n scripts/install.sh' clean - Smoke-tested all 3 detection paths + the fallback against synthetic log fixtures: each pattern fires the right diagnostic block; the fallback fires when none match Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai> * docs(runbook): corporate-network-install symptom-first FAQ New runbook for operators running 'make up' from inside a corporate network. Symptom-first layout (paste error block → find section → follow fix). Covers every corp-network failure mode we've seen during today's work (PRs #517 / #519 / #521 / this PR's earlier commits): §1 Registry pull failures - "403 Forbidden" / "401 Unauthorized" from registry-1.docker.io - "failed to resolve source metadata for docker.io/..." - "no such host: registry-1.docker.io" / "no such host: ghcr.io" - Fix: BASE_REGISTRY + GHCR_REGISTRY in .env - Artifactory layout disambiguation (unified vs split per upstream) §2 TLS verification errors - SELF_SIGNED_CERT_IN_CHAIN (npm/pnpm) - "unable to get local issuer certificate" (curl/openssl) - "x509: certificate signed by unknown authority" (Go) - CERTIFICATE_VERIFY_FAILED (Python) - Fix: drop corp CA at ./secrets/corp_ca.crt - Four ways to find the corp CA (IT, Chrome/Edge, Firefox, openssl probe) - Verification commands §3 Egress / DNS failures - "Could not resolve host" - "Connection refused" / "Connection timed out" / ETIMEDOUT / ECONNREFUSED - Fix: http_proxy + https_proxy + no_proxy in .env - The no_proxy three-category checklist (Compose service names, host.docker.internal, cloud-metadata + VPC) §4 Worker stays "unhealthy" after make up succeeds - Cause: no_proxy missing Compose service names - Fix: add postgres,redis,elasticsearch,opensearch,solr,api,worker,migrate §5 Runtime calls to OpenAI / GitHub fail - Cause A: host env vars set instead of .env - Cause B: TLS interception on the runtime path - Verification commands for both Plus quick decision tree at the top, a verifying-your-full-config one-shot, and cross-refs to deployment.md (architecture) and local-dev.md (general troubleshooting). The runbook is referenced by: - scripts/install.sh's diagnose_build_failure (commit B) — all three detection branches plus the fallback point at specific sections. - docs/01_architecture/deployment.md "Corporate TLS interception" § (added in commit A) — for symptom lookup. - docs/03_runbooks/local-dev.md "Stack will not start" troubleshooting bullets (existing pattern, extended in commit A and again in commit D). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai> * docs(claude-md): index corporate-network-install runbook Adds a row to the "Key Runbooks" table pointing at the new docs/03_runbooks/corporate-network-install.md (added in commit C of this branch). Keeps the table the canonical "where do I look when ..." index — without this row, the new runbook is only discoverable by the in-doc cross-references (deployment.md + local-dev.md) and by the install.sh diagnostic output. Adding it to CLAUDE.md makes it discoverable to anyone reading the project's top-level conventions. The other doc surfaces (deployment.md "Corporate TLS interception" §, local-dev.md "Stack will not start" troubleshooting bullets) were updated in commit A of this branch. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai> * fix(install): wrap docker compose build in function to satisfy CI guard CI guard scripts/ci/verify_install_builds_all_services.sh enforces that scripts/install.sh contains a top-level `docker compose build` line via regex `^[[:space:]]*docker compose build( .*)?$`. My earlier wrapper (commit fe8d5b4 on this branch) used `if ! docker compose build 2>&1 | tee "$build_log"; then`, which fails the anchor because of the `if !` prefix. Fix: move the bare `docker compose build` invocation into a wrapper function `do_compose_build()`; the calling site then runs `do_compose_build 2>&1 | tee "$build_log" || build_status=$?` and checks PIPESTATUS[0] to detect failure. The bare line inside the function body satisfies the guard regex; the calling-site pipe is unchanged in behavior (same output capture, same diagnostic-on-failure path). Validated: - `bash scripts/ci/verify_install_builds_all_services.sh` — OK (no-args = builds all) - `bash -n scripts/install.sh` — clean - Diagnostic still fires on synthetic TLS-interception log fixture Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai> * chore(docker): adjudicate Gemini — accept 3 ACCEPT + 1 PARTIAL ACCEPT PR #523 Gemini review (4 MEDIUM findings on f34a278): Findings 2-4 (Dockerfile:119, ui/Dockerfile:66, ui/Dockerfile:92) — ACCEPT. The 'update-ca-certificates 2>&1 | tail -1' chain was a tidiness optimization (the command prints info about each CA processed); but the RUN shell is /bin/sh with no pipefail, so a failed update-ca-certificates exit would be silently masked by tail's exit code 0. Real correctness issue — invalid corp CA cert would have shipped an image with the cert NOT in the trust store but the build "succeeded." Dropped the pipe in all 3 places; full update-ca-certificates output now prints (verbose > silent failures). Finding 1 (scripts/install.sh:253) — PARTIAL ACCEPT. ACCEPT the substance: 'trap EXIT' is the right cleanup mechanism — robust under Ctrl-C / signals / unexpected exits, eliminates the duplicate 'rm -f' calls. Added 'trap "rm -f \"$build_log\"" EXIT' right after the mktemp; removed both stale 'rm -f' calls. REJECT the suggested code as-written: Gemini's suggestion regresses to 'if ! docker compose build 2>&1 | tee "$build_log"; then' on a bare line, which is the EXACT pattern the verify_install_builds_all_services CI guard rejects (commit f34a278 on this branch fixed this by moving the 'docker compose build' invocation into a do_compose_build() function so the bare line satisfies the guard regex). The function wrapper stays; only the trap-based cleanup was adopted from the suggestion. Validated: - bash scripts/ci/verify_install_builds_all_services.sh — OK (no-args) - bash -n scripts/install.sh — clean - docker buildx build --check both Dockerfiles — clean Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai> --------- Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
) Update state.md current-branch + execution context to reflect the 6c5fac5 merge, prepend a one-line entry to "Last 5 merges", drop the now-6th row (bug_cluster_url_ssrf_hostname_bypass Phase 1 PR #510) into the state_history.md older-entries reference, and add the full reasoning entry to state_history.md. The narrative covers the three bundled improvements (CA cert build secret, diagnose_build_failure wrapper, symptom-first runbook), the CI guard regex chronology (why commit B failed, what f34a278 fixed), the Gemini adjudication (4 MEDIUM accepted in 3d6e8bc: 3 pipe-masks real, trap EXIT substance accepted but if-! suggestion rejected), the stale Gemini follow-up (4 stale re-flags on 3d6e8bc, all rejected with cited counter-evidence), and frames this as the closer of today's four-PR corp-network install story (#517 + #519 + #521 + #523). Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Update state.md current-branch + execution context, prepend the new one-line entry to "Last 5 merges", drop the now-6th row (chore_dockerfile_http_proxy_args PR #519) into the state_history.md older-entries reference, and add the full reasoning entry to state_history.md. The narrative covers the user-reported staleness (relyloop.com footer pinned at f733fcc / PR #509 / "7 days ago"), the root cause (paths filter correctly fires only on website/** changes; today's chore PRs didn't touch website/), the three options considered, the timing rationale (03:17 UTC for the project's "minute 17" convention, offset from existing crons, pre-dawn GHA low-load window), the cost (~30s/day of GHA time), and frames this PR as closing the "public visibility" gap left by today's otherwise-thorough install story. Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Summary
This PR bundles two related Dockerfile-flexibility changes:
1. Wire
http_proxy/https_proxy/no_proxyARGs through builds (commit03cb1e3c)Dockerfileandui/Dockerfileso corp installs behind an HTTP proxy can route apt / PyPI / npm fetches at build time and outbound HTTP (OpenAI, GitHub, registered ES/OpenSearch/Solr) at runtime through that proxy.http_proxy+HTTP_PROXY, etc.) are ENV'd in every stage because Linux tooling is split:apt+curlprefer lowercase;uv+pip+requestsaccept either;npm+pnpmprefer uppercase.build.argsblock (migrate / api / worker / ui).The
no_proxygotcha — Compose service namesWithout
postgres,redis,elasticsearch,opensearch,solr,api,worker,migrateinno_proxy, the worker's HTTP call tohttp://elasticsearch:9200(and similar in-network calls) gets routed through the corporate proxy, which has no path to those Compose-internal hostnames. The recommended.env.exampledefault bakes them in alongside:169.254.169.254— EC2 / cloud metadata service10.0.0.0/8— internal VPC trafficlocalhost,127.0.0.1— local2. Rename
UV_REGISTRY→GHCR_REGISTRY(commit9a9f1a0c)UV_REGISTRY(added in PR #517) read like an ARG for uv-managed dependencies (PyPI mirror?) when it actually controls only the registry prefix for theghcr.io/astral-sh/uvtooling image. Renamed toGHCR_REGISTRY— describes what upstream it overrides (the GHCR namespace), matchesBASE_REGISTRY's naming style, and stays accurate if any future GHCR-hosted image is added.The breaking-change surface is intentionally tiny: PR #517 merged hours ago and no operator has had time to bake
UV_REGISTRYinto their.envyet. Thestate.mdentry for PR #517 intentionally retains the originalUV_REGISTRYwording (accurate for what shipped at that merge); the upcoming state finalization will note the rename.Three usage patterns
What's documented
## Corporate registry proxy support§ indocs/01_architecture/deployment.mdgains a new### Corporate HTTP proxy (apt / PyPI / npm + runtime egress)subsection covering the three env vars, the no_proxy Compose-service-names gotcha, and a pointer to the deeper Artifactory-mirror case (not currently supported by the Dockerfiles).deployment.md'sGHCR_REGISTRYrow description is generalized away from "the uv COPY-from stage" to "every GHCR-hosted image … currently used by the uv-source alias stage; any future GHCR image lands under the same prefix.".env.exampledocuments all three proxy vars with a copy-pasteable defaultno_proxy.docs/03_runbooks/local-dev.mdtroubleshooting bullet updated to use the newGHCR_REGISTRYname.Test plan
docker buildx build --check -f Dockerfile .— clean.docker buildx build --check -f ui/Dockerfile ui— clean.docker compose config— default-empty case: all 12 proxy build-arg slots (4 services × 3 vars) resolve to"";BASE_REGISTRY+GHCR_REGISTRYresolve to""+ghcr.io/.docker compose --env-file <override> config— override case: all slots propagate the override value.pytest backend/tests/unit/test_dockerfile_runtime_stage.py— 3/3 pass.dockerjob — buildx of API image with default ARGs.docker-uijob — buildx of UI image with default ARGs.🤖 Generated with Claude Code