chore: drop pre-emptive 'compose down' so failed builds don't take prod offline #7

Merged
brendan merged 1 commits from chore/nanodrop-deploy-skip-preemptive-down into main 2026-05-11 07:50:11 +00:00
Owner

Summary

Two-line CI fix: drop the pre-emptive docker compose -f compose.yaml down from the deploy heredoc and add --remove-orphans to the surviving docker compose -f compose.yaml up -d --build.

docker compose up -d --build builds the image first and only recreates (stops + replaces) the running container after a successful build. So a build failure now leaves the previous container serving traffic instead of guaranteeing a 502 from the reverse proxy.

This is the 5th and final project in the cross-project compose downup anti-pattern audit filed on 2026-05-10. After this lands, the chore is complete:

  • authd PR #12 — merge_commit ea19e09eedf632fb04c88bd16668f3831d25a435
  • buchinese PR #8 — merge_commit 902e30f2635f34a5fe9bed23c10109a1ca423125
  • inventory PR #19 — merge_commit ecc933cbed56bbaf9c9de62675a6f870f6e473c7
  • movement PR #16 — merge_commit e0863a6c9e4fe3c98fa77b06930da3319ddabc12
  • nanodrop — this PR

Concrete motivation

The 2026-05-10 buchinese outage from PR #6 (an npm ci lockfile mismatch that took prod to HTTP 502 for the entire human-intervention turnaround) is the cost being prevented: the deploy script destroyed the running container BEFORE finding out whether the new image builds. After this change a future failed build keeps the last-known-good container in place.

--remove-orphans

Drop-in safe: no-op when there are no orphans; cleans up old service containers when compose.yaml is restructured (rename/remove). Same flag added to the 4 precedent workflows.

Security

None. — the diff is a 1-line removal + 1-line append inside the Deploy on server with Docker step's ssh heredoc. No secret reads, var-interpolation patterns, SSH key handling, or auth gates touched. Net security-surface delta: zero.

Test plan

  • YAML syntax valid (workflow opens; deploy job structure unchanged)
  • Pre-emptive docker compose ... down line removed from the deploy heredoc
  • Surviving docker compose ... up -d --build line gained --remove-orphans
  • No other file modified
  • npm run build clean
  • npm test → 126/126 passed across 18 files
  • refactor: pass = noop (CI-config edit, nothing to clean up within scope)
  • Needs manual deploy verification by user post-merge: next push to main should show no Stopping nanodrop ... line before the build step, and a deliberately-failing build should leave the running container in place
## Summary Two-line CI fix: drop the pre-emptive `docker compose -f compose.yaml down` from the deploy heredoc and add `--remove-orphans` to the surviving `docker compose -f compose.yaml up -d --build`. `docker compose up -d --build` builds the image first and only recreates (stops + replaces) the running container **after** a successful build. So a build failure now leaves the previous container serving traffic instead of guaranteeing a 502 from the reverse proxy. This is the **5th and final** project in the cross-project `compose down` → `up` anti-pattern audit filed on 2026-05-10. After this lands, the chore is complete: - authd PR #12 — merge_commit `ea19e09eedf632fb04c88bd16668f3831d25a435` - buchinese PR #8 — merge_commit `902e30f2635f34a5fe9bed23c10109a1ca423125` - inventory PR #19 — merge_commit `ecc933cbed56bbaf9c9de62675a6f870f6e473c7` - movement PR #16 — merge_commit `e0863a6c9e4fe3c98fa77b06930da3319ddabc12` - **nanodrop — this PR** ## Concrete motivation The 2026-05-10 buchinese outage from PR #6 (an `npm ci` lockfile mismatch that took prod to HTTP 502 for the entire human-intervention turnaround) is the cost being prevented: the deploy script destroyed the running container BEFORE finding out whether the new image builds. After this change a future failed build keeps the last-known-good container in place. ## `--remove-orphans` Drop-in safe: no-op when there are no orphans; cleans up old service containers when `compose.yaml` is restructured (rename/remove). Same flag added to the 4 precedent workflows. ## Security `None.` — the diff is a 1-line removal + 1-line append inside the `Deploy on server with Docker` step's ssh heredoc. No secret reads, var-interpolation patterns, SSH key handling, or auth gates touched. Net security-surface delta: zero. ## Test plan - [x] YAML syntax valid (workflow opens; deploy job structure unchanged) - [x] Pre-emptive `docker compose ... down` line removed from the deploy heredoc - [x] Surviving `docker compose ... up -d --build` line gained `--remove-orphans` - [x] No other file modified - [x] `npm run build` clean - [x] `npm test` → 126/126 passed across 18 files - [x] `refactor:` pass = noop (CI-config edit, nothing to clean up within scope) - [ ] **Needs manual deploy verification by user post-merge:** next push to `main` should show no `Stopping nanodrop ...` line before the build step, and a deliberately-failing build should leave the running container in place
brendan added 1 commit 2026-05-11 07:49:58 +00:00
Matches the cross-project pattern already applied to authd PR #12, buchinese
PR #8, inventory PR #19, and movement PR #16. The pre-emptive `docker compose
down` destroyed the running container BEFORE `docker compose up -d --build`
had a chance to verify the new image builds — a single npm ci lockfile
mismatch (buchinese PR #6, 2026-05-10) was enough to put prod to HTTP 502 for
the entire human-intervention turnaround. `docker compose up -d --build`
builds first; only on successful build does compose recreate the container.
On build failure compose exits non-zero with the previous container still
serving traffic. `--remove-orphans` is a drop-in cleanup for renamed/removed
services.

No test added (no CI-yaml tests exist in this project; matches 4 precedent
PRs). Refactor pass expected to be a noop.
brendan merged commit 398c008c32 into main 2026-05-11 07:50:11 +00:00
Sign in to join this conversation.