diff --git a/docs/v1-redeploy-changelog.md b/docs/v1-redeploy-changelog.md index 1b03912..a86b2ab 100644 --- a/docs/v1-redeploy-changelog.md +++ b/docs/v1-redeploy-changelog.md @@ -1087,73 +1087,6 @@ its own `machines:` 8-11, so 8/9/10 are internally consistent. The live 0-3 numbering was a deploy-time artifact; the bundle correctly uses 8-11. -### 2026-06-30 -- Phase-04 executed (network carve + internal-cert SAN gate); DOCFIX-059/060 - -PHASE-04 (network carve) -- PASS: -- Step 4.1 provider-ext + provider-ext-fip created idempotently (phase-04-network-create.sh); EXIT gate - PASS via phase-04-network-verify.sh (external/flat/physnet1/not-shared; subnet 10.12.4.0/22, gw - 10.12.4.1, no-dhcp, FIP pool 10.12.5.0-10.12.7.254). As-built this deploy: provider-ext = a4e1a7fa-..., - provider-ext-fip = f66e5bc5-... (runbook as-built refreshed). - -DOCFIX-059 -- internal-cert SAN gate + the VANTAGE correction (the substantive finding): -- The prompt's phase-04 item "confirm internal certs carry 10.12.12.5x SANs" was NOT implemented by any - committed artifact. Added scripts/phase-04-internal-cert-san-verify.sh (+ tests/phase-04-internal-cert-san/) - and runbook Step 4.2. -- VANTAGE (load-bearing): metal-internal (10.12.12.0/22, VID 103) is an ISOLATED service plane (D-052); the - jumphost is NOT on it. An s_client from the jumphost to 10.12.12.x TIMES OUT / conn-errors, and an - un-hardened check mislabels that as "no IP-SAN" -- a FALSE negative (observed live this session). The gate - must probe FROM a unit ON the plane (keystone/leader) via juju exec. Confirmed live: all 11 internal https - endpoints carry their own 10.12.12.5x IP-SAN (keystone/glance/nova/neutron/cinderv3/placement/barbican/ - octavia/magnum/swift/s3). Internal TLS is correct; the earlier failures were purely vantage. -- HARDENING in the gate: (a) every probe is timeout-bounded (an unbounded s_client hangs ~127s on a filtered - VIP -- proven at 6.02s vs 127s), classified TIMEOUT/CONN-ERR distinctly from a real NO-SAN; (b) non-https - endpoints (the plain-HTTP glance-simplestreams image-stream) are SKIPPED (no cert). Test covers PASS / - SKIP-http / NO-SAN / NO-CERT (fake openstack/juju + real jq; run on the jumphost). - -DOCFIX-060 -- phase-04-network-carve.md drift (the script was right; the md lagged): -- Inline Step 4.1 used the hardcoded `maas admin subnet read 1` -- a post-D-052 landmine (subnet ids drift - across cutovers). Corrected to gateway-by-CIDR, matching phase-04-network-create.sh (DOCFIX-047). -- IPAM reference carried the pre-D-052 single "Metal 10.12.8.0/22 = internal/admin VIPs" model. Corrected to - the D-052/053 split: metal-admin 10.12.8 (admin VIPs .8.5x + PXE), metal-internal 10.12.12 (VID 103, - internal VIPs .12.5x + all service east-west). -- Added a CANONICAL EXECUTION note (D-056) pointing to the three phase-04 scripts; refreshed as-built IDs. - -PROCESS lessons (recorded; no code change): -- PASTE SAFETY: a delivered block whose BEGIN/END label lines (which contain parentheses) were left inside - the fenced code region broke on paste -- bash rejected the parenthesis as an unexpected token. Label lines - carrying parens are NOT comments. RULE: put only valid bash inside a fenced block; labels go as # comments - or prose -- and run bash -n on EVERY delivered block first (that was the miss). The runbooks' own - bold-label convention (labels OUTSIDE the fence) is correct and unaffected. -- NETWORK-PROBE TIMEOUT: any s_client/curl/nc in a runbook step must be timeout-bounded with an explicit - timeout branch -- an unbounded probe is not acceptable in a deterministic gate. - -PHASE-05 (octavia) -- IN PROGRESS at handoff: config gate clear (retrofit use-internal-endpoints=true, -image-format=raw, amp-image-tag=octavia-amphora on both sides); octavia blocked, charm-octavia resources -0/0/0; PRE gate PROCEED. Step 5.1 configure-resources running (--wait=20m; do NOT re-fire on wait-timeout). - -### 2026-06-30 -- Phase-05 executed (Octavia enablement, D-021) -- PASS; DOCFIX-061 - -PHASE-05 (octavia) -- PASS (scripts ran clean; no script defects): -- 5.1 configure-resources (op 35/task 36, --wait=20m) cleared octavia's blocked -> active; lb-mgmt-net / - lb-mgmt-subnetv6 / lb-mgmt-sec-grp created; o-hm0 UP with an fc00:: ULA (state=UNKNOWN is normal for an - OVS internal port). Benign in-progress noise confirmed harmless: `ovs-vsctl: no row o-hm0` (queried - before the action creates the port) and the systemd-networkd stop/socket warning. -- 5.2 amphora pipeline (phase-05-amphora-pipeline.sh): config gate clear; base seeded via STAGE-AND-VERIFY - (sha256 070de108...); retrofit op 39/task 40 built amphora-haproxy-x86_64-ubuntu-22.04-20260701 - (807e3f5b-...) ACTIVE, tag octavia-amphora, image-format raw. phase-05-octavia-verify.sh -> PASS. - -DOCFIX-061 -- phase-05 as-built reconciliation (runbook drift; no script change): -- Retrofit's internal glance target corrected .8.53 -> 10.12.12.53: under D-052 the INTERNAL glance VIP is - on metal-internal (.12.53, confirmed live this session); the doc's ".8.53" predates the metal-admin/ - metal-internal split and is now the ADMIN VIP. -- Seed-method note corrected: this rebuild used STAGE-AND-VERIFY (the canonical Step 5.2 script), not the - 06-16 web-download expedient. Object IDs / op numbers refreshed to 2026-06-30. noble is seeded in - phase-06 6.0-BOOT this rebuild (not pre-staged in phase-05). o-hm0 ULA not captured this run (regenerates). - -DISCIPLINE (operator-directed 2026-06-30): reconcile scripts + commands + this changelog at the SUCCESSFUL -completion of EACH phase, before starting the next. Deliver the per-phase reconciliation as a repo-relative ZIP. - ### Next-free numbers -Design decision: D-063. Doc fix: DOCFIX-062. (DOCFIX-061 phase-05 as-built reconciliation recorded above; -DOCFIX-059 internal-cert SAN gate, DOCFIX-060 phase-04 md drift; D-061 teardown, D-062 mysql; DOCFIX-057 -old-teardown deprecation, DOCFIX-058 phase-03 3.3 HTTP-upstream recorded earlier.) +Design decision: D-063. Doc fix: DOCFIX-059. (D-061 teardown, D-062 mysql; DOCFIX-057 old-teardown +deprecation, DOCFIX-058 phase-03 3.3 HTTP-upstream both recorded above.)