Newer
Older
openstack-caracal-ipv4 / docs / v1-redeploy-changelog.md

v1 Redeploy -- Running Change Log

Purpose: Living log of design decisions, doc fixes, and runbook edits discovered DURING the v1 redeploy rehearsal that must be folded into docs/design-decisions.md and the phase runbooks UPON COMPLETION. This is the staging list for the completion consolidation -- nothing here is applied to the runbooks or design-decisions yet.

Status: OPEN -- accumulating. Append-only. ASCII + LF.

Session opened: 2026-06-26 (redeploy from clean teardown; D-052/D-053 plane set).

Next free numbers at session open: design decision D-054; doc fix DOCFIX-039. (Verified by grep of design-decisions.md: max D-053, max DOCFIX-038.)


Verified-state checkpoint (measured this session -- authoritative as-built)

scripts/pre-flight-checks.sh @ commit 40e3f9e -- ALL PASS, exit 0, 2026-06-26:

Six MAAS planes resolved BY CIDR (subnet IDs are post-D-052-cutover, NOT the old map):

provider-public  10.12.4.0/22   id=1   vid=0    gw=10.12.4.1   dns=[10.12.4.1]
metal-admin      10.12.8.0/22   id=2   vid=0    gw=10.12.8.1   dns=[10.12.8.1]
metal-internal   10.12.12.0/22  id=10  vid=103  gw=none        dns=[10.12.8.1]  (bridged br-internal)
data-tenant      10.12.16.0/22  id=6   vid=0    gw=none        dns=[10.12.8.1]
storage          10.12.32.0/22  id=7   vid=0    gw=none        dns=[10.12.8.1]
replication      10.12.36.0/22  id=8   vid=0    gw=none        dns=[10.12.8.1]

Per-host data/storage NIC links by CIDR, octets .40-.43, all four hosts: br-internal -> .12, enp8s0 -> .16, enp9s0 -> .32, enp10s0 -> .36.

Nodes openstack0-3 (4na83t / qdbqd6 / h8frng / tmsafc): all Ready, power off. OSD secondary disks (osd-blank-check.sh): all four 512 GiB / 200 KiB blank, RC=0. Bundle VIPs: 11 triple-column VIPs, aligned, .50-.60 band, OK=11 bad=0. octavia-pki overlay: present, 5 lb-mgmt-* keys, ASCII clean.


Pending design-decisions.md appends

D-054 -- Reusable tested scripts in scripts/; runbooks reference them (ADOPTED in practice; formal append pending)

What: Repeated discovery/verify logic lives in scripts/, authored and tested in a sandbox against synthetic fixtures, committed to the repo, and referenced by the runbooks. Runbooks document expected output and remain the gate authority; the scripts are the executable truth. All pinned network values live once in scripts/lib-net.sh (single source of truth), resolved BY CIDR (subnet IDs drift across cutovers).

Delivery workflow: author + test in sandbox -> publish file + sha256 -> commit from Windows -> jumphost git pull -> sha256sum match -> run via bash scripts/X.sh.

Convention: ASCII + LF (.gitattributes *.sh eol=lf); set -euo pipefail + shopt -s inherit_errexit + IFS=$'\n\t'; fail/warn/pass/note helpers with exit 0 (pass) / 1 (fatal) / 2 (warning) for gate scripts; read-only discovery kept separate from gated mutation; lib-net.sh is sourced, never executed (direct-run guard).

Why: Eliminates the paste-corruption failure class (see Findings below) and turns repeated discovery -- polled every redeploy cycle -- into a one-liner with a byte-identity guarantee (sha256) instead of a fragile copy-paste block.

Scripts added this session: lib-net.sh (new), pre-flight-checks.sh (implemented the placeholder), juju-spaces-check.sh (new), osd-blank-check.sh (new). All tested end-to-end against mock maas/juju + fixtures (positive + 7 negative fault injections for pre-flight; 4 scenarios for spaces). Committed at 40e3f9e.


Pending DOCFIX entries

DOCFIX-039 -- phase-01-bundle-deploy.md gate reconciliation (PROPOSED)

The phase-01 pre-deploy GATES encode the OLD plane layout (pre-D-052 CIDR->role map); the deploy COMMANDS are fine. Superseded by scripts/pre-flight-checks.sh. Five stale items:

  1. Constants: hardcoded subnet ids 1 2 6 7 8 9 + old CIDR->role map -> resolve BY CIDR (now in lib-net.sh; metal-internal is id=10 post-cutover, not id=6).
  2. CHECK 1 / Step 1.3 deploy guard: provider-column-only VIP check -> triple-column validator (provider/admin/internal, aligned, .50-.60).
  3. CHECK 2: enp8s0 + 10.12.12.0/22 (old "data") -> links BY CIDR; enp8s0 now carries 10.12.16.0/22 (data-tenant), metal-internal is on br-internal.
  4. CHECK 3: hardcoded ids/DNS -> subnets BY CIDR.
  5. EXIT GATE binding plane map (old: ceph->.16 / octavia->.12.1 / nova->.12.4x / vault->.8) -> corrected per D-052: ceph public/osd/mon->storage(.32); octavia overlay->data-tenant (.16); nova-compute neutron-plugin->data-tenant(.16); vault default->metal-admin(.8) + cluster->metal-internal(.12).

Action at completion: replace the inline CHECK blocks in phase-01 with bash scripts/pre-flight-checks.sh (document expected PASS output) and add a post-add-model bash scripts/juju-spaces-check.sh openstack as the per-model space gate (the old inline CHECK 5 ran juju spaces pre-model and failed "model not found"; spaces are per-model).


Pending runbook / file edits (apply at completion)

  1. runbooks/phase-01-bundle-deploy.md -- DOCFIX-039 (above): swap inline pre-flight blocks for bash scripts/pre-flight-checks.sh; add post-add-model bash scripts/juju-spaces-check.sh openstack; fix the 5 stale gate items; document expected output.
  2. scripts/validate.sh -- convert UTF-8 to ASCII when implementing the D-011 runner (phase-08). file reports "Unicode text, UTF-8 text" (em-dashes from the placeholder); violates the ASCII-only convention. Currently a placeholder, not yet run.
  3. Teardown runbook -- reference scripts/osd-blank-check.sh for the OSD-blank verification step (replaces the inline qemu-img loop).
  4. runbooks/ README / pre-flight references -- point at the new scripts where the old inline discovery blocks were described.

Findings / process learnings (this session)

  • Paste-corruption failure class. A hand-built base64 pre-flight block shipped two transcription defects: [:space:] (single bracket, must be [[:space:]]) on the grep count line, and ENV{ instead of END{ on the awk tally (so the summary silently never printed). Root cause: the base64 was hand-edited AFTER testing a clean version -- the bytes sent were never round-tripped through the sandbox. Mitigation is now standard practice (D-054): tested scripts committed to the repo, verified by sha256 on the jumphost.

  • Juju spaces are per-model. juju spaces / juju reload-spaces cannot run until after juju add-model; the old phase-01 CHECK 5 ran pre-model and failed with "model not found". Split into juju-spaces-check.sh, gated to run post-add-model.

  • Default-space globally poisons network-get (deploy root cause). The full D-052 binding deploy failed universally (network-get ... ERROR space "metal" not found, install hook dies on nearly every charm). Every static layer was correct -- bundle, model bindings, MAAS spaces/VLANs/per-NIC space tags all read metal-internal. The single stale value was controller model-defaults default-space = metal (a dead pre-D-052 name). An INVALID default-space poisons network-get for ALL endpoints regardless of their explicit binding. Fix: set juju model-defaults default-space=metal-admin (a live space) before add-model. A default-space-resolves- to-a-live-space gate is to be added to pre-flight-checks.sh.

  • Teardown --destroy-storage on virsh DELETES machine objects (does NOT release). The phase-00 teardown (juju destroy-model openstack --force --destroy-storage then per-host maas machine release) assumes release-to-Ready. On a virsh/KVM MAAS, --destroy-storage DECOMPOSES (deletes) the VM-backed machine objects. All four openstack hosts were removed from MAAS. Recoverable only because the libvirt domains

    • disks (incl the blank OSD vdb) survived. See D-055.

Pending design-decisions.md appends (continued)

D-055 -- virsh teardown defect + host re-enrollment procedure (ADOPTED)

Defect: juju destroy-model --destroy-storage against virsh-power MAAS machines deletes (decomposes) the machine objects rather than releasing them to Ready. The phase-00 teardown must NOT pass --destroy-storage for virsh hosts; release to Ready without it.

Recovery (now a reusable procedure): the libvirt domains survive, so re-enroll via maas admin machines create per host with virsh power + the boot NIC MAC (NOT add-chassis -- it would re-grab juju/lxd/tailscale). machines create auto-commissions (New->Commissioning->Ready) by PXE off the 2_metal boot NIC. Then re-tag openstack, then reconstruct the host interface tree (Strategy-B carve, from the captured as-built), then verify (pre-flight), then redeploy with the default-space fix.

Artifacts: scripts/lib-hosts.sh, scripts/reenroll-hosts.sh, docs/maas-as-built-reference.md. Proven live on openstack0 (2026-06-26): created virsh, commissioned, Ready, all six NICs discovered, boot NIC on 2_metal.

DOCFIX-040 -- host identity must be hostname-keyed, not system_id-keyed

lib-net.sh lines 45-47 key the host maps (SYSIDS, SYSID_HOST, SYSID_OCTET) on the system_ids 4na83t/qdbqd6/h8frng/tmsafc -- which DIED on re-enrollment (new random ids). Any script keyed on them silently breaks. New scripts/lib-hosts.sh keys all host identity on hostname (stable) and resolves system_id at runtime (host_sysid). At completion: retire the SYSID-keyed maps from lib-net.sh (or repoint them to lib-hosts).


Security note (action required)

The libvirt SSH password (logxen@10.12.64.1) was printed in plaintext on 2026-06-26 by maas admin machine power-parameters during virsh power-template discovery. Treat as exposed: rotate the libvirt SSH credential after the rebuild and scrub terminal scrollback. Runbook rule added: never use machine power-parameters for templating; read power_type and reconstruct the address pattern instead. reenroll-hosts.sh reads the password interactively (never a CLI arg, never logged, never in the repo).


Scripts / docs added (this batch)

  • scripts/lib-hosts.sh -- hostname-keyed host identity + virsh power constants (no secret).
  • scripts/reenroll-hosts.sh -- gated/idempotent re-enrollment (auto-commission, poll Ready, boot-NIC-on-2_metal verify; --check read-only mode). Tested: bash -n, shellcheck clean, mock-maas behavior test of --check (discover-by-hostname, NOT-ENROLLED detection, exit 0).
  • docs/maas-as-built-reference.md -- captured MAAS substrate + per-host NIC inventory + interface-carve target + virsh template, for DC-DC replay.
  • Pending next artifact: the Strategy-B interface-carve script (built once all four are Ready; bridge_type pulled verbatim from captured release JSON) -> then consolidate into runbooks/phase-00b-host-reenrollment.md.

DOCFIX-041 -- as-built reference: br-ex is charm-built, not a MAAS bridge

Correction to docs/maas-as-built-reference.md (first committed this session). The bundle's ovn-chassis bridge-interface-mappings maps br-ex:<provider-MAC> for all four hosts -> br-ex is built by the ovn-chassis charm at deploy (OVS), enslaving the provider NIC by MAC; it is NOT a MAAS interface. The MAAS carve therefore:

  • provider plane = raw enp1s0 + static 10.12.4.N (MAAS leaves it raw; the charm enslaves it into br-ex at deploy). MAAS does NOT create br-ex.
  • storage/replication = raw enp9s0/enp10s0 + statics; Juju auto-bridges them (br-enp9s0/br-enp10s0, Linux) at deploy.
  • the ONLY MAAS-built bridges are the metal-internal stack: enp7s0 -> br-metal -> br-metal.103 (VID 103) -> br-internal.

bridge_type: br-internal = standard (confirmed, D-052 command). br-metal = standard (RECOMMENDED, reasoned-not-measured -- original bring-up predates the repo and the capture did not preserve bridge_type; pending confirm before carve). The deployed-host ip-level read that showed br-metal/br-internal "OVS" was taken during the FAILED deploy and is reclassified UNRELIABLE.

Carve script added + MAAS interface CLI confirmations

  • scripts/carve-host-interfaces.sh <hostname> [--apply] -- Strategy-B per-host interface carve. Default DRY-RUN (resolves every id live, prints each mutation it WOULD run, changes nothing); --apply executes. Idempotent (skips existing bridge/vlan/link), resolves system_id by hostname / interface id by name / subnet id + VLAN object id by CIDR, asserts metal-internal is VID 103, requires Ready. Builds: enp1s0 raw+static (provider); enp7s0 -> br-metal(std) -> br-metal.103(VID 103) -> br-internal(std); enp8/9/10 raw+static (data/storage/repl); enp11s0 idle. Does NOT create br-ex (charm-built). Tested: bash -n, shellcheck clean, mock-MAAS dry-run (full id resolution + command preview), input guards.

  • MAAS 3.7 interface CLI confirmed (canonical.com/maas/docs/3.7 reference): create-bridge takes bridge_type=standard|ovs parent=<ifid> vlan=<vlan-obj-id>; create-vlan takes vlan=<VLAN-OBJECT-ID> parent=<ifid> (NOT the VID tag -- resolve the object id via the metal-internal subnet); link-subnet mode=STATIC subnet=<id> ip_address=<ip>; a NIC is moved to a plane's fabric via interface update <sid> <ifid> vlan=<vlan-obj-id> before link-subnet (re-enrolled raw NICs sit on transient auto-fabrics).

  • FINDING (teardown runbook bug): runbooks/phase-00-teardown-maas-reset.md "Phase 3" link-subnet block uses PRE-D-052 CIDRs (enp8s0=10.12.12.0/22 enp9s0=10.12.16.0/22 enp10s0=10.12.20.0/22) and dead system_ids -- it would link NICs to the WRONG subnets (10.12.12 is now metal-internal, 10.12.16 is now data-tenant, 10.12.20 no longer exists). Must be rewritten to current planes + hostname-keyed before that runbook is trusted. Note: the normal release-to-Ready path PRESERVES host interfaces, so that block only ran on a normal teardown; the full carve (this script) is needed only after a decompose, which is why the bridges were never scripted before.

Carve hardening: self-discovered metal IP blocks br-metal static (KI)

Root cause (cost several diagnostic rounds): after re-enrollment each host PXE-leases its own metal IP (10.12.8.4N) at commission. MAAS records this as a StaticIPAddress of alloc_type 6 (DISCOVERED) tied to the node via its boot NIC. This is a SEPARATE object from the network-discovery table (discoveries clear-by-mac-and-ip does NOT clear it) and from user allocations (ipaddresses read user-scope does NOT show it). It causes link-subnet ... ip_address=10.12.8.4N to fail with the misleading "IP address is already in use".

Authoritative read (the lesson): maas admin subnet ip-addresses <subnet_id> reports every in-use IP WITH its alloc_type and owning node -- this is the single correct "who holds this IP and why" query. Lead with it; do not probe ipaddresses/discovery/ leases piecemeal.

Release: maas admin ipaddresses release ip=<ip> force=true discovered=true (BOTH flags required; force alone returns "does not exist" for a discovered address).

Script fix (carve-host-interfaces.sh): release_self_discovered() runs before every STATIC link -- releases an alloc_type-6 record for the target IP ONLY when its owning node == this host (node_summary.system_id), and REFUSES (fatal) if a different node discovered it (a real conflict). Plus emit now captures and prints the MAAS error on a failed mutation instead of discarding it to /dev/null (the discard hid the real message and prolonged diagnosis). Only the metal plane (dhcp_on=true) is affected; the no-DHCP planes never produced a self-lease. Verified: mock self-release path + foreign-node refuse gate.

NOTE (design consistency, not a blocker): host statics .40-.43 sit inside the metal-admin/provider/internal VIP+mgmt reserve band (.2-.100). A reserved range blocks AUTO assignment, not explicit STATIC, so it did not break the carve -- but host octets arguably belong outside the VIP band. Log for the reserve-layout review.

DC-DC script audit (post-carve hardening batch)

Reviewed all MAAS scripts against what this session actually hit, so the DC-DC build replays cleanly instead of re-deriving the metal-IP archaeology.

  • carve gate rewrite (the big one). release_self_discovered keyed on node_summary.system_id, which is EMPTY on a fresh discovered record -> it silently no-op'd and the metal static (.8.41/.42/.43) had to be released by hand on three hosts. Replaced with release_self_indexed: the target is this host's architecturally-indexed metal IP (10.12.8. from HOST_OCTET), so a DISCOVERED observation on it is this host's own commissioning ghost. SAFETY: refuses if the record's system_id (when present) OR the discoveries-table MAC (when present) identifies a DIFFERENT host; releases otherwise. Removed the (unneeded) release call from carve_raw -- the no-DHCP planes never produce discovered records. Tested: 5 branches (foreign-sysid refuse, foreign-MAC refuse, indexed-basis release, MAC-basis release, no-record no-op).

  • missing step added: openstack tag. reenroll-hosts.sh now ensures the openstack tag exists and applies it to all four hosts after the Ready/boot-NIC gate (idempotent; --check-aware). Without it the bundle cannot place units (constraint tags=openstack). Was a manual step every rebuild.

  • DOCFIX-040 COMPLETE. pre-flight-checks.sh and osd-blank-check.sh both looped over the dead system_ids (4na83t...) via lib-net's SYSID maps -- broken for any rebuilt/DC-DC cluster. Migrated both to hostname-keyed (lib-hosts HOSTS / HOST_OCTET / host_sysid). Retired the SYSID/SYSID_HOST/SYSID_OCTET maps from lib-net.sh and added its sourced-library shellcheck directive. osd-blank verified via mock (iterates the four hostnames, RC=0).

  • validate.sh: em-dashes -> ASCII (the silent-UnicodeDecodeError class; ASCII-only rule for all scripts). Still a placeholder body otherwise.

REMAINING DC-DC scope (done MANUALLY this session; scripting them would make the bring-up fully hands-off -- NOT yet built):

  1. A multi-host carve-verify wrapper (assert all four hosts show the six expected static links on the right fabrics) -- currently an ad-hoc jq loop.
  2. A redeploy-prep wrapper: set model-defaults default-space=metal-admin, add-model, verify the MODEL's effective default-space (the value that poisoned the last deploy), reload-spaces, run juju-spaces-check. Currently manual steps R1-R3.