diff --git a/docs/maas-as-built-reference.md b/docs/maas-as-built-reference.md index 663c484..e645fb5 100644 --- a/docs/maas-as-built-reference.md +++ b/docs/maas-as-built-reference.md @@ -65,14 +65,24 @@ the CIDRs were re-IP'd onto the same L2 segments, so e.g. `3_data` now carries data-tenant 10.12.16.0/22): -| libvirt net | host NIC | plane | bridge/type | -|---|---|---|---| -| 1_provider | enp1s0 | provider-public | br-ex (OVS) | -| 2_metal (boot) | enp7s0 | metal-admin + metal-internal(VID103) | br-metal (OVS), br-metal.103 (vlan), br-internal (std bridge) | -| 3_data | enp8s0 | data-tenant | raw NIC | -| 4_storage | enp9s0 | storage | raw NIC | -| 5_replication | enp10s0 | replication | raw NIC | -| 8_lbaas | enp11s0 | idle (undefined; ex-lbaas) | raw NIC, no link | +| libvirt net | host NIC | plane | MAAS provides | built at deploy by | +|---|---|---|---|---| +| 1_provider | enp1s0 | provider-public | **raw NIC + static** | ovn-chassis builds br-ex (OVS), enslaves enp1s0 by MAC | +| 2_metal (boot) | enp7s0 | metal-admin + metal-internal(VID103) | br-metal -> br-metal.103 -> br-internal (the whole stack) | (MAAS-built; not charm) | +| 3_data | enp8s0 | data-tenant | raw NIC + static | (host-level geneve; stays raw) | +| 4_storage | enp9s0 | storage | raw NIC + static | Juju auto-bridges br-enp9s0 (Linux) | +| 5_replication | enp10s0 | replication | raw NIC + static | Juju auto-bridges br-enp10s0 (Linux) | +| 8_lbaas | enp11s0 | idle (undefined; ex-lbaas) | raw NIC, no link | - | + +WHO BUILDS WHICH BRIDGE (critical, from bundle ovn-chassis bridge-interface-mappings ++ design-decisions D line "MAC-based bridge-interface-mappings"): +- **br-ex** -- built by the **ovn-chassis charm** at deploy (OVS), enslaving the + provider NIC by MAC (bundle maps br-ex: for all four hosts). MAAS + must leave enp1s0 RAW (static only) or the MAC-enslave conflicts. +- **br-enp9s0 / br-enp10s0** -- Juju auto-bridges the storage/replication raw NICs + at deploy (Linux bridges) for ceph container attach. MAAS gives raw + static. +- **br-metal / br-metal.103 / br-internal** -- the ONLY bridges MAAS builds, because + the VLAN-103 metal-internal stack must exist pre-deploy. MAC inventory (host: provider / metal-boot / data / storage / replication / lbaas): - openstack0: 3d:fd:54 / 4f:1c:0b / 07:41:0a / d0:ed:e0 / 8f:ba:61 / d9:af:46 @@ -102,6 +112,8 @@ Procedure: `scripts/reenroll-hosts.sh` (gated, idempotent, discover-assert-pin). Constants: `scripts/lib-hosts.sh`. +Interface carve: `scripts/carve-host-interfaces.sh [--apply]` +(dry-run default; idempotent; ids resolved live). **Post-commission, before deploy:** re-apply the MAAS tag `openstack` to all four (the bundle places units via constraint `tags=openstack`; the tag applies to no @@ -112,33 +124,43 @@ ## 4. Per-host interface carve target (Strategy-B reconstruction) After commission, MAAS has only the raw NICs. Rebuild this tree (octet N = -.40/.41/.42/.43 by host index). Build order is forced by parentage. +.40/.41/.42/.43 by host index). Build order is forced by parentage. NOTE: MAAS +does NOT create br-ex (charm-built at deploy) -- provider is a raw NIC + static. | MAAS interface | type | parent | VLAN | space | static | |---|---|---|---|---|---| -| enp1s0 | physical | - | untagged | provider-public | (carries br-ex) | -| br-ex | bridge **OVS** | enp1s0 | untagged | provider-public | 10.12.4.N | +| enp1s0 | physical (raw) | - | untagged | provider-public | 10.12.4.N | | enp7s0 | physical | - | untagged | metal-admin | (carries br-metal) | -| br-metal | bridge **OVS** | enp7s0 | untagged | metal-admin | 10.12.8.N | +| br-metal | bridge **standard** | enp7s0 | untagged | metal-admin | 10.12.8.N | | br-metal.103 | vlan | br-metal | 103 | metal-internal | (carries br-internal) | | br-internal | bridge **standard** | br-metal.103 | 103 | metal-internal | 10.12.12.N | -| enp8s0 | physical | - | untagged | data-tenant | 10.12.16.N | -| enp9s0 | physical | - | untagged | storage | 10.12.32.N | -| enp10s0 | physical | - | untagged | replication | 10.12.36.N | -| enp11s0 | physical | - | untagged | undefined (idle) | (none) | +| enp8s0 | physical (raw) | - | untagged | data-tenant | 10.12.16.N | +| enp9s0 | physical (raw) | - | untagged | storage | 10.12.32.N | +| enp10s0 | physical (raw) | - | untagged | replication | 10.12.36.N | +| enp11s0 | physical (raw) | - | untagged | undefined (idle) | (none) | -Build order: physicals (auto-discovered) -> br-ex / br-metal (OVS) -> -br-metal.103 (vlan) -> br-internal (std bridge) -> statics on -br-ex/br-metal/br-internal/enp8/enp9/enp10. Link by CIDR (re-homes the NIC onto -the correct fabric/space). +Build order: physicals (auto-discovered) -> br-metal (bridge) -> br-metal.103 +(vlan) -> br-internal (std bridge) -> statics on enp1s0 / br-metal / br-internal / +enp8 / enp9 / enp10. Link by CIDR (re-homes the NIC onto the correct fabric/space). -CAVEAT: confirm each `bridge_type` (OVS vs standard) verbatim from a captured -machine-release JSON before carving -- it changes how MAAS renders netplan and -is the one value not to take from memory. +br-internal bridge_type = **standard** (confirmed: yesterday's D-052 create-bridge +command used bridge_type=standard). -Deployed-host bridge facts (for reference; created by Juju/LXD at deploy, NOT by -MAAS): storage/replication appear as Linux bridges br-enp9s0/br-enp10s0; -br-ex/br-metal/br-internal are OVS; data-tenant runs on raw enp8s0. +br-metal bridge_type = **standard** (RECOMMENDED, reasoned-not-measured: the +original bring-up predates the repo and the captured release JSON did not preserve +bridge_type. Reasoning: it is host/container networking (not OVN dataplane); it +carries a netplan-style VLAN child (br-metal.103), which is the conventional +standard-bridge construct; br-internal on top of it is standard; and standard is +the MAAS create-bridge default. Reversible. CONFIRM before carve.) + +Deployed-host bridge facts (for reference): at deploy, ovn-chassis builds br-ex +(OVS) on the provider NIC, and Juju builds br-enp9s0/br-enp10s0 (Linux) on the +storage/replication NICs; data-tenant runs on raw enp8s0. The metal stack +(br-metal/br-metal.103/br-internal) is MAAS-built (standard). NOTE: an `ip`-level +read of the hosts taken during the 2026-06-26 FAILED deploy showed br-metal/ +br-internal absent from the kernel bridge list -- treated as UNRELIABLE (partial/ +failed bring-up), not as evidence of OVS; the authoritative type is standard per +the D-052 create-bridge command. --- diff --git a/docs/v1-redeploy-changelog.md b/docs/v1-redeploy-changelog.md index bbc1021..f741f75 100644 --- a/docs/v1-redeploy-changelog.md +++ b/docs/v1-redeploy-changelog.md @@ -191,3 +191,52 @@ - Pending next artifact: the Strategy-B interface-carve script (built once all four are Ready; bridge_type pulled verbatim from captured release JSON) -> then consolidate into `runbooks/phase-00b-host-reenrollment.md`. + +### DOCFIX-041 -- as-built reference: br-ex is charm-built, not a MAAS bridge + +Correction to `docs/maas-as-built-reference.md` (first committed this session). The +bundle's ovn-chassis `bridge-interface-mappings` maps `br-ex:` for all +four hosts -> **br-ex is built by the ovn-chassis charm at deploy (OVS), enslaving the +provider NIC by MAC; it is NOT a MAAS interface.** The MAAS carve therefore: +- provider plane = **raw enp1s0 + static 10.12.4.N** (MAAS leaves it raw; the charm + enslaves it into br-ex at deploy). MAAS does NOT create br-ex. +- storage/replication = raw enp9s0/enp10s0 + statics; Juju auto-bridges them + (br-enp9s0/br-enp10s0, Linux) at deploy. +- the ONLY MAAS-built bridges are the metal-internal stack: + enp7s0 -> br-metal -> br-metal.103 (VID 103) -> br-internal. + +bridge_type: br-internal = standard (confirmed, D-052 command). br-metal = standard +(RECOMMENDED, reasoned-not-measured -- original bring-up predates the repo and the +capture did not preserve bridge_type; pending confirm before carve). The +deployed-host `ip`-level read that showed br-metal/br-internal "OVS" was taken during +the FAILED deploy and is reclassified UNRELIABLE. + +### Carve script added + MAAS interface CLI confirmations + +- `scripts/carve-host-interfaces.sh [--apply]` -- Strategy-B per-host + interface carve. Default DRY-RUN (resolves every id live, prints each mutation it + WOULD run, changes nothing); --apply executes. Idempotent (skips existing + bridge/vlan/link), resolves system_id by hostname / interface id by name / subnet + id + VLAN object id by CIDR, asserts metal-internal is VID 103, requires Ready. + Builds: enp1s0 raw+static (provider); enp7s0 -> br-metal(std) -> br-metal.103(VID + 103) -> br-internal(std); enp8/9/10 raw+static (data/storage/repl); enp11s0 idle. + Does NOT create br-ex (charm-built). Tested: bash -n, shellcheck clean, mock-MAAS + dry-run (full id resolution + command preview), input guards. + +- MAAS 3.7 interface CLI confirmed (canonical.com/maas/docs/3.7 reference): + create-bridge takes `bridge_type=standard|ovs parent= vlan=`; + create-vlan takes `vlan= parent=` (NOT the VID tag -- resolve + the object id via the metal-internal subnet); link-subnet `mode=STATIC + subnet= ip_address=`; a NIC is moved to a plane's fabric via `interface + update vlan=` before link-subnet (re-enrolled raw NICs + sit on transient auto-fabrics). + +- FINDING (teardown runbook bug): `runbooks/phase-00-teardown-maas-reset.md` + "Phase 3" link-subnet block uses PRE-D-052 CIDRs + (`enp8s0=10.12.12.0/22 enp9s0=10.12.16.0/22 enp10s0=10.12.20.0/22`) and dead + system_ids -- it would link NICs to the WRONG subnets (10.12.12 is now + metal-internal, 10.12.16 is now data-tenant, 10.12.20 no longer exists). Must be + rewritten to current planes + hostname-keyed before that runbook is trusted. Note: + the normal release-to-Ready path PRESERVES host interfaces, so that block only ran + on a normal teardown; the full carve (this script) is needed only after a + decompose, which is why the bridges were never scripted before. diff --git a/scripts/carve-host-interfaces.sh b/scripts/carve-host-interfaces.sh new file mode 100644 index 0000000..f001cd9 --- /dev/null +++ b/scripts/carve-host-interfaces.sh @@ -0,0 +1,180 @@ +#!/usr/bin/env bash +# scripts/carve-host-interfaces.sh [--apply] +# +# Strategy-B interface carve for ONE freshly-commissioned host. Reconstructs the +# host network tree that was lost when the machine was decomposed. Default is +# DRY-RUN (resolves every id live and prints each mutation it WOULD run, changes +# nothing). Pass --apply to execute. +# +# Target tree (octet N = .40-.43 by host index; see lib-hosts.sh HOST_OCTET): +# enp1s0 raw + STATIC 10.12.4.N (provider-public; ovn-chassis builds br-ex +# OVS at deploy and enslaves enp1s0 by MAC -- +# MAAS must leave enp1s0 RAW) +# enp7s0 --> br-metal (standard bridge) + STATIC 10.12.8.N (metal-admin) +# br-metal.103 (vlan, VID 103) +# --> br-internal (standard bridge) + STATIC 10.12.12.N (metal-internal) +# enp8s0 raw + STATIC 10.12.16.N (data-tenant) +# enp9s0 raw + STATIC 10.12.32.N (storage; Juju auto-bridges at deploy) +# enp10s0 raw + STATIC 10.12.36.N (replication; Juju auto-bridges at deploy) +# enp11s0 idle (ex-lbaas; no link) +# +# All ids resolved live: system_id by hostname, interface id by name, subnet id and +# VLAN object id by CIDR. Idempotent: skips a bridge/vlan/link that already exists. +# Requires the host to be Ready (link-subnet/update are rejected on Deployed). +# +# Exit: 0 ok | 1 fatal | 2 warning + +set -euo pipefail +shopt -s inherit_errexit 2>/dev/null || true + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +# shellcheck source=scripts/lib-net.sh +. "$SCRIPT_DIR/lib-net.sh" +# shellcheck source=scripts/lib-hosts.sh +. "$SCRIPT_DIR/lib-hosts.sh" + +MAAS_PROFILE="${MAAS_PROFILE:-admin}" +FATAL=0 +fail() { echo "FAIL: $*" >&2; FATAL=$((FATAL+1)); } +note() { echo "NOTE: $*"; } +hdr() { echo; echo "=== $* ==="; } + +usage() { echo "usage: $0 [--apply]" >&2; exit 1; } + +HN="${1:-}"; [ -n "$HN" ] || usage +MODE="dryrun"; [ "${2:-}" = "--apply" ] && MODE="apply" + +# validate hostname is one of ours +ok=0; for h in "${HOSTS[@]}"; do [ "$h" = "$HN" ] && ok=1; done +[ "$ok" = 1 ] || { echo "ERROR: '$HN' is not one of: ${HOSTS[*]}" >&2; exit 1; } + +need_jq || exit 1 +OCTET="${HOST_OCTET[$HN]}" + +# ---- live resolvers (read-only; safe in both modes) ----------------------- +maas_q() { maas "$MAAS_PROFILE" "$@"; } + +SID="$(host_sysid "$HN" || true)" +[ -n "$SID" ] || { fail "$HN is not enrolled in MAAS"; exit 1; } + +STATUS="$(maas_q machine read "$SID" 2>/dev/null | jq -r '.status_name // "?"')" +[ "$STATUS" = "Ready" ] || { fail "$HN ($SID) is '$STATUS', not Ready -- interface edits are rejected unless Ready/Broken"; exit 1; } + +# subnet id + vlan object id, resolved BY CIDR (drift-proof) +SUBNETS_JSON="$(maas_q subnets read)" +subid_of() { printf '%s' "$SUBNETS_JSON" | jq -r --arg c "$1" '.[]|select(.cidr==$c)|.id' | head -1; } +vlanid_of() { printf '%s' "$SUBNETS_JSON" | jq -r --arg c "$1" '.[]|select(.cidr==$c)|(.vlan.id // .vlan)' | head -1; } +vlanvid_of(){ printf '%s' "$SUBNETS_JSON" | jq -r --arg c "$1" '.[]|select(.cidr==$c)|(.vlan.vid // empty)' | head -1; } + +# plane CIDRs (verified set; sourced order from lib-net PLANE_CIDRS) +C_PROV="10.12.4.0/22"; C_METAL="10.12.8.0/22"; C_INT="$METAL_INTERNAL_CIDR" # 10.12.12.0/22 +C_DATA="10.12.16.0/22"; C_STOR="10.12.32.0/22"; C_REPL="10.12.36.0/22" + +# assert all six planes resolve, and the internal plane is really VID 103 +for c in "$C_PROV" "$C_METAL" "$C_INT" "$C_DATA" "$C_STOR" "$C_REPL"; do + [ -n "$(subid_of "$c")" ] || { fail "no MAAS subnet for $c"; } + [ -n "$(vlanid_of "$c")" ] || { fail "no VLAN for $c"; } +done +[ "$FATAL" = 0 ] || exit 1 +gotvid="$(vlanvid_of "$C_INT")" +[ "$gotvid" = "$METAL_INTERNAL_VID" ] || { fail "metal-internal $C_INT is VID '$gotvid', expected $METAL_INTERNAL_VID"; exit 1; } + +# interface id by name (live) +ifid_of() { maas_q interfaces read "$SID" | jq -r --arg n "$1" '.[]|select(.name==$n)|.id' | head -1; } +# is interface (by name) already linked to a given cidr? +linked_to() { + maas_q interfaces read "$SID" \ + | jq -e --arg n "$1" --arg c "$2" '.[]|select(.name==$n)|.links[]?|select(.subnet.cidr==$c)' >/dev/null 2>&1 +} + +# ---- mutation emitter ------------------------------------------------------ +# emit "" maas (runs in apply; prints WOULD in dryrun) +emit() { + local desc="$1"; shift + if [ "$MODE" = "apply" ]; then + echo " DO: $desc" + if ! maas "$MAAS_PROFILE" "$@" >/dev/null; then fail "$desc"; return 1; fi + else + echo " WOULD: $desc" + echo " maas $MAAS_PROFILE $*" + fi +} + +hdr "$HN ($SID) octet=.$OCTET mode=$MODE" +echo "resolved subnet/vlan ids (by CIDR):" +printf " provider %s sub=%s vlan=%s\n" "$C_PROV" "$(subid_of "$C_PROV")" "$(vlanid_of "$C_PROV")" +printf " metal %s sub=%s vlan=%s\n" "$C_METAL" "$(subid_of "$C_METAL")" "$(vlanid_of "$C_METAL")" +printf " internal %s sub=%s vlan=%s (vid %s)\n" "$C_INT" "$(subid_of "$C_INT")" "$(vlanid_of "$C_INT")" "$gotvid" +printf " data %s sub=%s vlan=%s\n" "$C_DATA" "$(subid_of "$C_DATA")" "$(vlanid_of "$C_DATA")" +printf " storage %s sub=%s vlan=%s\n" "$C_STOR" "$(subid_of "$C_STOR")" "$(vlanid_of "$C_STOR")" +printf " replicat %s sub=%s vlan=%s\n" "$C_REPL" "$(subid_of "$C_REPL")" "$(vlanid_of "$C_REPL")" + +# helper: link a RAW physical NIC -> move to plane VLAN, then STATIC link +carve_raw() { + local nic="$1" cidr="$2" ip="$3" + local id vlan sub + id="$(ifid_of "$nic")" + [ -n "$id" ] || { fail "$nic not found on $HN"; return 1; } + if linked_to "$nic" "$cidr"; then note "$nic already STATIC on $cidr -- SKIP"; return 0; fi + vlan="$(vlanid_of "$cidr")"; sub="$(subid_of "$cidr")" + emit "$nic(id=$id) -> VLAN $vlan ($cidr)" interface update "$SID" "$id" vlan="$vlan" + emit "$nic(id=$id) -> STATIC $ip on subnet $sub" interface link-subnet "$SID" "$id" mode=STATIC subnet="$sub" ip_address="$ip" +} + +hdr "provider plane (enp1s0 raw + static)" +carve_raw enp1s0 "$C_PROV" "10.12.4.$OCTET" + +hdr "metal stack (enp7s0 -> br-metal -> br-metal.103 -> br-internal)" +EID="$(ifid_of enp7s0)"; [ -n "$EID" ] || fail "enp7s0 not found" +# 1) clear enp7s0's commissioning link(s) so the IP lands on the bridge, not the member +if maas_q interfaces read "$SID" | jq -e '.[]|select(.name=="enp7s0")|.links[]?|select(.subnet!=null)' >/dev/null 2>&1; then + for lid in $(maas_q interfaces read "$SID" | jq -r '.[]|select(.name=="enp7s0")|.links[]?|select(.subnet!=null)|.id'); do + emit "unlink enp7s0(id=$EID) commissioning link id=$lid" interface unlink-subnet "$SID" "$EID" id="$lid" + done +fi +# 2) br-metal (standard) on enp7s0 -- inherits enp7s0's 2_metal untagged VLAN +if [ -z "$(ifid_of br-metal)" ]; then + emit "create br-metal (standard) parent=enp7s0(id=$EID)" interfaces create-bridge "$SID" name=br-metal bridge_type=standard parent="$EID" +else note "br-metal exists -- SKIP create"; fi +[ "$MODE" = apply ] && BMID="$(ifid_of br-metal)" || BMID="" +if ! linked_to br-metal "$C_METAL"; then + emit "br-metal(id=$BMID) -> STATIC 10.12.8.$OCTET on subnet $(subid_of "$C_METAL")" \ + interface link-subnet "$SID" "$BMID" mode=STATIC subnet="$(subid_of "$C_METAL")" ip_address="10.12.8.$OCTET" +else note "br-metal already on $C_METAL -- SKIP"; fi +# 3) br-metal.103 (VLAN, VID 103) on br-metal +if [ -z "$(ifid_of br-metal.103)" ]; then + emit "create br-metal.103 (VID 103, vlan obj $(vlanid_of "$C_INT")) parent=br-metal(id=$BMID)" \ + interfaces create-vlan "$SID" vlan="$(vlanid_of "$C_INT")" parent="$BMID" +else note "br-metal.103 exists -- SKIP create"; fi +[ "$MODE" = apply ] && V103="$(ifid_of br-metal.103)" || V103="" +# 4) br-internal (standard) on br-metal.103 +if [ -z "$(ifid_of br-internal)" ]; then + emit "create br-internal (standard) parent=br-metal.103(id=$V103)" \ + interfaces create-bridge "$SID" name=br-internal bridge_type=standard parent="$V103" +else note "br-internal exists -- SKIP create"; fi +[ "$MODE" = apply ] && BIID="$(ifid_of br-internal)" || BIID="" +if ! linked_to br-internal "$C_INT"; then + emit "br-internal(id=$BIID) -> STATIC 10.12.12.$OCTET on subnet $(subid_of "$C_INT")" \ + interface link-subnet "$SID" "$BIID" mode=STATIC subnet="$(subid_of "$C_INT")" ip_address="10.12.12.$OCTET" +else note "br-internal already on $C_INT -- SKIP"; fi + +hdr "data / storage / replication (raw + static)" +carve_raw enp8s0 "$C_DATA" "10.12.16.$OCTET" +carve_raw enp9s0 "$C_STOR" "10.12.32.$OCTET" +carve_raw enp10s0 "$C_REPL" "10.12.36.$OCTET" + +hdr "enp11s0 (ex-lbaas) -- left idle by design (no link)" +note "no action on enp11s0" + +# ---- verify (read-only, both modes) --------------------------------------- +hdr "resulting interface tree (live)" +maas_q interfaces read "$SID" | jq -r ' + .[] | " \(.name)\ttype=\(.type)\tvlan=\(.vlan.fabric):\(.vlan.vid)\tlinks=\([.links[]?|{(.subnet.cidr // "none"):(.ip_address // .mode)}])"' | sort + +echo +if [ "$MODE" = dryrun ]; then + note "DRY-RUN only -- nothing changed. Re-run with --apply to execute." +fi +echo "Summary: ${FATAL} fatal" +[ "$FATAL" -gt 0 ] && exit 1 +exit 0