Newer
Older
openstack-caracal-ipv4 / docs / maas-as-built-reference.md

MAAS As-Built Reference (VR0 / Baldurkeep)

Purpose: Authoritative, captured snapshot of the MAAS substrate and the four OpenStack KVM hosts, so a rebuild (or the next DC-DC region) can REPLAY the host enrollment + interface carve instead of re-deriving it from live state.

Status: Captured 2026-06-26 from live MAAS during the host re-enrollment recovery. ASCII + LF. Append-only; correct in-place only for measured drift.

Trust order: LIVE MAAS > this doc > committed bundle. Every per-host value here was measured live this session. IDs (subnet/space/fabric/interface) DRIFT across cutovers and re-enrollments -- resolve by CIDR / hostname, never by a stored ID. (PATTERN-1, lib-net.sh.)


1. Six network planes (post D-052 / D-053)

Resolve by CIDR. Subnet/space IDs intentionally omitted -- they drift (the D-052 cutover moved metal-internal off its old id; re-enrollment re-mints fabrics).

space CIDR VLAN / fabric gateway role
provider-public 10.12.4.0/22 untagged / 1_provider 10.12.4.1 public API VIPs + FIP/ext_net
metal-admin 10.12.8.0/22 untagged / 2_metal (PXE/DHCP) 10.12.8.1 operator/MAAS/PXE/admin API
metal-internal 10.12.12.0/22 tagged VID 103 / 2_metal none ALL OpenStack east-west (db/amqp/rpc/internal API/OVN-DB)
data-tenant 10.12.16.0/22 untagged / 4_data none OVN geneve overlay
storage 10.12.32.0/22 untagged / 8_storage none Ceph public
replication 10.12.36.0/22 untagged / 9_replication none Ceph cluster

Notes:

  • metal-internal rides the same L2 (2_metal) as metal-admin, tagged VID 103, carried on host bridge br-internal (over br-metal.103). It is the ONLY tagged plane and the ONLY container-services plane that needs a bridge.
  • Fabric cosmetic names (4_data / 8_storage / 9_replication) are reshuffled vs the libvirt network names and are irrelevant: Juju binds by SPACE NAME + CIDR
    • VLAN, all of which are correct.
  • Spaceless / not bound: f_oob 10.12.60.0/22; LXD fabric-4 10.37.195.0/24 + ULA fd42:8019:9206:57de::/64.
  • Gateways: only provider-public and metal-admin route. A gateway on any other plane is the D-052 "spurious-gw" defect class -- clear to none.
  • DNS resolver for the internal planes: 10.12.8.1.
  • VIP reserves: provider-public/metal-admin/metal-internal each 10.12.x.2-.100. FIP pool 10.12.5.0-.7.254. PXE DHCP 10.12.9.0-.11.254. mgmt reserves 10.12.4.101-.110 + 10.12.8.101-.110.

2. The four KVM hosts -- identity (hostname-keyed; system_id DRIFTS)

hostname libvirt domain / power_id host octet boot NIC (2_metal) MAC
openstack0 openstack0 .40 52:54:00:4f:1c:0b
openstack1 openstack1 .41 52:54:00:83:25:1f
openstack2 openstack2 .42 52:54:00:23:bd:72
openstack3 openstack3 .43 52:54:00:b2:7b:30

system_ids are minted fresh on every (re-)enrollment -- do NOT hardcode them. Resolve at runtime by hostname (scripts/lib-hosts.sh host_sysid). The dead pre-2026-06-26 ids were 4na83t/qdbqd6/h8frng/tmsafc.

Full per-host NIC inventory (libvirt domain XML; fixed MACs)

libvirt source-network -> MAAS plane (the libvirt net NAMES are pre-cutover; the CIDRs were re-IP'd onto the same L2 segments, so e.g. 3_data now carries data-tenant 10.12.16.0/22):

libvirt net host NIC plane MAAS provides built at deploy by
1_provider enp1s0 provider-public raw NIC + static ovn-chassis builds br-ex (OVS), enslaves enp1s0 by MAC
2_metal (boot) enp7s0 metal-admin + metal-internal(VID103) br-metal -> br-metal.103 -> br-internal (the whole stack) (MAAS-built; not charm)
3_data enp8s0 data-tenant raw NIC + static (host-level geneve; stays raw)
4_storage enp9s0 storage raw NIC + static Juju auto-bridges br-enp9s0 (Linux)
5_replication enp10s0 replication raw NIC + static Juju auto-bridges br-enp10s0 (Linux)
8_lbaas enp11s0 idle (undefined; ex-lbaas) raw NIC, no link -

WHO BUILDS WHICH BRIDGE (critical, from bundle ovn-chassis bridge-interface-mappings

  • design-decisions D line "MAC-based bridge-interface-mappings"):
  • br-ex -- built by the ovn-chassis charm at deploy (OVS), enslaving the provider NIC by MAC (bundle maps br-ex: for all four hosts). MAAS must leave enp1s0 RAW (static only) or the MAC-enslave conflicts.
  • br-enp9s0 / br-enp10s0 -- Juju auto-bridges the storage/replication raw NICs at deploy (Linux bridges) for ceph container attach. MAAS gives raw + static.
  • br-metal / br-metal.103 / br-internal -- the ONLY bridges MAAS builds, because the VLAN-103 metal-internal stack must exist pre-deploy.

MAC inventory (host: provider / metal-boot / data / storage / replication / lbaas):

  • openstack0: 3d:fd:54 / 4f:1c:0b / 07:41:0a / d0:ed:e0 / 8f:ba:61 / d9:af:46
  • openstack1: 9d:63:77 / 83:25:1f / 4e:71:6c / 42:50:8b / 86:78:ab / c6:56:12
  • openstack2: 89:7f:ce / 23:bd:72 / 24:70:08 / b8:5d:a3 / 28:bc:8c / b5:1e:61
  • openstack3: 99:fc:c2 / b2:7b:30 / c7:94:e9 / 41:cd:6b / bc:98:b0 / 6f:f5:ca (all prefixed 52:54:00:)

3. virsh power (non-secret) + enrollment

  • power_type: virsh
  • power_address: qemu+ssh://logxen@10.12.64.1/system (MAAS -> libvirt over SSH on the OOB host address; mirrors the surviving juju/lxd/tailscale machines)
  • power_id: the libvirt domain name == the hostname
  • power_pass: read interactively; never stored. The libvirt SSH password was exposed in plaintext on 2026-06-26 (maas machine power-parameters echoes it) -- ROTATE that credential after the rebuild. Never use power-parameters for templating; read power_type and reconstruct the address pattern instead.

Enrollment behaviour: machines create AUTO-COMMISSIONS (New -> Commissioning -> Ready) by PXE off the 2_metal boot NIC. The libvirt domains must already exist (this re-creates MAAS objects, not VMs). On commission, MAAS auto-discovers all six raw NICs; the five non-boot NICs land on transient auto-fabrics (fabric-NN, "Unconfigured") -- normal; the interface carve re-homes them by CIDR.

Procedure: scripts/reenroll-hosts.sh (gated, idempotent, discover-assert-pin). Constants: scripts/lib-hosts.sh. Interface carve: scripts/carve-host-interfaces.sh <hostname> [--apply] (dry-run default; idempotent; ids resolved live).

Post-commission, before deploy: re-apply the MAAS tag openstack to all four (the bundle places units via constraint tags=openstack; the tag applies to no machines after re-enrollment).


4. Per-host interface carve target (Strategy-B reconstruction)

After commission, MAAS has only the raw NICs. Rebuild this tree (octet N = .40/.41/.42/.43 by host index). Build order is forced by parentage. NOTE: MAAS does NOT create br-ex (charm-built at deploy) -- provider is a raw NIC + static.

MAAS interface type parent VLAN space static
enp1s0 physical (raw) - untagged provider-public 10.12.4.N
enp7s0 physical - untagged metal-admin (carries br-metal)
br-metal bridge standard enp7s0 untagged metal-admin 10.12.8.N
br-metal.103 vlan br-metal 103 metal-internal (carries br-internal)
br-internal bridge standard br-metal.103 103 metal-internal 10.12.12.N
enp8s0 physical (raw) - untagged data-tenant 10.12.16.N
enp9s0 physical (raw) - untagged storage 10.12.32.N
enp10s0 physical (raw) - untagged replication 10.12.36.N
enp11s0 physical (raw) - untagged undefined (idle) (none)

Build order: physicals (auto-discovered) -> br-metal (bridge) -> br-metal.103 (vlan) -> br-internal (std bridge) -> statics on enp1s0 / br-metal / br-internal / enp8 / enp9 / enp10. Link by CIDR (re-homes the NIC onto the correct fabric/space).

br-internal bridge_type = standard (confirmed: yesterday's D-052 create-bridge command used bridge_type=standard).

br-metal bridge_type = standard (RECOMMENDED, reasoned-not-measured: the original bring-up predates the repo and the captured release JSON did not preserve bridge_type. Reasoning: it is host/container networking (not OVN dataplane); it carries a netplan-style VLAN child (br-metal.103), which is the conventional standard-bridge construct; br-internal on top of it is standard; and standard is the MAAS create-bridge default. Reversible. CONFIRM before carve.)

Deployed-host bridge facts (for reference): at deploy, ovn-chassis builds br-ex (OVS) on the provider NIC, and Juju builds br-enp9s0/br-enp10s0 (Linux) on the storage/replication NICs; data-tenant runs on raw enp8s0. The metal stack (br-metal/br-metal.103/br-internal) is MAAS-built (standard). NOTE: an ip-level read of the hosts taken during the 2026-06-26 FAILED deploy showed br-metal/ br-internal absent from the kernel bridge list -- treated as UNRELIABLE (partial/ failed bring-up), not as evidence of OVS; the authoritative type is standard per the D-052 create-bridge command.


5. MAAS substrate facts (stable)

  • MAAS 3.7.2; controller maas.maas, Region+Rack, Non-HA(3 VLANs).
  • Pool default; zone default. Images synced (amd64): 24.04, 22.04, 20.04. Hosts deploy on 22.04 (Jammy).
  • Surviving (untouched) machines: capi-mgmt (lxd pod, Ready), juju/lxd/tailscale (virsh, owner logxen). The LXD pod composes only capi-mgmt; the four OpenStack hosts are virsh-individual (NOT pod-composed).
  • Tags present: openstack (apply to the 4 hosts), capi-mgmt, virtual, pod-console-logging, juju, lxd, tailscale.

6. DC-DC reuse notes

Region-specific (re-derive per region): the 10.12.x CIDRs, the boot/NIC MACs, the virsh power_address, the host octets, the libvirt host OOB address.

Structural (replays unchanged across regions): the six-plane model and its roles; metal-internal as a tagged VID-103 bridged stack over the metal plane; the per-host interface tree shape (OVS provider/metal + std-bridge internal + raw data/storage/replication); hostname-keyed identity with runtime system_id resolution; auto-commission-on-create; tag openstack re-apply before deploy; default-space must resolve to a LIVE space (a stale default-space globally poisons network-get).