Newer
Older
openstack-caracal-ipv4 / runbooks / phase-00-maas-reconfigure.md

Phase 00 -- MAAS reconfigure to D-058

Sequences the gated steps that take the live D-052/053 cloud to the D-058 plane scheme, then hands off to deploy. Scripts do the deterministic/idempotent work; the destructive juju + libvirt steps stay human-gated (runbooks), by design.

Precondition check (read-only): scripts/phase-00-maas-standup.sh should report the three DRIFT planes (.8 metal-admin -> provider-vip, .12 metal-internal -> metal-admin, .16 data-tenant -> metal-internal). If it reports no drift, the cloud is already on D-058 and only the deploy remains.

Step 1 -- Teardown (gated runbook: runbooks/phase-00-teardown-maas-reset.md)

Destroy the openstack Juju model and release openstack0-3 to MAAS Ready. The hosts MUST be released so the migrating subnets carry no live interface links -- the re-CIDR deletes those subnets, and MAAS refuses to delete a subnet with live allocations. juju destroy-model is typed by the operator (not auto-scripted). GATE: juju models shows no openstack; openstack0-3 are Ready.

Step 2 -- Audit (read-only)

scripts/phase-00-maas-standup.sh      # expect 3 DRIFT lines (.8/.12/.16)
scripts/phase-00-maas-recidr.sh       # audit: migration plan + metal/data fabric ids

Eyeball the fabric ids and confirm no live IP allocations are flagged on the migrating subnets. Change nothing here.

Step 3 -- Re-CIDR (gated, destructive)

scripts/phase-00-maas-recidr.sh --apply

Deletes the old .8/.12/.16 subnets (reserved ranges first), then recreates .12/.16/.20 on the SAME fabrics/VLANs (reuse-in-place; spaces inherited via the persisted VLANs). Collision-safe: all deletes precede all creates. If a delete is refused (live links remain), clear them (release/delete the machine interfaces) and re-run -- the script is idempotent (already-migrated planes SKIP).

Step 4 -- Standup (gated)

scripts/phase-00-maas-standup.sh --apply   # provider-vip .8 (VID 104) + gateways + dns + ALL reserves
scripts/phase-00-maas-standup.sh           # verify: all-SKIP, no drift

The standup is the single MAAS-address authority (topology + VIP bands + FIP pool

  • mgmt reserves). phase-00-maas-carve.sh is retired.

Step 5 -- Jumphost bridges (gated host runbook: runbooks/jumphost-provider-vip-gateway.md)

ORDERING TRAP (D-058): provider-vip's gateway 10.12.8.1 IS metal-admin's OLD address. On the jumphost, in order: (a) virbr2 (metal-admin) 10.12.8.1 -> 10.12.12.1 (b) virbr7 (oob) confirm already 10.12.60.1 (live) (c) THEN virbr1.104 (provider-vip) = 10.12.8.1 Bringing up virbr1.104=.8.1 before (a) frees .8.1 is a same-subnet collision. libvirt/netplan persistence is host-specific -- typed by the operator, not scripted.

Step 6 -- Deploy handoff

Proceed to phase-01 bundle deploy. Per-host interface carve (scripts/carve-host-interfaces.sh) runs after commissioning. The bundle already carries the D-058 VIP triples; d057-bundle-check.py PASSes against it.

Why teardown + jumphost are runbooks, not scripts

juju destroy-model and libvirt bridge edits are the most consequential and least reversible / least portable actions in the phase. Per the operating discipline, consequential mutations are human-gated; these stay operator-typed. The deterministic, idempotent MAAS work (re-CIDR, standup) is scripted + behavior-tested.