Stand up the CAPI/Magnum management cluster as a single-homed in-cloud tenant VM (capi-mgmt-v2), bootstrap k8s-snap on it, prove pod egress through the hard gate, and install the pinned CAPI provider stack. This is the persistent v1 management cluster -- there is NO clusterctl move/pivot.
Decisions: D-035 (in-cloud single-homed tenant VM; retires D-033/D-017), D-034 (CAPI versions sourced from the capi-helm-charts tag's dependencies.json, never hardcoded), D-031 (Magnum + magnum-capi-helm + capi-helm-charts engine). Troubleshooting: appendix-A entries DOCFIX-021, DOCFIX-024, DOCFIX-025a, D-035.
admin-openrc sourced on the jumphost; openstack, jq, kubectl available.capi-mgmt Keystone project, the flavors, and the ubuntu-24.04-noble image exist -- on a FRESH deploy NONE of these survive teardown; Step 6.0-BOOT below verifies-or-creates all of them (run it first). The Magnum trustee domain is auto-configured by the magnum charm via its keystone (identity-credentials) relation -- verify [trust] (trustee_domain_id / trustee_domain_admin_id / trustee_domain_admin_password) is populated in magnum.conf; no manual step.capi-mgmt-net tenant network yet (this phase creates it).Literals below are tagged ENV(...) so the later generalization pass is mechanical. Discover everything else dynamically at run time.
ENV(project) capi-mgmt (resolve by name; this rebuild id d5bc125c7c1841d389b76cd0a7b0a915, domain capi)ENV(ext-net) provider-ext (resolve by name; this rebuild id 0d00ddc1-d2bf-4849-a087-14c07d77f167)ENV(image) ubuntu-24.04-noble (resolve by name; this rebuild id 899b4b5c-d8f6-4df4-860b-a9210d0eefe8)ENV(flavor) gp.large (16384 MB / 4 vCPU / 80 GB)ENV(mgmt-cidr) 10.20.0.0/24 (capi-mgmt-subnet; overlay, non-IPAM)ENV(keystone-vip) 10.12.4.50:5000 (the gate target -- the deployed VIP)ENV(mgmt-fip) assigned in 6.2 (apiserver SAN; resolve dynamically. This rebuild capi-mgmt-v2 = 10.12.5.103, tenant 10.20.0.107; the old 10.12.7.40 / 10.20.0.45 was the pre-teardown mgmt VM -- DOCFIX-038)ENV(pod-cidr) 10.1.0.0/16 ENV(svc-cidr) 10.152.183.0/24 (snap defaults; non-colliding)ENV(capi-tag) 0.25.1 (capi-helm-charts release; dependencies.json source)# RUN: jumphost -- on vopenstack-jesse as jessea123, admin-openrc sourced.# RUN: mgmt VM -- shipped to the VM over SSH via the FIP (heredoc below).</dev/null on every sudo): ssh -i ~/.ssh/id_ed25519 -o BatchMode=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ConnectTimeout=10 ubuntu@10.12.5.103 bash -s <<'REOF' ... REOF (10.12.5.103 = this rebuild's capi-mgmt-v2 FIP; resolve dynamically -- the old 10.12.7.40 is dead.)# RUN: jumphost REQUIRED on a fresh deploy: post-teardown the cloud has no tenant projects, NO flavors, and NO images -- this is the substance of the retired do-doc-06 tenant setup, restored after the phase-NN consolidation dropped it (found in the 2026-06-10 pre-redeploy review). Everything is verify-or-create, so it is safe (all [SKIP]) on an existing cloud.
Flavor specs are as-built ground truth (2026-06-08 verified-state checkpoint): gp.large 16384/4/80 (mgmt VM, 6.2), gp.mid 8192/2/40 (workload masters, 8.0 template), capi.node 4096/2/40 (workload workers, 8.0 template); gp.small and m1.lbtest are as-built parity. The 40/80 GB root disks schedule because the bundle sets nova-compute libvirt-image-backend: rbd (B3) -- DISK_GB comes from the Ceph pool, not the ~9 GB local ephemeral ceiling.
The noble image is seeded by STAGE-AND-VERIFY (canonical per FINDING-3; supersedes the 2026-06-16 web-download ruling and the standalone-glance glance-direct line): download + sha256-vs-published-SHA256SUMS + openstack image create --file --import (client-safe -- the openstack snap's --import is the glance-direct equivalent; the standalone glance client is NOT assumed present). With the hardened bundle's glance image-conversion: true, --import lands the stored disk_format raw (D-021 Ceph fast-clone alignment). Web-download is retained as a tested alternative (appendix-A); for ubuntu cloud-images it works on the hardened bundle (the 2026-06-08 403 was transient/pre-hardening), but it cannot checksum-verify the fetched file -- stage-and-verify is preferred for provenance and unifies with the phase-08 kube seed.
AS-BUILT FACTS (verified live 2026-06-10 pre-teardown; supersede the rebuild handoff, which wrongly placed capi-mgmt in admin_domain): project capi-mgmt lives in domain capi ("CAPI/Magnum workload identity"); the noble image is public with os_distro/os_version properties; admin@admin_domain holds member + load-balancer_member + reader (NOT admin) on the project -- DOCFIX-036 / D-039: magnum mints the per-cluster app-cred from the TRUSTOR's roles, so the trustor must hold load-balancer_member or CAPO's cred 403s on Octavia and the workload cluster wedges at API-LB provisioning. NOTE -- the old static CAPO identity (user capo, its app-cred, capo-clouds.yaml) is a FOSSIL of the retired D-033 out-of-cloud path and is deliberately NOT recreated: the current architecture needs no static cloud credential (clusterctl init takes none; per-cluster creds are magnum-minted at create time per D-039).
( {
set -u
source ~/admin-openrc
echo "=== domain capi (verify-or-create; as-built: 'CAPI/Magnum workload identity', NOT Juju-created) ==="
PROJ_DOMAIN="capi" # as-built, verified live 2026-06-10
openstack domain show "$PROJ_DOMAIN" >/dev/null 2>&1 \
&& echo "[SKIP] domain $PROJ_DOMAIN exists" \
|| { openstack domain create --description "CAPI/Magnum workload identity" "$PROJ_DOMAIN" >/dev/null \
&& echo "[OK] domain $PROJ_DOMAIN"; }
echo "=== project capi-mgmt in domain $PROJ_DOMAIN (verify-or-create) ==="
openstack project show capi-mgmt --domain "$PROJ_DOMAIN" >/dev/null 2>&1 \
&& echo "[SKIP] project capi-mgmt exists" \
|| { openstack project create --domain "$PROJ_DOMAIN" \
--description "CAPI management project" capi-mgmt >/dev/null \
&& echo "[OK] project capi-mgmt (domain $PROJ_DOMAIN)"; }
echo "=== roles: $OS_USERNAME gets member + load-balancer_member + reader on capi-mgmt (DOCFIX-036 / D-039) ==="
# D-039 ROOT CAUSE: magnum mints the per-cluster app-cred carrying the TRUSTOR's roles,
# FROZEN at mint, and delegates ALL trustor roles unfiltered. If admin@admin_domain holds
# only `member` here, CAPO's app-cred 403s on Octavia (needs load-balancer_member) and the
# workload cluster wedges at API-LB provisioning. Grant all three so future mints carry LB
# authority. (load-balancer_member + reader are keystone/Octavia default roles.)
for ROLE in member load-balancer_member reader; do
if openstack role assignment list --user "$OS_USERNAME" --user-domain "$OS_USER_DOMAIN_NAME" \
--project capi-mgmt --project-domain "$PROJ_DOMAIN" --role "$ROLE" -f value 2>/dev/null | grep -q .; then
echo "[SKIP] $ROLE already on capi-mgmt"
else
openstack role add --user "$OS_USERNAME" --user-domain "$OS_USER_DOMAIN_NAME" \
--project capi-mgmt --project-domain "$PROJ_DOMAIN" "$ROLE" \
&& echo "[OK] $ROLE on capi-mgmt"
fi
done
echo "=== flavors (as-built specs; public -- verified live 2026-06-10 pre-teardown) ==="
for spec in "gp.large 4 16384 80" "gp.mid 2 8192 40" "capi.node 2 4096 40" \
"gp.small 1 4096 20" "m1.lbtest 1 1024 4"; do
set -- $spec
openstack flavor show "$1" >/dev/null 2>&1 \
&& echo "[SKIP] flavor $1 exists" \
|| { openstack flavor create --vcpus "$2" --ram "$3" --disk "$4" --public "$1" >/dev/null \
&& echo "[OK] $1 ($2 vcpu / $3 MB / $4 GB)"; }
done
echo "=== mgmt VM image ubuntu-24.04-noble (verify-or-seed; STAGE-AND-VERIFY canonical; HOME-staged, L7) ==="
if openstack image show ubuntu-24.04-noble >/dev/null 2>&1; then
echo "[SKIP] image ubuntu-24.04-noble exists"
else
# Stage-and-verify (FINDING-3): download to $HOME (snap-readable; NOT /tmp -- L7) if missing/
# checksum-stale, verify sha256 vs the published SHA256SUMS, then client-safe import via the
# openstack snap (--import == glance-direct; image-conversion lands it raw). NOT the standalone
# `glance` client (unconfirmed on this jumphost).
IMG_URL="https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img"
SUM_URL="https://cloud-images.ubuntu.com/noble/current/SHA256SUMS"
IMG_FILE="noble-server-cloudimg-amd64.img"; SRC="$HOME/$IMG_FILE"
EXP=$(curl -fsSL "$SUM_URL" | awk -v f="$IMG_FILE" '$2=="*"f || $2==f {print $1}')
[ -n "$EXP" ] || { echo "GATE FAIL: no published checksum for $IMG_FILE"; exit 1; }
if [ -f "$SRC" ] && [ "$(sha256sum "$SRC" | awk '{print $1}')" = "$EXP" ]; then
echo "[OK] staged noble present + checksum-valid; skipping download"
else
echo "[..] downloading noble to $SRC (snap-readable; NOT /tmp)"
wget -q -O "$SRC" "$IMG_URL"
GOT=$(sha256sum "$SRC" | awk '{print $1}')
[ "$EXP" = "$GOT" ] || { echo "GATE FAIL: checksum mismatch exp='$EXP' got='$GOT'"; exit 1; }
echo "[OK] checksum verified ($GOT)"
fi
openstack image create ubuntu-24.04-noble \
--file "$SRC" --import \
--container-format bare --disk-format qcow2 --public \
--property os_distro=ubuntu --property os_version=24.04
fi
# as-built (verified live 2026-06-10): visibility=public, os_distro=ubuntu, os_version=24.04,
# stored raw in Ceph via the bundle's glance image-conversion=true.
echo "=== poll to active (import + conversion) ==="
for i in $(seq 1 40); do
ST=$(openstack image show ubuntu-24.04-noble -f value -c status 2>/dev/null || echo '?')
echo "[$i] status=$ST"
[ "$ST" = active ] && break
sleep 15
done
} )
GATE: project + role + all five flavors present; ubuntu-24.04-noble active (disk_format raw expected with image-conversion on). Do not proceed to 6.0 until this passes.
# RUN: jumphost Safe/idempotent setup -- consolidated. (LIVE-REVIEW: exact SG rule syntax is standard openstack-client; confirm on the redeploy test.)
( {
set -u
PROJ=capi-mgmt # ENV(project)
echo "=== keypair (import the jumphost pubkey) ==="
openstack keypair show capi-mgmt-key >/dev/null 2>&1 \
|| openstack keypair create --public-key ~/.ssh/id_ed25519.pub capi-mgmt-key
echo "=== security group capi-mgmt-sg (ingress 22 + 6443; egress default-allow) ==="
openstack security group show capi-mgmt-sg >/dev/null 2>&1 \
|| openstack security group create --project "$PROJ" capi-mgmt-sg
SG=$(openstack security group show capi-mgmt-sg -f value -c id)
# add rules only if absent (re-run safe)
openstack security group rule list "$SG" -f value -c "Port Range" | grep -q '^22:22' \
|| openstack security group rule create --proto tcp --dst-port 22 "$SG"
openstack security group rule list "$SG" -f value -c "Port Range" | grep -q '^6443:6443' \
|| openstack security group rule create --proto tcp --dst-port 6443 "$SG"
echo "=== verify ==="
openstack security group rule list "$SG" -f value -c Protocol -c "Port Range"
} )
Expect: capi-mgmt-key present; capi-mgmt-sg with tcp/22 and tcp/6443 ingress.
# RUN: jumphost Idempotent network plumbing -- consolidated. DNS nameservers 1.1.1.1/1.0.0.1 (D-019: public resolvers; image pulls need internet egress).
( {
set -u
PROJ=capi-mgmt # ENV(project)
EXT=provider-ext # ENV(ext-net)
echo "=== network capi-mgmt-net ==="
openstack network show capi-mgmt-net >/dev/null 2>&1 \
|| openstack network create --project "$PROJ" capi-mgmt-net
echo "=== subnet capi-mgmt-subnet 10.20.0.0/24 ===" # ENV(mgmt-cidr)
openstack subnet show capi-mgmt-subnet >/dev/null 2>&1 \
|| openstack subnet create --project "$PROJ" --network capi-mgmt-net \
--subnet-range 10.20.0.0/24 \
--dns-nameserver 1.1.1.1 --dns-nameserver 1.0.0.1 capi-mgmt-subnet
echo "=== router capi-mgmt-router + ext-gw + subnet ==="
openstack router show capi-mgmt-router >/dev/null 2>&1 \
|| openstack router create --project "$PROJ" capi-mgmt-router
openstack router set --external-gateway "$EXT" capi-mgmt-router
openstack router add subnet capi-mgmt-router capi-mgmt-subnet 2>/dev/null || true
echo "=== verify ==="
openstack router show capi-mgmt-router -f value -c external_gateway_info -c status
} )
Expect: subnet 10.20.0.0/24; router ACTIVE with an external gateway on provider-ext.
# RUN: jumphost Creates the VM and pins the management FIP. The FIP is the stable apiserver endpoint for the jumphost AND the Magnum conductor.
( {
set -u
PROJ=capi-mgmt # ENV(project)
EXT=provider-ext # ENV(ext-net)
echo "=== create capi-mgmt-v2 (gp.large / ubuntu-24.04-noble) ==="
openstack server show capi-mgmt-v2 >/dev/null 2>&1 \
|| openstack server create --image ubuntu-24.04-noble --flavor gp.large \
--network capi-mgmt-net --security-group capi-mgmt-sg \
--key-name capi-mgmt-key capi-mgmt-v2
echo "=== wait ACTIVE (re-run until ACTIVE) ==="
openstack server show capi-mgmt-v2 -f value -c status -c addresses
echo "=== floating ip on provider-ext, associate to the VM ==="
FIP=$(openstack floating ip create "$EXT" -f value -c floating_ip_address)
openstack server add floating ip capi-mgmt-v2 "$FIP"
# tenant (fixed) IP = the server address that is NOT the FIP (single-NIC VM has exactly the two)
TENANT_IP=$(openstack server show capi-mgmt-v2 -f json \
| FIP="$FIP" python3 -c "import os,json,sys; a=json.load(sys.stdin).get('addresses',{}) or {}; ips=[ip for net in a.values() for ip in net]; print(next((ip for ip in ips if ip!=os.environ['FIP']), ''))")
[ -n "$TENANT_IP" ] || { echo "ABORT: could not resolve tenant IP"; exit 1; }
# PERSIST both (single source for 6.3-6.6 -- PATTERN-1; the FIP is pool-allocated + the tenant
# IP DHCP-assigned, so NEITHER is deterministic per rebuild -- never hardcode them)
printf 'MGMT_FIP=%s\nMGMT_TENANT_IP=%s\n' "$FIP" "$TENANT_IP" | tee ~/capi-mgmt-net.env
openstack server show capi-mgmt-v2 -f value -c addresses
} )
Note (DOCFIX-038): the FIP is pool-allocated and the tenant IP is DHCP-assigned -- NEITHER is deterministic (this rebuild: FIP 10.12.5.103, tenant 10.20.0.107; the pre-teardown VM was 10.12.7.40 / 10.20.0.45). Step 6.2 persists both to ~/capi-mgmt-net.env; 6.3-6.6a source it, and phase-07 (conductor kubeconfig) uses the same FIP. Do not hardcode either value.
# RUN: mgmt VM This is the premise of D-035. PROCEED ONLY IF VIP-OK.
source ~/capi-mgmt-net.env # MGMT_FIP, MGMT_TENANT_IP (written by 6.2)
ssh -i ~/.ssh/id_ed25519 -o BatchMode=yes -o StrictHostKeyChecking=no \
-o UserKnownHostsFile=/dev/null -o ConnectTimeout=10 ubuntu@"$MGMT_FIP" bash -s <<'REOF'
set -u
echo "=== VM -> Keystone VIP 10.12.4.50:5000 ===" # ENV(keystone-vip)
timeout 6 bash -c 'exec 3<>/dev/tcp/10.12.4.50/5000' && echo VIP-OK || echo VIP-FAIL
echo "=== VM -> internet 1.1.1.1:443 (image pulls) ==="
timeout 6 bash -c 'exec 3<>/dev/tcp/1.1.1.1/443' && echo NET-OK || echo NET-FAIL
REOF
GATE: require VIP-OK. NET-FAIL means sort provider-ext internet egress (or a registry mirror) before 6.6. Do NOT build k8s on a VM that fails VIP-OK. (appendix-A: D-035 -- single-NIC removes the dual-homed reverse-path bug.)
# RUN: mgmt VM Channel is 1.32-classic/stable (NOT 1.32/stable -- that is the charm-era track and does not exist for the snap). The bootstrap config MUST carry an explicit cluster-config block (appendix-A: DOCFIX-024 -- a config without it disables network+dns and the node never goes Ready). Every sudo gets </dev/null (appendix-A: DOCFIX-021 -- remote bash -s reads the script from stdin).
source ~/capi-mgmt-net.env # MGMT_FIP, MGMT_TENANT_IP (written by 6.2)
ssh -i ~/.ssh/id_ed25519 -o BatchMode=yes -o StrictHostKeyChecking=no \
-o UserKnownHostsFile=/dev/null -o ConnectTimeout=10 ubuntu@"$MGMT_FIP" \
bash -s "$MGMT_FIP" "$MGMT_TENANT_IP" <<'REOF'
set -euo pipefail
MGMT_FIP="$1"; MGMT_TENANT_IP="$2" # passed from the jumphost (extra-sans must be the real FIP + tenant IP)
echo "=== install k8s snap 1.32-classic/stable ==="
sudo snap install k8s --classic --channel=1.32-classic/stable </dev/null
echo "=== write bootstrap config (DOCFIX-024: cluster-config block REQUIRED) ==="
sudo tee /root/bootstrap-config.yaml >/dev/null <<CFG
cluster-config:
network:
enabled: true
dns:
enabled: true
pod-cidr: 10.1.0.0/16
service-cidr: 10.152.183.0/24
extra-sans:
- $MGMT_FIP
- $MGMT_TENANT_IP
CFG
sudo cat /root/bootstrap-config.yaml
echo "=== bootstrap (timeout 10m) ==="
sudo k8s bootstrap --name capi-mgmt-v2 --file /root/bootstrap-config.yaml --timeout 10m </dev/null
echo "=== status ==="
sudo k8s status --wait-ready --timeout 5m </dev/null
REOF
Expect: k8s status reports cluster ready, network+dns enabled, one node. Retry path: sudo snap remove k8s --purge </dev/null then re-run this block.
The agnhost pod-egress probe is the exact test the dual-homed D-033 node and the old k3s node FAILED. On this single-NIC VM it must Completed.
# RUN: jumphost (ssh to the mgmt VM; the kubeconfig lands on the jumphost). server = the FIP, not tenant IP
source ~/capi-mgmt-net.env # MGMT_FIP
ssh -i ~/.ssh/id_ed25519 -o BatchMode=yes -o StrictHostKeyChecking=no \
-o UserKnownHostsFile=/dev/null -o ConnectTimeout=10 ubuntu@"$MGMT_FIP" \
"sudo k8s config server=https://$MGMT_FIP:6443 </dev/null" > ~/capi-mgmt.kubeconfig
# [SENSITIVE] ~/capi-mgmt.kubeconfig contains a cluster-admin credential.
wc -l ~/capi-mgmt.kubeconfig ; head -1 ~/capi-mgmt.kubeconfig # expect >0 lines, "apiVersion: v1"
# RUN: jumphost -- node check + the hard gate
( {
set -u
export KUBECONFIG="$HOME/capi-mgmt.kubeconfig"
echo "=== node ==="
kubectl get nodes -o wide # expect capi-mgmt-v2 Ready, v1.32.13
echo "=== agnhost pod-egress probe -> Keystone VIP 10.12.4.50:5000 ==="
kubectl run egress-test --image=registry.k8s.io/e2e-test-images/agnhost:2.40 \
--restart=Never --command -- /agnhost connect 10.12.4.50:5000 --timeout=5s
echo "(poll the next line until STATUS=Completed)"
kubectl get pod egress-test -o jsonpath='{.status.phase} {.status.containerStatuses[0].state}{"\n"}'
} )
GATE: require the probe pod Completed / exitCode 0 (empty logs = clean TCP connect). That proves pod -> Cilium -> ens3 -> OVN -> router SNAT egress works. Then clean up the throwaway pod:
# RUN: jumphost KUBECONFIG="$HOME/capi-mgmt.kubeconfig" kubectl delete pod egress-test --now
# RUN: mgmt VM Run VM-side as root with KUBECONFIG=/root/kubeconfig (local apiserver = the VM's tenant IP:6443) so the matched 1.32.13 kubectl is used -- avoids the jumphost kubectl's +3-minor skew. Versions are READ from the tag's dependencies.json, never hardcoded (D-034). The as-built pins are in the reference block below as a known-good cross-check only.
HARDENED ORDER (appendix-A: D-034 install-ordering): cert-manager -> ORC -> clusterctl init -> CAAPH -> janitor. ORC precedes clusterctl init because CAPO v0.14.4's openstackserver controller hard-depends on ORC's Image.openstack.k-orc.cloud CRD; installing CAPO first crash-loops until ORC lands. (The 2026-06-08 run used ORC last and self-healed after 6 restarts -- the runbook corrects the order.)
# RUN: jumphost Installs the CAPI tooling on the mgmt VM at the dependencies.json pins and writes ~/capi-pins.env (sourced by 6.6b-6.6f). kubectl is pinned to the cluster's 1.32.13 (no apiserver skew). The SSH_OPTS/MGMT_VM vars set here are reused by 6.6b-6.6f (same jumphost shell).
# define the mgmt-VM connection once (reused by 6.6b-6.6f)
source ~/capi-mgmt-net.env # MGMT_FIP, MGMT_TENANT_IP (written by 6.2)
MGMT_VM="$MGMT_FIP"
SSH_OPTS="-i $HOME/.ssh/id_ed25519 -o BatchMode=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ConnectTimeout=10"
ssh $SSH_OPTS ubuntu@"$MGMT_VM" bash -s <<'REOF'
set -euo pipefail
sudo apt-get update -qq </dev/null && sudo apt-get install -y jq curl </dev/null
# kubeconfig for the local apiserver (the VM's own tenant IP:6443), readable by ubuntu -> helm/clusterctl/kubectl need no sudo
mkdir -p "$HOME/.kube"; sudo k8s config </dev/null > "$HOME/.kube/config"; chmod 600 "$HOME/.kube/config"
# egress pre-check (the VM pulls charts/binaries/manifests from these)
for h in https://raw.githubusercontent.com https://get.helm.sh https://github.com https://dl.k8s.io; do
printf '%s -> ' "$h"; curl -s -o /dev/null -w '%{http_code}\n' "$h" || echo FAIL
done
# version constellation from the chart tag's dependencies.json (D-034; never hardcoded)
curl -fsSL https://raw.githubusercontent.com/azimuth-cloud/capi-helm-charts/0.25.1/dependencies.json -o "$HOME/deps.json"
CAPI=$(jq -r '."cluster-api"' "$HOME/deps.json")
CAPO=$(jq -r '."cluster-api-provider-openstack"' "$HOME/deps.json")
CERT=$(jq -r '."cert-manager"' "$HOME/deps.json")
ORC=$(jq -r '."openstack-resource-controller"' "$HOME/deps.json")
CAAPH=$(jq -r '."addon-provider"' "$HOME/deps.json")
JANITOR=$(jq -r '."cluster-api-janitor-openstack"' "$HOME/deps.json")
HELM=$(jq -r '.helm' "$HOME/deps.json")
{ echo "CAPI=$CAPI"; echo "CAPO=$CAPO"; echo "CERT=$CERT"; echo "ORC=$ORC"; \
echo "CAAPH=$CAAPH"; echo "JANITOR=$JANITOR"; echo "HELM=$HELM"; } > "$HOME/capi-pins.env"
echo "== pins (cross-check: CAPI v1.13.2 CAPO v0.14.4 CERT v1.20.2 ORC v2.5.0 CAAPH 0.12.0 JANITOR 0.11.0 HELM v3.17.3) =="
cat "$HOME/capi-pins.env"
# install helm (pinned), clusterctl (= CAPI pin), kubectl (= cluster 1.32.13)
curl -fsSL "https://get.helm.sh/helm-${HELM}-linux-amd64.tar.gz" -o /tmp/helm.tgz
sudo tar -xzf /tmp/helm.tgz -C /usr/local/bin --strip-components=1 linux-amd64/helm </dev/null
curl -fsSL "https://github.com/kubernetes-sigs/cluster-api/releases/download/${CAPI}/clusterctl-linux-amd64" -o /tmp/clusterctl
sudo install -m 0755 /tmp/clusterctl /usr/local/bin/clusterctl </dev/null
curl -fsSL "https://dl.k8s.io/release/v1.32.13/bin/linux/amd64/kubectl" -o /tmp/kubectl
sudo install -m 0755 /tmp/kubectl /usr/local/bin/kubectl </dev/null
echo "== tooling =="; helm version --short; clusterctl version; kubectl version --client 2>/dev/null | head -1
REOF
# RUN: jumphost
ssh $SSH_OPTS ubuntu@"$MGMT_VM" bash -s <<'REOF' set -euo pipefail source "$HOME/capi-pins.env" helm repo add jetstack https://charts.jetstack.io helm repo update helm upgrade --install cert-manager jetstack/cert-manager \ --namespace cert-manager --create-namespace \ --version "$CERT" --set crds.enabled=true --wait --timeout 5m kubectl -n cert-manager wait --for=condition=Available deploy --all --timeout=180s kubectl -n cert-manager get pods REOF
# RUN: jumphost server-side apply (large CRDs). Manifest is the k-orc release install.yaml (D-034).
ssh $SSH_OPTS ubuntu@"$MGMT_VM" bash -s <<'REOF'
set -euo pipefail
source "$HOME/capi-pins.env"
kubectl apply --server-side -f \
"https://github.com/k-orc/openstack-resource-controller/releases/download/${ORC}/install.yaml"
kubectl -n orc-system wait --for=condition=Available deploy --all --timeout=180s
kubectl get crd images.openstack.k-orc.cloud
REOF
# RUN: jumphost cert-manager already present -> clusterctl detects and skips it.
ssh $SSH_OPTS ubuntu@"$MGMT_VM" bash -s <<'REOF'
set -euo pipefail
source "$HOME/capi-pins.env"
clusterctl init \
--core "cluster-api:${CAPI}" \
--bootstrap "kubeadm:${CAPI}" \
--control-plane "kubeadm:${CAPI}" \
--infrastructure "openstack:${CAPO}"
for ns in capi-system capi-kubeadm-bootstrap-system capi-kubeadm-control-plane-system capo-system; do
echo "== $ns =="; kubectl -n "$ns" wait --for=condition=Available deploy --all --timeout=240s
done
REOF
# RUN: jumphost
ssh $SSH_OPTS ubuntu@"$MGMT_VM" bash -s <<'REOF' set -euo pipefail source "$HOME/capi-pins.env" helm repo add capi-addon https://azimuth-cloud.github.io/cluster-api-addon-provider helm repo add capi-janitor https://azimuth-cloud.github.io/cluster-api-janitor-openstack helm repo update helm upgrade --install cluster-api-addon-provider capi-addon/cluster-api-addon-provider \ --namespace capi-addon-system --create-namespace --version "$CAAPH" --wait --timeout 5m helm upgrade --install cluster-api-janitor-openstack capi-janitor/cluster-api-janitor-openstack \ --namespace capi-janitor-system --create-namespace --version "$JANITOR" --wait --timeout 5m kubectl -n capi-addon-system get pods kubectl -n capi-janitor-system get pods REOF
# RUN: jumphost
ssh $SSH_OPTS ubuntu@"$MGMT_VM" bash -s <<'REOF' set -euo pipefail clusterctl version echo "== all controllers Running ==" kubectl get pods -A | egrep 'capi-|capo-|cert-manager|orc-system|janitor|addon' || true echo "== key CRDs present ==" kubectl get crd clusters.cluster.x-k8s.io \ openstackclusters.infrastructure.cluster.x-k8s.io \ kubeadmcontrolplanes.controlplane.cluster.x-k8s.io \ images.openstack.k-orc.cloud REOF
Completed both passed.capi-mgmt-v2 Ready (v1.32.13); ~/capi-mgmt.kubeconfig (server = FIP) works from the jumphost.Image CRD present; no crash-looping CAPO.capi-mgmt-v2: gp.large, ubuntu-24.04-noble; tenant IP + FIP are per-rebuild (this rebuild 10.20.0.107 ens3 / FIP 10.12.5.103; 2026-06-08/09: 10.20.0.45 / 10.12.7.40). 6.2 persists both to ~/capi-mgmt-net.env.capi-mgmt-net / subnet capi-mgmt-subnet 10.20.0.0/24; router capi-mgmt-router.phase-07 -- conductor graft: place ~/capi-mgmt.kubeconfig at /etc/magnum/kubeconfig on magnum/0 and stage the [capi_helm] conf.d drop-in (D-037), pointing the conductor at the FIP.