diff --git a/runbooks/phase-06-incloud-mgmt-cluster.md b/runbooks/phase-06-incloud-mgmt-cluster.md index 38dd469..115ca2f 100644 --- a/runbooks/phase-06-incloud-mgmt-cluster.md +++ b/runbooks/phase-06-incloud-mgmt-cluster.md @@ -19,7 +19,9 @@ allocated from it. Octavia is NOT required for the mgmt cluster itself (its apiserver is reached via the FIP directly); Octavia is a phase-08 prereq for workload clusters. - `admin-openrc` sourced on the jumphost; `openstack`, `jq`, `kubectl` available. -- The `capi-mgmt` Keystone project exists. The Magnum trustee domain is auto-configured +- The `capi-mgmt` Keystone project, the flavors, and the `ubuntu-24.04-noble` image + exist -- on a FRESH deploy NONE of these survive teardown; Step 6.0-BOOT below + verifies-or-creates all of them (run it first). The Magnum trustee domain is auto-configured by the magnum charm via its keystone (identity-credentials) relation -- verify [trust] (trustee_domain_id / trustee_domain_admin_id / trustee_domain_admin_password) is populated in magnum.conf; no manual step. @@ -46,6 +48,88 @@ --- +## Step 6.0-BOOT -- Fresh-deploy tenant bootstrap (project, role, flavors, mgmt image) +`# RUN: jumphost` REQUIRED on a fresh deploy: post-teardown the cloud has no +tenant projects, NO flavors, and NO images -- this is the substance of the retired +do-doc-06 tenant setup, restored after the phase-NN consolidation dropped it +(found in the 2026-06-10 pre-redeploy review). Everything is verify-or-create, so +it is safe (all [SKIP]) on an existing cloud. + +Flavor specs are as-built ground truth (2026-06-08 verified-state checkpoint): +gp.large 16384/4/80 (mgmt VM, 6.2), gp.mid 8192/2/40 (workload masters, 8.0 +template), capi.node 4096/2/40 (workload workers, 8.0 template); gp.small and +m1.lbtest are as-built parity. The 40/80 GB root disks schedule because the bundle +sets nova-compute `libvirt-image-backend: rbd` (B3) -- DISK_GB comes from the Ceph +pool, not the ~9 GB local ephemeral ceiling. + +The noble image imports via the interoperable import path (glance-direct), the +VERBATIM-proven path from the 2026-06-08 kube-image upload (plain web-download +403s on this cloud). With the hardened bundle's glance `image-conversion: true`, +the stored disk_format lands `raw` on the redeploy (expected; D-021 Ceph +fast-clone alignment). + +LIVE-REVIEW (two as-built facts not in the record -- capture from the OLD cloud +BEFORE teardown if still possible): `openstack project show capi-mgmt -f yaml` +(the project's domain) and `openstack image show ubuntu-24.04-noble -f yaml` +(visibility). The block below defaults the domain from the admin token and lets +glance default visibility (the 06-08 import landed `shared` with no flag); if 6.2 +later fails image-not-found under capi-mgmt scope, `openstack image set --public +ubuntu-24.04-noble` is the one-line repair. + +```bash +( { + set -u + source ~/admin-openrc + echo "=== project capi-mgmt (verify-or-create) ===" + PROJ_DOMAIN="${OS_PROJECT_DOMAIN_NAME:-default}" # LIVE-REVIEW: as-built domain + openstack project show capi-mgmt >/dev/null 2>&1 \ + && echo "[SKIP] project capi-mgmt exists" \ + || { openstack project create --domain "$PROJ_DOMAIN" capi-mgmt >/dev/null \ + && echo "[OK] project capi-mgmt (domain $PROJ_DOMAIN)"; } + + echo "=== role: let $OS_USERNAME scope to capi-mgmt (OS_PROJECT_ID blocks in 6.x/7.8/8.x) ===" + openstack role assignment list --user "$OS_USERNAME" --user-domain "$OS_USER_DOMAIN_NAME" \ + --project capi-mgmt --project-domain "$PROJ_DOMAIN" -f value 2>/dev/null | grep -q . \ + && echo "[SKIP] role assignment present" \ + || { openstack role add --user "$OS_USERNAME" --user-domain "$OS_USER_DOMAIN_NAME" \ + --project capi-mgmt --project-domain "$PROJ_DOMAIN" admin \ + && echo "[OK] admin role on capi-mgmt"; } + + echo "=== flavors (as-built specs; public) ===" + for spec in "gp.large 4 16384 80" "gp.mid 2 8192 40" "capi.node 2 4096 40" \ + "gp.small 1 2048 20" "m1.lbtest 1 1024 4"; do + set -- $spec + openstack flavor show "$1" >/dev/null 2>&1 \ + && echo "[SKIP] flavor $1 exists" \ + || { openstack flavor create --vcpus "$2" --ram "$3" --disk "$4" --public "$1" >/dev/null \ + && echo "[OK] $1 ($2 vcpu / $3 MB / $4 GB)"; } + done + + echo "=== mgmt VM image ubuntu-24.04-noble (verify-or-import; glance-direct; HOME-staged, L7) ===" + if openstack image show ubuntu-24.04-noble >/dev/null 2>&1; then + echo "[SKIP] image ubuntu-24.04-noble exists" + else + SRC="$HOME/noble-server-cloudimg-amd64.img" + [ -f "$SRC" ] || { echo "ABORT: $SRC missing (re-fetch: cloud-images.ubuntu.com/noble/current/)"; exit 1; } + glance image-create-via-import \ + --import-method glance-direct \ + --file "$SRC" \ + --container-format bare --disk-format qcow2 \ + --name ubuntu-24.04-noble + fi + echo "=== poll to active (import + conversion) ===" + for i in $(seq 1 40); do + ST=$(openstack image show ubuntu-24.04-noble -f value -c status 2>/dev/null || echo '?') + echo "[$i] status=$ST" + [ "$ST" = active ] && break + sleep 15 + done +} ) +``` +GATE: project + role + all five flavors present; `ubuntu-24.04-noble` `active` +(disk_format `raw` expected with image-conversion on). Do not proceed to 6.0 +until this passes. + ## Step 6.0 -- Keypair + security group (capi-mgmt project) `# RUN: jumphost` Safe/idempotent setup -- consolidated. (LIVE-REVIEW: exact SG rule syntax is standard openstack-client; confirm on the redeploy test.) diff --git a/runbooks/phase-08-workload-cluster-acceptance.md b/runbooks/phase-08-workload-cluster-acceptance.md index 92d21d8..5ac8a0b 100644 --- a/runbooks/phase-08-workload-cluster-acceptance.md +++ b/runbooks/phase-08-workload-cluster-acceptance.md @@ -26,6 +26,7 @@ already reports HEALTHY (if the phase-07 1.4.0 upgrade was skipped, expect the COSMETIC UNHEALTHY of D-042 -- functional, but not an acceptance pass). - Image `ubuntu-jammy-kube-v1.32.13` present AND carrying Glance properties + (8.0 below verifies, and on a fresh deploy imports it from the jumphost-staged qcow2) `kube_version` (e.g. v1.32.13) and `os_distro=ubuntu`. The driver reads the k8s version from the IMAGE, not a template label (P6-CONTRACT / L-P6-3); a missing property fails create. @@ -84,7 +85,39 @@ && echo "template OK" || echo "template ABSENT -- create it below" } ) ``` -Create the template only if absent (spec from the as-built capture; the two labels +If the image is ABSENT (fresh deploy -- nothing survives teardown), import it from +the jumphost-staged qcow2. The command is the VERBATIM 2026-06-08 as-executed path +(glance-direct; plain web-download 403s on this cloud). With the hardened bundle's +glance `image-conversion: true` the stored disk_format lands `raw` on the redeploy +(expected -- the as-built run stored qcow2 because conversion was off then): +```bash +( { + set -u + source ~/admin-openrc + if openstack image show ubuntu-jammy-kube-v1.32.13 >/dev/null 2>&1; then + echo "[SKIP] image ubuntu-jammy-kube-v1.32.13 present" + else + SRC="$HOME/ubuntu-jammy-kube-v1.32.13-260401-2014.qcow2" + [ -f "$SRC" ] || { echo "ABORT: $SRC missing on the jumphost (azimuth-images source; see appendix-B)"; exit 1; } + glance image-create-via-import \ + --import-method glance-direct \ + --file "$SRC" \ + --container-format bare --disk-format qcow2 \ + --property os_distro=ubuntu --property kube_version=v1.32.13 \ + --name ubuntu-jammy-kube-v1.32.13 + fi + echo "=== poll to active (3.7G stage + conversion; allow ~10 min) ===" + for i in $(seq 1 40); do + ST=$(openstack image show ubuntu-jammy-kube-v1.32.13 -f value -c status 2>/dev/null || echo '?') + echo "[$i] status=$ST" + [ "$ST" = active ] && break + sleep 15 + done +} ) +``` +GATE: image `active` and the 8.0 property check above passes (kube_version +v1.32.13 / os_distro ubuntu). Then create the template only if absent (spec from +the as-built capture; the two labels are intentionally the whole config -- chart 0.25.1 + the conf.d drop-in govern the rest). `--network-driver` is OMITTED deliberately: under the 1.4.0 driver the option IS honored (it maps to the chart `network_driver`), so to keep the as-built chart