# Runbook 04a — CAPI bootstrap cluster install on capi-mgmt.maas **Reference:** D-017 (full rebuild every cycle). Runs after `04-magnum-domain.md` and before `05-magnum-capi-driver.md`. **Goal:** From a MAAS-Ready `capi-mgmt` VM, produce a single-node k3s running cluster-api, CAPO, canonical-kubernetes providers, cert-manager, and ORC, with a workload-cluster kubeconfig delivered to the jumphost for use by the Magnum CAPI driver in runbook 05. **Pre-conditions:** - OpenStack cloud is up and stable (`02-deploy.md` complete, all units active/idle) - Magnum trustee domain is created (`04-magnum-domain.md` complete) - `capi-mgmt` MAAS machine is in **Ready** state (released after teardown, not yet deployed) - Jumphost has `~/admin-openrc` sourced and an authenticated `openstack` CLI working against the new Caracal cloud - Vault CA bundle is available on the jumphost at a known path (issued by the Caracal Vault during `03-vault-init.md`) **Network preconditions:** - `capi-mgmt` machine should be configured in MAAS with two interfaces: - `eth0` on the metal fabric (DHCP from MAAS) — used for k3s API bind - `eth1` on the provider fabric (static IP, no DHCP) — used for workload-cluster FIP reach. This IP must NOT fall inside the Neutron FIP allocation pool on the ext_net subnet. - Verify the eth1 IP is outside the FIP pool before deploy: ```bash openstack subnet show -c allocation_pools -c gateway_ip ``` ## Step 1 — Deploy Ubuntu 24.04 to capi-mgmt via MAAS Use MAAS UI: Machines → capi-mgmt → Take action → Deploy → Ubuntu 24.04 LTS (Noble) → Deploy machine. Wait for Deployed status (~10 min). Verify SSH reachability once Deployed (note: SSH user is `ubuntu`, not `jessea123`; MAAS cloud-init pattern): ```bash ssh ubuntu@ 'hostname; uname -a; ip -br a' ``` Verify both interfaces show their expected IPs. ## Step 2 — Install Vault CA on the bootstrap host The bootstrap host must trust the Caracal Vault root CA so that `openstack` CLI calls and CAPO authentication to Keystone succeed over HTTPS. ```bash # From jumphost — replace with the deployed capi-mgmt IP scp /vault-ca.crt ubuntu@:/tmp/vault-ca.crt ssh ubuntu@ << 'REMOTE' sudo install -m 0644 /tmp/vault-ca.crt /usr/local/share/ca-certificates/vault-ca.crt sudo update-ca-certificates # Verify Keystone reachable with TLS curl --cacert /etc/ssl/certs/ca-certificates.crt https://:5000/v3 -s -o /dev/null -w "%{http_code}\n" # Expect: 200 REMOTE ``` ## Step 3 — Install k3s k3s defaults to binding 0.0.0.0:6443. Bind to the metal-network IP only to keep the management API off the provider network. The TLS-SAN flags must include both the IP and the FQDN. k3s does NOT auto-add 127.0.0.1 to the SAN list; if 127.0.0.1 needs to be in the kubeconfig, add it explicitly as a `--tls-san`. We do not — we rewrite the kubeconfig server URL instead. ```bash ssh ubuntu@ 'bash -s' << 'REMOTE' set -euo pipefail BIND_ADDR=$(ip -4 -br a show eth0 | awk '{print $3}' | cut -d/ -f1) echo "bind addr: $BIND_ADDR" if systemctl is-active --quiet k3s; then echo "[skip] k3s already running" else curl -sfL https://get.k3s.io | \ INSTALL_K3S_EXEC="server \ --bind-address=${BIND_ADDR} \ --advertise-address=${BIND_ADDR} \ --node-ip=${BIND_ADDR} \ --tls-san=${BIND_ADDR} \ --tls-san=capi-mgmt.maas \ --write-kubeconfig-mode=0644 \ --disable=traefik" \ sh - fi # Wait for Ready for i in $(seq 1 30); do if sudo k3s kubectl get nodes 2>/dev/null | awk 'NR>1 && $2=="Ready"{n++} END{exit n<1}'; then echo "[ok] node Ready after ${i} polls" break fi sleep 2 done # Copy and rewrite kubeconfig sudo install -o ubuntu -g ubuntu -m 0600 /etc/rancher/k3s/k3s.yaml /home/ubuntu/.kube-bootstrap.yaml sed -i "s|server: https://127\\.0\\.0\\.1:6443|server: https://${BIND_ADDR}:6443|" /home/ubuntu/.kube-bootstrap.yaml grep '^ server:' /home/ubuntu/.kube-bootstrap.yaml KUBECONFIG=/home/ubuntu/.kube-bootstrap.yaml kubectl get nodes REMOTE ``` ## Step 4 — Install helm and clusterctl kubectl is provided by k3s as a symlink; do not re-install. ```bash ssh ubuntu@ 'bash -s' << 'REMOTE' set -euo pipefail # helm if ! command -v helm >/dev/null 2>&1; then curl -fL https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash fi helm version --short # clusterctl — fetch latest from GitHub API, fall back to a pinned version if needed if ! command -v clusterctl >/dev/null 2>&1; then CLUSTERCTL_VER=$(curl -fsSL --max-time 15 \ https://api.github.com/repos/kubernetes-sigs/cluster-api/releases/latest \ | python3 -c 'import json,sys; print(json.load(sys.stdin)["tag_name"])') curl -fLo /tmp/clusterctl --max-time 60 \ "https://github.com/kubernetes-sigs/cluster-api/releases/download/${CLUSTERCTL_VER}/clusterctl-linux-amd64" sudo install -o root -g root -m 0755 /tmp/clusterctl /usr/local/bin/clusterctl rm /tmp/clusterctl fi clusterctl version REMOTE ``` ## Step 5 — clusterctl init with canonical-kubernetes providers ```bash ssh ubuntu@ 'bash -s' << 'REMOTE' set -euo pipefail mkdir -p ~/.cluster-api cat > ~/.cluster-api/clusterctl.yaml << 'CONFIG' providers: - name: "canonical-kubernetes" url: "https://github.com/canonical/cluster-api-k8s/releases/latest/download/bootstrap-components.yaml" type: "BootstrapProvider" - name: "canonical-kubernetes" url: "https://github.com/canonical/cluster-api-k8s/releases/latest/download/control-plane-components.yaml" type: "ControlPlaneProvider" CONFIG export KUBECONFIG=/home/ubuntu/.kube-bootstrap.yaml if kubectl get namespace capi-system >/dev/null 2>&1; then echo "[skip] CAPI already initialized" else clusterctl init \ --infrastructure openstack \ --bootstrap canonical-kubernetes \ --control-plane canonical-kubernetes fi # Wait for all controller deployments for ns in cert-manager capi-system cabpck-system cacpck-system capo-system; do echo "[wait] ${ns}" kubectl wait --for=condition=Available deployment --all --namespace "${ns}" --timeout=5m done clusterctl version kubectl get pods -A REMOTE ``` Expected namespaces (note the abbreviated canonical-kubernetes names): - `cert-manager` - `capi-system` — cluster-api core - `capo-system` — CAPI provider for OpenStack - `cabpck-system` — CAPI Bootstrap Provider Canonical Kubernetes - `cacpck-system` — CAPI Control-Plane Provider Canonical Kubernetes ## Step 6 — Install ORC (OpenStack Resource Controller) Required by CAPO for managing OpenStack resources as Kubernetes objects. Verify the latest release URL before applying. ```bash ssh ubuntu@ 'bash -s' << 'REMOTE' set -euo pipefail export KUBECONFIG=/home/ubuntu/.kube-bootstrap.yaml ORC_URL="https://github.com/k-orc/openstack-resource-controller/releases/latest/download/install.yaml" kubectl apply -f "$ORC_URL" # Wait for ORC controller sleep 5 for ns in $(kubectl get ns -o name | grep -E '^namespace/(orc|openstack-resource-controller)' | sed 's|namespace/||'); do echo "[wait] ${ns}" kubectl wait --for=condition=Available deployment --all --namespace "${ns}" --timeout=5m done REMOTE ``` ## Step 7 — Cloud-side preparation (run from jumphost) Inventory existing images and flavors before creating. Lesson from prior cycles: do not blindly create `ubuntu-24.04-capi` when `noble-amd64` is already present and suitable. ```bash source ~/admin-openrc openstack image list | grep -i noble openstack flavor list ``` Create the supporting cloud-side resources for CAPO: ```bash # Project openstack project create --domain admin_domain capi-mgmt \ --description "CAPI management cluster workloads" # User openstack user create --domain admin_domain --project capi-mgmt \ --project-domain admin_domain --password-prompt capo # Roles openstack role add --project capi-mgmt --project-domain admin_domain \ --user capo --user-domain admin_domain member openstack role add --project capi-mgmt --project-domain admin_domain \ --user capo --user-domain admin_domain load-balancer_member # Switch to capo unset $(env | awk -F= '/^OS_/{print $1}') export OS_AUTH_URL= export OS_IDENTITY_API_VERSION=3 export OS_USERNAME=capo export OS_USER_DOMAIN_NAME=admin_domain export OS_PROJECT_NAME=capi-mgmt export OS_PROJECT_DOMAIN_NAME=admin_domain export OS_PASSWORD= export OS_CACERT= # App credential (record id and secret immediately — secret only shown at creation) openstack application credential create capo-app-cred \ --description "CAPO authentication" \ -f yaml > ~/capi-mgmt/capo-app-cred.yaml chmod 0600 ~/capi-mgmt/capo-app-cred.yaml # Nova keypair — generate on capi-mgmt and upload public key ssh ubuntu@ 'ssh-keygen -t ed25519 -N "" -f ~/.ssh/capi-mgmt-key' ssh ubuntu@ 'cat ~/.ssh/capi-mgmt-key.pub' > /tmp/capi-mgmt-key.pub openstack keypair create --public-key /tmp/capi-mgmt-key.pub capi-mgmt-key # Also pull the private key back to jumphost for post-rebuild access scp -p ubuntu@:~/.ssh/capi-mgmt-key ~/capi-mgmt/capi-mgmt-key chmod 0600 ~/capi-mgmt/capi-mgmt-key ``` ## Step 8 — Compose clouds.yaml and cloud.conf Use `v3applicationcredential` auth — cleaner than user/password. ```bash # Read app credential APP_CRED_ID=$(yq -r '.id' ~/capi-mgmt/capo-app-cred.yaml) APP_CRED_SECRET=$(yq -r '.secret' ~/capi-mgmt/capo-app-cred.yaml) # Compose clouds.yaml for capi-mgmt cat > /tmp/clouds.yaml << EOC clouds: openstack: auth_type: v3applicationcredential auth: auth_url: application_credential_id: ${APP_CRED_ID} application_credential_secret: ${APP_CRED_SECRET} region_name: RegionOne cacert: /usr/local/share/ca-certificates/vault-ca.crt interface: public identity_api_version: 3 EOC scp /tmp/clouds.yaml ubuntu@:/home/ubuntu/clouds.yaml ssh ubuntu@ 'chmod 0600 ~/clouds.yaml' # cloud.conf for OCCM — use tls-insecure=true for v1 testcloud # (v2: ship Vault CA via CK8sConfig files field instead) cat > /tmp/cloud.conf << EOC [Global] auth-url= application-credential-id=${APP_CRED_ID} application-credential-secret=${APP_CRED_SECRET} region=RegionOne tls-insecure=true [LoadBalancer] floating-network-id= EOC ``` ## Step 9 — Render and apply the cluster manifest The canonical-kubernetes cluster template takes 18 substitution variables. Capture them in a `cluster-env` file, then use `envsubst` to render. The template is fetched from `canonical/cluster-api-k8s`. Variables (verify exact names against the template at apply time): ``` CLUSTER_NAME=capi-mgmt-cluster NAMESPACE=default KUBERNETES_VERSION=v1.32.2 CONTROL_PLANE_MACHINE_COUNT=1 WORKER_MACHINE_COUNT=0 OPENSTACK_CONTROL_PLANE_MACHINE_FLAVOR=capi-mgmt-node OPENSTACK_NODE_MACHINE_FLAVOR=capi-mgmt-node OPENSTACK_DNS_NAMESERVERS= OPENSTACK_EXTERNAL_NETWORK_ID= OPENSTACK_FAILURE_DOMAIN=nova OPENSTACK_IMAGE_NAME=noble-amd64 OPENSTACK_SSH_KEY_NAME=capi-mgmt-key OPENSTACK_CLOUD_YAML_B64=$(base64 -w0 /tmp/clouds.yaml) OPENSTACK_CLOUD_CONFIG_B64=$(base64 -w0 /tmp/cloud.conf) OPENSTACK_CLOUD_CACERT_B64=$(base64 -w0 ) OPENSTACK_CLOUD=openstack OPENSTACK_NODE_CIDR=10.6.0.0/24 KUBE_CONTROL_PLANE_ENDPOINT_PORT=6443 ``` Render and apply: ```bash ssh ubuntu@ 'bash -s' << 'REMOTE' set -euo pipefail export KUBECONFIG=/home/ubuntu/.kube-bootstrap.yaml curl -fLo /tmp/cluster-template.yaml \ https://github.com/canonical/cluster-api-k8s/releases/latest/download/cluster-template.yaml # Source env vars (operator fills in /tmp/cluster-env) # shellcheck disable=SC1091 source /tmp/cluster-env envsubst < /tmp/cluster-template.yaml > /tmp/cluster-rendered.yaml kubectl apply -f /tmp/cluster-rendered.yaml REMOTE ``` ## Step 10 — Poll for cluster Available ```bash ssh ubuntu@ 'bash -s' << 'REMOTE' set -euo pipefail export KUBECONFIG=/home/ubuntu/.kube-bootstrap.yaml START=$(date +%s) DEADLINE=$((START + 15*60)) while [[ $(date +%s) -lt $DEADLINE ]]; do PHASE=$(kubectl get cluster capi-mgmt-cluster -o jsonpath='{.status.phase}' 2>/dev/null || echo "?") AVAILABLE=$(kubectl get cluster capi-mgmt-cluster -o jsonpath='{.status.conditions[?(@.type=="Available")].status}' 2>/dev/null || echo "?") ELAPSED=$(($(date +%s) - START)) printf '[%4ds] Phase=%s Available=%s\n' "$ELAPSED" "$PHASE" "$AVAILABLE" [[ "$AVAILABLE" == "True" ]] && break sleep 15 done clusterctl describe cluster capi-mgmt-cluster --show-conditions all REMOTE ``` ## Step 11 — Export workload kubeconfig to jumphost ```bash ssh ubuntu@ 'bash -s' << 'REMOTE' set -euo pipefail export KUBECONFIG=/home/ubuntu/.kube-bootstrap.yaml mkdir -p ~/magnum-capi clusterctl get kubeconfig capi-mgmt-cluster > ~/magnum-capi/capi-mgmt-cluster.kubeconfig chmod 0600 ~/magnum-capi/capi-mgmt-cluster.kubeconfig KUBECONFIG=~/magnum-capi/capi-mgmt-cluster.kubeconfig kubectl get nodes REMOTE # Copy to jumphost for runbook 05 scp -p ubuntu@:~/magnum-capi/capi-mgmt-cluster.kubeconfig ~/magnum-capi/capi-mgmt-cluster.kubeconfig chmod 0600 ~/magnum-capi/capi-mgmt-cluster.kubeconfig ``` ## Exit criteria - `capi-mgmt.maas` is Deployed in MAAS with k3s + CAPI controllers + ORC running - `capi-mgmt-cluster` workload cluster is Available - Workload kubeconfig exists at `~/magnum-capi/capi-mgmt-cluster.kubeconfig` on the jumphost - Proceed to `05-magnum-capi-driver.md` ## Recurring pitfalls (apply to execution) - `juju ssh` HANGS when stdout is redirected — use `juju exec --unit X -- 'cmd'` - MAAS-deployed Ubuntu uses `ubuntu` user, not `jessea123` - k3s `--bind-address=X` doesn't bind 127.0.0.1 — kubeconfig server URL must be sed-rewritten - Snap-confined openstack CLI cannot read `/tmp` — paths under `$HOME` only - `openstack -f value -c X -c Y` outputs in alphabetical column order — use single-column queries - GitHub API rate limit is 60 unauthenticated requests/hour — cache results, don't refetch on every run - `.maas` DNS may not resolve from jumphost — use IPs directly