# Runbook 04a — CAPI bootstrap cluster install on capi-mgmt.maas

**Reference:** D-017 (full rebuild every cycle). Runs after `04-magnum-domain.md`
and before `05-magnum-capi-driver.md`.

**Goal:** From a MAAS-Ready `capi-mgmt` VM, produce a single-node k3s
running cluster-api, CAPO, canonical-kubernetes providers, cert-manager,
and ORC, with a workload-cluster kubeconfig delivered to the jumphost
for use by the Magnum CAPI driver in runbook 05.

**Pre-conditions:**

- OpenStack cloud is up and stable (`02-deploy.md` complete, all units
  active/idle)
- Magnum trustee domain is created (`04-magnum-domain.md` complete)
- `capi-mgmt` MAAS machine is in **Ready** state (released after teardown,
  not yet deployed)
- Jumphost has `~/admin-openrc` sourced and an authenticated `openstack`
  CLI working against the new Caracal cloud
- Vault CA bundle is available on the jumphost at a known path
  (issued by the Caracal Vault during `03-vault-init.md`)

**Network preconditions:**

- `capi-mgmt` machine should be configured in MAAS with two interfaces:
  - `eth0` on the metal fabric (DHCP from MAAS) — used for k3s API bind
  - `eth1` on the provider fabric (static IP, no DHCP) — used for
    workload-cluster FIP reach. This IP must NOT fall inside the Neutron
    FIP allocation pool on the ext_net subnet.
- Verify the eth1 IP is outside the FIP pool before deploy:
  ```bash
  openstack subnet show <ext_net_subnet> -c allocation_pools -c gateway_ip
  ```

## Step 1 — Deploy Ubuntu 24.04 to capi-mgmt via MAAS

Use MAAS UI: Machines → capi-mgmt → Take action → Deploy → Ubuntu 24.04
LTS (Noble) → Deploy machine. Wait for Deployed status (~10 min).

Verify SSH reachability once Deployed (note: SSH user is `ubuntu`, not
`jessea123`; MAAS cloud-init pattern):

```bash
ssh ubuntu@<eth0-ip> 'hostname; uname -a; ip -br a'
```

Verify both interfaces show their expected IPs.

## Step 2 — Install Vault CA on the bootstrap host

The bootstrap host must trust the Caracal Vault root CA so that `openstack`
CLI calls and CAPO authentication to Keystone succeed over HTTPS.

```bash
# From jumphost — replace <eth0-ip> with the deployed capi-mgmt IP
scp <vault-ca-path>/vault-ca.crt ubuntu@<eth0-ip>:/tmp/vault-ca.crt

ssh ubuntu@<eth0-ip> << 'REMOTE'
sudo install -m 0644 /tmp/vault-ca.crt /usr/local/share/ca-certificates/vault-ca.crt
sudo update-ca-certificates
# Verify Keystone reachable with TLS
curl --cacert /etc/ssl/certs/ca-certificates.crt https://<keystone-internal>:5000/v3 -s -o /dev/null -w "%{http_code}\n"
# Expect: 200
REMOTE
```

## Step 3 — Install k3s

k3s defaults to binding 0.0.0.0:6443. Bind to the metal-network IP only to
keep the management API off the provider network. The TLS-SAN flags must
include both the IP and the FQDN. k3s does NOT auto-add 127.0.0.1 to the
SAN list; if 127.0.0.1 needs to be in the kubeconfig, add it explicitly as
a `--tls-san`. We do not — we rewrite the kubeconfig server URL instead.

```bash
ssh ubuntu@<eth0-ip> 'bash -s' << 'REMOTE'
set -euo pipefail
BIND_ADDR=$(ip -4 -br a show eth0 | awk '{print $3}' | cut -d/ -f1)
echo "bind addr: $BIND_ADDR"

if systemctl is-active --quiet k3s; then
  echo "[skip] k3s already running"
else
  curl -sfL https://get.k3s.io | \
    INSTALL_K3S_EXEC="server \
      --bind-address=${BIND_ADDR} \
      --advertise-address=${BIND_ADDR} \
      --node-ip=${BIND_ADDR} \
      --tls-san=${BIND_ADDR} \
      --tls-san=capi-mgmt.maas \
      --write-kubeconfig-mode=0644 \
      --disable=traefik" \
    sh -
fi

# Wait for Ready
for i in $(seq 1 30); do
  if sudo k3s kubectl get nodes 2>/dev/null | awk 'NR>1 && $2=="Ready"{n++} END{exit n<1}'; then
    echo "[ok] node Ready after ${i} polls"
    break
  fi
  sleep 2
done

# Copy and rewrite kubeconfig
sudo install -o ubuntu -g ubuntu -m 0600 /etc/rancher/k3s/k3s.yaml /home/ubuntu/.kube-bootstrap.yaml
sed -i "s|server: https://127\\.0\\.0\\.1:6443|server: https://${BIND_ADDR}:6443|" /home/ubuntu/.kube-bootstrap.yaml
grep '^    server:' /home/ubuntu/.kube-bootstrap.yaml

KUBECONFIG=/home/ubuntu/.kube-bootstrap.yaml kubectl get nodes
REMOTE
```

## Step 4 — Install helm and clusterctl

kubectl is provided by k3s as a symlink; do not re-install.

```bash
ssh ubuntu@<eth0-ip> 'bash -s' << 'REMOTE'
set -euo pipefail

# helm
if ! command -v helm >/dev/null 2>&1; then
  curl -fL https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
fi
helm version --short

# clusterctl — fetch latest from GitHub API, fall back to a pinned version if needed
if ! command -v clusterctl >/dev/null 2>&1; then
  CLUSTERCTL_VER=$(curl -fsSL --max-time 15 \
    https://api.github.com/repos/kubernetes-sigs/cluster-api/releases/latest \
    | python3 -c 'import json,sys; print(json.load(sys.stdin)["tag_name"])')
  curl -fLo /tmp/clusterctl --max-time 60 \
    "https://github.com/kubernetes-sigs/cluster-api/releases/download/${CLUSTERCTL_VER}/clusterctl-linux-amd64"
  sudo install -o root -g root -m 0755 /tmp/clusterctl /usr/local/bin/clusterctl
  rm /tmp/clusterctl
fi
clusterctl version
REMOTE
```

## Step 5 — clusterctl init with canonical-kubernetes providers

```bash
ssh ubuntu@<eth0-ip> 'bash -s' << 'REMOTE'
set -euo pipefail

mkdir -p ~/.cluster-api
cat > ~/.cluster-api/clusterctl.yaml << 'CONFIG'
providers:
  - name: "canonical-kubernetes"
    url: "https://github.com/canonical/cluster-api-k8s/releases/latest/download/bootstrap-components.yaml"
    type: "BootstrapProvider"
  - name: "canonical-kubernetes"
    url: "https://github.com/canonical/cluster-api-k8s/releases/latest/download/control-plane-components.yaml"
    type: "ControlPlaneProvider"
CONFIG

export KUBECONFIG=/home/ubuntu/.kube-bootstrap.yaml

if kubectl get namespace capi-system >/dev/null 2>&1; then
  echo "[skip] CAPI already initialized"
else
  clusterctl init \
    --infrastructure openstack \
    --bootstrap canonical-kubernetes \
    --control-plane canonical-kubernetes
fi

# Wait for all controller deployments
for ns in cert-manager capi-system cabpck-system cacpck-system capo-system; do
  echo "[wait] ${ns}"
  kubectl wait --for=condition=Available deployment --all --namespace "${ns}" --timeout=5m
done

clusterctl version
kubectl get pods -A
REMOTE
```

Expected namespaces (note the abbreviated canonical-kubernetes names):

- `cert-manager`
- `capi-system` — cluster-api core
- `capo-system` — CAPI provider for OpenStack
- `cabpck-system` — CAPI Bootstrap Provider Canonical Kubernetes
- `cacpck-system` — CAPI Control-Plane Provider Canonical Kubernetes

## Step 6 — Install ORC (OpenStack Resource Controller)

Required by CAPO for managing OpenStack resources as Kubernetes objects.
Verify the latest release URL before applying.

```bash
ssh ubuntu@<eth0-ip> 'bash -s' << 'REMOTE'
set -euo pipefail
export KUBECONFIG=/home/ubuntu/.kube-bootstrap.yaml

ORC_URL="https://github.com/k-orc/openstack-resource-controller/releases/latest/download/install.yaml"
kubectl apply -f "$ORC_URL"

# Wait for ORC controller
sleep 5
for ns in $(kubectl get ns -o name | grep -E '^namespace/(orc|openstack-resource-controller)' | sed 's|namespace/||'); do
  echo "[wait] ${ns}"
  kubectl wait --for=condition=Available deployment --all --namespace "${ns}" --timeout=5m
done
REMOTE
```

## Step 7 — Cloud-side preparation (run from jumphost)

Inventory existing images and flavors before creating. Lesson from prior
cycles: do not blindly create `ubuntu-24.04-capi` when `noble-amd64` is
already present and suitable.

```bash
source ~/admin-openrc
openstack image list | grep -i noble
openstack flavor list
```

Create the supporting cloud-side resources for CAPO:

```bash
# Project
openstack project create --domain admin_domain capi-mgmt \
  --description "CAPI management cluster workloads"

# User
openstack user create --domain admin_domain --project capi-mgmt \
  --project-domain admin_domain --password-prompt capo

# Roles
openstack role add --project capi-mgmt --project-domain admin_domain \
  --user capo --user-domain admin_domain member
openstack role add --project capi-mgmt --project-domain admin_domain \
  --user capo --user-domain admin_domain load-balancer_member

# Switch to capo
unset $(env | awk -F= '/^OS_/{print $1}')
export OS_AUTH_URL=<keystone-internal>
export OS_IDENTITY_API_VERSION=3
export OS_USERNAME=capo
export OS_USER_DOMAIN_NAME=admin_domain
export OS_PROJECT_NAME=capi-mgmt
export OS_PROJECT_DOMAIN_NAME=admin_domain
export OS_PASSWORD=<the-password-you-set>
export OS_CACERT=<vault-ca-path>

# App credential (record id and secret immediately — secret only shown at creation)
openstack application credential create capo-app-cred \
  --description "CAPO authentication" \
  -f yaml > ~/capi-mgmt/capo-app-cred.yaml
chmod 0600 ~/capi-mgmt/capo-app-cred.yaml

# Nova keypair — generate on capi-mgmt and upload public key
ssh ubuntu@<eth0-ip> 'ssh-keygen -t ed25519 -N "" -f ~/.ssh/capi-mgmt-key'
ssh ubuntu@<eth0-ip> 'cat ~/.ssh/capi-mgmt-key.pub' > /tmp/capi-mgmt-key.pub
openstack keypair create --public-key /tmp/capi-mgmt-key.pub capi-mgmt-key
# Also pull the private key back to jumphost for post-rebuild access
scp -p ubuntu@<eth0-ip>:~/.ssh/capi-mgmt-key ~/capi-mgmt/capi-mgmt-key
chmod 0600 ~/capi-mgmt/capi-mgmt-key
```

## Step 8 — Compose clouds.yaml and cloud.conf

Use `v3applicationcredential` auth — cleaner than user/password.

```bash
# Read app credential
APP_CRED_ID=$(yq -r '.id' ~/capi-mgmt/capo-app-cred.yaml)
APP_CRED_SECRET=$(yq -r '.secret' ~/capi-mgmt/capo-app-cred.yaml)

# Compose clouds.yaml for capi-mgmt
cat > /tmp/clouds.yaml << EOC
clouds:
  openstack:
    auth_type: v3applicationcredential
    auth:
      auth_url: <keystone-internal>
      application_credential_id: ${APP_CRED_ID}
      application_credential_secret: ${APP_CRED_SECRET}
    region_name: RegionOne
    cacert: /usr/local/share/ca-certificates/vault-ca.crt
    interface: public
    identity_api_version: 3
EOC

scp /tmp/clouds.yaml ubuntu@<eth0-ip>:/home/ubuntu/clouds.yaml
ssh ubuntu@<eth0-ip> 'chmod 0600 ~/clouds.yaml'

# cloud.conf for OCCM — use tls-insecure=true for v1 testcloud
# (v2: ship Vault CA via CK8sConfig files field instead)
cat > /tmp/cloud.conf << EOC
[Global]
auth-url=<keystone-internal>
application-credential-id=${APP_CRED_ID}
application-credential-secret=${APP_CRED_SECRET}
region=RegionOne
tls-insecure=true

[LoadBalancer]
floating-network-id=<ext-net-uuid>
EOC
```

## Step 9 — Render and apply the cluster manifest

The canonical-kubernetes cluster template takes 18 substitution variables.
Capture them in a `cluster-env` file, then use `envsubst` to render. The
template is fetched from `canonical/cluster-api-k8s`.

Variables (verify exact names against the template at apply time):

```
CLUSTER_NAME=capi-mgmt-cluster
NAMESPACE=default
KUBERNETES_VERSION=v1.32.2
CONTROL_PLANE_MACHINE_COUNT=1
WORKER_MACHINE_COUNT=0
OPENSTACK_CONTROL_PLANE_MACHINE_FLAVOR=capi-mgmt-node
OPENSTACK_NODE_MACHINE_FLAVOR=capi-mgmt-node
OPENSTACK_DNS_NAMESERVERS=<dns-server-ips>
OPENSTACK_EXTERNAL_NETWORK_ID=<ext-net-uuid>
OPENSTACK_FAILURE_DOMAIN=nova
OPENSTACK_IMAGE_NAME=noble-amd64
OPENSTACK_SSH_KEY_NAME=capi-mgmt-key
OPENSTACK_CLOUD_YAML_B64=$(base64 -w0 /tmp/clouds.yaml)
OPENSTACK_CLOUD_CONFIG_B64=$(base64 -w0 /tmp/cloud.conf)
OPENSTACK_CLOUD_CACERT_B64=$(base64 -w0 <vault-ca-path>)
OPENSTACK_CLOUD=openstack
OPENSTACK_NODE_CIDR=10.6.0.0/24
KUBE_CONTROL_PLANE_ENDPOINT_PORT=6443
```

Render and apply:

```bash
ssh ubuntu@<eth0-ip> 'bash -s' << 'REMOTE'
set -euo pipefail
export KUBECONFIG=/home/ubuntu/.kube-bootstrap.yaml

curl -fLo /tmp/cluster-template.yaml \
  https://github.com/canonical/cluster-api-k8s/releases/latest/download/cluster-template.yaml

# Source env vars (operator fills in /tmp/cluster-env)
# shellcheck disable=SC1091
source /tmp/cluster-env

envsubst < /tmp/cluster-template.yaml > /tmp/cluster-rendered.yaml
kubectl apply -f /tmp/cluster-rendered.yaml
REMOTE
```

## Step 10 — Poll for cluster Available

```bash
ssh ubuntu@<eth0-ip> 'bash -s' << 'REMOTE'
set -euo pipefail
export KUBECONFIG=/home/ubuntu/.kube-bootstrap.yaml

START=$(date +%s)
DEADLINE=$((START + 15*60))

while [[ $(date +%s) -lt $DEADLINE ]]; do
  PHASE=$(kubectl get cluster capi-mgmt-cluster -o jsonpath='{.status.phase}' 2>/dev/null || echo "?")
  AVAILABLE=$(kubectl get cluster capi-mgmt-cluster -o jsonpath='{.status.conditions[?(@.type=="Available")].status}' 2>/dev/null || echo "?")
  ELAPSED=$(($(date +%s) - START))
  printf '[%4ds] Phase=%s Available=%s\n' "$ELAPSED" "$PHASE" "$AVAILABLE"
  [[ "$AVAILABLE" == "True" ]] && break
  sleep 15
done

clusterctl describe cluster capi-mgmt-cluster --show-conditions all
REMOTE
```

## Step 11 — Export workload kubeconfig to jumphost

```bash
ssh ubuntu@<eth0-ip> 'bash -s' << 'REMOTE'
set -euo pipefail
export KUBECONFIG=/home/ubuntu/.kube-bootstrap.yaml
mkdir -p ~/magnum-capi
clusterctl get kubeconfig capi-mgmt-cluster > ~/magnum-capi/capi-mgmt-cluster.kubeconfig
chmod 0600 ~/magnum-capi/capi-mgmt-cluster.kubeconfig
KUBECONFIG=~/magnum-capi/capi-mgmt-cluster.kubeconfig kubectl get nodes
REMOTE

# Copy to jumphost for runbook 05
scp -p ubuntu@<eth0-ip>:~/magnum-capi/capi-mgmt-cluster.kubeconfig ~/magnum-capi/capi-mgmt-cluster.kubeconfig
chmod 0600 ~/magnum-capi/capi-mgmt-cluster.kubeconfig
```

## Exit criteria

- `capi-mgmt.maas` is Deployed in MAAS with k3s + CAPI controllers + ORC running
- `capi-mgmt-cluster` workload cluster is Available
- Workload kubeconfig exists at `~/magnum-capi/capi-mgmt-cluster.kubeconfig`
  on the jumphost
- Proceed to `05-magnum-capi-driver.md`

## Recurring pitfalls (apply to execution)

- `juju ssh` HANGS when stdout is redirected — use `juju exec --unit X -- 'cmd'`
- MAAS-deployed Ubuntu uses `ubuntu` user, not `jessea123`
- k3s `--bind-address=X` doesn't bind 127.0.0.1 — kubeconfig server URL must be sed-rewritten
- Snap-confined openstack CLI cannot read `/tmp` — paths under `$HOME` only
- `openstack -f value -c X -c Y` outputs in alphabetical column order — use single-column queries
- GitHub API rate limit is 60 unauthenticated requests/hour — cache results, don't refetch on every run
- `.maas` DNS may not resolve from jumphost — use IPs directly
