# Runbook 00 — Pre-Deploy

## Purpose

Prepare for a clean Caracal rebuild of the VR0 DC0 Omega Cloud. Capture all
state needed for rollback, gracefully tear down dependent workloads, and verify
the destination environment is ready before destroying the existing OpenStack
model.

## Prerequisites

- SSH access to jumphost `vopenstack-jesse` as `jessea123`
- `admin-openrc` and `user1-openrc` available in `$HOME`
- Access to the Juju controller hosting the `openstack` model
- Access to the capi-mgmt.maas k3s cluster (kubeconfig present)
- NetBox IPv4 imports completed (per `netbox/ipv4-prefixes-import.py`)
- NetBox VLAN imports completed (per `netbox/vlans-import.py`)

## Phase 1 — Verify NetBox readiness (gating)

Run the verification path of the NetBox import scripts. Confirm all entries
appear correctly scoped to VR0 DC0.

```bash
cd ~/vr0-dc0-caracal
NETBOX_URL=https://netbox.baldurkeep.com NETBOX_TOKEN=<token> \
  python3 netbox/ipv4-prefixes-import.py --verify-only
NETBOX_URL=https://netbox.baldurkeep.com NETBOX_TOKEN=<token> \
  python3 netbox/vlans-import.py --verify-only
```

Expected: all prefixes and VLANs report scope-OK, no MISSING entries.

## Phase 2 — Capture current state

Backups needed for potential rollback:

```bash
# Vault unseal keys and root CA cert
juju ssh vault/0 -- sudo cat /var/snap/vault/common/vault.crt > ~/backups/$(date +%F)/vault-root-ca.crt
# (Unseal keys MUST be on file from initial Vault setup; verify presence)
ls -la ~/.vault-keys

# Export current bundle
juju export-bundle --model openstack > ~/backups/$(date +%F)/bundle-pre-rebuild.yaml

# Snapshot of current 'juju status'
juju status --model openstack --format=yaml > ~/backups/$(date +%F)/juju-status-pre-rebuild.yaml

# Inventory of FIPs and tenant resources we might want to recreate
source ~/admin-openrc
openstack floating ip list -c "Floating IP Address" -c "Fixed IP Address" \
  -c "Project" -f csv > ~/backups/$(date +%F)/floating-ips.csv
openstack server list --all-projects -c ID -c Name -c Project -c Status -f csv \
  > ~/backups/$(date +%F)/servers.csv
openstack network list --all-projects -c ID -c Name -c Project -f csv \
  > ~/backups/$(date +%F)/networks.csv
openstack loadbalancer list -c id -c name -c project_id -c vip_address -f csv \
  > ~/backups/$(date +%F)/loadbalancers.csv
```

## Phase 3 — KVM snapshots of openstack0-3

From the jumphost (which is the hypervisor):

```bash
for vm in openstack0 openstack1 openstack2 openstack3; do
  sudo virsh snapshot-create-as --domain "$vm" \
    --name "pre-caracal-rebuild-$(date +%F)" \
    --description "Pre-Caracal rebuild baseline" \
    --atomic
done
sudo virsh snapshot-list openstack0
```

These snapshots are the disaster-recovery point.

## Phase 4 — Graceful CAPI workload teardown (D-013)

Delete the CAPI workload cluster cleanly so its OpenStack resources (LBs, FIPs,
volumes, Octavia members) are released by CAPI controllers before model destroy.

```bash
export KUBECONFIG=~/magnum-capi/phase3/capi-mgmt-cluster.kubeconfig
# (Adjust path if kubeconfig has moved)

# Delete the workload cluster — CAPI handles tenant OpenStack cleanup
kubectl delete cluster capi-mgmt-cluster -n default
# Wait for finalizers; this may take ~10 minutes
kubectl wait --for=delete cluster/capi-mgmt-cluster -n default --timeout=15m
```

Verify on the OpenStack side that resources were released:

```bash
source ~/admin-openrc
openstack server list --all-projects | grep -i capi || echo "No CAPI servers remaining"
openstack loadbalancer list | grep -i capi || echo "No CAPI LBs remaining"
openstack floating ip list -c "Floating IP Address" -c "Fixed IP Address" -f csv
```

## Phase 5 — Preserve capi-mgmt.maas itself

The bootstrap k3s + CAPI controllers on `capi-mgmt.maas` are NOT destroyed —
they will be re-used post-rebuild as the Magnum CAPI mgmt plane. Verify the
controllers are still healthy:

```bash
ssh capi-mgmt.maas -- sudo kubectl --kubeconfig /etc/rancher/k3s/k3s.yaml \
  get pods -A
```

Confirm:
- `capi-system` namespace pods Running
- `capo-system` (CAPI OpenStack provider) pods Running
- `cert-manager` pods Running
- `orc-system` (OpenStack Resource Controller) pods Running

## Phase 6 — Final go/no-go checklist

Do not proceed to `runbooks/01-destroy-model.md` until all of the following pass:

- [ ] NetBox verification clean
- [ ] Vault unseal keys backed up and verified readable
- [ ] `bundle-pre-rebuild.yaml` exists and is non-empty
- [ ] `juju-status-pre-rebuild.yaml` shows desired-pre-destroy state captured
- [ ] All four KVM snapshots created (`virsh snapshot-list` confirms)
- [ ] CAPI workload cluster deletion completed (`kubectl get cluster` returns
      "no resources found")
- [ ] OpenStack-side resources from CAPI workload are released (no orphaned LBs,
      FIPs, volumes)
- [ ] capi-mgmt.maas k3s cluster controllers all Running

## Notes

- Snapshot disk space consumption can grow significantly during the rebuild
  window. Verify free space on `/var/lib/libvirt/images` prior to running
  the rebuild deploy.
- If Vault unseal keys cannot be located, STOP. A failed Vault re-init without
  the original keys means lost issued certificates and is destructive to any
  data sealed under the existing root key. This MUST be confirmed before model
  destroy.
