Status: Second execution document of Batch B. First cloud-mutating step in the v1 deploy sequence. Triggers MAAS provisioning of 4 hosts, LXD container creation, charm installation, and the initial relation cascade.
Position in sequence: Runs after v1-do-doc-03-destroy.md (state confirmed Clean: 5 VMs Ready, no openstack model). Runs before v1-do-doc-05-vault-init.md (manual Vault init).
Cross-references:
bundle.yaml (canonical deploy artifact)overlays/octavia-pki.yaml (gitignored; from v1-do-doc-02)Execute the Caracal-bundle deploy and watch the model settle to a known-incomplete state: every charm reaches active/idle EXCEPT those waiting on vault:certificates, which sit in blocked: 'certs not present yet' until Vault is initialized in v1-do-doc-05.
What this document does:
juju add-model openstackjuju deploy ./bundle.yaml --overlay overlays/octavia-pki.yamlWhat this document does NOT do:
Out of scope:
| Decision | Choice | Notes |
|---|---|---|
| Model name | openstack |
Matches runbooks/01-destroy-model.md Phase B target |
| Deploy command | juju deploy ./bundle.yaml --overlay overlays/octavia-pki.yaml |
One overlay; vr0-dc0-testcloud.yaml was a placeholder and is empty per pre-deploy review |
--trust flag |
Not used | Standard OpenStack charms on MAAS do not require bundle-level trust. If a specific charm needs it post-deploy (none expected for Caracal v1), apply targeted juju trust <app> then. |
| Settle wait | Manual watch via juju status --watch 30s; ~60-90 min typical |
Charms cycle blocked → maintenance → active/idle |
| Expected pre-Vault end state | Vault blocked; cert-relation consumers blocked; everything else active/idle | See §7.3 for the explicit blocked-charm list |
| PKI on-disk verification | Files-on-disk + fingerprint compare after Octavia config-changed hook completes | §8 — the explicit operator-asked confirmation |
| Prereq | Verification |
|---|---|
v1-do-doc-01-prep.md ✓ (state-check passed) |
Manual confirmation |
v1-do-doc-02-pki.md ✓ (overlay generated) |
test -f "$REPO/overlays/octavia-pki.yaml" |
v1-do-doc-03-destroy.md ✓ (no openstack model, 5 VMs Ready) |
Run §2 state-detection block from doc-03 again — should still return CLEAN |
| Pre-deploy fixes all committed and pulled locally | Verified in v1-do-doc-01 §4.3 |
Shell context — paste once:
export REPO="$HOME/openstack-caracal-ipv4" cd "$REPO" echo "REPO=$REPO" test -f overlays/octavia-pki.yaml && echo "[OK] PKI overlay present" || echo "[FAIL] missing overlay" git status --short # Expect: clean working tree
These are READ-ONLY safety checks. Stop if any FAIL.
cd "$REPO"
echo "=== 4.1 bundle.yaml YAML parses ==="
python3 -c "import yaml; yaml.safe_load(open('bundle.yaml'))" \
&& echo "[OK] bundle.yaml parses" || echo "[FAIL] bundle.yaml YAML error"
echo ""
echo "=== 4.2 octavia-pki overlay YAML parses ==="
python3 -c "
import yaml
d = yaml.safe_load(open('overlays/octavia-pki.yaml'))
o = d['applications']['octavia']['options']
keys = sorted(o.keys())
expected = ['lb-mgmt-controller-cacert','lb-mgmt-controller-cert','lb-mgmt-issuing-ca-key-passphrase','lb-mgmt-issuing-ca-private-key','lb-mgmt-issuing-cacert']
print('Keys in overlay:', keys)
print('All 5 keys present:', keys == expected)
print('All values non-empty:', all(v for v in o.values()))
"
echo ""
echo "=== 4.3 ceph-osd has no storage block (pre-deploy fix #1) ==="
grep -A 12 "^ ceph-osd:" bundle.yaml | grep "^ storage:" \
&& echo "[FAIL] storage block present — pre-deploy fix not applied" \
|| echo "[OK] no storage block under ceph-osd"
echo ""
echo "=== 4.4 expected-osd-count is 4 (matches one OSD per host) ==="
grep -A 8 "^ ceph-mon:" bundle.yaml | grep "expected-osd-count: 4" \
&& echo "[OK] expected-osd-count: 4" \
|| echo "[FAIL] expected-osd-count is not 4"
echo ""
echo "=== 4.5 11 VIPs declared (Designate deferred to v2 per D-019) ==="
VIP_COUNT=$(grep -cE "^[[:space:]]+vip: 10\.12\.4\." bundle.yaml)
echo "VIP count: $VIP_COUNT (expect 11)"
echo ""
echo "=== 4.6 No model named 'openstack' exists ==="
juju models | grep "^openstack" \
&& echo "[FAIL] model 'openstack' already exists — re-run doc-03 destroy" \
|| echo "[OK] no openstack model"
echo ""
echo "=== 4.7 All 5 cloud-target VMs MAAS-Ready ==="
export MAAS_PROFILE=$(maas list 2>/dev/null | awk 'NR==1 {print $1}')
maas "$MAAS_PROFILE" machines read 2>/dev/null \
| python3 -c "
import json, sys
machines = json.load(sys.stdin)
targets = ['openstack0', 'openstack1', 'openstack2', 'openstack3', 'capi-mgmt']
ready = sum(1 for m in machines if m.get('hostname') in targets and m.get('status_name') == 'Ready' and not m.get('owner'))
print(f'Ready + unowned: {ready} / 5')
print('[OK]' if ready == 5 else '[FAIL]')
"
echo ""
echo "=== 4.8 Juju controller available ==="
juju controllers
juju show-controller 2>/dev/null | head -10
echo ""
echo "=== 4.9 Disk space on /var/lib/libvirt/images ==="
df -h /var/lib/libvirt/images 2>/dev/null
echo " Need ≥ 4 × 8 GiB for openstack0-3 root + LXD container space; ≥ 4 × 512 GiB OSD qcow2 already allocated"
If any check above does not show [OK] (or the expected value), stop and investigate before continuing.
juju add-model openstack juju model-config -m openstack | head -20
Expected: model openstack created on the current controller. juju models should now show it.
Optional model-config tweaks (only if you have a reason):
default-base: ubuntu@22.04/stable— already in the bundle's top-level config; model-level override not needed.transmit-vendor-metrics=false— privacy posture; testcloud doesn't need to phone home. Optional.For this v1 cycle, leave model-config at defaults. Tweaks are easier to debug when only one is changed at a time.
cd "$REPO" # Confirm working dir and overlay pwd ls -la bundle.yaml overlays/octavia-pki.yaml echo "" # Deploy — this returns in a few seconds; actual provisioning runs in background juju deploy ./bundle.yaml --overlay overlays/octavia-pki.yaml -m openstack
Expected output: a long list of deploy actions ("Deploying ...", "Resolving ...", "Located bundle ..."). Then control returns to the prompt.
If the deploy command itself errors (YAML syntax, charm-not-found, etc.), stop here. The bundle has not started provisioning yet — fix and rerun.
Provisioning takes 60-90 minutes typical for this testcloud size on this jumphost. The bundle requests MAAS-deploy of 4 hosts + creation of ~25 LXD containers + 30+ charm installs + relation establishment.
In a dedicated terminal (don't share with the destroy / deploy terminal — interaction can interrupt screen redraws):
juju status --color --watch 30s -m openstack
Refreshes every 30 seconds. Ctrl+C to exit.
Rough timeline (your mileage may vary):
| Elapsed | What to expect |
|---|---|
| 0-3 min | MAAS commissioning starts on openstack0-3 (boot, fingerprint, partition) |
| 3-10 min | Ubuntu install on openstack0-3 via MAAS preseed |
| 10-15 min | Hosts in Juju show pending → started. LXD service comes up on each |
| 15-30 min | LXD containers being created; subordinate charms (mysql-router, hacluster) attaching |
| 30-50 min | Charm config-changed hooks running; relations forming; databases bootstrapping |
| 50-90 min | Most charms reach blocked (waiting on Vault) or active/idle (no Vault dep) |
| 90+ min | Settle stabilizes at the pre-Vault end state |
If progress visibly stalls for >15 minutes in the middle of the timeline, see §7.4.
When the model has settled (stops progressing for >5 minutes), this is what juju status should show:
blocked (waiting on Vault):
The following charms have :certificates relations to vault:certificates and CANNOT reach active/idle until Vault is initialized in v1-do-doc-05:
vault itself — status: Vault needs to be initialized (this is the trigger to run doc-05)mysql-innodb-cluster (needs vault cert for inter-instance TLS)keystone (Keystone API TLS)glance (Glance API TLS)nova-cloud-controller, placement, neutron-api, neutron-api-plugin-ovn, ovn-central, ovn-chassis, ovn-chassis-octaviacinder, octavia, octavia-dashboardbarbican, barbican-vaultmagnum, magnum-dashboardglance-simplestreams-sync, openstack-dashboard, ceph-radosgwoctavia-diskimage-retrofit (subordinate of glance-simplestreams-sync)*-hacluster subordinates indirectly (because their principal is blocked; Designate's hacluster removed per D-019)active/idle (no Vault dependency):
rabbitmq-serveretcd (uses easyrsa for its OWN TLS, not Vault)easyrsanova-compute (no :certificates relation directly; it gets ceph keys via ceph-mon)ceph-mon, ceph-osd (Ceph cluster bootstraps independently)Note on the chicken-and-egg:
Per D-006 (Vault HA backend), etcd's TLS is bootstrapped by easyrsa via the easyrsa:client ↔ etcd:certificates relation. This is what lets etcd come up active/idle BEFORE Vault is initialized. Then Vault uses etcd as its HA backend. Watch that easyrsa/0 and etcd/{0,1,2} reach active/idle within the first 30 minutes; if etcd stays blocked beyond that, easyrsa-related certs likely didn't flow.
If progress stalls for >15 min in the middle of the timeline:
# Find which units are blocked or in error state juju status -m openstack | grep -E "(blocked|error|maintenance)" # For any unit in error state, get its log juju show-status-log <unit-name> -m openstack # E.g.: juju show-status-log keystone/0 # For deeper inspection juju ssh <unit-name> -m openstack -- sudo tail -200 /var/log/juju/unit-<unit-name>.log
Common stalls:
lxc list on the host; container quotas, image availabilityjuju resolved <unit> may helpPer D-018, do not pursue graceful recovery from major errors at this stage — full teardown via v1-do-doc-03 and redeploy is the canonical "reset" path.
This section addresses the operator-asked confirmation that the PKI overlay made it onto the Octavia unit's filesystem after the charm's config-changed hook completes.
Run this section after octavia/0 has progressed past pending/maintenance and reached at least a blocked state (i.e., the charm has run its install + config-changed hooks but is waiting on Vault for the API TLS cert). The lb-mgmt-* options are consumed by the config-changed hook regardless of Vault status — so on-disk material should be present even with octavia/0 in blocked.
echo "=== Octavia unit status ===" juju status octavia -m openstack # Expect: octavia/0 in 'blocked' (cert relation pending) or 'maintenance' (still configuring). # If status is still 'pending' or 'allocating', wait and re-run this section.
echo "=== /etc/octavia/certs/ contents ===" juju ssh octavia/0 -m openstack -- sudo ls -la /etc/octavia/certs/
Expected: 4-5 PEM files. The exact filenames depend on the charm revision; commonly:
server_ca.cert.pem — Issuing CA cert (consumed from lb-mgmt-issuing-cacert)server_ca.key.pem — Issuing CA encrypted private key (consumed from lb-mgmt-issuing-ca-private-key)client_ca.cert.pem — Controller CA cert (consumed from lb-mgmt-controller-cacert)client.cert-and-key.pem — Controller cert + key bundle (consumed from lb-mgmt-controller-cert)If the directory is empty or missing, the config-changed hook hasn't run yet or failed. Re-check unit status; see §7.4 remediation.
[unverified, flagging]: the exact filenames above are typical for recent charm-octavia revisions but may vary. The verification below uses fingerprint comparison (content-based), which is filename-agnostic — adapt the filenames in the loop if
lsshows different ones.
mkdir -p "$HOME/pki-verify"
chmod 700 "$HOME/pki-verify"
cd "$HOME/pki-verify"
# Pull whatever PEM files are present in /etc/octavia/certs/
juju ssh octavia/0 -m openstack -- sudo ls /etc/octavia/certs/ 2>/dev/null | \
while read -r f; do
case "$f" in
*.pem|*.crt)
echo "Pulling $f ..."
juju ssh octavia/0 -m openstack -- sudo cat "/etc/octavia/certs/$f" \
> "$HOME/pki-verify/unit-$f"
;;
esac
done
ls -la "$HOME/pki-verify/"
echo "=== Issuing CA fingerprint comparison ==="
# Find the issuing CA on the unit (charm naming: typically server_ca.cert.pem)
UNIT_FILE=$(ls "$HOME/pki-verify/" | grep -E "^unit-server.*ca.*\.cert\.pem$" | head -1)
if [ -z "$UNIT_FILE" ]; then
echo "[WARN] no unit-server*ca*.cert.pem file found — list and adapt:"
ls "$HOME/pki-verify/"
else
UNIT_FP=$(openssl x509 -in "$HOME/pki-verify/$UNIT_FILE" -noout -fingerprint -sha256 2>/dev/null | cut -d= -f2)
SRC_FP=$(openssl x509 -in "$HOME/octavia-pki/issuing-ca/issuing-ca.cert.pem" -noout -fingerprint -sha256 | cut -d= -f2)
echo "Unit ($UNIT_FILE): $UNIT_FP"
echo "Jumphost (issuing-ca): $SRC_FP"
if [ "$UNIT_FP" = "$SRC_FP" ]; then
echo "[OK] Issuing CA cert on unit matches jumphost source"
else
echo "[FAIL] fingerprints DIFFER — investigate before continuing"
fi
fi
echo ""
echo "=== Controller CA fingerprint comparison ==="
UNIT_FILE=$(ls "$HOME/pki-verify/" | grep -E "^unit-client.*ca.*\.cert\.pem$" | head -1)
if [ -z "$UNIT_FILE" ]; then
echo "[WARN] no unit-client*ca*.cert.pem file found — list and adapt:"
ls "$HOME/pki-verify/"
else
UNIT_FP=$(openssl x509 -in "$HOME/pki-verify/$UNIT_FILE" -noout -fingerprint -sha256 2>/dev/null | cut -d= -f2)
SRC_FP=$(openssl x509 -in "$HOME/octavia-pki/controller-ca/controller-ca.cert.pem" -noout -fingerprint -sha256 | cut -d= -f2)
echo "Unit ($UNIT_FILE): $UNIT_FP"
echo "Jumphost (controller-ca): $SRC_FP"
if [ "$UNIT_FP" = "$SRC_FP" ]; then
echo "[OK] Controller CA cert on unit matches jumphost source"
else
echo "[FAIL] fingerprints DIFFER — investigate before continuing"
fi
fi
The lb-mgmt-controller-cert value contains BOTH the controller cert AND its key, concatenated. On the unit it lands as a single PEM bundle. Confirm:
echo "=== Controller cert+key bundle verification ===" UNIT_FILE=$(ls "$HOME/pki-verify/" | grep -E "^unit-client.*\.pem$" | grep -v "ca" | head -1) # Common name: client.cert-and-key.pem if [ -z "$UNIT_FILE" ]; then echo "[WARN] no controller cert bundle found on unit — list and adapt:" ls "$HOME/pki-verify/" else echo "Found bundle: $UNIT_FILE" # Compare cert fingerprint UNIT_FP=$(openssl x509 -in "$HOME/pki-verify/$UNIT_FILE" -noout -fingerprint -sha256 2>/dev/null | cut -d= -f2) SRC_FP=$(openssl x509 -in "$HOME/octavia-pki/controller/controller.cert.pem" -noout -fingerprint -sha256 | cut -d= -f2) echo "Unit cert FP: $UNIT_FP" echo "Source cert FP: $SRC_FP" [ "$UNIT_FP" = "$SRC_FP" ] && echo "[OK] cert match" || echo "[FAIL] cert mismatch" # Confirm cert+key in bundle match each other (proof of possession) CERT_PUB=$(openssl x509 -in "$HOME/pki-verify/$UNIT_FILE" -noout -pubkey 2>/dev/null | openssl md5) KEY_PUB=$(openssl pkey -in "$HOME/pki-verify/$UNIT_FILE" -pubout 2>/dev/null | openssl md5) echo "Cert pubkey md5: $CERT_PUB" echo "Key pubkey md5: $KEY_PUB" [ "$CERT_PUB" = "$KEY_PUB" ] && echo "[OK] cert and key in bundle are paired" || echo "[FAIL] cert and key DO NOT match" fi
The Issuing CA's encrypted key sits on the unit. Confirm the passphrase from octavia.conf can decrypt it. This is the test that proves the runtime amphora-signing path will work once Octavia comes up post-Vault.
echo "=== Passphrase round-trip test ==="
# Pull the passphrase from octavia.conf
UNIT_PASS=$(juju ssh octavia/0 -m openstack -- \
sudo grep "^ca_private_key_passphrase" /etc/octavia/octavia.conf 2>/dev/null | head -1 | cut -d= -f2- | sed 's/^[[:space:]]*//' | sed 's/[[:space:]]*$//')
# Pull the encrypted key
UNIT_KEY=$(ls "$HOME/pki-verify/" | grep -E "^unit-server.*key.*\.pem$" | head -1)
if [ -z "$UNIT_PASS" ]; then
echo "[WARN] passphrase line not found in octavia.conf — may not be present until vault-init"
elif [ -z "$UNIT_KEY" ]; then
echo "[WARN] no unit-server*key*.pem found"
else
# Try to decrypt the key using the passphrase from the unit
if openssl pkey -in "$HOME/pki-verify/$UNIT_KEY" -passin "pass:$UNIT_PASS" -noout 2>/dev/null; then
echo "[OK] passphrase in octavia.conf decrypts the on-disk Issuing CA key"
else
echo "[FAIL] passphrase did NOT decrypt the key — overlay value mismatch"
fi
# Also confirm passphrase matches what we generated on jumphost
SRC_PASS=$(cat "$HOME/octavia-pki/issuing-ca/passphrase.txt")
if [ "$UNIT_PASS" = "$SRC_PASS" ]; then
echo "[OK] passphrase on unit matches jumphost source"
else
echo "[FAIL] passphrase on unit does NOT match jumphost source"
fi
fi
# Clear the passphrase from shell
unset UNIT_PASS SRC_PASS
Note: the
ca_private_key_passphraseline inoctavia.confmay not appear until the cert relation completes. If §8.6 reports[WARN] passphrase line not found, that is expected at pre-Vault state — the charm may defer writing the full[certificates]section until the cert relation has flowed. Re-run §8.6 after v1-do-doc-05 completes.
# Optional: shred the temp copies of the cert material shred -uvz "$HOME/pki-verify/"*.pem 2>/dev/null rmdir "$HOME/pki-verify" 2>/dev/null
The unit retains the originals; the jumphost-side originals are at $HOME/octavia-pki/ (left in place per v1-do-doc-02 §13).
Before proceeding to Vault init:
[OK]openstack createdactive/idle: rabbitmq-server, etcd/{0,1,2}, easyrsa, ceph-mon/{0,1,2}, ceph-osd/{0,1,2,3}, nova-compute/{0,1,2,3}blocked: Vault needs to be initialized (the trigger for doc-05)/etc/octavia/certs/ on octavia/0[OK][OK][OK][OK] (or [WARN] not yet in conf — acceptable if pre-Vault)If all checked, proceed to v1-do-doc-05-vault-init.md.
| Aspect | Testcloud (v1) | Roosevelt |
|---|---|---|
| Bundle | bundle.yaml (single file) |
Multi-environment overlay structure |
| Hosts | 4 KVM VMs | Bare-metal MAAS-managed servers |
| LXD container layout | Dense (10+ on machine 8) | More spread; possibly real units instead of LXD for some apps |
| Overlay set | overlays/octavia-pki.yaml only |
Site overlay (machine assignments, NIC MACs) + Vault overlay + PKI overlay |
| Settle time | 60-90 minutes | Likely 2-4 hours (more hosts, real provisioning) |
| Octavia PKI source | Operator-generated, overlay-distributed | Vault PKI engine |
| Octavia PKI verification | This §8 procedure | Vault-side audit trail; no manual comparison needed |
| Date | Change | Reference |
|---|---|---|
| 2026-05-27 | Document created. Replaces runbooks/deprecated/02-deploy.md (placeholder). Adds explicit §8 on-disk PKI verification per operator request. |
Batch B drafting |