diff --git a/bundle.yaml b/bundle.yaml index 5504d05..1a2f21e 100644 --- a/bundle.yaml +++ b/bundle.yaml @@ -383,18 +383,13 @@ options: debug: false openstack-origin: *openstack-origin - # ----- PKI material (4 cert blobs + passphrase) --------------------- - # TODO(octavia-cert): inline values BEFORE deploy. Two sources: - # (a) Copy from Bobcat backup at: - # ~/backups/pre-caracal-destroy-2026-05-22/bundle-pre-destroy.yaml - # (lines ~230-234; CA valid until 2027-05-15 — adequate for testcloud) - # (b) Generate fresh via the (yet-to-be-written) octavia-cert-runbook - # — required for Roosevelt deploy - # lb-mgmt-controller-cacert: - # lb-mgmt-controller-cert: - # lb-mgmt-issuing-ca-key-passphrase: - # lb-mgmt-issuing-ca-private-key: - # lb-mgmt-issuing-cacert: + # ----- PKI material ------------------------------------------------- + # 5 lb-mgmt-* options are supplied via overlays/octavia-pki.yaml + # (gitignored). Generated per runbooks/01a-octavia-pki-generation.md. + # Deploy with: + # juju deploy ./bundle.yaml \ + # --overlay overlays/vr0-dc0-testcloud.yaml \ + # --overlay overlays/octavia-pki.yaml vip: 10.12.4.233 os-public-hostname: octavia.omega.dc0.vr0.cloud.neumatrix.local bindings: *api-bindings diff --git a/runbooks/01a-octavia-pki-generation.md b/runbooks/01a-octavia-pki-generation.md new file mode 100644 index 0000000..65bd707 --- /dev/null +++ b/runbooks/01a-octavia-pki-generation.md @@ -0,0 +1,650 @@ +# Runbook 01a — Octavia LBaaS PKI generation + +**Status:** Pre-deploy execution. Runs between `01-destroy-model.md` and `02-deploy.md`. +**Numbering rationale:** Octavia PKI artifacts must exist on the deploy host before +`juju deploy` is invoked (the values are referenced by the overlay file). Placing +this between destroy and deploy aligns generation with the "fresh rebuild" framing. + +**Cross-references:** +- D-007 (Octavia in bundle from day one) +- Bundle `octavia.options` PKI material section +- `overlays/octavia-pki.yaml` (gitignored — output of this runbook) +- Workstream 3a decision (2026-05-22): generate fresh, EC P-384 CAs, overlay-file approach + +--- + +## 1. Purpose & scope + +This runbook generates a complete two-tier PKI for Charmed Octavia's +amphora load-balancer trust domain: + +- **Issuing CA** — Octavia uses this to sign each amphora's server certificate + at LB-creation time. Octavia receives the **private key** and **passphrase**. +- **Controller CA** — amphorae's trust anchor for connections **from** the + Octavia controller. Octavia only receives the **cert** (no key needed at + runtime); the controller's identity is proved by: +- **Controller certificate** — signed by Controller CA, presented by the + Octavia controller to each amphora. Bundled as cert + key into a single + PEM blob. + +Five charm options consume the artifacts (`octavia` application): + +| Charm option | Content | Format | +|---|---|---| +| `lb-mgmt-issuing-cacert` | Issuing CA certificate | base64-encoded PEM | +| `lb-mgmt-issuing-ca-private-key` | Issuing CA encrypted private key | base64-encoded PEM (already encrypted with passphrase) | +| `lb-mgmt-issuing-ca-key-passphrase` | Issuing CA key passphrase | plain string (NOT base64) | +| `lb-mgmt-controller-cacert` | Controller CA certificate | base64-encoded PEM | +| `lb-mgmt-controller-cert` | Controller cert + key, concatenated | base64-encoded PEM bundle | + +**Scope:** v1 testcloud (VR0 DC0 Omega Cloud). Roosevelt deltas documented in +section 14. + +**Out of scope:** Octavia API TLS (issued by Vault via `octavia:certificates` +relation); rotation procedure (deferred to Roosevelt runbook). + +--- + +## 2. Decisions captured + +Per workstream 3a sign-off (2026-05-22): + +| Decision | Choice | Roosevelt parallel | +|---|---|---| +| Cert provenance | Generate fresh (no Bobcat-backup copy) | Vault PKI engine | +| CA key algorithm | EC P-384 | EC P-384 (Vault root) | +| Controller cert algorithm | EC P-256 | EC P-256 | +| CA validity | 10 years | 5-year intermediate, Vault-rotated | +| Controller cert validity | 2 years | 90 days, auto-rotated | +| Distribution method | Juju overlay file (gitignored) | Vault-injected at deploy | +| Storage path on jumphost | `$HOME/octavia-pki/` | Vault PKI mounts | +| Passphrase strength | 32 random bytes, base64-encoded (44 chars) | Vault-generated | + +**Naming convention:** + +- Issuing CA CN: `VR0 DC0 Omega Cloud Octavia Issuing CA` +- Controller CA CN: `VR0 DC0 Omega Cloud Octavia Controller CA` +- Controller cert CN: `octavia-controller.omega.dc0.vr0.cloud.neumatrix.local` +- Controller cert SANs: above CN, plus `octavia.omega.dc0.vr0.cloud.neumatrix.local`, plus `10.12.4.233` (the Octavia API VIP per workstream 2) +- Organization (O): `Neumatrix` + +--- + +## 3. Prerequisites + +- Executor is on jumphost `vopenstack-jesse` as `jessea123`. +- `openssl` version 3.x or later installed (`openssl version` to confirm). +- `$HOME` is writable (snap-confined `openstackclients` cannot read `/tmp`; + all paths must resolve under `$HOME`). +- Git repository `openstack-caracal-ipv4` cloned on jumphost at a known path + (referred to as `$REPO` throughout). Set this in the executor's shell: + ```bash + export REPO=$HOME/repos/openstack-caracal-ipv4 # adjust to actual clone path + ``` +- Repository is on `main` branch and clean (`cd $REPO && git status` shows clean tree). +- Previous workstream 2 commit has been pushed (bundle has the VIP assignments and + active hacluster stack — verify with `grep -c "^ vip: 10.12.4." "$REPO/bundle.yaml"`, + expect 12). + +--- + +## 4. Pre-flight: gitignore patch (DO THIS FIRST) + +**Critical:** the `.gitignore` patch goes in BEFORE any private key material +exists on disk. This minimizes the race window for an accidental commit. + +```bash +cd "$REPO" + +# Append to .gitignore (idempotent — check if already present first) +grep -q "octavia-pki.yaml" .gitignore || cat >> .gitignore <<'EOF' + +# Octavia PKI artifacts — NEVER commit +overlays/octavia-pki.yaml +octavia-pki/ +*.key +*.key.enc +passphrase.txt +EOF + +# Review the diff +git diff .gitignore + +# Commit and push BEFORE generating any keys +git add .gitignore +git commit -m "gitignore: octavia PKI artifacts and overlay (runbook 01a)" +git push origin main +``` + +**Verify the gitignore is effective:** + +```bash +# This should NOT show overlays/octavia-pki.yaml even as untracked +touch overlays/octavia-pki.yaml +git status --short overlays/ # expect: empty output for octavia-pki.yaml +rm overlays/octavia-pki.yaml +``` + +If the test file does show as untracked, **STOP** and fix the gitignore syntax before +generating any secrets. + +--- + +## 5. Workspace setup + +```bash +WORKDIR=$HOME/octavia-pki +mkdir -p "$WORKDIR"/{issuing-ca,controller-ca,controller,overlay-build} +chmod 700 "$WORKDIR" +cd "$WORKDIR" +echo "Working in: $WORKDIR" +``` + +Resulting layout: + +``` +$HOME/octavia-pki/ +├── issuing-ca/ # passphrase.txt, .key.enc, .cert.pem +├── controller-ca/ # passphrase.txt, .key.enc, .cert.pem +├── controller/ # .key, .csr, .cert.pem, .bundle.pem, .cnf +└── overlay-build/ # base64 intermediates → consumed by step 10 +``` + +--- + +## 6. Generate Issuing CA + +EC P-384 key encrypted with random 32-byte passphrase. Self-signed cert, 10y validity. + +```bash +cd "$WORKDIR/issuing-ca" + +# Generate passphrase (no trailing newline — required for clean YAML embedding) +openssl rand -base64 32 | tr -d '\n' > passphrase.txt +chmod 600 passphrase.txt + +# Sanity-check +test $(wc -c < passphrase.txt) -eq 44 || { echo "ERROR: passphrase length wrong"; exit 1; } + +# Generate EC P-384 private key, encrypted with passphrase +openssl genpkey -algorithm EC \ + -pkeyopt ec_paramgen_curve:P-384 \ + -aes-256-cbc \ + -pass file:passphrase.txt \ + -out issuing-ca.key.enc +chmod 600 issuing-ca.key.enc + +# Self-sign cert (10 years, SHA-384) +openssl req -new -x509 -sha384 \ + -key issuing-ca.key.enc \ + -passin file:passphrase.txt \ + -days 3650 \ + -subj "/CN=VR0 DC0 Omega Cloud Octavia Issuing CA/O=Neumatrix" \ + -out issuing-ca.cert.pem + +# Verify +openssl x509 -in issuing-ca.cert.pem -noout -dates -subject +openssl verify -CAfile issuing-ca.cert.pem issuing-ca.cert.pem +# Expect: issuing-ca.cert.pem: OK + +ls -la +``` + +--- + +## 7. Generate Controller CA + +Identical pattern; different CN. + +```bash +cd "$WORKDIR/controller-ca" + +openssl rand -base64 32 | tr -d '\n' > passphrase.txt +chmod 600 passphrase.txt +test $(wc -c < passphrase.txt) -eq 44 || { echo "ERROR: passphrase length wrong"; exit 1; } + +openssl genpkey -algorithm EC \ + -pkeyopt ec_paramgen_curve:P-384 \ + -aes-256-cbc \ + -pass file:passphrase.txt \ + -out controller-ca.key.enc +chmod 600 controller-ca.key.enc + +openssl req -new -x509 -sha384 \ + -key controller-ca.key.enc \ + -passin file:passphrase.txt \ + -days 3650 \ + -subj "/CN=VR0 DC0 Omega Cloud Octavia Controller CA/O=Neumatrix" \ + -out controller-ca.cert.pem + +openssl x509 -in controller-ca.cert.pem -noout -dates -subject +openssl verify -CAfile controller-ca.cert.pem controller-ca.cert.pem +# Expect: controller-ca.cert.pem: OK +``` + +**Why Controller CA's key is encrypted even though Octavia never uses it:** +The Controller CA key is needed for future rotations of the controller cert. +Encrypting it (with its own passphrase, separate from Issuing CA's) is defense +in depth — if the jumphost is compromised, the key still requires the +passphrase to be useful for forging controller certs. + +--- + +## 8. Generate Controller certificate + +EC P-256 key (no encryption — Octavia must read it at startup), CSR with SAN +extensions, signed by Controller CA, 2y validity. + +```bash +cd "$WORKDIR/controller" + +# Generate unencrypted EC P-256 key +openssl genpkey -algorithm EC \ + -pkeyopt ec_paramgen_curve:P-256 \ + -out controller.key +chmod 600 controller.key + +# CSR config with SAN extensions +cat > controller.cnf <<'EOF' +[req] +distinguished_name = req_distinguished_name +req_extensions = v3_req +prompt = no + +[req_distinguished_name] +CN = octavia-controller.omega.dc0.vr0.cloud.neumatrix.local +O = Neumatrix + +[v3_req] +keyUsage = critical, digitalSignature, keyEncipherment +extendedKeyUsage = clientAuth, serverAuth +subjectAltName = @alt_names + +[alt_names] +DNS.1 = octavia-controller.omega.dc0.vr0.cloud.neumatrix.local +DNS.2 = octavia.omega.dc0.vr0.cloud.neumatrix.local +IP.1 = 10.12.4.233 +EOF + +# Generate CSR +openssl req -new -sha256 \ + -key controller.key \ + -config controller.cnf \ + -out controller.csr + +# Sign with Controller CA (2 years) +openssl x509 -req -sha256 \ + -in controller.csr \ + -CA "$WORKDIR/controller-ca/controller-ca.cert.pem" \ + -CAkey "$WORKDIR/controller-ca/controller-ca.key.enc" \ + -passin file:"$WORKDIR/controller-ca/passphrase.txt" \ + -CAcreateserial \ + -days 730 \ + -extfile controller.cnf \ + -extensions v3_req \ + -out controller.cert.pem + +# Bundle cert + key (the lb-mgmt-controller-cert option expects both in one PEM) +cat controller.cert.pem controller.key > controller.bundle.pem +chmod 600 controller.bundle.pem +``` + +**Verify the chain and SAN:** + +```bash +# Chain verifies +openssl verify -CAfile "$WORKDIR/controller-ca/controller-ca.cert.pem" controller.cert.pem +# Expect: controller.cert.pem: OK + +# SAN extensions present +openssl x509 -in controller.cert.pem -noout -ext subjectAltName +# Expect: +# DNS:octavia-controller.omega.dc0.vr0.cloud.neumatrix.local, +# DNS:octavia.omega.dc0.vr0.cloud.neumatrix.local, +# IP Address:10.12.4.233 + +# Validity +openssl x509 -in controller.cert.pem -noout -dates +# Expect: notAfter ~2 years from today + +# Bundle integrity (cert + key match) +openssl x509 -in controller.bundle.pem -noout -pubkey > /tmp/cert.pub +openssl pkey -in controller.bundle.pem -pubout > /tmp/key.pub +diff /tmp/cert.pub /tmp/key.pub && echo "Bundle cert/key match" +rm /tmp/cert.pub /tmp/key.pub +``` + +--- + +## 9. Final chain verification + +A standalone block to confirm the full chain is sound before consuming for Octavia: + +```bash +cd "$WORKDIR" + +echo "=== Issuing CA ===" +openssl x509 -in issuing-ca/issuing-ca.cert.pem -noout -subject -dates +openssl verify -CAfile issuing-ca/issuing-ca.cert.pem issuing-ca/issuing-ca.cert.pem + +echo "" +echo "=== Controller CA ===" +openssl x509 -in controller-ca/controller-ca.cert.pem -noout -subject -dates +openssl verify -CAfile controller-ca/controller-ca.cert.pem controller-ca/controller-ca.cert.pem + +echo "" +echo "=== Controller cert ===" +openssl x509 -in controller/controller.cert.pem -noout -subject -dates +openssl verify -CAfile controller-ca/controller-ca.cert.pem controller/controller.cert.pem +``` + +All three "verify" lines must show `: OK`. If any do not, **STOP** and investigate +before proceeding. + +--- + +## 10. Base64-encode artifacts + +Each base64 file is a single line (no wrapping); each becomes one YAML value. + +```bash +cd "$WORKDIR/overlay-build" + +# Issuing CA cert (base64) +base64 -w0 "$WORKDIR/issuing-ca/issuing-ca.cert.pem" > issuing-cacert.b64 + +# Issuing CA private key (already encrypted PEM → base64) +base64 -w0 "$WORKDIR/issuing-ca/issuing-ca.key.enc" > issuing-ca-private-key.b64 + +# Controller CA cert +base64 -w0 "$WORKDIR/controller-ca/controller-ca.cert.pem" > controller-cacert.b64 + +# Controller cert + key bundle +base64 -w0 "$WORKDIR/controller/controller.bundle.pem" > controller-cert.b64 + +# Sanity-check sizes (expect 500-2000 chars each) +wc -c *.b64 +``` + +--- + +## 11. Assemble the overlay file + +```bash +# Read each artifact into shell variables +ISSUING_CACERT=$(cat "$WORKDIR/overlay-build/issuing-cacert.b64") +ISSUING_CA_KEY=$(cat "$WORKDIR/overlay-build/issuing-ca-private-key.b64") +ISSUING_CA_PASS=$(cat "$WORKDIR/issuing-ca/passphrase.txt") +CONTROLLER_CACERT=$(cat "$WORKDIR/overlay-build/controller-cacert.b64") +CONTROLLER_CERT=$(cat "$WORKDIR/overlay-build/controller-cert.b64") + +# Assemble overlay (note: passphrase is YAML-quoted; cert blobs are not — they're +# guaranteed-safe base64 without special chars) +mkdir -p "$REPO/overlays" +cat > "$REPO/overlays/octavia-pki.yaml" < + # lb-mgmt-controller-cert: + # lb-mgmt-issuing-ca-key-passphrase: + # lb-mgmt-issuing-ca-private-key: + # lb-mgmt-issuing-cacert: +``` + +**With this block:** + +```yaml + # ----- PKI material ------------------------------------------------- + # 5 lb-mgmt-* options are supplied via overlays/octavia-pki.yaml + # (gitignored). Generated per runbooks/01a-octavia-pki-generation.md. + # Deploy with: + # juju deploy ./bundle.yaml \ + # --overlay overlays/vr0-dc0-testcloud.yaml \ + # --overlay overlays/octavia-pki.yaml +``` + +Commit this bundle change separately from the overlay generation work: + +```bash +cd "$REPO" +git diff bundle.yaml +git add bundle.yaml +git commit -m "bundle: octavia PKI moves to overlay (runbook 01a) + +Remove inline placeholders + TODO(octavia-cert) block. PKI values now +supplied via overlays/octavia-pki.yaml (gitignored), generated per +runbooks/01a-octavia-pki-generation.md. Decision per workstream 3a +(2026-05-22): industry-best-practice secret handling on testcloud +to rehearse Roosevelt's Vault-PKI-backed posture." +git push origin main +``` + +--- + +## 13. Sensitive-file backup + +The Issuing CA private key + its passphrase are the crown jewels of the LB trust +domain. Loss → cannot sign new amphora certs (LBs gradually break). Exposure → +attacker can forge amphora identities and intercept tenant LB traffic. + +**Minimum backup for testcloud:** + +```bash +cd $HOME +BACKUP_NAME="octavia-pki-backup-$(date +%Y%m%d-%H%M%S).tar.gz" + +tar -czf "$BACKUP_NAME" -C $HOME octavia-pki/ + +# Encrypt with strong symmetric cipher +gpg --symmetric --cipher-algo AES256 --output "${BACKUP_NAME}.gpg" "$BACKUP_NAME" + +# Shred the unencrypted tar +shred -uvz "$BACKUP_NAME" + +ls -la "${BACKUP_NAME}.gpg" +``` + +**Move `${BACKUP_NAME}.gpg` off-host** (your decision — admin workstation +encrypted drive, password-manager attachment, dedicated secrets vault, etc.). +Do NOT leave it sitting in $HOME on the jumphost long-term — that's a single +point of compromise. + +**Roosevelt note:** Vault PKI engine stores all of this — no manual backup +required; Vault's own backup mechanism covers it. The procedure above is +testcloud-only. + +--- + +## 14. Cleanup of intermediates + +After successful deploy + verification (section 14), shred files that are not +needed for future rotation: + +```bash +# Optional: shred the base64 intermediates (regeneratable from PEM sources) +shred -uvz "$WORKDIR/overlay-build/"*.b64 +rmdir "$WORKDIR/overlay-build" + +# Optional: shred the CSR (regeneratable if needed) +shred -uvz "$WORKDIR/controller/controller.csr" + +# DO NOT shred any of the following — they are needed for future operations: +# - issuing-ca/{issuing-ca.cert.pem, issuing-ca.key.enc, passphrase.txt} +# - controller-ca/{controller-ca.cert.pem, controller-ca.key.enc, passphrase.txt} +# - controller/{controller.key, controller.cert.pem, controller.bundle.pem, controller.cnf} +# +# Specifically: +# - Issuing CA artifacts: required for signing new amphoras (Octavia uses them runtime) +# - Controller CA artifacts: required for signing new controller certs (rotation) +# - Controller cert/key: required to repopulate the overlay if jumphost is rebuilt +``` + +--- + +## 15. Post-deploy verification + +After `runbooks/02-deploy.md` completes (`juju deploy` with the overlay), +verify Octavia is healthy and the PKI plumbing works. + +```bash +# Octavia charm active/idle +juju status octavia +# Expect: octavia/0 active idle + +# Octavia services running +juju ssh octavia/0 -- sudo systemctl is-active octavia-api octavia-worker octavia-housekeeping +# Expect: 3x "active" + +# Confirm PKI files landed on the unit +juju ssh octavia/0 -- sudo ls -la /etc/octavia/certs/ +# Expect: server_ca.cert.pem, server_ca.key.pem, client_ca.cert.pem, client.cert-and-key.pem +# (filenames are charm-controlled; presence is what matters) + +# Confirm Octavia can use them — verbose health-check from the API +juju ssh octavia/0 -- sudo journalctl -u octavia-api --since "5 minutes ago" \ + | grep -iE "(cert|ssl|tls|amphora)" | head -20 +# Expect: no errors related to cert loading +``` + +**Smoketest — create a test LB once amphora image is available:** + +```bash +# After `octavia-diskimage-retrofit` has populated Glance with the amphora image, +# and the LBaaS Mgmt network is wired (these are downstream runbook steps), +# a test LB creation exercises the full PKI chain: + +source ~/admin-openrc +openstack loadbalancer create --name pki-smoketest --vip-subnet-id + +# Watch for amphora spawn (3-5 minutes typical) +watch -n5 'openstack loadbalancer show pki-smoketest' +# Wait for: provisioning_status=ACTIVE, operating_status=ONLINE + +# Octavia-worker log should show successful amphora handshake (signed by Issuing CA, +# trusted via Controller CA): +juju ssh octavia/0 -- sudo journalctl -u octavia-worker --since "10 minutes ago" \ + | grep -iE "(amphora|cert)" | tail -20 +# Expect: "amphora connection established" or similar +# Expect: no TLS handshake errors, no cert validation errors + +# Cleanup the smoketest LB +openstack loadbalancer delete pki-smoketest --cascade +``` + +If amphora handshake fails with cert errors, the most likely causes are: + +1. SAN mismatch — the controller's connection to amphora uses the cert's CN/SAN; + verify the controller cert SAN covers all addresses Octavia uses to reach amphorae. +2. Bundle/key mismatch — `lb-mgmt-controller-cert` bundle should contain BOTH the + cert and the matching private key; if they're for different keys, handshake fails. +3. Encrypted Issuing CA key + wrong passphrase — verify the passphrase string in + the overlay matches what was used at generation. + +--- + +## 16. Roosevelt deltas (forward-look) + +When this runbook is adapted for Roosevelt bare-metal deploy: + +| Aspect | Testcloud (v1) | Roosevelt | +|---|---|---| +| Issuing CA root | Self-signed | Intermediate signed by Vault root CA | +| CA storage | Filesystem on jumphost | Vault PKI engine, encrypted at rest | +| Controller cert validity | 2 years | 90 days | +| Rotation | Manual (this runbook re-run) | Automated via Vault + cron + bundle redeploy | +| Backup | gpg tarball, off-host | Vault's own backup mechanism | +| Amphora image signing | Out of scope for v1 | Image signed by Vault PKI as well | +| Procedure file | `runbooks/01a-octavia-pki-generation.md` | New runbook in Roosevelt repo | + +The procedure structure (generate Issuing CA → Controller CA → Controller cert → +encode → overlay → backup → deploy) remains identical. Roosevelt just sources +the CA root from Vault instead of self-signing. + +--- + +## 17. Rotation/renewal pointer + +For testcloud, the 2-year controller cert and 10-year CAs are intentionally +"set and forget" — they will outlive the cloud at this scale. + +If rotation IS needed before testcloud teardown (e.g., a key leak event), the +re-run procedure is: + +1. Generate new Controller cert signed by **existing** Controller CA (re-run + sections 8-9 only). +2. Regenerate the overlay (section 11) with the new Controller cert; leave all + other values unchanged. +3. `juju config octavia lb-mgmt-controller-cert=` (single-option + update; does not require full bundle redeploy). +4. Octavia services may need a restart: `juju ssh octavia/0 -- sudo systemctl restart octavia-api octavia-worker octavia-housekeeping`. +5. Existing amphorae will need to reconnect using the new cert; in-flight LBs + may briefly drop. This is acceptable for a security-event rotation. + +For Roosevelt, this whole procedure is replaced by Vault automated rotation — +see Roosevelt runbook (TBD). + +--- + +## 18. Change log + +| Date | Change | Reference | +|---|---|---| +| 2026-05-22 | Document created. Fresh-generate, EC P-384 CAs, EC P-256 controller cert, overlay-file distribution. | Workstream 3a |