Newer
Older
openstack-caracal-ipv4 / runbooks / 01a-octavia-pki-generation.md
@JANeumatrix JANeumatrix 20 hours ago 22 KB Updates

Runbook 01a — Octavia LBaaS PKI generation

Status: Pre-deploy execution. Runs between 01-destroy-model.md and 02-deploy.md. Numbering rationale: Octavia PKI artifacts must exist on the deploy host before juju deploy is invoked (the values are referenced by the overlay file). Placing this between destroy and deploy aligns generation with the "fresh rebuild" framing.

Cross-references:

  • D-007 (Octavia in bundle from day one)
  • Bundle octavia.options PKI material section
  • overlays/octavia-pki.yaml (gitignored — output of this runbook)
  • Workstream 3a decision (2026-05-22): generate fresh, EC P-384 CAs, overlay-file approach

1. Purpose & scope

This runbook generates a complete two-tier PKI for Charmed Octavia's amphora load-balancer trust domain:

  • Issuing CA — Octavia uses this to sign each amphora's server certificate at LB-creation time. Octavia receives the private key and passphrase.
  • Controller CA — amphorae's trust anchor for connections from the Octavia controller. Octavia only receives the cert (no key needed at runtime); the controller's identity is proved by:
  • Controller certificate — signed by Controller CA, presented by the Octavia controller to each amphora. Bundled as cert + key into a single PEM blob.

Five charm options consume the artifacts (octavia application):

Charm option Content Format
lb-mgmt-issuing-cacert Issuing CA certificate base64-encoded PEM
lb-mgmt-issuing-ca-private-key Issuing CA encrypted private key base64-encoded PEM (already encrypted with passphrase)
lb-mgmt-issuing-ca-key-passphrase Issuing CA key passphrase plain string (NOT base64)
lb-mgmt-controller-cacert Controller CA certificate base64-encoded PEM
lb-mgmt-controller-cert Controller cert + key, concatenated base64-encoded PEM bundle

Scope: v1 testcloud (VR0 DC0 Omega Cloud). Roosevelt deltas documented in section 14.

Out of scope: Octavia API TLS (issued by Vault via octavia:certificates relation); rotation procedure (deferred to Roosevelt runbook).


2. Decisions captured

Per workstream 3a sign-off (2026-05-22):

Decision Choice Roosevelt parallel
Cert provenance Generate fresh (no Bobcat-backup copy) Vault PKI engine
CA key algorithm EC P-384 EC P-384 (Vault root)
Controller cert algorithm EC P-256 EC P-256
CA validity 10 years 5-year intermediate, Vault-rotated
Controller cert validity 2 years 90 days, auto-rotated
Distribution method Juju overlay file (gitignored) Vault-injected at deploy
Storage path on jumphost $HOME/octavia-pki/ Vault PKI mounts
Passphrase strength 32 random bytes, base64-encoded (44 chars) Vault-generated

Naming convention:

  • Issuing CA CN: VR0 DC0 Omega Cloud Octavia Issuing CA
  • Controller CA CN: VR0 DC0 Omega Cloud Octavia Controller CA
  • Controller cert CN: octavia-controller.omega.dc0.vr0.cloud.neumatrix.local
  • Controller cert SANs: above CN, plus octavia.omega.dc0.vr0.cloud.neumatrix.local, plus 10.12.4.233 (the Octavia API VIP per workstream 2)
  • Organization (O): Neumatrix

3. Prerequisites

  • Executor is on jumphost vopenstack-jesse as jessea123.
  • openssl version 3.x or later installed (openssl version to confirm).
  • $HOME is writable (snap-confined openstackclients cannot read /tmp; all paths must resolve under $HOME).
  • Git repository openstack-caracal-ipv4 cloned on jumphost at a known path (referred to as $REPO throughout). Set this in the executor's shell:
    export REPO=$HOME/repos/openstack-caracal-ipv4   # adjust to actual clone path
  • Repository is on main branch and clean (cd $REPO && git status shows clean tree).
  • Previous workstream 2 commit has been pushed (bundle has the VIP assignments and active hacluster stack — verify with grep -c "^ vip: 10.12.4." "$REPO/bundle.yaml", expect 12).

4. Pre-flight: gitignore patch (DO THIS FIRST)

Critical: the .gitignore patch goes in BEFORE any private key material exists on disk. This minimizes the race window for an accidental commit.

cd "$REPO"

# Append to .gitignore (idempotent — check if already present first)
grep -q "octavia-pki.yaml" .gitignore || cat >> .gitignore <<'EOF'

# Octavia PKI artifacts — NEVER commit
overlays/octavia-pki.yaml
octavia-pki/
*.key
*.key.enc
passphrase.txt
EOF

# Review the diff
git diff .gitignore

# Commit and push BEFORE generating any keys
git add .gitignore
git commit -m "gitignore: octavia PKI artifacts and overlay (runbook 01a)"
git push origin main

Verify the gitignore is effective:

# This should NOT show overlays/octavia-pki.yaml even as untracked
touch overlays/octavia-pki.yaml
git status --short overlays/  # expect: empty output for octavia-pki.yaml
rm overlays/octavia-pki.yaml

If the test file does show as untracked, STOP and fix the gitignore syntax before generating any secrets.


5. Workspace setup

WORKDIR=$HOME/octavia-pki
mkdir -p "$WORKDIR"/{issuing-ca,controller-ca,controller,overlay-build}
chmod 700 "$WORKDIR"
cd "$WORKDIR"
echo "Working in: $WORKDIR"

Resulting layout:

$HOME/octavia-pki/
├── issuing-ca/           # passphrase.txt, .key.enc, .cert.pem
├── controller-ca/        # passphrase.txt, .key.enc, .cert.pem
├── controller/           # .key, .csr, .cert.pem, .bundle.pem, .cnf
└── overlay-build/        # base64 intermediates → consumed by step 10

6. Generate Issuing CA

EC P-384 key encrypted with random 32-byte passphrase. Self-signed cert, 10y validity.

cd "$WORKDIR/issuing-ca"

# Generate passphrase (no trailing newline — required for clean YAML embedding)
openssl rand -base64 32 | tr -d '\n' > passphrase.txt
chmod 600 passphrase.txt

# Sanity-check
test $(wc -c < passphrase.txt) -eq 44 || { echo "ERROR: passphrase length wrong"; exit 1; }

# Generate EC P-384 private key, encrypted with passphrase
openssl genpkey -algorithm EC \
  -pkeyopt ec_paramgen_curve:P-384 \
  -aes-256-cbc \
  -pass file:passphrase.txt \
  -out issuing-ca.key.enc
chmod 600 issuing-ca.key.enc

# Self-sign cert (10 years, SHA-384)
openssl req -new -x509 -sha384 \
  -key issuing-ca.key.enc \
  -passin file:passphrase.txt \
  -days 3650 \
  -subj "/CN=VR0 DC0 Omega Cloud Octavia Issuing CA/O=Neumatrix" \
  -out issuing-ca.cert.pem

# Verify
openssl x509 -in issuing-ca.cert.pem -noout -dates -subject
openssl verify -CAfile issuing-ca.cert.pem issuing-ca.cert.pem
# Expect: issuing-ca.cert.pem: OK

ls -la

7. Generate Controller CA

Identical pattern; different CN.

cd "$WORKDIR/controller-ca"

openssl rand -base64 32 | tr -d '\n' > passphrase.txt
chmod 600 passphrase.txt
test $(wc -c < passphrase.txt) -eq 44 || { echo "ERROR: passphrase length wrong"; exit 1; }

openssl genpkey -algorithm EC \
  -pkeyopt ec_paramgen_curve:P-384 \
  -aes-256-cbc \
  -pass file:passphrase.txt \
  -out controller-ca.key.enc
chmod 600 controller-ca.key.enc

openssl req -new -x509 -sha384 \
  -key controller-ca.key.enc \
  -passin file:passphrase.txt \
  -days 3650 \
  -subj "/CN=VR0 DC0 Omega Cloud Octavia Controller CA/O=Neumatrix" \
  -out controller-ca.cert.pem

openssl x509 -in controller-ca.cert.pem -noout -dates -subject
openssl verify -CAfile controller-ca.cert.pem controller-ca.cert.pem
# Expect: controller-ca.cert.pem: OK

Why Controller CA's key is encrypted even though Octavia never uses it: The Controller CA key is needed for future rotations of the controller cert. Encrypting it (with its own passphrase, separate from Issuing CA's) is defense in depth — if the jumphost is compromised, the key still requires the passphrase to be useful for forging controller certs.


8. Generate Controller certificate

EC P-256 key (no encryption — Octavia must read it at startup), CSR with SAN extensions, signed by Controller CA, 2y validity.

cd "$WORKDIR/controller"

# Generate unencrypted EC P-256 key
openssl genpkey -algorithm EC \
  -pkeyopt ec_paramgen_curve:P-256 \
  -out controller.key
chmod 600 controller.key

# CSR config with SAN extensions
cat > controller.cnf <<'EOF'
[req]
distinguished_name = req_distinguished_name
req_extensions = v3_req
prompt = no

[req_distinguished_name]
CN = octavia-controller.omega.dc0.vr0.cloud.neumatrix.local
O = Neumatrix

[v3_req]
keyUsage = critical, digitalSignature, keyEncipherment
extendedKeyUsage = clientAuth, serverAuth
subjectAltName = @alt_names

[alt_names]
DNS.1 = octavia-controller.omega.dc0.vr0.cloud.neumatrix.local
DNS.2 = octavia.omega.dc0.vr0.cloud.neumatrix.local
IP.1 = 10.12.4.233
EOF

# Generate CSR
openssl req -new -sha256 \
  -key controller.key \
  -config controller.cnf \
  -out controller.csr

# Sign with Controller CA (2 years)
openssl x509 -req -sha256 \
  -in controller.csr \
  -CA "$WORKDIR/controller-ca/controller-ca.cert.pem" \
  -CAkey "$WORKDIR/controller-ca/controller-ca.key.enc" \
  -passin file:"$WORKDIR/controller-ca/passphrase.txt" \
  -CAcreateserial \
  -days 730 \
  -extfile controller.cnf \
  -extensions v3_req \
  -out controller.cert.pem

# Bundle cert + key (the lb-mgmt-controller-cert option expects both in one PEM)
cat controller.cert.pem controller.key > controller.bundle.pem
chmod 600 controller.bundle.pem

Verify the chain and SAN:

# Chain verifies
openssl verify -CAfile "$WORKDIR/controller-ca/controller-ca.cert.pem" controller.cert.pem
# Expect: controller.cert.pem: OK

# SAN extensions present
openssl x509 -in controller.cert.pem -noout -ext subjectAltName
# Expect:
#     DNS:octavia-controller.omega.dc0.vr0.cloud.neumatrix.local,
#     DNS:octavia.omega.dc0.vr0.cloud.neumatrix.local,
#     IP Address:10.12.4.233

# Validity
openssl x509 -in controller.cert.pem -noout -dates
# Expect: notAfter ~2 years from today

# Bundle integrity (cert + key match)
openssl x509 -in controller.bundle.pem -noout -pubkey > /tmp/cert.pub
openssl pkey -in controller.bundle.pem -pubout > /tmp/key.pub
diff /tmp/cert.pub /tmp/key.pub && echo "Bundle cert/key match"
rm /tmp/cert.pub /tmp/key.pub

9. Final chain verification

A standalone block to confirm the full chain is sound before consuming for Octavia:

cd "$WORKDIR"

echo "=== Issuing CA ==="
openssl x509 -in issuing-ca/issuing-ca.cert.pem -noout -subject -dates
openssl verify -CAfile issuing-ca/issuing-ca.cert.pem issuing-ca/issuing-ca.cert.pem

echo ""
echo "=== Controller CA ==="
openssl x509 -in controller-ca/controller-ca.cert.pem -noout -subject -dates
openssl verify -CAfile controller-ca/controller-ca.cert.pem controller-ca/controller-ca.cert.pem

echo ""
echo "=== Controller cert ==="
openssl x509 -in controller/controller.cert.pem -noout -subject -dates
openssl verify -CAfile controller-ca/controller-ca.cert.pem controller/controller.cert.pem

All three "verify" lines must show : OK. If any do not, STOP and investigate before proceeding.


10. Base64-encode artifacts

Each base64 file is a single line (no wrapping); each becomes one YAML value.

cd "$WORKDIR/overlay-build"

# Issuing CA cert (base64)
base64 -w0 "$WORKDIR/issuing-ca/issuing-ca.cert.pem" > issuing-cacert.b64

# Issuing CA private key (already encrypted PEM → base64)
base64 -w0 "$WORKDIR/issuing-ca/issuing-ca.key.enc" > issuing-ca-private-key.b64

# Controller CA cert
base64 -w0 "$WORKDIR/controller-ca/controller-ca.cert.pem" > controller-cacert.b64

# Controller cert + key bundle
base64 -w0 "$WORKDIR/controller/controller.bundle.pem" > controller-cert.b64

# Sanity-check sizes (expect 500-2000 chars each)
wc -c *.b64

11. Assemble the overlay file

# Read each artifact into shell variables
ISSUING_CACERT=$(cat "$WORKDIR/overlay-build/issuing-cacert.b64")
ISSUING_CA_KEY=$(cat "$WORKDIR/overlay-build/issuing-ca-private-key.b64")
ISSUING_CA_PASS=$(cat "$WORKDIR/issuing-ca/passphrase.txt")
CONTROLLER_CACERT=$(cat "$WORKDIR/overlay-build/controller-cacert.b64")
CONTROLLER_CERT=$(cat "$WORKDIR/overlay-build/controller-cert.b64")

# Assemble overlay (note: passphrase is YAML-quoted; cert blobs are not — they're
# guaranteed-safe base64 without special chars)
mkdir -p "$REPO/overlays"
cat > "$REPO/overlays/octavia-pki.yaml" <<EOF
# Octavia LBaaS PKI overlay — SENSITIVE — NEVER COMMIT
# Generated: $(date -u +%Y-%m-%dT%H:%M:%SZ) UTC
# Source: runbooks/01a-octavia-pki-generation.md
# Issuing CA, Controller CA, Controller cert all generated fresh per workstream 3a.
#
# This file is gitignored. If you see it staged or committed, .gitignore is broken.

applications:
  octavia:
    options:
      lb-mgmt-issuing-cacert: ${ISSUING_CACERT}
      lb-mgmt-issuing-ca-private-key: ${ISSUING_CA_KEY}
      lb-mgmt-issuing-ca-key-passphrase: "${ISSUING_CA_PASS}"
      lb-mgmt-controller-cacert: ${CONTROLLER_CACERT}
      lb-mgmt-controller-cert: ${CONTROLLER_CERT}
EOF

chmod 600 "$REPO/overlays/octavia-pki.yaml"

# Unset the shell variables (they held key material)
unset ISSUING_CACERT ISSUING_CA_KEY ISSUING_CA_PASS CONTROLLER_CACERT CONTROLLER_CERT

Validate the overlay parses as YAML:

python3 -c "import yaml; d = yaml.safe_load(open('$REPO/overlays/octavia-pki.yaml')); \
  o = d['applications']['octavia']['options']; \
  print('Keys present:', sorted(o.keys())); \
  print('All values non-empty:', all(v for v in o.values()))"
# Expect: 5 keys listed; "All values non-empty: True"

Confirm gitignore is doing its job:

cd "$REPO"
git status --short
# overlays/octavia-pki.yaml MUST NOT appear here
# If it does — STOP, shred the file, fix .gitignore, regenerate

12. Bundle.yaml housekeeping

The octavia application in bundle.yaml still has commented placeholder lines for the 5 PKI options plus the TODO(octavia-cert): block. These should be removed and replaced with a pointer to the overlay.

Replace this block in bundle.yaml (inside octavia.options:):

      # ----- PKI material (4 cert blobs + passphrase) ---------------------
      # TODO(octavia-cert): inline values BEFORE deploy. Two sources:
      #   (a) Copy from Bobcat backup at:
      #       ~/backups/pre-caracal-destroy-2026-05-22/bundle-pre-destroy.yaml
      #       (lines ~230-234; CA valid until 2027-05-15 — adequate for testcloud)
      #   (b) Generate fresh via the (yet-to-be-written) octavia-cert-runbook
      #       — required for Roosevelt deploy
      # lb-mgmt-controller-cacert: <base64 PEM>
      # lb-mgmt-controller-cert: <base64 PEM cert + key>
      # lb-mgmt-issuing-ca-key-passphrase: <passphrase string>
      # lb-mgmt-issuing-ca-private-key: <base64 encrypted PEM>
      # lb-mgmt-issuing-cacert: <base64 PEM>

With this block:

      # ----- PKI material -------------------------------------------------
      # 5 lb-mgmt-* options are supplied via overlays/octavia-pki.yaml
      # (gitignored). Generated per runbooks/01a-octavia-pki-generation.md.
      # Deploy with:
      #   juju deploy ./bundle.yaml \
      #     --overlay overlays/vr0-dc0-testcloud.yaml \
      #     --overlay overlays/octavia-pki.yaml

Commit this bundle change separately from the overlay generation work:

cd "$REPO"
git diff bundle.yaml
git add bundle.yaml
git commit -m "bundle: octavia PKI moves to overlay (runbook 01a)

Remove inline placeholders + TODO(octavia-cert) block. PKI values now
supplied via overlays/octavia-pki.yaml (gitignored), generated per
runbooks/01a-octavia-pki-generation.md. Decision per workstream 3a
(2026-05-22): industry-best-practice secret handling on testcloud
to rehearse Roosevelt's Vault-PKI-backed posture."
git push origin main

13. Sensitive-file backup

The Issuing CA private key + its passphrase are the crown jewels of the LB trust domain. Loss → cannot sign new amphora certs (LBs gradually break). Exposure → attacker can forge amphora identities and intercept tenant LB traffic.

Minimum backup for testcloud:

cd $HOME
BACKUP_NAME="octavia-pki-backup-$(date +%Y%m%d-%H%M%S).tar.gz"

tar -czf "$BACKUP_NAME" -C $HOME octavia-pki/

# Encrypt with strong symmetric cipher
gpg --symmetric --cipher-algo AES256 --output "${BACKUP_NAME}.gpg" "$BACKUP_NAME"

# Shred the unencrypted tar
shred -uvz "$BACKUP_NAME"

ls -la "${BACKUP_NAME}.gpg"

Move ${BACKUP_NAME}.gpg off-host (your decision — admin workstation encrypted drive, password-manager attachment, dedicated secrets vault, etc.). Do NOT leave it sitting in $HOME on the jumphost long-term — that's a single point of compromise.

Roosevelt note: Vault PKI engine stores all of this — no manual backup required; Vault's own backup mechanism covers it. The procedure above is testcloud-only.


14. Cleanup of intermediates

After successful deploy + verification (section 14), shred files that are not needed for future rotation:

# Optional: shred the base64 intermediates (regeneratable from PEM sources)
shred -uvz "$WORKDIR/overlay-build/"*.b64
rmdir "$WORKDIR/overlay-build"

# Optional: shred the CSR (regeneratable if needed)
shred -uvz "$WORKDIR/controller/controller.csr"

# DO NOT shred any of the following — they are needed for future operations:
#   - issuing-ca/{issuing-ca.cert.pem, issuing-ca.key.enc, passphrase.txt}
#   - controller-ca/{controller-ca.cert.pem, controller-ca.key.enc, passphrase.txt}
#   - controller/{controller.key, controller.cert.pem, controller.bundle.pem, controller.cnf}
#
# Specifically:
#   - Issuing CA artifacts: required for signing new amphoras (Octavia uses them runtime)
#   - Controller CA artifacts: required for signing new controller certs (rotation)
#   - Controller cert/key: required to repopulate the overlay if jumphost is rebuilt

15. Post-deploy verification

After runbooks/02-deploy.md completes (juju deploy with the overlay), verify Octavia is healthy and the PKI plumbing works.

# Octavia charm active/idle
juju status octavia
# Expect: octavia/0 active idle

# Octavia services running
juju ssh octavia/0 -- sudo systemctl is-active octavia-api octavia-worker octavia-housekeeping
# Expect: 3x "active"

# Confirm PKI files landed on the unit
juju ssh octavia/0 -- sudo ls -la /etc/octavia/certs/
# Expect: server_ca.cert.pem, server_ca.key.pem, client_ca.cert.pem, client.cert-and-key.pem
# (filenames are charm-controlled; presence is what matters)

# Confirm Octavia can use them — verbose health-check from the API
juju ssh octavia/0 -- sudo journalctl -u octavia-api --since "5 minutes ago" \
  | grep -iE "(cert|ssl|tls|amphora)" | head -20
# Expect: no errors related to cert loading

Smoketest — create a test LB once amphora image is available:

# After `octavia-diskimage-retrofit` has populated Glance with the amphora image,
# and the LBaaS Mgmt network is wired (these are downstream runbook steps),
# a test LB creation exercises the full PKI chain:

source ~/admin-openrc
openstack loadbalancer create --name pki-smoketest --vip-subnet-id <provider-subnet>

# Watch for amphora spawn (3-5 minutes typical)
watch -n5 'openstack loadbalancer show pki-smoketest'
# Wait for: provisioning_status=ACTIVE, operating_status=ONLINE

# Octavia-worker log should show successful amphora handshake (signed by Issuing CA,
# trusted via Controller CA):
juju ssh octavia/0 -- sudo journalctl -u octavia-worker --since "10 minutes ago" \
  | grep -iE "(amphora|cert)" | tail -20
# Expect: "amphora <UUID> connection established" or similar
# Expect: no TLS handshake errors, no cert validation errors

# Cleanup the smoketest LB
openstack loadbalancer delete pki-smoketest --cascade

If amphora handshake fails with cert errors, the most likely causes are:

  1. SAN mismatch — the controller's connection to amphora uses the cert's CN/SAN; verify the controller cert SAN covers all addresses Octavia uses to reach amphorae.
  2. Bundle/key mismatch — lb-mgmt-controller-cert bundle should contain BOTH the cert and the matching private key; if they're for different keys, handshake fails.
  3. Encrypted Issuing CA key + wrong passphrase — verify the passphrase string in the overlay matches what was used at generation.

16. Roosevelt deltas (forward-look)

When this runbook is adapted for Roosevelt bare-metal deploy:

Aspect Testcloud (v1) Roosevelt
Issuing CA root Self-signed Intermediate signed by Vault root CA
CA storage Filesystem on jumphost Vault PKI engine, encrypted at rest
Controller cert validity 2 years 90 days
Rotation Manual (this runbook re-run) Automated via Vault + cron + bundle redeploy
Backup gpg tarball, off-host Vault's own backup mechanism
Amphora image signing Out of scope for v1 Image signed by Vault PKI as well
Procedure file runbooks/01a-octavia-pki-generation.md New runbook in Roosevelt repo

The procedure structure (generate Issuing CA → Controller CA → Controller cert → encode → overlay → backup → deploy) remains identical. Roosevelt just sources the CA root from Vault instead of self-signing.


17. Rotation/renewal pointer

For testcloud, the 2-year controller cert and 10-year CAs are intentionally "set and forget" — they will outlive the cloud at this scale.

If rotation IS needed before testcloud teardown (e.g., a key leak event), the re-run procedure is:

  1. Generate new Controller cert signed by existing Controller CA (re-run sections 8-9 only).
  2. Regenerate the overlay (section 11) with the new Controller cert; leave all other values unchanged.
  3. juju config octavia lb-mgmt-controller-cert=<new-base64> (single-option update; does not require full bundle redeploy).
  4. Octavia services may need a restart: juju ssh octavia/0 -- sudo systemctl restart octavia-api octavia-worker octavia-housekeeping.
  5. Existing amphorae will need to reconnect using the new cert; in-flight LBs may briefly drop. This is acceptable for a security-event rotation.

For Roosevelt, this whole procedure is replaced by Vault automated rotation — see Roosevelt runbook (TBD).


18. Change log

Date Change Reference
2026-05-22 Document created. Fresh-generate, EC P-384 CAs, EC P-256 controller cert, overlay-file distribution. Workstream 3a