After vault's cert cascade (phase-02), confirm the cloud settled to active/idle (except the expected post-deploy blocks), build the IP-only admin-openrc, verify API reachability, and repoint the external Horizon reverse proxy.
Decisions: B5 (IP-only endpoints; no FQDN), D-021 (octavia stays BLOCKED awaiting configure-resources -- expected, cleared in phase-05), D-044 (Horizon Secure-cookie override on the plain-HTTP proxy leg; Step 3.3, PER-REBUILD), D-045 / DOCFIX-031 (haproxy backends confirmed LOADED via a functional sweep, NOT juju status; Step 3.1). Troubleshooting: appendix-A -- DOCFIX-021 (action human-output corrupts captured artifacts), DOCFIX-018 (IP-only OS_AUTH_URL), DOCFIX-022 (admin project discovered, not hardcoded), D-045/DOCFIX-031 (haproxy plaintext-check-vs-SSL backend DOWN), nginx reverse-proxy lessons.
ENV(keystone-vip) 10.12.4.50 (keystone PUBLIC endpoint = provider VIP; verify vs bundle)ENV(admin-domain) admin_domain (charmed-keystone admin user + project domain)ENV(dashboard-vip) 10.12.4.58 (Horizon provider VIP; was .234 pre-R14)# RUN: jumphost -- vopenstack-jesse as jessea123; juju + openstack + openssl.# RUN: jumphost The cascade here is NARROW (mysql bootstrapped before vault init, so only the Vault consumers clear: ovn-central x3, ovn-chassis x3, ovn-chassis-octavia, neutron-api-plugin-ovn, barbican-vault). Watch, then walk units AND subordinates.
juju status --color --watch 30s -m openstack # Ctrl-C once settled
Acceptance walk (counts non-active/idle across units + subordinates):
juju status -m openstack --format=yaml | python3 -c "
import yaml,sys
d=yaml.safe_load(sys.stdin); apps=d.get('applications',{}); bad=[]
def chk(n,u):
ws=(u.get('workload-status') or {}).get('current',''); js=(u.get('juju-status') or {}).get('current','')
msg=(u.get('workload-status') or {}).get('message','')
if ws!='active' or js!='idle': bad.append('%s: workload=%s juju=%s msg=%s'%(n,ws,js,msg))
for app,info in apps.items():
for un,ud in (info.get('units') or {}).items():
chk(un,ud)
for sn,sd in (ud.get('subordinates') or {}).items(): chk(sn,sd)
print('Non-active/idle units: %d'%len(bad))
for b in bad: print(' '+b)
"
GATE: expected non-active/idle = 1 (octavia/0 BLOCKED "Awaiting configure-resources", the D-021 next step) or briefly 2 (+ glance-simplestreams-sync, normal pre-run). Any TLS consumer (the five above) persisting waiting/error past ~15 min is the concern -- STOP and read its log + relations (do NOT assume TLS; a prior stall was a MySQL 1045 desync):
juju status --relations -m openstack ovn-central ovn-chassis ovn-chassis-octavia neutron-api-plugin-ovn barbican-vault # juju ssh -m openstack <unit> -- 'sudo tail -120 /var/log/juju/unit-<unit-dashed>.log' </dev/null
# RUN: jumphost The acceptance walk above gates on juju active/idle. That is NOT sufficient: a unit can be active/idle while a charm-rendered haproxy backend is silently DOWN -- observed 2026-06-12, nova-cc nova-api down ~3.2 days with juju green (root cause D-045: haproxy not reloaded after the cert cascade -> plaintext checks vs the SSL backend). Probe haproxy's own verdict on every unit:
( {
echo "=== POST-TLS GATE: haproxy backend health sweep across all units ==="
for unit in $(juju status -m openstack --format=json | python3 -c 'import json,sys; d=json.load(sys.stdin); [print(u) for a in d.get("applications",{}).values() for u in (a.get("units") or {})]'); do
juju ssh -m openstack "$unit" -- "test -S /var/run/haproxy/admin.sock || exit 0; sudo python3 -c 'import socket;s=socket.socket(socket.AF_UNIX);s.connect(\"/var/run/haproxy/admin.sock\");s.sendall(b\"show stat\n\");print(s.makefile().read())' | grep -vE 'FRONTEND|BACKEND' | grep ',DOWN,'" </dev/null 2>/dev/null | sed "s|^|[$unit] DOWN: |"
done
echo "=== sweep complete -- no DOWN lines above means every haproxy backend is UP ==="
} )
GATE: zero [unit] DOWN: lines. On a DOWN line (check token L7STS/400 == plaintext-vs-SSL), remediate the flagged unit (set U, then validate-and-reload):
U=nova-cloud-controller/0 juju ssh -m openstack "$U" -- 'sudo haproxy -c -f /etc/haproxy/haproxy.cfg' </dev/null # gate: must say valid juju ssh -m openstack "$U" -- 'sudo systemctl reload haproxy' </dev/null # graceful master-worker
Re-run the sweep until clean. (Signature confirm if needed: plaintext curl http://SVC-IP:876x/ returns 400, TLS curl -k https://SVC-IP:876x/ returns 200 -- the transport is the difference.)
# RUN: jumphost Keystone PUBLIC = the provider VIP IP over HTTPS with the vault CA (no FQDN, no /etc/hosts -- B5). This canonical block folds in three fixes: the CA is pulled via --format json + jq because the action's human output wraps the PEM in an INDENTED YAML block that is not valid PEM (appendix-A: DOCFIX-021); the OS_AUTH_URL is the VIP IP (DOCFIX-018); and the admin project is DISCOVERED by a scope-test loop rather than hardcoded, because the scoping project name varies by charm rev (DOCFIX-022 -- the cause of a prior HTTP 401). ( set -e ) keeps OS_PASSWORD inside the subshell and aborts cleanly on any failure.
KEYSTONE_VIP=10.12.4.50 # keystone PUBLIC endpoint = provider VIP (verify vs bundle on rebuild)
ADMIN_DOMAIN=admin_domain # charmed-keystone admin user + project domain
PROJECT_CANDIDATES="admin admin_domain" # tried in order; first that SCOPES wins (DOCFIX-022 variance)
CA="$HOME/vault-init/vault-ca-root.pem"
RC="$HOME/admin-openrc"
( set -e
mkdir -p "$HOME/vault-init"
# 1. Vault root CA -> file (JSON extract; DOCFIX-021 -- human output indents the PEM)
juju run vault/leader get-root-ca -m openstack --format json \
| jq -r '[.. | strings | select(test("-----BEGIN CERTIFICATE-----"))][0]' > "$CA"
openssl x509 -in "$CA" -noout -subject -dates
# 2. Admin password -> var (JSON extract, not human output)
ADMIN_PASS=$(juju run keystone/leader get-admin-password -m openstack --format json | python3 -c "
import json,sys
d=json.load(sys.stdin)
def f(o):
if isinstance(o,dict):
for k in ('admin-password','password','Stdout'):
if k in o and o[k]: return str(o[k]).strip()
for v in o.values():
r=f(v)
if r: return r
elif isinstance(o,list):
for v in o:
r=f(v)
if r: return r
return ''
print(f(d))
")
[ -n "$ADMIN_PASS" ] || { echo "FATAL: password extract failed"; exit 1; }
# 3. PROJECT LOOKUP: first candidate that issues a SCOPED token wins (DOCFIX-022)
export OS_AUTH_URL="https://${KEYSTONE_VIP}:5000/v3" OS_USERNAME=admin OS_PASSWORD="$ADMIN_PASS"
export OS_USER_DOMAIN_NAME="$ADMIN_DOMAIN" OS_PROJECT_DOMAIN_NAME="$ADMIN_DOMAIN"
export OS_IDENTITY_API_VERSION=3 OS_REGION_NAME=RegionOne OS_CACERT="$CA"
ADMIN_PROJECT=""
for P in $PROJECT_CANDIDATES; do
if OS_PROJECT_NAME="$P" openstack token issue >/dev/null 2>&1; then ADMIN_PROJECT="$P"; break; fi
done
[ -n "$ADMIN_PROJECT" ] || { echo "FATAL: no candidate project scoped (tried: $PROJECT_CANDIDATES)"; exit 1; }
echo "[OK] admin project = $ADMIN_PROJECT ; password len ${#ADMIN_PASS}"
# 4. Write ~/admin-openrc (backs up any existing one first)
[ -f "$RC" ] && mv "$RC" "$RC.pre-$(date -u +%Y%m%dT%H%M%SZ)"
cat > "$RC" <<EOF
export OS_AUTH_URL=https://${KEYSTONE_VIP}:5000/v3
export OS_USERNAME=admin
export OS_PASSWORD='$ADMIN_PASS'
export OS_PROJECT_NAME=$ADMIN_PROJECT
export OS_USER_DOMAIN_NAME=$ADMIN_DOMAIN
export OS_PROJECT_DOMAIN_NAME=$ADMIN_DOMAIN
export OS_IDENTITY_API_VERSION=3
export OS_REGION_NAME=RegionOne
export OS_CACERT=$CA
EOF
chmod 600 "$RC"
)
# 5. Verify from the written file (password stayed inside the subshell above)
( source "$RC"; echo "auth -> $OS_AUTH_URL project=$OS_PROJECT_NAME"; openstack token issue 2>&1 | head -6 )
( source "$RC"; openstack endpoint list -f value -c "Service Name" -c Interface -c URL 2>&1 | sort )
GATE: token issue returns a SCOPED token; endpoint list is IP-only across all services (public on the provider VIP .5x, internal+admin on the metal VIP .8.5x, keystone admin on :35357). Two non-blocking notes for later: s3/swift is registered on the radosgw VIP .60:443 (re-check vs the radosgw :80 listener during any Swift/S3 smoke); the gss image-stream is HTTP on metal 10.12.8.172.
# RUN: operator (outside the Juju model) + jumphost Horizon is fronted by an operator-managed nginx reverse proxy. On each rebuild / VIP relocation: (1) repoint the upstream to the CURRENT dashboard provider VIP (now https://10.12.4.58, was .234 pre-R14), and (2) reapply the Horizon Secure-cookie override (DOCFIX-030 / D-044, PER-REBUILD -- below). Two interplays:
proxy_set_header Host $http_host); rewriting it to the VIP would emit redirects corporate clients may not route.juju-ffe3b8-2-lxd-2) alongside IP SANs. nginx upstream verification is DNS-only (X509_check_host), so the proxy_pass IP NEVER matches -- proxy_ssl_verify on requires proxy_ssl_name set to the cert's DNS SAN. (B5 is IP-only for ENDPOINTS, not certs.)Proxy topology (as-executed 2026-06-12 -- confirm/refresh per site):
nginx 10.12.4.7 (Ubuntu 24.04, nginx 1.24.0 native/systemd); also fronts MAAS (listen 80 -> 10.12.4.10:5240). Horizon vhost /etc/nginx/sites-available/openstack (symlinked into sites-enabled), listen 81; corporate clients reach it via 10.17.11.246:81.As-executed change set (gate every edit -- sed -i exits 0 on zero matches, so grep-assert the expected line after any mutation):
# RUN: jumphost -- ship the vault root CA to the proxy scp ~/vault-init/vault-ca-root.pem jessea123@10.12.4.7:/tmp/
# RUN: operator ON 10.12.4.7 -- install CA, back up + edit the Horizon vhost, validate, restart. sudo install -o root -g root -m 644 /tmp/vault-ca-root.pem /etc/nginx/vault-ca-root.pem && rm -f /tmp/vault-ca-root.pem sudo cp -a /etc/nginx/sites-available/openstack "/etc/nginx/sites-available/openstack.bak-$(date -u +%Y%m%dT%H%M%SZ)" # Set in the Horizon server block (then `grep` to confirm each landed): # proxy_pass https://10.12.4.58:443; # proxy_ssl_trusted_certificate /etc/nginx/vault-ca-root.pem; # proxy_ssl_verify on; # proxy_ssl_name juju-ffe3b8-2-lxd-2; # the dashboard cert's DNS SAN -- per site (discover: openssl s_client -connect 10.12.4.58:443 </dev/null 2>/dev/null | openssl x509 -noout -ext subjectAltName) # proxy_redirect https://$http_host/ http://$http_host/; # unwind the scheme-mismatch redirect loop (Horizon emits absolute https:// on the client Host -> browser then speaks TLS to the :81 plaintext listener) sudo nginx -t # GATE: configuration ok sudo systemctl restart nginx # prefer restart over reload for a definitive cutover (a curl ~2s after `reload` can be served by a draining old worker; ~2s blip incl. the co-hosted MAAS proxy)
GATE (on the proxy): curl -sI http://127.0.0.1:81/horizon/ -> 302 to .../auth/login; no TLS errors in error.log.
The charm renders CSRF_COOKIE_SECURE/SESSION_COOKIE_SECURE = True (vault:certificates). On the plain-HTTP client leg the browser drops the Secure csrftoken and login fails with "CSRF cookie not set" -- so a clean follow of 3.3 otherwise stalls at the browser login. Drop an ASCII-only post-load override on the dashboard unit, then graceful-reload apache2:
# RUN: jumphost -- D-044 cookie override on the dashboard unit (ASCII-only; PER-REBUILD) juju ssh -m openstack openstack-dashboard/leader -- "printf 'CSRF_COOKIE_SECURE = False\nSESSION_COOKIE_SECURE = False\n' | sudo tee /usr/share/openstack-dashboard/openstack_dashboard/local/local_settings.d/_99_internal_http_cookies.py >/dev/null && sudo systemctl reload apache2" </dev/null
Verify the csrftoken Set-Cookie carries NO Secure attribute (over the VIP, vault CA):
# RUN: jumphost CK=$(curl -s -o /dev/null -D - --cacert ~/vault-init/vault-ca-root.pem https://10.12.4.58/horizon/auth/login/ | grep -i 'set-cookie:.*csrftoken') [ -n "$CK" ] || echo "WARN: no csrftoken Set-Cookie on this GET -- confirm via the browser login" printf '%s\n' "$CK" | grep -iq 'secure' && echo "FAIL: csrftoken still Secure" || echo "OK: csrftoken not Secure"
Then confirm an external browser login over the proxy succeeds. PER-REBUILD: teardown wipes the unit; reapply each rebuild until edge TLS (the Roosevelt access/DNS workstream) makes the override unnecessary. The upstream stays PLAIN HTTP (as-built); the abandoned upstream-TLS and self-signed-client-TLS approaches are NOT part of v1 (D-044 rationale). Diagnostic lessons (reload race, proxy_ssl_name DNS-SAN, sed no-op, scheme-mismatch redirect loop) are in appendix-A.
~/admin-openrc (0600) authenticates and returns a SCOPED token; endpoint list IP-only.~/vault-init/vault-ca-root.pem validates TLS to the keystone VIP./horizon (the root path 404s) -- probe /horizon/auth/login/. Two VIPs per bundle B1: 10.12.4.58 (provider) + 10.12.8.58 (metal).phase-04 -- network carve (external provider network).