Newer
Older
openstack-caracal-ipv4 / runbooks / appendix-D-magnum-trust-model.md

Appendix D -- Magnum cluster-create trust model (multi-tenant) [REVISED 2026-07-02]

Fills onboarding Stage 7. Grounded in the magnum + keystone source (read live 2026-07-02) and the live multi-tenant validation (tenant acme). This revision CORRECTS the 2026-07-01 draft, whose central hypothesis (D.3, "a clean-role tenant identity delegates the trust") was REFUTED live -- the real blockers were a keystone policy template (D-065) and an app-cred trust restriction (D-066), neither of which is about role delegation.

VALIDATION STATUS: the identity/trust path is PROVEN end to end -- a tenant password identity clears create_user (D-064) and create_trust (D-065), then magnum proceeds to certificate generation. Cluster COMPLETION is currently blocked one step later at the Barbican/Vault cert substrate (D-067), an operator-side defect independent of the tenant model.


D.1 What magnum does at cluster-create (the mechanism, in order)


  1. create_trustee -> identity:create_user (magnum_domain_admin creates the per-cluster trustee user in the magnum domain). Unblocked by D-064. PROVEN live.
  2. create_trust -> identity:create_trust (the cluster CREATOR is the trustor; trustee is the step-1 user; impersonation=True; roles = the creator's token roles). Unblocked by D-065 + password auth. PROVEN live.
  3. generate_certificates_to_cluster -> stores the cluster CA cert in BARBICAN, which stores it in Vault (castellan vault_key_manager). CURRENT BLOCKER -- see D-067.
  4. (then) the capi-helm driver mints the per-cluster CAPO child app credential (D-039) and provisions via helm/CAPI. NOT YET REACHED on the multi-tenant path.

D.2 Two hard constraints on WHO creates the cluster


Constraint 1 -- the create_trust policy template (D-065). This cloud's charm-rendered base policy shipped identity:create_trust = "user_id:%(trust.trustor_user_id)s", a non-resolving template on Caracal (keystone populates target.trust.trustor_user_id). It evaluated false for EVERY caller (admin included), regardless of roles -- proven by a direct openstack trust create with trustor==self still 403ing. Fixed by D-065 (override with the target-prefixed form keystone itself ships). This is why the 2026-07-01 role-delegation hypothesis was wrong: the failure was templating, not roles.

Constraint 2 -- app credentials cannot create trusts (D-066). After D-065, an app-cred-authenticated create_trust STILL failed -- keystone's _check_application_credential (trusts.py) blocks trust creation from any application-credential token, and on this build the docstring states this applies "regardless of the 'unrestricted' flag". Confirmed live: an unrestricted app cred was refused; the same identity via PASSWORD passed. Therefore the cluster-creator MUST authenticate with a PASSWORD.

Consequence: the cluster-creator is -cluster (password, member + load-balancer_member on the tenant project) per the D-066 Option-3 account model. The app cred (-svc) is for non-trust automation only. See appendix-C for the full account set.

Keystone's create_trust ALSO enforces _require_trustor_has_role_in_project (the trustor must hold each delegated role on the project). Magnum delegates the creator's token roles, which are by construction a subset of what the creator holds on the scoped project -> passes.


D.3 The multi-tenant rule (CORRECTED)


A Magnum cluster is created by the tenant's -cluster identity, authenticating by PASSWORD, project-scoped to -prod, holding exactly member + load-balancer_member. Not admin, not an app cred. This satisfies: create_trust policy (D-065), the app-cred block (password, D-066), the trustor==caller check (by construction), and the trustor-has-role check (D-039-style grants).

The 2026-06-09 single-consumer path (admin creates in the admin-owned capi-mgmt project) sidesteps the trust-delegation constraint and does NOT validate the tenant model -- retired.


D.4 The current blocker (D-067) -- operator-side, not tenant-side


Step D.1(3) fails: magnum -> Barbican POST /v1/secrets returns 500 -> castellan vault_key_manager -> Vault AppRole login rejected: source address "10.12.8.176" unauthorized through CIDR restrictions. barbican reaches Vault on the metal-admin plane; Vault's AppRole binds the secret_id to metal-internal (D-052/D-053). The bundle is correct (all secrets endpoints metal-internal); the LIVE binding drifted. Fix = live rebind to metal-internal (gated, next session), NOT CIDR-widen. Full detail in D-067.

This is independent of the tenant identity model: it blocks cert-gen for ANY creator. Once D-067 is fixed, cluster-create should proceed past cert-gen into the capi-helm driver's provisioning, where the CAPO child-cred mint (D-039) happens under -cluster.


D.5 The create (tenant -cluster, PASSWORD)


# authenticate as <client>-cluster via PASSWORD (NOT app cred), project-scoped to <client>-prod
#   OS_USERNAME=<client>-cluster OS_USER_DOMAIN_ID=<domain> OS_PROJECT_ID=<client-prod> OS_PASSWORD=...
#   OS_CACERT=<vault root CA>    OS_AUTH_URL=https://<keystone-vip>:5000/v3
openstack coe cluster create <cluster> --cluster-template <client>-k8s \
  --keypair <client>-key --master-count 1 --node-count 1
openstack coe cluster show <cluster> -f value -c status -c status_reason
#   expect (post D-067): CREATE_IN_PROGRESS -> ... -> CREATE_COMPLETE

D.6 Open validation items (next session)


  1. Fix D-067 (barbican<->Vault metal-internal rebind), then re-run the create -> cert-gen clears.
  2. Watch to CREATE_COMPLETE; capture where/whose the CAPO child cred is minted (confirms D-066's -cluster-owns-CAPO-cred design empirically).
  3. kubeconfig + nodes/CNI/CCM (phase-08 8.3 pattern).
  4. Clean-room beta pass: onboard a fresh tenant from ONLY handed-over credentials (zero admin fallback) via scripts/tenant-onboard.sh, and complete the tenant-facing tests.