# Appendix D -- Magnum cluster-create trust model (multi-tenant)

Fills the gap the onboarding runbook Stage 7 marks [PENDING]: exactly which identity
creates a Magnum cluster, and why the Keystone trust delegation constrains that choice.
Grounded in the magnum source (magnum/common/keystone.py, read live 2026-07-01) and the
D-039 / D-051 / D-064 identity model. Supersedes the single-consumer shortcut used on
2026-06-09 (admin creates in the admin-owned capi-mgmt project), which sidesteps -- rather
than exercises -- the trust constraint and therefore does NOT validate the tenant path.

--------------------------------------------------------------------------------
## D.1 What magnum does at cluster-create (the mechanism)
--------------------------------------------------------------------------------

Two Keystone writes happen before any infrastructure is touched
(magnum/conductor/handlers/common/trust_manager.py -> create_trustee_and_trust):

1. create_trustee -> `identity:create_user`
   Magnum's trustee_domain_admin (magnum_domain_admin, Admin on the magnum domain)
   creates a per-cluster service user in the magnum domain. This is the step D-064
   unblocked (the create_user policy templating fix). VALIDATED live 2026-07-01:
   trustee user is created successfully.

2. create_trust -> `identity:create_trust`
   Magnum creates a Keystone trust delegating the CALLER's roles to that trustee.
   From magnum/common/keystone.py:

       def create_trust(self, trustee_user):
           trustor_user_id   = self.session.get_user_id()      # the CALLER's user
           trustor_project_id = self.session.get_project_id()  # the CALLER's project
           if CONF.trust.roles:
               roles = CONF.trust.roles      # (unset on this deploy)
               else:
               roles = self.context.roles    # -> the roles in the CALLER's token
           self.client.trusts.create(
               trustor_user=trustor_user_id, project=trustor_project_id,
               trustee_user=trustee_user, impersonation=True, role_names=roles)

Two facts follow directly from that code, and they are the whole model:

  A. The TRUSTOR is the identity that issued `openstack coe cluster create`
     (`self.session` is the request-context client). The Keystone policy
     `identity:create_trust = "user_id:%(trust.trustor_user_id)s"` is therefore
     satisfied by construction -- caller == trustor. (So the create_trust 403 is
     NOT a trustor-identity policy failure.)

  B. The DELEGATED ROLES are `self.context.roles` -- the roles present in the
     CALLER's token on `trustor_project_id`. Keystone's create_trust REFUSES to
     delegate any role the trustor does not actually hold on that project
     (a trust cannot grant more than the trustor has). `CONF.trust.roles` is unset
     here, so magnum delegates the caller's token roles verbatim -- whatever they are.

--------------------------------------------------------------------------------
## D.2 Why the 2026-06-09 single-consumer path "worked" (and why we retired it)
--------------------------------------------------------------------------------

On 2026-06-09 the cluster was created by ADMIN, scoped to the admin-owned capi-mgmt
project. Admin trivially holds (or cloud-admin-bypasses) every role it delegates to
itself, so create_trust never exercised the delegation constraint. That is a
SINGLE-CONSUMER shortcut: one privileged operator standing in for the tenant. It
proves the driver/CAPI plumbing but NOT the multi-tenant identity path, because in
the real product the cluster creator is a TENANT, not the cloud operator.

The admin-in-capi-mgmt attempt on 2026-07-01 then 403'd at create_trust because that
mixed scope (admin user, capi-mgmt project) is not a clean delegatable-role identity
on capi-mgmt -- and, under D-064, admin scoped to capi-mgmt is a RESTRICTED identity
there (it is not cloud_admin outside the admin domain; `list_role_assignments` 403s
in that scope, confirmed live). It is the wrong identity for the tenant model on two
counts: it is the operator, and its token roles are not the tenant delegatable set.

--------------------------------------------------------------------------------
## D.3 The multi-tenant rule (what identity must create the cluster)
--------------------------------------------------------------------------------

RULE: a Magnum cluster is created by the TENANT's own project-scoped identity, whose
token carries EXACTLY the delegatable tenant roles -- `member` and
`load-balancer_member` (and `reader` where used) -- and NOT `admin`.

Rationale, straight from D.1.B:
  - The trust delegates `context.roles`. If the creator's token carries `admin`,
    magnum tries to delegate `admin` into the trust; Keystone refuses a trust that
    grants a role the trustor does not properly hold as a delegatable project grant,
    and even if it did, delegating `admin` into a long-lived cluster credential is a
    privilege-escalation footgun (the trustee impersonates the trustor with
    impersonation=True). The tenant set (member + load-balancer_member) is the
    correct, least-privilege delegation.
  - `load-balancer_member` MUST be in the creator's token: the magnum-capi-helm
    driver provisions an Octavia LB for the apiserver, and the trust must carry
    Octavia authority or CAPO 403s at LB reconcile (D-039). This is exactly why
    D-039 grants the trustor `load-balancer_member` on the cluster project.
  - `member` provides the compute/network/volume authority the cluster's CCM/CSI
    need via the trust.

WHO THIS IS, per the onboarding model (tenant-onboarding-runbook Stage 2/4):
  - The tenant's SERVICE identity: `<client>-ci` / `<client>-svc`, holding
    `member` + `load-balancer_member` on `<client>-prod`, authenticating with its
    UNRESTRICTED application credential (the app cred is required so the driver can
    mint the per-cluster CAPO child cred -- D-039 / onboarding Stage 4).
  - Equivalently a tenant human user with `member` + `load-balancer_member` on the
    project, but the service/app-cred identity is the production path (Jenkins/CI).

The operator (admin / cloud_admin) does NOT create tenant clusters. The capi-mgmt
project is the MANAGEMENT-plane project (where the CAPI mgmt cluster VM and the
operator's own D-039 roles live for the mgmt cluster itself); tenant clusters are
created in the TENANT's project by the TENANT's identity.

--------------------------------------------------------------------------------
## D.4 Trustor role-set validation (run before the create)
--------------------------------------------------------------------------------

Confirm the creating identity's TOKEN carries the delegatable set and nothing that
cannot be delegated. Run AS the tenant creator identity (app cred or password):

    # as the tenant service identity, project-scoped to <client>-prod
    openstack token issue -f value -c user_id -c project_id   # confirm scope
    # roles in THIS token == what magnum will delegate (context.roles):
    openstack role assignment list --user <this-user-id> \
      --project <tenant-project-id> --effective --names -f value -c Role | sort

GATE: the role set is a subset of { member, load-balancer_member, reader }, and
INCLUDES load-balancer_member. If `admin` appears, this is the wrong identity --
do not create with it.

Note: a tenant/app-cred identity cannot run `role assignment list` for other users
(policy 403, by design). Query only its own assignment, or read it as admin
beforehand during onboarding.

--------------------------------------------------------------------------------
## D.5 The create (tenant identity), and the trust it produces
--------------------------------------------------------------------------------

    # authenticate as the tenant service identity via its app cred (onboarding Stage 4)
    #   OS_AUTH_TYPE=v3applicationcredential + the app cred id/secret from the 0600 file
    # then, project-scoped to the tenant project:
    openstack coe cluster create <cluster-name> \
      --cluster-template <tenant-template> \
      --keypair <tenant-key> \
      --master-count 1 --node-count 2

    # verify the trust was created and carries the tenant roles:
    openstack coe cluster show <cluster-name> -f value -c status -c trustee_user_id
    #   status -> CREATE_IN_PROGRESS (past trustee+trust), NOT CREATE_FAILED at ~3s.

Expected: create_user (D-064) AND create_trust both pass, because the creator is the
trustor and its token roles (member + load-balancer_member) are cleanly delegatable
on the tenant project. The driver then proceeds to helm/CAPI provisioning.

--------------------------------------------------------------------------------
## D.6 Roosevelt
--------------------------------------------------------------------------------

  - Cluster-create is a TENANT self-service operation, performed by the tenant's
    app-cred identity carrying member + load-balancer_member on the tenant project.
    Wire it into the tenant CI (Jenkins) path (onboarding Stage 7), never the
    operator admin.
  - Optionally pin `CONF.trust.roles = member,load-balancer_member` in magnum.conf
    (via the D-037 conf.d mechanism) to make the delegated set EXPLICIT and
    independent of whatever roles happen to be in the caller's token -- a hardening
    that removes the "wrong token roles" failure mode entirely. Decide as a tracked
    item; unset (inherit context.roles) is the upstream default and works when the
    creator identity is correct.
  - The management-plane capi-mgmt project + the operator's D-039 roles there remain
    for the MGMT cluster; they are not the tenant cluster-create path.

--------------------------------------------------------------------------------
## D.7 Open validation item
--------------------------------------------------------------------------------

This appendix establishes the model from the magnum source and the identity design.
The live behavioral confirmation on THIS cloud -- create a cluster as a tenant
app-cred identity (member + load-balancer_member) and observe create_trust succeed --
is the acceptance step, and folds into onboarding Stage 7 (currently [PENDING]) and
the D-011 gate. Until run, D.3 is design-derived-from-source, not yet live-verified
on the multi-tenant path. (UPDATE 2026-07-01: onboarding Stages 1-4 VALIDATED live as tenant acme -- manager
self-service, app-cred cluster-creator with member+load-balancer_member, tenant L3. Stage 5
template = corrected-pending (image-by-UUID). Stage 6 create_trust = the outstanding item;
the create_user half (D-064) is confirmed live.)
