# Phase 07 -- Magnum Conductor Graft (D-031 / D-037 / D-042)

Graft the magnum-capi-helm CAPI driver onto the charm-managed conductor
(`magnum/0`), point it at the in-cloud management cluster (phase-06) via the
FIP, and land on a CONTRACT-COHERENT driver so `coe cluster` health reports
`HEALTHY`. The driver upgrade (D-042) is part of the v1 baseline here, not a
follow-up -- the as-first-built 1.3.0 read the version-less v1beta2
`infrastructureRef` and reported a cosmetic UNHEALTHY; it is superseded by the
RELEASED `magnum-capi-helm==1.4.0`, which is the v1 end state.

Decisions: D-031 (driver/engine/surface), D-037 (conf.d drop-in + config-dir via
/etc/default, NOT a systemd ExecStart drop-in), D-042 (driver must be
contract-coherent with the Layer-A core; amends D-034), D-036 (driver/engine/
chart coherence), D-046 (magnum trustee domain-setup; REQUIRED manual step -- Step 7.0),
D-047 (keystone v3 drop-in for magnum-api -- Step 7.7b). Troubleshooting: appendix-A
DOCFIX-021, D-037, D-042, and lessons L-P6-1..4.

DOCFIX-063 (2026-07-01 as-run reconciliation, fresh Pattern-A rebuild): Step 7.1 rewritten
verify-first (the phase-06 capi-mgmt-sg already opens 6443, so the hardcoded per-conductor
rule-add is dropped to a measured fallback); Step 7.2 helm auth-proof moved AFTER 7.4 (helm is
installed there, absent on a fresh conductor); Step 7.3 probe switched to `kubectl api-versions`
(api-resources shows only the PREFERRED version, giving a false "v1beta1 not served" when the
core groups prefer v1beta2); Step 7.6 now creates /etc/magnum/magnum.conf.d before the tee
(absent on a fresh deploy); the conf.d ASCII checks now `sudo` the grep (a non-sudo read of the
root-owned path gave a false "ASCII clean"); the Step 7.4 helm egress pre-check points at a real
asset. As-built refreshed: conductor magnum/0 10.12.4.76 -> 10.12.12.107 (metal-internal, D-052).
D-063 (open): the phase-06 capi-mgmt-sg opens 6443+22 to 0.0.0.0/0 -- fine for a single-DC
rehearsal, tighten for Roosevelt.

---

## Prerequisites (must be true entering phase-07)
- phase-06 EXIT GATE passed: `capi-mgmt-v2` Ready, CAPI stack up (ORC `Image` CRD
  present, no crash-looping CAPO), `~/capi-mgmt.kubeconfig` (server = FIP) works
  from the jumphost.
- Magnum charm live (`magnum/0`) and related to keystone. The charm RENDERS magnum.conf
  `[trust]` (trustee_domain_name=magnum, trustee_domain_admin_name=magnum_domain_admin,
  password) from the identity-credentials relation, but it does NOT create the keystone
  domain/user those names reference -- that is the MANUAL `domain-setup` action (Step 7.0,
  D-046). `[trust]` being populated is NOT sufficient; magnum reports "Unit is ready"
  whether or not the domain exists, and the omission 403s every `coe` op. Step 7.0 creates
  AND asserts the domain/user.
- `admin-openrc` on the jumphost; `juju` (model openstack); `jq`.

## Constants and env-literals (TAG: confirm per site on rebuild)
- `ENV(conductor-unit)` magnum/0        (LXD 1/lxd/2 on openstack1; addr 10.12.12.107 on metal-internal per D-052 -- confirm per site; was 10.12.4.76 pre-D-052)
- `ENV(conductor-src)`  n/a             (DOCFIX-063: verify-first 7.1 no longer adds a per-conductor SG rule -- the phase-06 capi-mgmt-sg already opens 6443; a source rule is a FALLBACK only, measured not hardcoded)
- `ENV(mgmt-fip)`       per-rebuild     (mgmt apiserver / kubeconfig server; source ~/capi-mgmt-net.env from phase-06 -- this rebuild 10.12.7.222; per-rebuild, DOCFIX-038)
- `ENV(mgmt-sg)`        capi-mgmt-sg     (in the capi-mgmt project)
- `ENV(project)`        capi-mgmt        (resolve by name; this rebuild id d5bc125c7c1841d389b76cd0a7b0a915, domain capi)
- `ENV(magnum-ns)`      magnum-<project-id>  (driver namespace per project; this rebuild magnum-d5bc125c7c1841d389b76cd0a7b0a915)
- `ENV(chart-ver)`      0.25.1           (capi-helm-charts; load-bearing -- driver default is 0.10.1)
- `ENV(helm-ver)`       v3.17.3

## Run-location legend
- `# RUN: jumphost`            -- vopenstack-jesse as jessea123 (admin-openrc).
- `# RUN: jumphost -> magnum/0`-- shipped to the conductor via `juju ssh -m openstack magnum/0 '...' </dev/null`
  (DOCFIX-021: `</dev/null` on every juju ssh / sudo so the remote command does not eat the heredoc/pipe).
- Conductor facts: DEB install (magnum 18.0.1, python3.10, container base ubuntu 22.04);
  conductor runs as user `magnum`; daemon launched by an LSB init script wrapped by
  systemd `systemd-start` (NOT a direct ExecStart) -- see Step 7.7.

---

## Command-label convention
Every command block below is bracketed by bold labels, so a command line is never mistaken
for surrounding prose (these render in GitBucket and read clearly in a raw editor):
- **RUN -- LOC** -- the block CHANGES state; run it at LOC (e.g. `jumphost`, `vault/0`, `jumphost -> magnum/0`).
- **CHECK (read-only) -- LOC** -- a read-only verification; safe to re-run.
- **GATE:** -- a hard stop; do NOT proceed past the block unless the stated condition holds.
- **Expect:** -- what a passing result looks like.
- `> CAUTION:` -- marks a destructive, secret-handling, or irreversible step.


## Step 7.0 -- Magnum trustee domain-setup (D-046; REQUIRED on every (re)deploy)
The magnum charm action `domain-setup` is MANUAL and idempotent; magnum
reports active/"Unit is ready" REGARDLESS of whether the trustee domain exists. If the keystone
domain `magnum` + user `magnum_domain_admin` (referenced by magnum.conf `[trust]`) are absent,
`magnum/common/policy.py` 401s on EVERY policy-enforced request -> every `coe` op 403s (the
2026-06-17 incident; the 2026-06-11 redeploy omitted this and it stayed latent until the first
coe call). Run here, AFTER magnum + identity-service are related, and BEFORE any coe call
(Step 7.9 / phase-08). No magnum restart needed (domain_admin_auth resolves by NAME;
trustee_domain_id is recomputed per request).

Step A -- create the trustee domain (charm-native; idempotent; takes no parameters):

**RUN -- jumphost**
```bash
juju run magnum/leader domain-setup </dev/null
```

Step B -- ASSERT the domain + admin user exist (read-only GATE; do NOT proceed on failure):

**CHECK (read-only) -- jumphost**
```bash
( { source ~/admin-openrc
  openstack domain show magnum -f value -c id
  openstack user show magnum_domain_admin --domain magnum -f value -c id
} )
```

Step C -- GATE coe (must return the conductor row, state up, NO 403):

**CHECK (read-only) -- jumphost**
```bash
( { source ~/admin-openrc; openstack coe service list; } )
```
**GATE:** Step B returns a domain id + a user id; Step C returns exactly one row
(`magnum-conductor`, state `up`). A 403 at Step C means domain-setup did not take (re-run
Step A) or the magnum.conf `[trust]` names differ from the created domain/user. (Benign
"No domain/user exists" idempotency lines may appear in the action output.)

## Step 7.1 -- Authorize the conductor source on the mgmt-cluster SG (VERIFY-FIRST; DOCFIX-063)
(scoped to the capi-mgmt project). DOCFIX-063: do NOT hardcode a per-conductor source rule.
The phase-06 capi-mgmt-sg already opens `tcp/6443` to `0.0.0.0/0` (the FIP is the access
point), so the conductor reaches the apiserver with NO new rule. Inspect the SG + prove
reachability FIRST; add a rule ONLY if 6443 is not already permitted, and then with the source
the mgmt VM actually SEES (measured, never the pre-D-052 provider literal 10.12.4.76).

**CHECK (read-only) -- jumphost**
```bash
( {
  set -u
  source ~/admin-openrc
  CAPI_PID=$(openstack project show capi-mgmt --domain capi -f value -c id)   # ENV(project); resolve, never hardcode
  unset OS_PROJECT_NAME OS_PROJECT_ID OS_TENANT_NAME OS_TENANT_ID
  export OS_PROJECT_ID="$CAPI_PID"
  SG=$(openstack security group show capi-mgmt-sg -f value -c id)   # ENV(mgmt-sg)
  echo "SG=$SG"
  echo "=== current ingress rules (JSON -- avoids the -c column-swap trap) ==="
  openstack security group rule list "$SG" -f json
} )
```
Then prove conductor -> mgmt apiserver reachability (FIP from phase-06's env, never hardcoded):

**CHECK (read-only) -- jumphost -> magnum/0**
```bash
source ~/capi-mgmt-net.env   # MGMT_FIP (DOCFIX-038: per-rebuild)
juju ssh -m openstack magnum/0 \
  "timeout 6 bash -c 'exec 3<>/dev/tcp/$MGMT_FIP/6443' && echo TCP-OK || echo TCP-FAIL" </dev/null
```
**GATE:** require `TCP-OK`. If 6443 is already `0.0.0.0/0` (the phase-06 default), TCP-OK holds
with no mutation -- proceed. FALLBACK (only if TCP-FAIL AND 6443 is NOT already open): MEASURE
the source the mgmt VM sees from magnum/0 (e.g. via conntrack / a listener on the VM), then add
exactly that source -- never the pre-D-052 provider literal 10.12.4.76:
```bash
# FALLBACK ONLY -- <measured-src>/32 is the source the mgmt VM sees; do NOT guess it.
( { source ~/admin-openrc
  CAPI_PID=$(openstack project show capi-mgmt --domain capi -f value -c id)
  unset OS_PROJECT_NAME OS_PROJECT_ID OS_TENANT_NAME OS_TENANT_ID; export OS_PROJECT_ID="$CAPI_PID"
  SG=$(openstack security group show capi-mgmt-sg -f value -c id)
  openstack security group rule create --proto tcp --dst-port 6443 --remote-ip <measured-src>/32 "$SG"
} )
```

## Step 7.2 -- Place the mgmt kubeconfig on the conductor [SENSITIVE; not batched]
The source `~/capi-mgmt.kubeconfig` already has its
server rewritten to the FIP (phase-06 6.5). Transfer base64-piped straight into a
root-written 0600 file owned by the conductor user -- never stage the admin
kubeconfig in /tmp (appendix-A: L-P6-4).

**RUN -- jumphost -> magnum/0**
```bash
# discover the conductor service user (expect: magnum)
juju ssh -m openstack magnum/0 'systemctl show magnum-conductor -p User --value' </dev/null

# transfer (umask 077; chown to the discovered user; 0600)
# NOTE: NO trailing </dev/null here -- stdin IS the payload. A </dev/null would
# override the pipe (SC2259) and silently write an EMPTY kubeconfig while the
# && chain still exits 0. DOCFIX-021 applies only to commands whose stdin is
# NOT in use; the discovery line above keeps it, this pipe must not.
base64 ~/capi-mgmt.kubeconfig | juju ssh -m openstack magnum/0 \
  "sudo bash -c 'umask 077; base64 -d > /etc/magnum/kubeconfig && \
   getent passwd magnum >/dev/null && chown magnum: /etc/magnum/kubeconfig && \
   chmod 0600 /etc/magnum/kubeconfig'"

# verify byte-exact (hashes must match before proceeding)
sha256sum ~/capi-mgmt.kubeconfig
juju ssh -m openstack magnum/0 'sudo sha256sum /etc/magnum/kubeconfig' </dev/null
```
**GATE:** the two sha256 hashes are identical (an empty or truncated transfer fails here,
not three steps later as a confusing conductor auth error).
End-to-end proof (the conductor user authenticates to the mgmt cluster via the FIP) --
DOCFIX-063: helm is installed in Step 7.4, so on a fresh conductor it is ABSENT here. RUN THIS
CHECK AFTER STEP 7.4 (integrity above + the 7.1 TCP reachability already gate 7.2 without it):

**CHECK (read-only; run AFTER Step 7.4) -- jumphost -> magnum/0**
```bash
juju ssh -m openstack magnum/0 \
  'command -v helm >/dev/null 2>&1 || { echo "helm MISSING -- run this after Step 7.4"; exit 0; }; \
   sudo -u magnum env HOME=/tmp helm --kubeconfig /etc/magnum/kubeconfig list -A' </dev/null
```
Expect: the mgmt-cluster helm releases listed (cert-manager, ck-dns, ck-network
cilium, cluster-api-addon-provider, cluster-api-janitor-openstack, metrics-server).
GATE: a populated list = reach + auth OK. (Hardening, Roosevelt: replace this
cluster-admin kubeconfig with a scoped ServiceAccount kubeconfig.)

## Step 7.3 -- Confirm the driver target + served CAPI versions (D-042)
The fix is the RELEASED tag
`magnum-capi-helm==1.4.0` (the "generalize-api-resources" feature). 1.3.0 read the
version-less v1beta2 `infrastructureRef` and failed the health GET; 1.4.0 resolves each
resource query as `api_resources.get(<Kind>,{}).get("api_version", <code-default>)`,
where the driver's CODE defaults are v1beta1 for every CAPI core kind (Cluster /
MachineDeployment / Machine -> cluster.x-k8s.io/v1beta1; OpenstackCluster ->
infrastructure.cluster.x-k8s.io/v1beta1; K8sControlPlane ->
controlplane.cluster.x-k8s.io/v1beta1). IMPORTANT: the `api_resources` OPTION itself
defaults to an EMPTY map `{}` -- the v1beta1 values are code-level fallbacks, NOT option
defaults. This cluster serves v1beta1 (CAPI v1.13 still serves it; unserved only in
v1.16), so an empty `api_resources` yields v1beta1 lookups that match -- no per-kind
override needed.

Sanity-confirm v1beta1 is SERVED per group before installing (DOCFIX-063):

**RUN -- jumphost**
```bash
# DOCFIX-063: `kubectl api-resources` prints only the PREFERRED apiVersion per kind, so a core
# group that PREFERS v1beta2 (cluster/controlplane/bootstrap under CAPI v1.13) shows header-only
# under a v1beta1 filter -- a FALSE "not served". `kubectl api-versions` lists ALL served
# group/versions -- the definitive answer for whether the driver's v1beta1 GETs will resolve.
( {
  export KUBECONFIG="$HOME/capi-mgmt.kubeconfig"
  kubectl api-versions | grep -E 'cluster\.x-k8s\.io/' | sort
} )
#   Expect v1beta1 SERVED for the core groups (alongside v1beta2 as preferred):
#     cluster.x-k8s.io/v1beta1, controlplane.cluster.x-k8s.io/v1beta1,
#     bootstrap.cluster.x-k8s.io/v1beta1, infrastructure.cluster.x-k8s.io/v1beta1,
#     addons.cluster.x-k8s.io/v1beta1. If a CORE group serves ONLY v1beta2 (v1beta1 ABSENT),
#     override just that kind via api_resources in Step 7.6; otherwise the empty default works.
```

## Step 7.4 -- Install the driver (1.4.0) + helm in the conductor container
`--no-deps` preserves the deb-managed oslo stack (no
PEP668 issue on the 22.04 container).

**RUN -- jumphost -> magnum/0**
```bash
# egress pre-check (DOCFIX-063: hit a REAL asset -- bare https://get.helm.sh/ 404s (no root
# index) and looks like a failure; the versioned sha256sum URL is a true 200 reachability probe)
juju ssh -m openstack magnum/0 \
  'curl -s -o /dev/null -w "pypi:%{http_code}\n" https://pypi.org/simple/ ; \
   curl -s -o /dev/null -w "helm:%{http_code}\n" https://get.helm.sh/helm-v3.17.3-linux-amd64.tar.gz.sha256sum' </dev/null

# helm v3.17.3 -- INSTALL it (not check+echo) and put it on the CONDUCTOR's PATH (DOCFIX-035).
# The magnum-conductor LSB-init PATH excludes /usr/local/bin, so the driver's `helm` shell-out
# fails even when an interactive `juju ssh` shell finds it. Install the binary to /usr/local/bin
# AND symlink /usr/bin/helm -> it (/usr/bin IS on the restricted init PATH). Checksum-verified.
juju ssh -m openstack magnum/0 'set -e
  WANT=v3.17.3
  if [ -x /usr/bin/helm ] && /usr/bin/helm version --short 2>/dev/null | grep -q "$WANT"; then
    echo "[SKIP] /usr/bin/helm already $WANT"
  else
    T=helm-$WANT-linux-amd64.tar.gz
    D=$(mktemp -d); cd "$D"
    curl -fsSLO "https://get.helm.sh/$T"
    EXP=$(curl -fsSL "https://get.helm.sh/$T.sha256sum" | cut -d" " -f1)
    GOT=$(sha256sum "$T" | cut -d" " -f1)
    [ -n "$EXP" ] && [ "$EXP" = "$GOT" ] || { echo "GATE FAIL: helm checksum exp=$EXP got=$GOT"; exit 1; }
    tar xzf "$T"
    sudo install -o root -g root -m 0755 linux-amd64/helm /usr/local/bin/helm
    sudo ln -sfn /usr/local/bin/helm /usr/bin/helm
    cd /; rm -rf "$D"
    echo "[OK] installed $(/usr/bin/helm version --short)"
  fi' </dev/null

# DOCFIX-035 GATE: helm must resolve from the conductor's RESTRICTED init PATH (no /usr/local/bin),
# not just an interactive shell. Reproduce that PATH and confirm `helm` is found (via /usr/bin):
juju ssh -m openstack magnum/0 'env -i PATH=/usr/sbin:/usr/bin:/sbin:/bin sh -c "command -v helm && helm version --short"' </dev/null

# install the RELEASED contract-coherent driver (supersedes 1.3.0)
juju ssh -m openstack magnum/0 'sudo python3 -m pip install --no-deps --upgrade "magnum-capi-helm==1.4.0"' </dev/null

# verify the install + entry point
juju ssh -m openstack magnum/0 \
  'pip show magnum-capi-helm | egrep "Version|Location"; \
   python3 -c "import importlib.metadata as m; print([e.name for e in m.entry_points(group=\"magnum.drivers\")])"' </dev/null
```
Expect: helm reachable on the restricted PATH -- the gate prints `/usr/bin/helm` + `v3.17.3`
(DOCFIX-035; `command -v helm` in a login shell is NOT sufficient proof). Driver Version 1.4.0;
`k8s_capi_helm_v1` present in the entry points.

## Step 7.5 -- api_resources (D-042; set EXPLICITLY to an empty map on this cluster)
1.4.0 exposes ONE [capi_helm] option for this -- `api_resources`, a JSON string mapping
CAPI kinds (Cluster, OpenstackCluster, MachineDeployment, K8sControlPlane, Machine,
Manifests, HelmRelease) to `{api_version, plural_name}`. The driver's CODE falls back to
v1beta1 for every CAPI core kind when that kind is absent from the map (Step 7.3), and
this cluster serves v1beta1 -- so the map's CONTENTS are empty here. But set it
EXPLICITLY to `{}` in the drop-in (Step 7.6) rather than omit it: the option's registered
default is a Python dict `{}` and the driver runs `json.loads()` on the value, so an
explicit string `{}` avoids depending on how oslo coerces a non-string default (not
empirically testable in the build environment -- explicit-set is the safe choice).
Override a specific kind ONLY if Step 7.3 showed it serves ONLY v1beta2, e.g.
`api_resources = {"Cluster": {"api_version": "cluster.x-k8s.io/v1beta2"}}`.

## Step 7.6 -- Stage the [capi_helm] conf.d drop-in (D-037)
0644 root, NO secrets (it points at the 0600
kubeconfig). The `default_helm_chart_version = 0.25.1` line is LOAD-BEARING (driver
built-in default is `0.10.1`, the retired v1alpha6-era chart). `api_resources` is set to
an explicit empty map `{}` (Step 7.5 -- the driver's code falls back to v1beta1 for every
CAPI kind, which this cluster serves; explicit `{}` avoids the dict-default `json.loads`
question). ASCII only.

**RUN -- jumphost -> magnum/0**
```bash
# DOCFIX-063: /etc/magnum/magnum.conf.d/ does NOT exist on a fresh rebuild (the deb ships
# magnum.conf, not the .conf.d dir; tee cannot create a missing parent). Create it first
# (root:root 0755, magnum-traversable for --config-dir; also the Step 7.7 config-dir target).
juju ssh -m openstack magnum/0 'sudo install -d -o root -g root -m 0755 /etc/magnum/magnum.conf.d' </dev/null

juju ssh -m openstack magnum/0 "sudo tee /etc/magnum/magnum.conf.d/00-capi-helm.conf >/dev/null <<'CONF'
[capi_helm]
kubeconfig_file = /etc/magnum/kubeconfig
helm_chart_repo = https://azimuth-cloud.github.io/capi-helm-charts
helm_chart_name = openstack-cluster
default_helm_chart_version = 0.25.1
api_resources = {}
CONF" </dev/null
```
If (and only if) Step 7.3 showed a core kind is v1beta2-only, append the override --
ONE line, a JSON value naming just the kinds that need it:
```
    # api_resources = {"Cluster": {"api_version": "cluster.x-k8s.io/v1beta2"}, ...}
```
Re-check ASCII cleanliness:

**CHECK (read-only) -- jumphost -> magnum/0**
```bash
# DOCFIX-063: sudo the grep -- /etc/magnum is root-owned (0750 root:magnum); a non-sudo read
# gets "Permission denied" and the `|| echo` prints a FALSE "ASCII clean".
juju ssh -m openstack magnum/0 \
  'sudo env LC_ALL=C grep -nP "[^\x00-\x7F]" /etc/magnum/magnum.conf.d/00-capi-helm.conf && echo NON-ASCII || echo "ASCII clean"' </dev/null
```

## Step 7.7 -- Wire config-dir injection via /etc/default (D-037 REVISED; NOT a systemd drop-in)
These OpenStack debs run the daemon through an LSB
init script wrapped by systemd `systemd-start`; a systemd `ExecStart` drop-in is
INERT (appendix-A: D-037, L-P6-1/L-P6-2). The sanctioned extension point is
`/etc/default/magnum-conductor`, sourced inside the init script AFTER the base
`--config-file` is assembled. The charm does not manage that file.

**RUN -- jumphost -> magnum/0**
```bash
# confirm the daemon currently has NO --config-dir (the problem we are fixing)
juju ssh -m openstack magnum/0 'ps -ww -C magnum-conductor -o args=' </dev/null

# create the per-service extension (literal $DAEMON_ARGS -- it expands at source time)
juju ssh -m openstack magnum/0 \
  "echo 'DAEMON_ARGS=\"\$DAEMON_ARGS --config-dir /etc/magnum/magnum.conf.d\"' \
   | sudo tee /etc/default/magnum-conductor >/dev/null && \
   sudo chmod 0644 /etc/default/magnum-conductor" </dev/null

# DRY-RUN verify WITHOUT restarting: the init script's own show-args echoes the assembled cmdline
juju ssh -m openstack magnum/0 '/etc/init.d/magnum-conductor show-args' </dev/null
```
**GATE:** `show-args` must show BOTH `--config-file=/etc/magnum/magnum.conf` AND
`--config-dir /etc/magnum/magnum.conf.d`. Do not restart until this passes.
RESIDUAL (logged): if a future charm hook ever writes /etc/default/magnum-conductor,
the append is lost and [capi_helm] silently stops being read -- detect via show-args/ps.

## Step 7.7b -- Force keystone v3 for magnum-api via a magnum.conf.d drop-in (D-047)
The charm renders `auth_version = v2.0` in magnum.conf
`[keystone_authtoken]`/`[keystone_auth]` (a template type-compare bug; Caracal keystone does
not serve v2.0). On THIS deploy it is COSMETIC -- magnum's domain_admin_auth rewrites v2.0->v3
and token validation worked throughout -- but v2.0 is the provably wrong value, so override it
with a drop-in (D-047). Same config-dir mechanism as Step 7.7, but for the magnum-API service:
Step 7.7 wired `--config-dir` only for the conductor, and oslo.config reads `--config-dir` AFTER
`--config-file`, so the drop-in wins. v3 URLs are DERIVED from the live `[keystone_authtoken]`
(no hardcoded VIPs). No restart here -- Step 7.8 restarts both services.

**RUN -- jumphost -> magnum/0**
```bash
juju ssh -m openstack magnum/0 sudo bash -s <<'REOF'
set -e
# (1) wire --config-dir into magnum-api (mirror Step 7.7's conductor wiring; idempotent)
grep -q -- '--config-dir /etc/magnum/magnum.conf.d' /etc/default/magnum-api 2>/dev/null \
  || echo 'DAEMON_ARGS="$DAEMON_ARGS --config-dir /etc/magnum/magnum.conf.d"' >> /etc/default/magnum-api
chmod 0644 /etc/default/magnum-api
# (2) derive v3 URLs from the live [keystone_authtoken] block; write the override drop-in
WWW=$(awk -F'= ' '/^\[keystone_authtoken\]/{s=1} s&&/^www_authenticate_uri/{print $2; exit}' /etc/magnum/magnum.conf)
AURL=$(awk -F'= ' '/^\[keystone_authtoken\]/{s=1} s&&/^auth_url/{print $2; exit}' /etc/magnum/magnum.conf)
WWW3=${WWW/\/v2.0//v3};   case "$WWW3"  in */v3) ;; *) WWW3="${WWW3%/}/v3";;  esac
AURL3=${AURL/\/v2.0//v3}; case "$AURL3" in */v3) ;; *) AURL3="${AURL3%/}/v3";; esac
printf '[keystone_authtoken]\nauth_version = v3\nwww_authenticate_uri = %s\nauth_url = %s\n[keystone_auth]\nauth_version = v3\nwww_authenticate_uri = %s\nauth_url = %s\n' \
  "$WWW3" "$AURL3" "$WWW3" "$AURL3" > /etc/magnum/magnum.conf.d/50-keystone-v3-override.conf
chmod 0644 /etc/magnum/magnum.conf.d/50-keystone-v3-override.conf
echo "[OK] 50-keystone-v3-override.conf:"; cat /etc/magnum/magnum.conf.d/50-keystone-v3-override.conf
REOF
```
**GATE:** the drop-in lists `auth_version = v3` + `/v3` URLs in BOTH sections, and
`grep -- --config-dir /etc/default/magnum-api` returns the line. The effective value is
proven in Step 7.8 by the magnum-api launched cmdline carrying `--config-dir` (L-P6-1/2:
gate on the assembled cmdline, not the file text). Restart happens in Step 7.8.

## Step 7.8 -- Restart conductor + verify driver + HEALTHY (P6e + D-042 Stage 6)
Restart on magnum/0, then a jumphost-side health poll.

**RUN -- jumphost -> magnum/0**
```bash
juju ssh -m openstack magnum/0 \
  'sudo systemctl restart magnum-conductor magnum-api && sleep 3 && \
   systemctl is-active magnum-conductor magnum-api && \
   echo "--- conductor args ---"; ps -ww -C magnum-conductor -o args=; \
   echo "--- magnum-api args (D-047: must carry --config-dir) ---"; ps -ww -C magnum-api -o args=' </dev/null
# expect: both active; BOTH cmdlines carry --config-dir /etc/magnum/magnum.conf.d
# (conductor -> [capi_helm] driver config; magnum-api -> the v3 keystone override).

juju ssh -m openstack magnum/0 'sudo magnum-driver-manage list-drivers 2>/dev/null | grep capi || \
   echo "driver list (full):"; sudo magnum-driver-manage list-drivers' </dev/null
# expect: k8s_capi_helm_v1 listed.
```
Health poll (the D-042 fix target -- this is what 1.3.0 reported UNHEALTHY):

FRESH DEPLOY ROUTING: on a clean redeploy NO cluster exists yet, so there is nothing
to poll -- SKIP this poll; the gate is discharged in phase-08 step 8.2
(`capi-test-1` reaching `health_status = HEALTHY`). The poll below applies when
grafting onto a cloud that already has a CAPI-driver cluster: substitute that
cluster's name and the current `ENV(project)` id (both are run-specific).

**RUN -- jumphost**
```bash
( {
  source ~/admin-openrc
  CAPI_PID=$(openstack project show capi-mgmt --domain capi -f value -c id)   # ENV(project); resolve, never hardcode
  unset OS_PROJECT_NAME OS_PROJECT_ID OS_TENANT_NAME OS_TENANT_ID
  export OS_PROJECT_ID="$CAPI_PID"
  for i in $(seq 1 10); do
    echo "[$i] health=$(openstack coe cluster show capi-test-1 -f value -c health_status 2>/dev/null)"
    echo "    reason=$(openstack coe cluster show capi-test-1 -f value -c health_status_reason 2>/dev/null)"
    sleep 20
  done
} )
```
**GATE:** (existing-cluster graft only): `health_status -> HEALTHY`, with the
`infrastructure` sub-check now `Ready` (it was the only failing axis under 1.3.0).
On a FRESH DEPLOY this gate is deferred to phase-08 step 8.2 -- do not block here.
If it does not clear on an existing-cluster graft, go to Rollback.

## Step 7.9 -- Regression check (confirm create/manage path intact)
(capi-mgmt scope). Prove the upgraded driver still creates+deletes.

FRESH DEPLOY ROUTING: SKIP this step -- the `capi-k8s-v1-34` template does not exist
yet (phase-08 step 8.0 creates it), and phase-08 itself (create `capi-test-1` to
CREATE_COMPLETE, full acceptance, then 8.5 delete) is a superset of this check. Run
7.9 as written only when grafting onto an existing cloud where the template is present.

**RUN -- jumphost**
```bash
openstack coe cluster create capi-fix-check --cluster-template capi-k8s-v1-34 \
  --keypair capi-mgmt-key --master-count 1 --node-count 1
# watch to CREATE_COMPLETE, then:
openstack coe cluster delete capi-fix-check    # watch to gone
```

## Rollback (TEMPORARY holding state only -- if 7.8 health does not clear or 7.9 regresses)
Reverts to the as-first-built functional
(cosmetic-UNHEALTHY) state on 1.3.0 -- a TEMPORARY holding state to keep the conductor
serving while the 1.4.0 issue is diagnosed, NOT a v1 end state. v1 is NOT complete until
`magnum-capi-helm==1.4.0` is installed and `health_status = HEALTHY` (D-011). Re-attempt
7.3-7.9 after diagnosis.

**RUN -- jumphost -> magnum/0**
```bash
juju ssh -m openstack magnum/0 'sudo python3 -m pip install --no-deps --force-reinstall "magnum-capi-helm==1.3.0"' </dev/null
# restore the config backup if you snapshotted one, then:
juju ssh -m openstack magnum/0 'sudo systemctl restart magnum-conductor' </dev/null
```

---

## EXIT GATE (phase-07 complete)
- Conductor reaches the mgmt apiserver via the FIP (TCP-OK); kubeconfig 0600/magnum; helm list OK.
- magnum-capi-helm 1.4.0 installed (contract-coherent, RELEASED); `k8s_capi_helm_v1` enumerated.
- [capi_helm] drop-in read by the conductor (`--config-dir` present in the live cmdline).
- `health_status = HEALTHY` (infrastructure Ready) on a CAPI-driver cluster -- D-042
  issue eliminated. FRESH DEPLOY: no cluster exists yet; this item is DEFERRED to
  phase-08 step 8.2 (existing-cluster graft: verify here on that cluster).
- Regression create/delete passed (FRESH DEPLOY: deferred -- phase-08 8.1-8.5 is the
  superset proof).
- Proceed to phase-08 (workload-cluster acceptance + D-011).

## As-built reference (2026-07-01 Pattern-A rebuild graft -- audit trail; supersedes the 2026-06-08/09 pre-D-052 run)
- magnum/0: LXD 1/lxd/2 on openstack1, addr 10.12.12.107 (metal-internal per D-052; was 10.12.4.76 pre-D-052),
  charm magnum 2024.1/stable rev 70, DEB magnum 18.0.1, python3.10, container ubuntu 22.04; conductor user `magnum`.
- Driver: RELEASED magnum-capi-helm==1.4.0 (pip --no-deps; api_resources={} explicit -> code-default v1beta1,
  served by CAPI v1.13.2 / CAPO v0.14.4). This is the v1 baseline; the pre-D-052 run's interim 1.3.0
  (version-less v1beta2 ref -> cosmetic UNHEALTHY, D-042) is superseded.
- kubeconfig: /etc/magnum/kubeconfig, -rw------- magnum, 5641 bytes this rebuild (sha256 26ed1091...6c11),
  server = the mgmt FIP:6443 (per-rebuild; this rebuild 10.12.7.222 -- DOCFIX-038).
- conf.d drop-in /etc/magnum/magnum.conf.d/00-capi-helm.conf: kubeconfig_file, helm_chart_repo
  (azimuth), helm_chart_name openstack-cluster, default_helm_chart_version 0.25.1 (api_resources
  left default -- v1beta1 served by CAPI v1.13.2 / CAPO v0.14.4).
- config-dir injection: /etc/default/magnum-conductor `DAEMON_ARGS="$DAEMON_ARGS --config-dir
  /etc/magnum/magnum.conf.d"`; verified live via `ps` and the init script `show-args`.
- helm v3.17.3 at /usr/local/bin/helm + /usr/bin/helm symlink (DOCFIX-035: on the conductor's restricted init PATH).
- Driver internals (reference, from installed source): routes on (server_type vm, os ubuntu,
  coe kubernetes); k8s version comes from the IMAGE `kube_version` property (NOT a template label),
  os_distro=ubuntu; flavor floor 2048 MB / 2 vCPU; auto-mints an app credential (workload nodes use
  the PUBLIC keystone interface); apiServer ALWAYS provisions an Octavia LB (+FIP default).

## Next
phase-08 -- workload-cluster acceptance: create a tenant cluster from template
`capi-k8s-v1-34`, confirm CREATE_COMPLETE + Ready nodes + Calico + LB, and run the
D-011 (amended per D-019) acceptance criteria.
