diff --git a/README.md b/README.md
index 465cb4e..9578877 100644
--- a/README.md
+++ b/README.md
@@ -78,7 +78,7 @@
 | phase-04 | Network carve (provider external network + IPAM reference)       |
 | phase-05 | Octavia enablement                                               |
 | phase-06 | In-cloud CAPI management cluster (D-035)                         |
-| phase-07 | Magnum conductor graft (magnum-capi-helm driver; D-031/D-037/D-042) |
+| phase-07 | Magnum conductor graft (magnum-capi-helm driver; trustee domain-setup; D-031/D-037/D-042/D-046/D-047) |
 | phase-08 | Workload-cluster acceptance (D-011)                             |
 
 NetBox imports run separately, gated on external NetBox-engineer review
diff --git a/bundle.yaml b/bundle.yaml
index bdeb77b..a253cf9 100644
--- a/bundle.yaml
+++ b/bundle.yaml
@@ -21,7 +21,7 @@
 #                   (10.12.16.0/22), ceph-osd cluster->replication (10.12.20.0/22). Bindings, NOT
 #                   ceph-*-network config, so the LXD-contained mon actually gets a storage NIC.
 #                   Clients bind ceph->storage; container principals carry it too (subset rule). (C2)
-# Magnum:           Layer A only -- CAPI driver graft is Layer B (runbooks/04a + 05)
+# Magnum:           Layer A only -- CAPI driver graft is Layer B (runbooks/phase-06..08)
 # Octavia:          lb-mgmt PKI options supplied via overlays/octavia-pki.yaml (gitignored).
 #                   Amphora-pipeline options baked (use-internal-endpoints etc.). (B4)
 # OVN tunnels:      geneve overlay on the DATA space (10.12.12.0/22) -- ovn-chassis + ovn-chassis-octavia
@@ -466,12 +466,14 @@
   # Kubernetes-as-a-Service: Magnum (Layer A -- CAPI graft is Layer B)
   # =====================================================================
   # NOTE: After bundle deploys, magnum/0 will show active/idle but CANNOT create K8s clusters.
-  # Layer B (post-deploy) brings it to life:
-  #   1. capi-mgmt VM with k3s + CAPI operators              (runbook 04a)
-  #   2. pip install magnum-capi-helm==1.1.0 into magnum venv (runbook 05)
-  #   3. /etc/magnum/magnum.conf.d/99-capi.conf with enabled_drivers
-  #   4. Install kubeconfig at /etc/magnum/kubeconfig
-  #   5. Create Keystone capi-mgmt project + capo user + app credential
+  # Layer B (post-deploy) brings it to life -- see runbooks/phase-06..08:
+  #   1. In-cloud single-homed mgmt VM (capi-mgmt-v2) with k8s-snap + CAPI/CAPO  (phase-06; D-035)
+  #   2. magnum-capi-helm==1.4.0 grafted onto the conductor                      (phase-07; D-037/D-042)
+  #   3. /etc/magnum/magnum.conf.d/00-capi-helm.conf (driver) + 50-keystone-v3-override.conf,
+  #      both read via --config-dir wired into /etc/default/magnum-{conductor,api}  (D-037/D-047)
+  #   4. kubeconfig at /etc/magnum/kubeconfig (server = the mgmt FIP)             (phase-07)
+  #   5. magnum trustee domain-setup (REQUIRED; D-046); per-cluster app-creds are
+  #      minted by magnum at cluster-create -- NO static capo user/app-cred       (D-039)
 
   magnum:
     charm: magnum
@@ -647,7 +649,7 @@
   - [barbican-vault:secrets-storage, vault:secrets]
   - [barbican:ha, barbican-hacluster:ha]
 
-  # ---- Magnum (Layer A only; CAPI graft is Layer B/runbook 05)
+  # ---- Magnum (Layer A only; CAPI graft is Layer B/runbooks phase-06..08)
   - [magnum-mysql-router:db-router, mysql-innodb-cluster:db-router]
   - [magnum:shared-db, magnum-mysql-router:shared-db]
   - [magnum:identity-service, keystone:identity-service]
diff --git a/docs/design-decisions.md b/docs/design-decisions.md
index 5cef1c9..ad98255 100644
--- a/docs/design-decisions.md
+++ b/docs/design-decisions.md
@@ -121,7 +121,7 @@
 ---
 
 
-## D-007: Magnum inclusion
+## D-007: Magnum inclusion (Layer A current; Layer B mechanism/topology superseded -- see D-035 / D-037 / D-042)
 
 **Decision:** Magnum in bundle from day one. Two-layer install.
 
@@ -146,7 +146,7 @@
 
 **CAPI mgmt plane:** Post-pivot, the workload cluster IS the CAPI management plane (per **runbook 04a §17**, `clusterctl move` pivots cluster state from the `capi-mgmt.maas` bootstrap k3s into the workload cluster, which becomes self-managing). Per **D-017**, both the bootstrap k3s and the workload cluster are rebuilt from scratch every deployment cycle — there is no preserved-across-rebuild artifact. The bootstrap install + pivot procedure lives in `runbooks/04a-capi-bootstrap-cluster.md` and runs **before** this runbook. This pattern transfers to Roosevelt unchanged.
 
-**Superseded portions:** The "preserved across rebuild" stance in earlier drafts of this decision is **superseded by D-017**. See D-017 for rationale. The earlier `stackhpc/magnum-capi-helm` v0.13.0 driver pin is superseded by the `openstack/magnum-capi-helm` 1.1.0 pin above (driver source repo moved + archived).
+**Superseded portions:** The "preserved across rebuild" stance in earlier drafts of this decision is **superseded by D-017**. See D-017 for rationale. The earlier `stackhpc/magnum-capi-helm` v0.13.0 driver pin is superseded by the `openstack/magnum-capi-helm` 1.1.0 pin above (driver source repo moved + archived). The Layer B *mechanism and topology* are now further superseded: the CAPI management plane is an in-cloud single-homed VM with NO `clusterctl move` -- the kubeconfig points at that mgmt cluster, not a post-pivot workload cluster (**D-035**); the conductor graft is `/etc/default/magnum-conductor` `DAEMON_ARGS --config-dir` + `00-capi-helm.conf`, NOT a systemd-ExecStart override + `99-capi.conf` (**D-037**); the driver pin is **1.4.0** for CAPI-core contract coherence (**D-042**); and the per-cluster app-cred replaces any static `capo` credential (**D-039**). Layer A (the bundle) is current. Live deploy steps are runbooks/phase-06 (mgmt cluster) and phase-07 (conductor graft + `domain-setup` D-046 + keystone-v3 drop-in D-047); `runbooks/05-magnum-capi-driver.md` is historical.
 
 ---
 
@@ -312,7 +312,7 @@
 ---
 
 
-## D-017: CAPI bootstrap cluster lifecycle
+## D-017: CAPI bootstrap cluster lifecycle (bootstrap mechanism superseded by D-035; full-rebuild principle retained)
 
 **Decision:** L3 full teardown and rebuild every deployment cycle. The `capi-mgmt.maas` MAAS VM is released back to Ready state on teardown; on rebuild, it is re-deployed from scratch with Ubuntu 24.04, k3s, CAPI controllers, and ORC. **Nothing is preserved across cycles.**
 
@@ -327,6 +327,8 @@
 
 **Supersedes:** the "preserved across rebuild" stance in earlier drafts of D-007 and D-013.
 
+**Superseded by:** **D-035** retired the bootstrap MECHANISM -- there is no `capi-mgmt.maas` MAAS VM, no k3s, and no `clusterctl move`/pivot; the management plane is an in-cloud single-homed tenant VM (`capi-mgmt-v2`, k8s-snap) built in runbooks/phase-06. The full-teardown-and-rebuild-every-cycle PRINCIPLE stated here is RETAINED (now realized by phase-00 teardown + the phase-06 rebuild). `runbooks/04a-capi-bootstrap-cluster.md` and `05-magnum-capi-driver.md` are historical (folded into phase-06/07).
+
 **Alternatives considered:**
 
 - L1: Wipe just the cluster CRs, keep k3s + controllers. Rejected: skips the install rehearsal that's the whole point.
@@ -473,11 +475,11 @@
 
 ## D-035: Management-cluster placement -- in-cloud single-homed tenant VM
 
-**Decision:** run the CAPI management cluster as a single-homed in-cloud tenant VM (`capi-mgmt-v2`): one NIC on the management tenant subnet (10.20.0.0/24), reached via a floating IP (10.12.7.40); k8s-snap (channel `1.32-classic/stable`), Cilium CNI; not CAPI-self-managed (no `clusterctl move`).
+**Decision:** run the CAPI management cluster as a single-homed in-cloud tenant VM (`capi-mgmt-v2`): one NIC on the management tenant subnet (10.20.0.0/24), reached via a floating IP (per-rebuild -- this rebuild 10.12.5.103; the original 10.12.7.40 is dead -- DOCFIX-038); k8s-snap (channel `1.32-classic/stable`), Cilium CNI; not CAPI-self-managed (no `clusterctl move`).
 
 **Rationale:** D-033's out-of-cloud node was necessarily dual-homed and its pod egress to the OpenStack API VIPs failed -- the Cilium reverse-NAT reply was emitted back out the second NIC instead of being redirected into the pod via `cilium_host` (a multi-NIC reverse-path fault; the `k8s` charm exposes too few Cilium annotations to repair it). A single-homed VM removes the second NIC and the fault entirely. The single-NIC pod-egress premise was then proven by the Phase 4 hard gate (an agnhost pod TCP probe to the Keystone VIP 10.12.4.50:5000 returning exitCode 0).
 
-**Status:** Adopted 2026-06-08; pod-egress premise validated. **Supersedes:** D-033 (revisits D-030 in simpler form). **Unaffected:** D-031, D-034.
+**Status:** Adopted 2026-06-08; pod-egress premise validated. **Supersedes:** D-033 (revisits D-030 in simpler form); also retires the Layer B topology of **D-007** and the bootstrap mechanism of **D-017** (k3s-on-MAAS + `clusterctl move` -> in-cloud single-homed VM, no pivot). **Unaffected:** D-031, D-034.
 
 **Trade-off:** a single-node management cluster is a SPOF with no self-heal -- see D-041 (manual-start policy) and D-040 (the OOM that surfaced it).
 
@@ -505,6 +507,20 @@
 
 ---
 
+## D-039: Magnum mints per-cluster app-creds carrying the trustor's roles (grant load-balancer_member)
+
+**Status:** ACCEPTED 2026-06-09 (applied in phase-06; asserted in phase-08 prereqs). Cited by phase-06 (DOCFIX-036 grant), phase-08, and appendix-A; recorded here to close the dangling reference.
+
+**Context:** the magnum-capi-helm service path uses NO static, pre-provisioned application credential. At cluster-create magnum mints a per-cluster Keystone application credential from a trust, and that app-cred carries the TRUSTOR's roles FROZEN at mint time, delegated unfiltered. The trustor is the identity that creates the cluster in the capi-mgmt project (admin@admin_domain in v1).
+
+**Decision:** the trustor must hold `load-balancer_member` (plus `member` and `reader`) on the capi-mgmt project BEFORE any cluster is created, so every minted app-cred carries Octavia authority. A trustor holding only `member` mints a frozen app-cred that 403s when CAPO queries Octavia to confirm LB state -- the workload cluster then wedges at API-LB provisioning, and a stuck-delete 403s the same way (appendix-A). The grant is idempotent (member + load-balancer_member + reader) and applied in phase-06.
+
+**Roosevelt implication:** whichever identity creates Magnum/CAPI clusters must carry `load-balancer_member` + `reader` on the cluster project; `member`-only is a latent 403. There is no static `capo` user/app-cred to provision -- that pattern was retired with the per-cluster mint.
+
+**Related:** D-031 / D-036 (driver/engine surface), D-046 (the trustee domain the trust resolves against), appendix-A (D-039 + stuck-delete recovery).
+
+---
+
 ## D-040: Raise nova-compute reserved-host-memory on the hyperconverged hosts
 
 **Decision:** set `nova-compute reserved-host-memory` to 8192 MB (from the default 512) so Nova placement accounts for the non-Nova memory co-located on each hyperconverged host. Charm config -> survives redeploy.
@@ -573,6 +589,8 @@
 | 2026-06-08 | D-034 (CAPI constellation pinned to dependencies.json; supersedes D-022), D-035 (in-cloud single-homed mgmt VM; supersedes D-033), D-036 (driver/chart/CAPO coherence resolved), D-037 ([capi_helm] via /etc/default DAEMON_ARGS) added. | In-cloud mgmt pivot |
 | 2026-06-09 | D-040 (reserved-host-memory 8192), D-041 (non-HA manual-start policy), D-042 (driver<->core contract coherence; 1.4.0 pin) added. | OOM incident + driver fix |
 | 2026-06-09 | D-019..D-042 consolidated into this document (15 decisions). Existing D-001..D-018 left intact (em-dash style preserved); the new entries are ASCII. | Repo sanitation / doc refresh |
+| 2026-06-17 | D-044 (Horizon secure-cookie override) + D-045 (haproxy confirmed-LOADED) folded from the changes-doc; D-046 (magnum trustee-domain setup) + D-047 (keystone v2.0 render bug / v3 drop-in) merged and renumbered from the 06-17 addendum (its "D-044/D-045"); D-048 (stage-and-verify canonical image seed, supersedes web-download) + D-049 (D1: kube v1.34.8 / capi-k8s-v1-34) added; D-042 amended (FINDING-5: UNHEALTHY is a <=1.3.0 false-negative, HEALTHY on 1.4.0); D-050 (PROPOSED: keystone policyd-override) recorded. | End-of-deploy runbook sweep |
+| 2026-06-18 | D-039 (Magnum per-cluster app-cred roles; grant load-balancer_member) recorded to close a dangling reference (previously cited only in phase-06/08, bundle, appendix-A). D-007 + D-017 annotated as superseded by D-035 / D-037 / D-042 (in-cloud mgmt VM; /etc/default config-dir graft; 1.4.0 driver) -- historical bodies retained. | Pre-commit audit (runbook sweep) |
 
 <!-- patchset-20260610-decisions-addendum -->
 
@@ -626,3 +644,95 @@
 Note: the restart procedure's failure-mode table already references the config
 key for SHUTOFF guests; whichever option is chosen, align that table, this
 decision, and the bundle/runbook with each other.
+
+<!-- patchset-20260617-sweep-decisions : end-of-deploy runbook sweep -->
+
+---
+
+## D-044: Horizon Secure-cookie override for internal-HTTP dashboard access (DOCFIX-030)
+
+**Status:** Adopted 2026-06-17 (PER-REBUILD; phase-03 Step 3.3). Resolves the mislabeled "D-043" tag used for this item in earlier phase-03/changes-doc drafts -- D-043 is the tenant-VM auto-resume decision.
+
+**Decision:** the jumphost reaches Horizon over a plain-HTTP reverse-proxy leg, but the dashboard sets `SESSION_COOKIE_SECURE`/`CSRF_COOKIE_SECURE=True`, so the browser drops the session/CSRF cookies over HTTP and login fails. Apply a Django settings override on the openstack-dashboard leader (`_99_internal_http_cookies.py`, setting `SESSION_COOKIE_SECURE=False` + `CSRF_COOKIE_SECURE=False`) to allow cookie flow over the internal HTTP leg. A TESTCLOUD accommodation of the no-DNS / no-FQDN-cert posture.
+
+**Trade-off / Roosevelt:** disabling Secure cookies is acceptable only because the proxy leg is internal and the cloud has no public DNS / FQDN-valid cert. The Roosevelt root-fix is cloud DNS + FQDN-valid certs end-to-end (which also fixes gss and the nginx proxy_ssl_name handling); then this override is removed. Self-signed-client-TLS approaches are NOT part of v1.
+
+**Related:** DOCFIX-030 (phase-03 Step 3.3), D-043 (distinct -- auto-resume).
+
+## D-045: Charm-rendered haproxy config must be confirmed LOADED, not just rendered (DOCFIX-031)
+
+**Status:** Adopted + APPLIED 2026-06-11; re-verified 2026-06-16 post phase-05.
+
+**Decision:** after the vault/TLS cert cascade settles, confirm every unit's haproxy is actually checking its backends over the freshly-rendered SSL config -- by a functional probe of haproxy's OWN backend state (admin-socket `show stat`, grep `,DOWN,`), NOT by `juju status`. Reload (not restart) any unit whose running haproxy predates the `check-ssl` re-render.
+
+**Root cause (confirmed, not refined to a check-config defect):** nova-cloud-controller haproxy was not reloaded after the cert cascade, so its health checks ran plaintext against the now-SSL backend port and marked the nova-api backends DOWN -- while juju stayed active/idle (juju is BLIND to per-backend haproxy state). An 8s wire capture showed the checks switch to TLS after reload; both backends returned UP/L7OK/200. The reload is a real fix, not a band-aid.
+
+**Status note:** the surfaced symptom was nova-api EOF / 503 behind a green juju. phase-03 Step 3.1 now gates a zero-DOWN sweep cloud-wide.
+
+**Related:** DOCFIX-031 (phase-03 Step 3.1; appendix-A), D-046 / D-047 (the separate magnum-keystone incident).
+
+## D-046: Magnum trustee-domain setup is a REQUIRED, asserted post-deploy step
+
+**Status:** ACCEPTED 2026-06-17. Recorded in design-decisions-addendum-20260617.md as "D-044"; renumbered to D-046 here per the 06-17 reconciliation (D-044/D-045 were taken by the Horizon/haproxy decisions). Matches the rootcause doc + phase-08 handoff prompt.
+
+**Context:** all `openstack coe ...` ops returned 403 ("Keystone client authentication failed"); magnum-api.log showed keystoneauth1 401 on every request since 2026-06-16. Root cause: the keystone domain `magnum` and user `magnum_domain_admin` that magnum.conf `[trust]` references did not exist. `magnum/common/policy.py:130` resolves `trustee_domain_id` on EVERY policy-enforced request (driver-agnostic), authenticating as the trustee domain admin; with the domain/user absent that is a hard 401 -> every coe op 403.
+
+**Cause:** the magnum charm action `domain-setup` is MANUAL, not automatic; magnum reports active / "Unit is ready" regardless of whether it has run. The 2026-06-11 teardown/redeploy rebuilt keystone with fresh domains but the runbook did not re-run `domain-setup`.
+
+**Decision:** `domain-setup` is a REQUIRED, ASSERTED post-deploy step on every (re)deploy, after the magnum + identity-service relation is up and BEFORE magnum is declared functional / before phase-08: (1) `juju run magnum/leader domain-setup`; (2) assert `openstack domain show magnum` and `openstack user show magnum_domain_admin --domain magnum` both succeed; (3) gate `openstack coe service list` (must return the conductor row, no 403). magnum's active/ready status MUST NOT be treated as evidence the trustee domain exists.
+
+**Roosevelt:** carry as an explicit runbook step + assertion; consider upstreaming a charm change so domain-setup runs automatically, or so the charm surfaces "trustee domain not set up" instead of reporting ready.
+
+**Related:** D-047 (the v2.0 render bug found in the same incident, but NOT the cause), D-031 / D-037 (magnum surface).
+
+## D-047: keystone auth_version v2.0 charm-render bug -- keep the v3 drop-in
+
+**Status:** ACCEPTED (keep) 2026-06-17. addendum "D-045" -> D-047 per the 06-17 reconciliation.
+
+**Context:** the magnum charm template renders `auth_version = v2.0` due to a type bug (the keystone interface delivers `api_version` JSON-decoded as int 3; the template does a strict string compare `3 == "3"` -> False -> v2.0). Full analysis in incident-magnum-keystone-v2-rootcause-20260617.md.
+
+**Finding:** the v2.0 render was NOT the cause of the coe 403 (that was D-046). On this deployment v2.0 is cosmetic -- magnum's `domain_admin_auth` rewrites v2.0 -> v3, v3 is discovered from the unversioned `auth_url`, and incoming token validation worked throughout.
+
+**Decision (Jesse, 2026-06-17):** KEEP the magnum.conf.d v3 drop-in. v2.0 is the provably wrong value for Caracal (which does not serve v2.0); the drop-in forces v3 via the same config-dir mechanism as the D-037 conductor graft (no charm-file drift, survives re-render). Architectural correctness over minimize-delta, even though the drop-in did not unblock coe.
+
+**As-built:** `/etc/magnum/magnum.conf.d/50-keystone-v3-override.conf` (auth_version=v3 + www_authenticate_uri/auth_url v3 in `[keystone_authtoken]` and `[keystone_auth]`); `/etc/default/magnum-api` DAEMON_ARGS adds `--config-dir /etc/magnum/magnum.conf.d` (mirrors D-037 for the standalone magnum-api).
+
+**Roosevelt:** carry the v3 fix as a drop-in/overlay; upstream the template type bug (and the separate identity-service departed-hook IndexError crash documented in the rootcause doc).
+
+**Related:** D-046, D-037.
+
+## D-048: Stage-and-verify is the canonical image-seed method (supersedes web-download)
+
+**Status:** Adopted 2026-06-17 (operator-approved). Supersedes the 2026-06-16 "web-download canonical" ruling.
+
+**Decision:** seed ALL glance images (octavia amphora base, the noble mgmt image, the workload kube image) by STAGE-AND-VERIFY: download to `$HOME` (snap-readable; NOT /tmp), verify the file against the published checksum (azimuth-images manifest sha512 for kube images; ubuntu cloud-images SHA256SUMS for noble), then `openstack image create --file [--import]`. Web-download is retained as a TESTED ALTERNATIVE only (appendix-A).
+
+**Rationale:** (1) FINDING-3 -- glance's web-download plugin fetches with urllib (UA `Python-urllib/3.x`) and the azimuth CDN 403s that UA, so web-download is UNUSABLE for kube images (202-accept, then stuck `queued`); (2) web-download cannot checksum-verify the fetched file (the CDN redirect strips the digest) -- weaker provenance; (3) stage-and-verify is one provenance-verified path cloud-wide -- less Roosevelt delta. CORRECTION-1: a plain `--file` PUT stores qcow2 (boots fine); `--import` runs glance image-conversion -> raw (Ceph fast-clone alignment).
+
+**Roosevelt:** unify on stage-and-verify; the longer-term target remains gss-from-a-controlled-mirror once cloud DNS + FQDN certs land.
+
+**Related:** D-021 (amphora pipeline), FINDING-3 (appendix-A image-seeding), phase-05 / 06 / 08.
+
+## D-049: Workload kube image bumped v1.32.13 (EOL) -> v1.34.8 (D1)
+
+**Status:** Adopted 2026-06-17. Procedure target; re-validation on v1.34.8 follows the stage-and-verify seed.
+
+**Decision:** the workload-cluster kube image moves from the EOL ubuntu-jammy-kube-v1.32.13 to ubuntu-jammy-kube-v1.34.8 (azimuth-images 0.28.0, build 260518-1604; sha512 7efde485...760bdb3), and the cluster template is renamed `capi-k8s-v1-32` -> `capi-k8s-v1-34`. v1.34.8 is mature with good runway and within CAPI v1.13.2 support. The management cluster's OWN k8s stays at v1.32.13 (k8s-snap 1.32-classic) -- this bump is the workload image only.
+
+**Note:** the 2026-06-09 D-011 acceptance ran on v1.32.13; the v1.34.8 image is seeded via D-048 stage-and-verify, and D-011 re-validation on v1.34.8 is the pending acceptance item. The template now pins `--network-driver calico` (DOCFIX-032).
+
+**Related:** D-031 / D-034 (CAPI surface), D-048 (seed), DOCFIX-032 (CNI pin), phase-08.
+
+## D-042 -- AMENDMENT (2026-06-17): FINDING-5 -- rescope to "<= 1.3.0" (HEALTHY on 1.4.0)
+
+The D-042 cosmetic `health_status = UNHEALTHY` false-negative is a property of driver builds <= 1.3.0 (the v1beta2 contract-ref mismatch: those builds read `apiVersion` off the infrastructureRef, which CAPI v1.13's v1beta2 contract no longer carries). The RELEASED 1.4.0 driver carries the `api_resources` override and reports `health_status = HEALTHY` against the CAPI v1.13.2 / CAPO v0.14.4 stack (confirmed this rebuild). FINDING-5: D-042 is therefore CLOSED for v1 -- the UNHEALTHY caveat applies only to the historical <=1.3.0 holding state, NOT to the as-built 1.4.0 pin. Auto-heal is still NOT wired to health_status (CAPI MachineHealthCheck heals independently).
+
+## D-050: PROPOSED / OPEN -- keystone `use-policyd-override=true` with no policy zip (FINDING-1)
+
+**Status:** PROPOSED / OPEN (recorded 2026-06-17; no action taken).
+
+**Question:** keystone is configured with `use-policyd-override=true` but no policy override zip is supplied. This is currently a no-op (no custom policy applied), but the flag advertises an override capability that does not exist -- a latent footgun (a future operator may assume policy is being enforced, or a stray zip could silently change authz).
+
+**Options (unresolved):** (a) set `use-policyd-override=false` for v1 (the override is unused) and revisit when a real policy is needed; (b) keep true and supply an explicit, reviewed policy zip; (c) leave as-is and document the no-op. No decision made -- recorded as an open point to rule on (cf. D-043, also pending).
+
+**Related:** D-029 (Keystone SSO deferral), FINDING-1.
diff --git a/docs/netbox-vip-queue.md b/docs/netbox-vip-queue.md
index 04e512f..9ee1c4e 100644
--- a/docs/netbox-vip-queue.md
+++ b/docs/netbox-vip-queue.md
Binary files differ
diff --git a/runbooks/README.md b/runbooks/README.md
index 89f4cd9..7e595e6 100644
--- a/runbooks/README.md
+++ b/runbooks/README.md
@@ -30,7 +30,7 @@
 | 04 | phase-04-network-carve.md               | Provider external network + IPAM reference           |                       |
 | 05 | phase-05-octavia-enablement.md          | Enable Octavia (amphora)                             | D-021                 |
 | 06 | phase-06-incloud-mgmt-cluster.md        | In-cloud single-homed CAPI management cluster        | D-035                 |
-| 07 | phase-07-conductor-graft.md             | Graft the magnum-capi-helm driver onto the conductor | D-031 / D-037 / D-042 |
+| 07 | phase-07-conductor-graft.md             | Trustee domain-setup + graft the magnum-capi-helm driver | D-031 / D-037 / D-042 / D-046 / D-047 |
 | 08 | phase-08-workload-cluster-acceptance.md | End-to-end tenant cluster + acceptance bar           | D-011 (amended D-019) |
 
 ## Appendices
diff --git a/runbooks/appendix-A-troubleshooting.md b/runbooks/appendix-A-troubleshooting.md
index 6638d6c..2d7ef3a 100644
--- a/runbooks/appendix-A-troubleshooting.md
+++ b/runbooks/appendix-A-troubleshooting.md
@@ -114,12 +114,25 @@
   "string present in the unit file" as "the daemon received the flag." Gate on the
   assembled/launched cmdline (`show-args`, then `ps` on the live process).
 
+### DOCFIX-035 -- helm not on the conductor's PATH  (phase-07)
+- Symptom: the magnum-capi-helm driver fails shelling out to `helm` (cluster create errors on a
+  helm invocation), yet `command -v helm` in an interactive `juju ssh magnum/0` shell finds it.
+- Cause: the conductor runs via an LSB init script (systemd `systemd-start`) with the restricted
+  init PATH (e.g. `/usr/sbin:/usr/bin:/sbin:/bin`), which EXCLUDES `/usr/local/bin` -- where a
+  get.helm.sh tarball install lands. An interactive login shell has `/usr/local/bin` on PATH, so
+  it masks the problem (the classic green-in-the-shell, broken-in-the-daemon trap).
+- Fix: install the binary to `/usr/local/bin/helm` AND symlink `/usr/bin/helm -> it` (`/usr/bin`
+  IS on the restricted PATH). Checksum-verify the tarball (sha256 vs get.helm.sh `.sha256sum`)
+  before install. VERIFY against the restricted PATH, not a login shell:
+  `env -i PATH=/usr/sbin:/usr/bin:/sbin:/bin sh -c 'command -v helm && helm version --short'`
+  must print `/usr/bin/helm` (phase-07 7.4).
+
 ### L-P6-3 -- k8s version comes from the IMAGE, not a template label  (phase-08)
 - Symptom: cluster create fails in the driver before provisioning.
 - Cause: the magnum-capi-helm driver reads `kube_version` from the Glance image
   properties and routes on `os_distro`; it does NOT take k8s version from a template
   label.
-- Fix: the workload image (e.g. `ubuntu-jammy-kube-v1.32.13`) MUST carry
+- Fix: the workload image (e.g. `ubuntu-jammy-kube-v1.34.8`) MUST carry
   `kube_version` (e.g. v1.32.13) and `os_distro=ubuntu`. Verify before create (phase-08 8.0).
 
 ================================================================================
@@ -164,6 +177,11 @@
   so the Cluster auto-finalizes and deletes. Then manually clean orphaned neutron
   resources in dependency order: router remove subnet -> router unset external-gateway
   -> router delete -> subnet delete -> network delete -> security group delete.
+- Name-guard (FINDING-4): NEVER patch/delete a CR by an inferred name. The OpenStackCluster is
+  named `<cluster>-<CAPI-suffix>` where the suffix is random per create (NOT the Magnum cluster
+  name). LIST first -- `kubectl -n <magnum-ns> get openstackcluster` -- and operate on the EXACT
+  name returned. The magnum-ns is `magnum-<project-id>` (resolve the project id; never hardcode).
+  A wrong-name patch silently no-ops and the delete stays wedged.
 
 ### LB-failover -- LB stuck provisioning_status=ERROR after a host event  (phase-08)
 - Symptom: the kube-api Octavia LB shows `operating_status ONLINE` but
@@ -181,13 +199,15 @@
   cluster. If the mgmt cluster is down (see D-041), the taint persists.
 - Fix: restore the mgmt cluster API; CAPI then removes the taint and addons schedule.
 
-### CNI-label -- network_driver vs the chart-default Calico (1.4.0)  (phase-08)
-- Note: under the as-FIRST-built driver 1.3.0 the legacy Magnum `network_driver` label
-  was IGNORED and the capi-helm `openstack-cluster` chart's default CNI (Calico) always
-  ran. Under the RELEASED 1.4.0 driver the `network_driver` template option IS honored
-  (it maps through to the chart). To keep the as-built CNI (Calico), the `capi-k8s-v1-32`
-  template OMITS `--network-driver` (phase-08); set `flannel` there only to intentionally
-  switch the CNI. (Mgmt cluster CNI is separately Cilium, via k8s-snap.)
+### CNI-label / DOCFIX-032 -- network_driver under driver 1.4.0; pin calico explicitly  (phase-08)
+- Note: under the as-FIRST-built driver 1.3.0 the legacy Magnum `network_driver` label was
+  IGNORED and the capi-helm `openstack-cluster` chart's default CNI (Calico) always ran. Under
+  the RELEASED 1.4.0 driver the `network_driver` template option IS honored (it maps through to
+  the chart `network_driver`).
+- DOCFIX-032: pin `--network-driver calico` EXPLICITLY on the `capi-k8s-v1-34` template
+  (phase-08) rather than relying on the default staying Calico. Chart 0.25.1 ships ONLY Calico
+  (flannel is not packaged), so `flannel` there would fail to converge -- do not set it. (Mgmt
+  cluster CNI is separately Cilium, via k8s-snap.)
 
 ================================================================================
 ## Hyperconverged host / mgmt-VM resilience
@@ -249,6 +269,16 @@
   allocation_pool (phase-00 Phase 4). A reserved range stops future auto-assign onto
   a configured VIP. Negative test post-deploy: no service vip == any unit primary.
 
+### DEVIATION-2 -- raise a KVM host's RAM, then MAAS-recommission to Ready  (phase-00)
+- Context (2026-06-11): the openstack0-3 KVM guests were bumped 16384 -> 32768 MiB on the 196 GB
+  hypervisor to relieve memory pressure. Pattern: with the guest SHUT OFF (and after the OSD
+  wipe), `virsh setmaxmem <dom> 32G --config` then `virsh setmem <dom> 32G --config`; boot; then
+  MAAS RECOMMISSION the node so MAAS re-reads hardware and lands it back at Ready at the new size
+  (4x Ready at 32768 in ~3 min). Do the maxmem change while shut off -- a live setmaxmem is rejected.
+- D-040 `reserved-host-memory 8192` is RETAINED (correctness floor, independent of host size).
+  Re-measure the per-host container/service footprint against the 32 GiB envelope before the
+  Roosevelt node-role split -- 16 GiB-era pressure numbers do not map 1:1.
+
 ================================================================================
 ## Deploy-time (phase-01)
 ================================================================================
@@ -330,6 +360,39 @@
   revs instead of re-introducing the 401-by-hardcode.
 
 ================================================================================
+## Core services: HAProxy + reverse-proxy (phase-03)
+================================================================================
+
+### D-045 / DOCFIX-031 -- juju "active/idle" but an haproxy backend is DOWN  (phase-03)
+- Symptom: `juju status` is all active/idle, yet a service VIP intermittently 503s or a unit's
+  API is unreachable. juju health is BLIND to per-backend haproxy state.
+- Cause: a charm-rendered haproxy backend can be silently DOWN without the charm going non-idle
+  -- e.g. (D-045) haproxy was NOT reloaded after the TLS/cert cascade, so its health checks ran
+  plaintext against an SSL backend and marked it DOWN. juju-green is necessary, not sufficient.
+- Fix: sweep haproxy's OWN verdict on every unit via its admin socket, then remediate+reload.
+  Per unit, read `/var/run/haproxy/admin.sock` (`show stat`) and `grep ',DOWN,'` (excluding the
+  FRONTEND/BACKEND summary rows). For any flagged unit: `sudo haproxy -c -f
+  /etc/haproxy/haproxy.cfg` (must say valid) then `sudo systemctl reload haproxy` (graceful
+  master-worker; reload, not restart). Phase-03 3.1 gates on a zero-DOWN sweep cloud-wide --
+  it closes the juju-green-but-backend-DOWN hole.
+
+### nginx-reverse-proxy -- jumphost -> internal-VIP proxy gotchas  (phase-03)
+- Context: the jumphost reaches internal-only dashboards/APIs via an nginx reverse proxy
+  (phase-03 3.3). Four traps, each with the as-built fix:
+- reload race: a `systemctl reload nginx` right after editing the vhost can be served by a
+  still-draining old worker (a curl ~2s later hits stale behavior; the co-hosted MAAS proxy
+  blips too). `nginx -t` FIRST; prefer `restart` for a definitive cutover when the listen/upstream
+  set changed, reload only for content-equivalent edits.
+- proxy_ssl_name / SNI: the upstream presents a DNS-SAN cert (a juju-internal name, e.g.
+  `juju-ffe3b8-2-lxd-2`); set `proxy_ssl_name` to that SAN, `proxy_ssl_verify on`, and the vault
+  CA in `proxy_ssl_trusted_certificate`, or verification fails on the IP-only connect.
+- sed no-op: a `sed -i` that does not match silently changes nothing and the proxy keeps the old
+  behavior -- assert the post-edit content, do not trust sed's exit code.
+- scheme-mismatch redirect loop: the backend issues `https://` Location headers while the proxy
+  listens `http`; without `proxy_redirect https:// http://` (or a matching listen scheme) the
+  browser loops. Match the scheme end-to-end or rewrite the redirect.
+
+================================================================================
 ## Octavia enablement (phase-05)
 ================================================================================
 
@@ -357,6 +420,36 @@
   The amphora pipeline gate asserts the two are equal before building (phase-05 5.2).
 
 ================================================================================
+## Image seeding (phase-05/06/08)
+================================================================================
+
+### FINDING-3 -- azimuth CDN 403s glance web-download; stage-and-verify is canonical  (phase-06, phase-08)
+- Symptom: a glance web-download import (`--import-method web-download`) 202-accepts, then the
+  image hangs in `queued` forever and never reaches `active`.
+- Cause: glance's web-download plugin fetches with urllib (User-Agent `Python-urllib/3.x`); the
+  azimuth-images CDN (`azimuth-images.stackhpc.cloud`) returns HTTP 403 to that UA. A curl/HEAD
+  probe with a different UA passes -- which is why an earlier probe false-passed while the real
+  import failed.
+- Fix (canonical): STAGE-AND-VERIFY. curl the qcow2 to `$HOME` (snap-readable, NOT /tmp -- L7;
+  curl's UA is not blocked), verify the checksum against the published manifest (azimuth-images
+  manifest.json -- sha512 for kube images; the ubuntu cloud-images SHA256SUMS for noble), then
+  `openstack image create --file --import` (the openstack snap's `--import` == glance-direct;
+  image-conversion lands it `raw`). CORRECTION-1: a plain `--file` PUT (no `--import`) stores
+  qcow2 -- fine for boot, but `--import` gives the raw Ceph fast-clone alignment.
+- Clear a stuck record before retry: gated `openstack image delete <id>` on the `queued` remnant
+  (verify the EXACT id first -- FINDING-4 name-guard discipline).
+- Roosevelt: unify ALL image seeding (amphora base, noble mgmt, kube) on stage-and-verify for one
+  provenance-verified path cloud-wide.
+
+### web-download -- tested ALTERNATIVE to stage-and-verify  (phase-05/06/08)
+- Web-download (`openstack image create --import --import-method web-download --uri <url>`) is
+  retained as a tested ALTERNATIVE, not the canonical path (superseded 2026-06-17; see
+  design-decisions). Caveats: (1) it cannot checksum-verify the fetched file against a published
+  digest (the CDN redirect strips it) -- weaker provenance; (2) it 403s on the azimuth CDN
+  (FINDING-3), so it is unusable for kube images; (3) for ubuntu cloud-images it works on the
+  hardened bundle (the 2026-06-08 403 was transient/pre-hardening). Use only as an expedient.
+
+================================================================================
 ## Notes
 ================================================================================
 - This index covers phases 00-08. It grows the same way for any future phase: keyed by
diff --git a/runbooks/appendix-B-asbuilt-version-lock.md b/runbooks/appendix-B-asbuilt-version-lock.md
index add0350..a434fad 100644
--- a/runbooks/appendix-B-asbuilt-version-lock.md
+++ b/runbooks/appendix-B-asbuilt-version-lock.md
@@ -1,7 +1,8 @@
 # Appendix B -- As-Built Version / Channel / Revision Lock
 
 Source: `juju export-bundle` (model `openstack`) + the in-cloud mgmt-cluster
-captures, 2026-06-09. ASCII-only.
+captures, 2026-06-09; B.2/B.3 workload-image, template, driver, and helm facts refreshed
+in the 2026-06-17 sweep (D1 + DOCFIX-032 + DOCFIX-035). ASCII-only.
 
 POLICY (D-002 + consolidation prompt): the bundle PINS CHANNELS, not revisions.
 This appendix records the as-built REVISIONS as the known-good baseline. A fresh
@@ -80,7 +81,7 @@
 
 ## B.2 In-cloud management cluster + CAPI constellation (D-034 / D-035 / D-037)
 
-Node `capi-mgmt-v2` (FIP 10.12.7.40, internal 10.20.0.45), single-node, non-CAPI-managed:
+Node `capi-mgmt-v2` (FIP + internal IP are per-rebuild -- this rebuild FIP 10.12.5.103 / internal 10.20.0.107; 2026-06-09: 10.12.7.40 / 10.20.0.45), single-node, non-CAPI-managed:
 - k8s-snap: channel `1.32-classic/stable`, rev 5326, k8s v1.32.13 (classic confinement)
 - CAPI core + kubeadm-bootstrap + kubeadm-control-plane: v1.13.2
 - CAPO (infra provider): v0.14.4
@@ -89,7 +90,10 @@
 - CAAPH (cluster-api-addon-provider): chart 0.12.0 (`helm --version`, from dependencies.json; deploys image 62f7c00)
 - cluster-api-janitor-openstack: chart 0.11.0 (`helm --version`, from dependencies.json; deploys image d527847)
 - cluster-autoscaler (per-workload): v1.30.4
-- Mgmt CNI: Cilium 1.17.12-ck0. Workload-cluster CNI: Calico (chart default).
+- Mgmt CNI: Cilium 1.17.12-ck0. Workload-cluster CNI: Calico (DOCFIX-032: pinned explicitly, not relied-on default).
+- helm: v3.17.3 -- mgmt-VM tooling (phase-06 6.6a) AND the magnum conductor (phase-07 7.4),
+  installed to /usr/local/bin + a /usr/bin/helm symlink so the conductor's restricted init PATH
+  resolves it (DOCFIX-035).
 
 VERSION-SOURCE RULE (D-034): every provider ref above is read live from the chosen
 `capi-helm-charts` release tag's `dependencies.json` via `jq`. DO NOT hardcode
@@ -97,8 +101,8 @@
 
 ## B.3 Magnum driver + chart (Layer B -- outside Juju channels, manually pinned)
 
-- magnum-capi-helm driver: 1.3.0 was the AS-FIRST-BUILT pin; the v1 TARGET is the
-  RELEASED `magnum-capi-helm==1.4.0` (D-042). 1.3.0 is contract-INCOHERENT with the
+- magnum-capi-helm driver: 1.3.0 was the AS-FIRST-BUILT pin; the v1 AS-BUILT pin is the
+  RELEASED `magnum-capi-helm==1.4.0` (D-042; installed, health HEALTHY). 1.3.0 is contract-INCOHERENT with the
   Layer-A core -- it reads `apiVersion` off the infrastructureRef, which CAPI v1.13
   (v1beta2 contract) no longer carries, so the driver's `infrastructure` health GET
   returns "not found" (cosmetic only -- the create path is unaffected; the chart
@@ -120,10 +124,15 @@
 - chart repo: https://azimuth-cloud.github.io/capi-helm-charts
 - chart name: openstack-cluster ; default_helm_chart_version: 0.25.1
 - conf.d drop-in: /etc/magnum/magnum.conf.d/00-capi-helm.conf (D-037)
-- note (CNI): the `capi-k8s-v1-32` template OMITS the Magnum `network_driver` field, so
-  the workload cluster gets the chart-default Calico (the as-built CNI). Whether 1.4.0
-  honors `network_driver` is unverified and not relied on -- omitting the field is what
-  guarantees Calico (appendix-A: CNI-label; phase-08).
+- workload kube image (D1): ubuntu-jammy-kube-v1.34.8 (azimuth-images 0.28.0, build 260518-1604);
+  kube_version v1.34.8, os_distro ubuntu; sha512 7efde4857c9f9da045a98d71def30e229b3d7fffd8a5680e8aee0c5a8b13ba73fca3cf758a927230a1fbe3c451d8d21cfaeded96091e2a4f313c6a404760bdb3
+  (manifest.json). Seeded by STAGE-AND-VERIFY from the azimuth CDN (FINDING-3 -- glance
+  web-download 403s the urllib UA). Bumped from EOL v1.32.13 (within CAPI v1.13.2 support).
+- workload template (D1): capi-k8s-v1-34 (was capi-k8s-v1-32), --network-driver calico pinned (DOCFIX-032).
+- note (CNI, DOCFIX-032): the `capi-k8s-v1-34` template PINS `--network-driver calico`
+  explicitly. Under driver 1.4.0 `network_driver` IS honored (maps to the chart); chart 0.25.1
+  ships only Calico (flannel not packaged), so the explicit pin documents intent and does not
+  rely on the default staying Calico (appendix-A: CNI-label / DOCFIX-032; phase-08).
 - v1 END STATE: 1.4.0 installed and `health_status = HEALTHY` (D-011). 1.3.0 is only a
   TEMPORARY rollback/holding state (phase-07 Rollback), never a v1 completion. Either
   way, do NOT wire magnum auto-heal to health_status (CAPI MachineHealthCheck handles
diff --git a/runbooks/ops-capi-recovery.md b/runbooks/ops-capi-recovery.md
index 01e6886..06eb862 100644
--- a/runbooks/ops-capi-recovery.md
+++ b/runbooks/ops-capi-recovery.md
@@ -11,8 +11,12 @@
 Magnum health). Everything upstream stays red until the layer below is green.
 
 Scope-hygiene preambles are the canonical ones from the 2026-06-09 as-executed
-log. ENV literals: project capi-mgmt 674171fd28d446d3a37073b6a761e910; mgmt FIP
-10.12.7.40; kube-api LB 0f968008-...; regenerate per site on rebuild.
+log. ENV values are PER-REBUILD and resolved at run time in the blocks below:
+project capi-mgmt id via `openstack project show capi-mgmt --domain capi`; mgmt FIP
+via `~/capi-mgmt-net.env` as `$MGMT_FIP` (the phase-06 single source); the
+magnum-<id> driver namespace via `kubectl get ns`; the kube-api LB by id at failover.
+Never reuse a prior rebuild's literals (2026-06-09 example, do NOT paste: project
+674171fd..., FIP 10.12.7.40, LB 0f968008-...).
 
 ---
 
@@ -38,7 +42,8 @@
 # capi-mgmt scope
 source ~/admin-openrc
 unset OS_PROJECT_ID OS_TENANT_ID OS_TENANT_NAME OS_PROJECT_DOMAIN_ID
-export OS_PROJECT_ID=674171fd28d446d3a37073b6a761e910
+CAPI_PID=$(openstack project show capi-mgmt --domain capi -f value -c id)  # per-rebuild; resolve, never hardcode
+export OS_PROJECT_ID="$CAPI_PID"
 unset OS_PROJECT_NAME OS_PROJECT_DOMAIN_NAME OS_TENANT_NAME OS_TENANT_ID
 openstack server stop capi-mgmt-v2
 # NOTE: Nova ACPI stop does NOT produce a clean guest shutdown on this VM
@@ -61,7 +66,8 @@
 ( {
   source ~/admin-openrc
   unset OS_PROJECT_ID OS_TENANT_ID OS_TENANT_NAME OS_PROJECT_DOMAIN_ID
-  export OS_PROJECT_ID=674171fd28d446d3a37073b6a761e910
+  CAPI_PID=$(openstack project show capi-mgmt --domain capi -f value -c id)  # per-rebuild; resolve, never hardcode
+  export OS_PROJECT_ID="$CAPI_PID"
   unset OS_PROJECT_NAME OS_PROJECT_DOMAIN_NAME OS_TENANT_NAME OS_TENANT_ID
   openstack server start capi-mgmt-v2
   for i in $(seq 1 20); do
@@ -70,9 +76,10 @@
     [ "$ST" = ACTIVE ] && break
     sleep 10
   done
+  source ~/capi-mgmt-net.env   # MGMT_FIP (per-rebuild single source from phase-06; never hardcode)
   echo "=== TCP probe loop: FIP :22 (sshd lags ACTIVE by ~3 min) ==="
   for i in $(seq 1 18); do
-    timeout 5 bash -c 'exec 3<>/dev/tcp/10.12.7.40/22' 2>/dev/null \
+    timeout 5 bash -c "exec 3<>/dev/tcp/$MGMT_FIP/22" 2>/dev/null \
       && { echo "[$i] SSH-PORT-OK"; break; } || echo "[$i] not yet"
     sleep 10
   done
@@ -90,10 +97,11 @@
 BEGIN runbook block: mgmt k8s readiness poll (cold-start aware)
 ------------------------------------------------------------------------
 ( {
+  source ~/capi-mgmt-net.env   # MGMT_FIP (per-rebuild single source from phase-06; never hardcode)
   for i in $(seq 1 15); do
     echo "--- [$i] $(date -u +%H:%M:%S) ---"
     ssh -i ~/.ssh/id_ed25519 -o BatchMode=yes -o StrictHostKeyChecking=no \
-        -o UserKnownHostsFile=/dev/null -o ConnectTimeout=10 ubuntu@10.12.7.40 \
+        -o UserKnownHostsFile=/dev/null -o ConnectTimeout=10 ubuntu@"$MGMT_FIP" \
         'uptime; sudo k8s status 2>&1 </dev/null | head -4'
     sleep 120
   done
@@ -117,17 +125,18 @@
   export KUBECONFIG="$HOME/capi-mgmt.kubeconfig"
   kubectl get nodes -o wide
   kubectl get pods -A | egrep 'capi-|capo-|cert-manager|orc-system|janitor|addon'
-  NS=magnum-674171fd28d446d3a37073b6a761e910
+  NS=$(kubectl get ns -o name | cut -d/ -f2 | grep "^magnum-" | head -1)  # capi-mgmt driver ns; resolve, never hardcode
   kubectl -n "$NS" get cluster,openstackcluster,machines
 } )
-# kubeconfig missing? Re-emit (phase-06 Step 6.5, verbatim):
+# kubeconfig missing? Re-emit (phase-06 Step 6.5; source ~/capi-mgmt-net.env for $MGMT_FIP first):
 #   ssh -i ~/.ssh/id_ed25519 -o BatchMode=yes -o StrictHostKeyChecking=no \
-#       -o UserKnownHostsFile=/dev/null -o ConnectTimeout=10 ubuntu@10.12.7.40 \
-#       "sudo k8s config server=https://10.12.7.40:6443 </dev/null" > ~/capi-mgmt.kubeconfig
+#       -o UserKnownHostsFile=/dev/null -o ConnectTimeout=10 ubuntu@"$MGMT_FIP" \
+#       "sudo k8s config server=https://$MGMT_FIP:6443 </dev/null" > ~/capi-mgmt.kubeconfig
 ( {
   source ~/admin-openrc
   unset OS_PROJECT_ID OS_TENANT_ID OS_TENANT_NAME OS_PROJECT_DOMAIN_ID
-  export OS_PROJECT_ID=674171fd28d446d3a37073b6a761e910
+  CAPI_PID=$(openstack project show capi-mgmt --domain capi -f value -c id)  # per-rebuild; resolve, never hardcode
+  export OS_PROJECT_ID="$CAPI_PID"
   unset OS_PROJECT_NAME OS_PROJECT_DOMAIN_NAME OS_TENANT_NAME OS_TENANT_ID
   openstack loadbalancer list -f yaml
 } )
@@ -222,11 +231,12 @@
   unset OS_PROJECT_ID OS_TENANT_ID OS_TENANT_NAME
   openstack loadbalancer amphora list -f yaml          # all ALLOCATED
   export KUBECONFIG="$HOME/capi-mgmt.kubeconfig"
-  NS=magnum-674171fd28d446d3a37073b6a761e910
+  NS=$(kubectl get ns -o name | cut -d/ -f2 | grep "^magnum-" | head -1)  # capi-mgmt driver ns; resolve, never hardcode
   kubectl -n "$NS" get cluster,openstackcluster        # Available=True (allow ~10 min post-failover for CAPO resync)
   source ~/admin-openrc
   unset OS_PROJECT_ID OS_TENANT_ID OS_TENANT_NAME OS_PROJECT_DOMAIN_ID
-  export OS_PROJECT_ID=674171fd28d446d3a37073b6a761e910
+  CAPI_PID=$(openstack project show capi-mgmt --domain capi -f value -c id)  # per-rebuild; resolve, never hardcode
+  export OS_PROJECT_ID="$CAPI_PID"
   unset OS_PROJECT_NAME OS_PROJECT_DOMAIN_NAME OS_TENANT_NAME OS_TENANT_ID
   openstack coe cluster show capi-test-1 -f value -c health_status
   openstack coe cluster show capi-test-1 -f value -c health_status_reason
diff --git a/runbooks/phase-00-teardown-maas-reset.md b/runbooks/phase-00-teardown-maas-reset.md
index 94bf9ba..433376e 100644
--- a/runbooks/phase-00-teardown-maas-reset.md
+++ b/runbooks/phase-00-teardown-maas-reset.md
@@ -13,8 +13,9 @@
 libvirt/qemu-img), KI-P3-001.
 
 !!! DESTRUCTIVE. Phase 1 (destroy-model + release) and Phase 2 (OSD wipe) are
-    irreversible short of the KVM snapshots (the D-017 safety net). Each destructive
-    step is DISCRETE and individually gated -- do not batch.
+    irreversible. There is NO model-state rollback (DEVIATION-1): a KVM snapshot revert
+    cannot restore the destroyed Juju model -- the repo runbooks ARE the tested restore
+    path (D-017). Each destructive step is DISCRETE and individually gated -- do not batch.
 
 CAPI-MGMT NOTE: this teardown releases the FOUR openstack hosts only. The MAAS
 `capi-mgmt` VM is the RETIRED D-033 out-of-cloud node; the in-cloud `capi-mgmt-v2`
@@ -25,8 +26,11 @@
 ---
 
 ## Prerequisites
-- KVM snapshots of openstack0-3 exist (safety net). Authenticated juju session
-  (`juju whoami`). MAAS CLI logged in as profile `admin`.
+- (OPTIONAL) KVM snapshots of openstack0-3. NOTE (DEVIATION-1): snapshots do NOT give
+  model-state rollback -- destroy-model erases the Juju controller DB, so a disk revert
+  resurrects machines with no managing model + a stale MAAS view. The repo runbooks are
+  the restore path (D-017); snapshots are not required for this cycle.
+- Authenticated juju session (`juju whoami`). MAAS CLI logged in as profile `admin`.
 - Run from jumphost `vopenstack-jesse` (user `jessea123`, sudo; also the libvirt hypervisor).
 
 ## Constants and env-literals
@@ -47,7 +51,10 @@
 ```bash
 ( {
   echo "=== 0a. five network spaces (hard blocker if absent) ==="
-  juju spaces   # expect metal 10.12.8.0/22 | provider 10.12.4.0/22 | data 10.12.12.0/22 | storage 10.12.16.0/22 | replication 10.12.20.0/22
+  # DOCFIX-026: MAAS is authoritative for spaces (Juju imports them at add-model); use the
+  # model-independent query (same as Phase 5). Expect: metal 10.12.8.0/22 | provider 10.12.4.0/22
+  # | data 10.12.12.0/22 | storage 10.12.16.0/22 | replication 10.12.20.0/22 (lbaas + undefined also appear).
+  maas admin spaces read | jq -r '.[] | "\(.name)\t\([.subnets[]?.cidr] | join(", "))"'
 
   echo "=== 0b. VIP ipranges (note the front-loaded ones to KEEP + the stale .224-.254 to remove) ==="
   maas admin ipranges read \
@@ -69,7 +76,8 @@
   printf '%-46s state=%s owner=%s mode=%s\n' "$f" \
     "$(sudo virsh -c qemu:///system domstate "$host" 2>/dev/null)" \
     "$(sudo stat -c '%U:%G' "$f" 2>/dev/null)" "$(sudo stat -c '%a' "$f" 2>/dev/null)"
-done   # expect (AFTER Phase 1 release): 4 lines, state=shut off, owner=root:root, mode=600
+done   # expect (AFTER Phase 1 release): 4 lines, state=shut off, owner=root:root, mode=600.
+       # (Run PRE-teardown as a baseline: state=running, owner=libvirt-qemu:kvm -- correct live state.)
 ```
 
 ## Phase 1 -- Teardown (D-018)  DISCRETE / DESTRUCTIVE
@@ -206,7 +214,7 @@
 `# RUN: jumphost`
 ```bash
 ( {
-  juju spaces                                              # 5 spaces present
+  maas admin spaces read | jq -r '.[] | "\(.name)\t\([.subnets[]?.cidr] | join(", "))"'   # DOCFIX-026: 5 spaces (juju spaces FAILS here -- model gone post-teardown)
   maas admin machines read | jq -r '.[]|select(.hostname|test("^openstack[0-3]$"))|"\(.hostname)\t\(.status_name)"' | sort   # all Ready
   for SID in 4na83t qdbqd6 h8frng tmsafc; do echo "-- $SID --"
     maas admin interfaces read "$SID" | jq -r '.[]|select(.name|test("^enp(8|9|10)s0$"))|"  \(.name)\t\([.links[]?|{(.subnet.cidr):.ip_address}])"'
@@ -238,6 +246,17 @@
   openstack0 resolved dynamically (the block does not depend on these).
 - MAAS carve: front-loaded .2-.63 reservations created earlier and persistent; stale
   metal .224-.254 was iprange id=2 (deleted after confirmation).
+- DEVIATION-2 (2026-06-11): hypervisor 196 GB; openstack0-3 each 16384 -> 32768 MiB
+  (virsh setmaxmem/setmem --config while shut off, post-OSD-wipe), then MAAS recommission
+  with `skip_networking=1 skip_storage=1 testing_scripts=none` -- refreshes hardware
+  inventory WITHOUT losing interface links/storage layout (all 12 storage links preserved;
+  4x Ready at 32768 in ~3 min). D-040 reserved-host-memory 8192 retained (correctness floor,
+  not a function of total RAM). Per-host footprint for Roosevelt rebalancing is measured at
+  the 32 GiB envelope (16 GiB-era pressure numbers do not map 1:1). [recommission pattern -> appendix-A]
+- DEVIATION-3 (2026-06-11): the destroy-model released Juju machine 4 (the retired D-033
+  out-of-cloud capi-mgmt MAAS node) as a side effect; MAAS shows capi-mgmt = Ready (landed
+  Ready, not re-released by the Phase 1C loop, which targeted only the four system_ids).
+  The separate "Phase 7 teardown of old MAAS capi-mgmt node" queue item is thereby closed.
 
 ## Next
 phase-01 -- bundle deploy.
diff --git a/runbooks/phase-01-bundle-deploy.md b/runbooks/phase-01-bundle-deploy.md
index a6a28b9..8b745ba 100644
--- a/runbooks/phase-01-bundle-deploy.md
+++ b/runbooks/phase-01-bundle-deploy.md
@@ -74,11 +74,13 @@
 } )
 ```
 ```bash
-# CHECK 4b: OSD /dev/vdb blank (run on each host; sudo required -- appendix-A: R7)
+# CHECK 4b: OSD /dev/vdb blank (DOCFIX-027 -- LOCAL libvirt-host loop, NOT ssh: the four
+# hosts are Released/powered-off entering phase-01, and /var/lib/libvirt/images is a
+# hypervisor (jumphost) path that does not exist on the hosts. RUN: jumphost (libvirt host; sudo).
 for h in openstack0 openstack1 openstack2 openstack3; do
   echo "== $h =="
-  ssh jessea123@$h "sudo qemu-img info /var/lib/libvirt/images/${h}-1.qcow2 | grep -E 'virtual size|disk size'" </dev/null
-done   # expect virtual 512 GiB, disk ~KiB (sparse/blank)
+  sudo qemu-img info "/var/lib/libvirt/images/${h}-1.qcow2" | grep -E 'virtual size|disk size'
+done   # expect virtual 512 GiB, disk ~200 KiB (sparse/blank)
 ```
 GATE: VIPs 11/11/0; enp8s0 linked on all 4; subnet DNS as above; 4 nodes Ready; OSD blank.
 
@@ -120,6 +122,14 @@
 } )
 ```
 
+CONVERGENCE WATCH (ENHANCEMENT-1): keep two windows open for the whole deploy arc.
+- Window 1 (detail): `juju status -m openstack --watch 5s` (always explicit -m; on a 50-app
+  model the table exceeds one screen, so this is a slice -- optionally filter, e.g. `... 'mysql*' vault`).
+- Window 2 (signal): `scripts/deploy-watch.sh openstack 15` -- compact machine/unit state counts
+  + named error/blocked units. Health-at-a-glance: the error/blocked section stays EMPTY until
+  the expected late blocks (vault needs-init, octavia awaiting-configure). Neither window
+  descends into subordinates; neither replaces the phase gates.
+
 ## Step 1.4 -- DNS gate during deploy (as machines come up)
 `# RUN: jumphost`  Run when machine 0 reaches `started`, then per LXD unit as they
 appear (flag BEFORE the target; logic inside the remote quotes; no outer 2>/dev/null):
@@ -144,6 +154,12 @@
   * Waiting on vault certs (expected pre-init): ovn-central x3, ovn-chassis x3
     (incl nova-compute subordinates), ovn-chassis-octavia, neutron-api-plugin-ovn, barbican-vault.
   * octavia BLOCKED "Awaiting configure-resources" (D-021); gss unknown (pre-run).
+  * magnum/0 BLOCKED "Ports which should be open, but are not: 9501" -- pre-vault posture:
+    magnum-api is loopback-bound ([api] host not yet templated) and haproxy backends target
+    unit IPs. EXPECTED phase-01 end-state; self-resolves at the phase-02 cert rollout (apache2
+    takes *:9501). Confirmed self-resolving 2026-06-12 (FINDING-2); verify in the phase-02 post-init sweep.
+  * keystone/0 "PO (broken): Unit is ready" -- expected while use-policyd-override=true with
+    no policy zip attached (FINDING-1); keystone runs the DEFAULT policy. No mutation this arc.
 - Section-G NIC payoff confirmed (no subset/binding errors): ceph-mon -> storage 10.12.16.x;
   octavia -> data 10.12.12.1; nova-compute -> data 10.12.12.4x; vault -> metal 10.12.8.x.
 - Proceed to phase-02 (vault init).
@@ -154,6 +170,24 @@
 - Pre-deploy verify: VIPs 11/11/0; enp8s0 -> 10.12.12.40-43 (all 4); subnet DNS as above; nodes Ready; OSD blank.
 - Settled: zero errors; mysql /0 R/W (10.12.8.173), /1 (.179) /2 (.185) R/O; vault blocked needs-init.
 
+## Balance / stability observations (Roosevelt rebalancing inputs -- post-deploy item 6)
+- Quorum triads (mysql-innodb-cluster, ovn-central, ceph-mon) all on machines 0/1/2: correct
+  anti-affinity; machine 3 loss breaks no quorum; any single loss of 0/1/2 leaves a 2-of-3 majority.
+- Machine 0 = no-compute control host, largest container count: prefigures the Roosevelt role split.
+- FLAG: machine 3 concentrates six singletons (vault, glance, nova-cloud-controller, octavia,
+  placement, barbican) + compute + OSD. Acceptable on testcloud; Roosevelt answer is role split + HA,
+  informed by measured footprints at the 32 GiB envelope (DEVIATION-2 caveat).
+- rabbitmq-server single unit: messaging SPOF, as designed for v1.
+
+## PATTERN-1 (standing convention) -- dynamic lookup vs. pinned identifiers
+READ/VERIFY ops discover values at runtime (never hardcode what resolves: hostname->system_id via
+`maas admin machines read | jq`; subnet id by CIDR). DESTRUCTIVE/IRREVERSIBLE ops discover
+dynamically, ASSERT against a pinned EXPECTED set, ABORT on mismatch, then operate on the pinned
+values (a filter bug or an unexpected new machine must not become collateral damage). Retrofit
+candidates (apply as fixture-tested gated blocks, NOT bulk edits): phase-00 release/host loops ->
+discover-assert-pin; subnet ids -> resolve by CIDR; octet maps -> derive from hostname index.
+Canonical statement in runbooks/README.md.
+
 ## Next
 phase-02 -- vault bring-up.
 
diff --git a/runbooks/phase-02-vault-bringup.md b/runbooks/phase-02-vault-bringup.md
index fc03cf4..74582e4 100644
--- a/runbooks/phase-02-vault-bringup.md
+++ b/runbooks/phase-02-vault-bringup.md
@@ -42,7 +42,12 @@
 init with the `2>&1 | tee` capture (NOT `>`). Save `~/vault-init/init.txt` off-host
 the moment the gate passes.
 ```bash
+# RUN: jumphost -- open the interactive session ONLY (paste this line alone; DOCFIX-029)
 juju ssh -m openstack vault/0
+```
+WAIT for the remote prompt (`ubuntu@juju-...`) before pasting the next block -- a combined
+paste buffers the in-session lines and feeds them to the session on connect.
+```bash
 # --- inside the vault/0 session: ---
 export VAULT_ADDR=http://127.0.0.1:8200 ; umask 077 ; mkdir -p ~/vault-init
 vault status 2>&1 | grep -E 'Initialized|Sealed|Storage Type|HA Enabled' || true   # pre-check: Initialized false (fresh)
@@ -86,18 +91,29 @@
 juju actions vault --schema --format yaml -m openstack | sed -n '/authorize-charm:/,/^[a-z]/p'
 ```
 ```bash
-# RUN: on vault/0 -- mint a short-lived child token (root entered hidden, never on argv/history)
+# RUN: jumphost -- open the interactive session ONLY (paste this line alone; DOCFIX-029)
 juju ssh -m openstack vault/0
-# --- inside the session: ---
+```
+WAIT for the remote prompt (`ubuntu@juju-...`). This in-session block contains a hidden
+`read -s` -- a combined paste would let read swallow the next buffered line as the secret.
+NO trailing `exit`: exit MANUALLY after copying the child token (a paste-ahead `exit` could
+self-terminate the session and mask the swallow).
+```bash
+# --- inside the session: mint a short-lived child token (root entered hidden, never on argv/history) ---
 export VAULT_ADDR=http://127.0.0.1:8200
 read -s -p "root token: " VAULT_TOKEN; echo ; export VAULT_TOKEN
 vault token create -ttl=10m -field=token        # prints ONLY the child token -- copy it
 unset VAULT_TOKEN
-exit
+# (exit manually after you have copied the child token)
 ```
 ```bash
 # RUN: jumphost -- authorize + root CA + status (each juju run blocks to completion)
-juju run vault/leader authorize-charm token=<short-lived-child-token> -m openstack
+# ENHANCEMENT-2: enter the child token via hidden read (keeps it out of jumphost shell
+# history). The token still transits the Juju operation log (inherent to the action;
+# mitigated by the 10m TTL) -- this narrows exposure, it does not eliminate it.
+read -s -p "child token: " TOK; echo
+juju run vault/leader authorize-charm token="$TOK" -m openstack
+unset TOK
 juju run vault/leader generate-root-ca -m openstack
 juju status vault -m openstack
 ```
@@ -114,6 +130,14 @@
 - The narrow cert cascade to the Vault consumers (ovn-central x3, ovn-chassis x3,
   ovn-chassis-octavia, neutron-api-plugin-ovn, barbican-vault) now proceeds -- it is
   watched and accepted in phase-03.
+- POST-INIT SWEEP (FINDING-2 / DOCFIX-028 cross-check) -- after the cert cascade settles:
+  * magnum/0 -> active "Unit is ready"; magnum-api is now served by apache2 on *:9501 (all
+    interfaces; haproxy backends reachable; [api] port moved to the wsgi backend). The
+    phase-01 pre-vault 9501 BLOCK was the expected loopback-bound posture and self-resolves
+    here at the TLS cutover (confirmed 2026-06-12). If it is STILL loopback-bound after certs
+    settle, escalate to charm diagnosis BEFORE phase-03 (then the phase-01 line is a defect).
+  * keystone/0 PO state UNCHANGED ("PO (broken): Unit is ready") -- still default policy
+    (FINDING-1: use-policyd-override=true with no zip). Not a regression; no mutation.
 
 ## As-built reference (2026-06-03 run -- audit trail)
 - init: 5 shares / threshold 3, "Vault initialized with 5 key shares and a key
diff --git a/runbooks/phase-03-core-verify.md b/runbooks/phase-03-core-verify.md
index 38d2ddb..1fdbead 100644
--- a/runbooks/phase-03-core-verify.md
+++ b/runbooks/phase-03-core-verify.md
@@ -5,9 +5,12 @@
 API reachability, and repoint the external Horizon reverse proxy.
 
 Decisions: B5 (IP-only endpoints; no FQDN), D-021 (octavia stays BLOCKED awaiting
-configure-resources -- expected, cleared in phase-05). Troubleshooting: appendix-A --
-DOCFIX-021 (action human-output corrupts captured artifacts), DOCFIX-018 (IP-only
-OS_AUTH_URL), DOCFIX-022 (admin project discovered, not hardcoded).
+configure-resources -- expected, cleared in phase-05), D-044 (Horizon Secure-cookie
+override on the plain-HTTP proxy leg; Step 3.3, PER-REBUILD), D-045 / DOCFIX-031 (haproxy
+backends confirmed LOADED via a functional sweep, NOT juju status; Step 3.1). Troubleshooting:
+appendix-A -- DOCFIX-021 (action human-output corrupts captured artifacts), DOCFIX-018 (IP-only
+OS_AUTH_URL), DOCFIX-022 (admin project discovered, not hardcoded), D-045/DOCFIX-031 (haproxy
+plaintext-check-vs-SSL backend DOWN), nginx reverse-proxy lessons.
 
 ---
 
@@ -62,6 +65,31 @@
 # juju ssh -m openstack <unit> -- 'sudo tail -120 /var/log/juju/unit-<unit-dashed>.log' </dev/null
 ```
 
+### Step 3.1 backend-health gate (DOCFIX-031 / D-045) -- juju status is BLIND to a dead backend
+`# RUN: jumphost`  The acceptance walk above gates on juju active/idle. That is NOT
+sufficient: a unit can be active/idle while a charm-rendered haproxy backend is silently
+DOWN -- observed 2026-06-12, nova-cc nova-api down ~3.2 days with juju green (root cause
+D-045: haproxy not reloaded after the cert cascade -> plaintext checks vs the SSL backend).
+Probe haproxy's own verdict on every unit:
+```bash
+( {
+  echo "=== POST-TLS GATE: haproxy backend health sweep across all units ==="
+  for unit in $(juju status -m openstack --format=json | python3 -c 'import json,sys; d=json.load(sys.stdin); [print(u) for a in d.get("applications",{}).values() for u in (a.get("units") or {})]'); do
+    juju ssh -m openstack "$unit" -- "test -S /var/run/haproxy/admin.sock || exit 0; sudo python3 -c 'import socket;s=socket.socket(socket.AF_UNIX);s.connect(\"/var/run/haproxy/admin.sock\");s.sendall(b\"show stat\n\");print(s.makefile().read())' | grep -vE 'FRONTEND|BACKEND' | grep ',DOWN,'" </dev/null 2>/dev/null | sed "s|^|[$unit] DOWN: |"
+  done
+  echo "=== sweep complete -- no DOWN lines above means every haproxy backend is UP ==="
+} )
+```
+GATE: zero `[unit] DOWN:` lines. On a DOWN line (check token L7STS/400 == plaintext-vs-SSL),
+remediate the flagged unit (set U, then validate-and-reload):
+```bash
+U=nova-cloud-controller/0
+juju ssh -m openstack "$U" -- 'sudo haproxy -c -f /etc/haproxy/haproxy.cfg' </dev/null   # gate: must say valid
+juju ssh -m openstack "$U" -- 'sudo systemctl reload haproxy' </dev/null                 # graceful master-worker
+```
+Re-run the sweep until clean. (Signature confirm if needed: plaintext `curl http://SVC-IP:876x/`
+returns 400, TLS `curl -k https://SVC-IP:876x/` returns 200 -- the transport is the difference.)
+
 ## Step 3.2 -- Build admin-openrc (IP-only; canonical block)
 `# RUN: jumphost`  Keystone PUBLIC = the provider VIP IP over HTTPS with the vault
 CA (no FQDN, no /etc/hosts -- B5). This canonical block folds in three fixes:
@@ -140,27 +168,78 @@
 Swift/S3 smoke); the gss image-stream is HTTP on metal `10.12.8.172`.
 
 ## Step 3.3 -- Horizon access via the external nginx reverse proxy
-`# RUN: operator (outside the Juju model)`  Horizon is fronted by an
-operator-managed nginx reverse proxy. On each rebuild / VIP relocation, repoint its
+`# RUN: operator (outside the Juju model) + jumphost`  Horizon is fronted by an
+operator-managed nginx reverse proxy. On each rebuild / VIP relocation: (1) repoint the
 upstream to the CURRENT dashboard provider VIP (now `https://10.12.4.58`, was `.234`
-pre-R14). Verify two interplays:
-- ALLOWED_HOSTS: Horizon (bundle B5 setting) must permit whatever Host header reaches
-  it, else HTTP 400 DisallowedHost. Either set the proxy `proxy_set_header Host` to the
-  dashboard VIP, or add the proxy hostname to Horizon ALLOWED_HOSTS.
-- Upstream TLS: the dashboard cert is vault-signed for the VIP IP (IP-SAN). The proxy
-  must trust the vault root CA (`~/vault-init/vault-ca-root.pem`) for `proxy_ssl_verify`,
-  or terminate/re-encrypt per policy.
-LIVE-REVIEW: the proxy host + config path + reload command are operator-managed and
-not captured here -- record them verbatim when wired, and confirm an external GET
-reaches the Horizon login. (Roosevelt: this repoint folds into the access/DNS workstream.)
+pre-R14), and (2) reapply the Horizon Secure-cookie override (DOCFIX-030 / D-044,
+PER-REBUILD -- below). Two interplays:
+- ALLOWED_HOSTS: Horizon (bundle B5) must permit the Host header that reaches it, else
+  HTTP 400 DisallowedHost. As-built keeps the client Host (`proxy_set_header Host $http_host`);
+  rewriting it to the VIP would emit redirects corporate clients may not route.
+- Upstream TLS name-match: the dashboard cert is vault-signed and embeds the unit HOSTNAME
+  as a DNS SAN (e.g. `juju-ffe3b8-2-lxd-2`) alongside IP SANs. nginx upstream verification
+  is DNS-only (X509_check_host), so the proxy_pass IP NEVER matches -- `proxy_ssl_verify on`
+  requires `proxy_ssl_name` set to the cert's DNS SAN. (B5 is IP-only for ENDPOINTS, not certs.)
+
+Proxy topology (as-executed 2026-06-12 -- confirm/refresh per site):
+- Proxy host `nginx` 10.12.4.7 (Ubuntu 24.04, nginx 1.24.0 native/systemd); also fronts
+  MAAS (listen 80 -> 10.12.4.10:5240). Horizon vhost `/etc/nginx/sites-available/openstack`
+  (symlinked into sites-enabled), listen 81; corporate clients reach it via 10.17.11.246:81.
+
+As-executed change set (gate every edit -- `sed -i` exits 0 on zero matches, so grep-assert
+the expected line after any mutation):
+```bash
+# RUN: jumphost -- ship the vault root CA to the proxy
+scp ~/vault-init/vault-ca-root.pem jessea123@10.12.4.7:/tmp/
+```
+```bash
+# RUN: operator ON 10.12.4.7 -- install CA, back up + edit the Horizon vhost, validate, restart.
+sudo install -o root -g root -m 644 /tmp/vault-ca-root.pem /etc/nginx/vault-ca-root.pem && rm -f /tmp/vault-ca-root.pem
+sudo cp -a /etc/nginx/sites-available/openstack "/etc/nginx/sites-available/openstack.bak-$(date -u +%Y%m%dT%H%M%SZ)"
+# Set in the Horizon server block (then `grep` to confirm each landed):
+#   proxy_pass https://10.12.4.58:443;
+#   proxy_ssl_trusted_certificate /etc/nginx/vault-ca-root.pem;
+#   proxy_ssl_verify on;
+#   proxy_ssl_name juju-ffe3b8-2-lxd-2;   # the dashboard cert's DNS SAN -- per site (discover: openssl s_client -connect 10.12.4.58:443 </dev/null 2>/dev/null | openssl x509 -noout -ext subjectAltName)
+#   proxy_redirect https://$http_host/ http://$http_host/;   # unwind the scheme-mismatch redirect loop (Horizon emits absolute https:// on the client Host -> browser then speaks TLS to the :81 plaintext listener)
+sudo nginx -t                       # GATE: configuration ok
+sudo systemctl restart nginx        # prefer restart over reload for a definitive cutover (a curl ~2s after `reload` can be served by a draining old worker; ~2s blip incl. the co-hosted MAAS proxy)
+```
+GATE (on the proxy): `curl -sI http://127.0.0.1:81/horizon/` -> 302 to .../auth/login; no TLS errors in error.log.
+
+### DOCFIX-030 -- Horizon Secure-cookie override (D-044; PER-REBUILD)
+The charm renders `CSRF_COOKIE_SECURE`/`SESSION_COOKIE_SECURE = True` (vault:certificates).
+On the plain-HTTP client leg the browser drops the Secure csrftoken and login fails with
+"CSRF cookie not set" -- so a clean follow of 3.3 otherwise stalls at the browser login.
+Drop an ASCII-only post-load override on the dashboard unit, then graceful-reload apache2:
+```bash
+# RUN: jumphost -- D-044 cookie override on the dashboard unit (ASCII-only; PER-REBUILD)
+juju ssh -m openstack openstack-dashboard/leader -- "printf 'CSRF_COOKIE_SECURE = False\nSESSION_COOKIE_SECURE = False\n' | sudo tee /usr/share/openstack-dashboard/openstack_dashboard/local/local_settings.d/_99_internal_http_cookies.py >/dev/null && sudo systemctl reload apache2" </dev/null
+```
+Verify the csrftoken Set-Cookie carries NO Secure attribute (over the VIP, vault CA):
+```bash
+# RUN: jumphost
+CK=$(curl -s -o /dev/null -D - --cacert ~/vault-init/vault-ca-root.pem https://10.12.4.58/horizon/auth/login/ | grep -i 'set-cookie:.*csrftoken')
+[ -n "$CK" ] || echo "WARN: no csrftoken Set-Cookie on this GET -- confirm via the browser login"
+printf '%s\n' "$CK" | grep -iq 'secure' && echo "FAIL: csrftoken still Secure" || echo "OK: csrftoken not Secure"
+```
+Then confirm an external browser login over the proxy succeeds.
+PER-REBUILD: teardown wipes the unit; reapply each rebuild until edge TLS (the Roosevelt
+access/DNS workstream) makes the override unnecessary. The upstream stays PLAIN HTTP
+(as-built); the abandoned upstream-TLS and self-signed-client-TLS approaches are NOT part
+of v1 (D-044 rationale). Diagnostic lessons (reload race, proxy_ssl_name DNS-SAN, sed no-op,
+scheme-mismatch redirect loop) are in appendix-A.
 
 ---
 
 ## EXIT GATE (phase-03 complete)
 - Cloud settled: acceptance walk shows only the expected block(s) (octavia; maybe gss).
+- Every haproxy backend UP/L7OK by the functional sweep (DOCFIX-031), not merely juju active/idle.
 - `~/admin-openrc` (0600) authenticates and returns a SCOPED token; endpoint list IP-only.
 - Vault root CA at `~/vault-init/vault-ca-root.pem` validates TLS to the keystone VIP.
-- Horizon reachable through the repointed reverse proxy.
+- Horizon reachable through the repointed reverse proxy AND login works (D-044 cookie override
+  applied). Dashboard webroot is the charm default `/horizon` (the root path 404s) -- probe
+  `/horizon/auth/login/`. Two VIPs per bundle B1: 10.12.4.58 (provider) + 10.12.8.58 (metal).
 
 ## As-built reference (2026-06-03 run -- audit trail)
 - Cascade settled ~04:15Z: all five Vault consumers active/idle; only expected
@@ -171,7 +250,14 @@
   domains admin_domain, OS_CACERT=~/vault-init/vault-ca-root.pem.
 - Vault root CA: subject "Vault Root Certificate Authority (charm-pki-local)",
   notBefore 2026-06-03, notAfter 2036-05-31; TLS to 10.12.4.50:5000 OK (B5 IP-SAN holds).
-- Dashboard VIP 10.12.4.58 (nginx upstream repoint pending operator capture).
+- Dashboard VIP 10.12.4.58 (provider) + 10.12.8.58 (metal); nginx upstream repoint + D-044
+  cookie override captured in Step 3.3 (as-executed 2026-06-12).
+- gss image-stream public endpoint is HTTP on the unit IP this rebuild (10.12.8.196; was .172
+  the 06-03 snapshot -- a rebuild-variable container primary, not a defect; off the critical
+  path, no jumphost route to the container space). Refresh the snapshot per rebuild.
+- Exit-gate re-confirm (2026-06-16, read-only, all green): settle (only octavia D-021 + gss
+  between-runs); admin-openrc scoped + Nova compute service list up (D-045 holds end-to-end);
+  vault-CA -> keystone VIP TLS verify rc 0; haproxy backend sweep zero DOWN cloud-wide.
 
 ## Next
 phase-04 -- network carve (external provider network).
diff --git a/runbooks/phase-04-network-carve.md b/runbooks/phase-04-network-carve.md
index 107da29..9bce883 100644
--- a/runbooks/phase-04-network-carve.md
+++ b/runbooks/phase-04-network-carve.md
@@ -114,11 +114,18 @@
 - FIP allocation + tenant router gateways are now possible (needed by phase-06 mgmt
   VM FIP, phase-08 cluster FIPs + LB validation).
 
-## As-built reference (2026-06-03 run -- audit trail)
-- network provider-ext = 70b34bb2-3afb-4b43-96d3-f520dbcbf9a8 (external, flat, physnet1, shared=false, role=provider)
-- subnet provider-ext-fip = e3afcbae-ec34-4125-9007-2bfa51851422
+## As-built reference (object IDs regenerate per deploy -- old IDs are dead post-teardown, not a discrepancy)
+- network provider-ext = 0d00ddc1-d2bf-4849-a087-14c07d77f167  (06-03 snapshot: 70b34bb2-...)
+  (external, flat, physnet1, shared=false, role=provider)
+- subnet provider-ext-fip = d27f196c-a2d9-4bb9-99f3-bcb8caea3165  (06-03 snapshot: e3afcbae-...)
   (cidr 10.12.4.0/22, gateway 10.12.4.1, enable_dhcp=false, alloc 10.12.5.0-10.12.7.254,
    tags role=provider + netbox-iprange=10.12.5.0-10.12.7.254)
+- Live MAAS reservations the IPAM draft + D-003 do NOT yet list (the DRAFT is incomplete, not
+  the cloud -- draft <- live): 10.12.4.101-10.12.4.110 (subnet 1, provider) +
+  10.12.8.101-10.12.8.110 (subnet 2, metal), both "mgmt-plane reserved" (10 IPs each). Both sit
+  OUTSIDE the FIP pool (10.12.5.0-10.12.7.254) and the VIP /26 blocks -> no conflict with
+  provider-ext-fip. FOLD into docs/netbox-vip-queue.md + D-003 in the docs sub-pass (purpose
+  annotation pending operator confirmation; do NOT mutate NetBox until IPAM design is confirmed -- D-010).
 - Transitional note: MAAS already carried the front-loaded VIP reservations (.2-.63
   provider + .8.2-.63 metal; old D-020 .8.224-.254 gone) ahead of the bundle's interim
   .50-.60 VIPs -- harmless (a reserved range blocks future auto-assign, does not evict
diff --git a/runbooks/phase-05-octavia-enablement.md b/runbooks/phase-05-octavia-enablement.md
index d89fb27..61d0293 100644
--- a/runbooks/phase-05-octavia-enablement.md
+++ b/runbooks/phase-05-octavia-enablement.md
@@ -91,6 +91,16 @@
 fresh -> download+checksum+upload+retrofit). For a FIRST live run in a new
 environment you may stop after the seed to eyeball before the multi-minute build.
 
+SEED METHOD (canonical): stage-and-verify (download + sha256-vs-published-SHA256SUMS +
+`openstack image create --file`) is CANONICAL here -- it carries provenance verification,
+works for any source, and unifies with the phase-08 kube-image seed (FINDING-3). This
+SUPERSEDES the 2026-06-16 "web-download canonical" ruling: web-download cannot checksum-verify
+the fetched file and is infeasible for the azimuth CDN (urllib UA 403). Web-download is retained
+as a TESTED ALTERNATIVE in appendix-A. Note the staged base lands QCOW2 (legacy `--file` does
+NOT run glance's import conversion -- CORRECTION-1); that is fine, the retrofit consumes the
+qcow2 base and emits the raw `octavia-amphora` OUTPUT (the config gate's image-format=raw is on
+the retrofit OUTPUT, not the base).
+
 ```bash
 # Tunables (operator-confirm the first two for your environment):
 BASE_IMG_URL="https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img"
@@ -171,20 +181,26 @@
   admin-scope failover) is D-011 criterion 4 -- run in phase-08 (needs tenant
   scaffolding + the external provider network from phase-04).
 
-## As-built reference (2026-06-03 run -- audit trail)
+## As-built reference (current rebuild 2026-06-16; per-deploy values regenerate -- old IDs are not discrepancies)
 - octavia/0: octavia 14.0.0, charm rev 441 2024.1/stable, on 3/lxd/3, data leg 10.12.12.1;
   multi-homed (reaches provider VIPs over eth1).
-- configure-resources op 15 / task 16 completed (--wait=20m). Created lb-mgmt-net
-  (d1ee4bca-...), lb-mgmt-subnetv6 (1c1f50df-..., IPv6 geneve), lb-mgmt-sec-grp (acbacb21-...).
-  o-hm0 fc00:9c49:5b4e:cf23:f816:3eff:fead:56df/64, br-int port.
-- amphora: retrofit is metal-only (10.12.8.172) -> internal glance VIP 10.12.8.53.
-  base jammy-amphora-base uploaded (f8b48cdb-...); retrofit op 19/task 20 built
-  amphora-haproxy-x86_64-ubuntu-22.04-20260603 (4e4a94ac-...), ACTIVE, tag octavia-amphora
-  (matches octavia amp-image-tag). image-format raw.
+- configure-resources op 9 / task 10 completed (--wait=20m; 06-03 snapshot: op 15/task 16).
+  Created lb-mgmt-net / lb-mgmt-subnetv6 (IPv6 geneve) / lb-mgmt-sec-grp; o-hm0 UP, IPv6-ULA
+  fc00:3f8c:7162:d105:f816:3eff:feea:7e45/64 (06-03: fc00:9c49:...:56df; the ULA regenerates per deploy).
+- amphora: retrofit is metal-only -> internal glance VIP 10.12.8.53. base jammy-amphora-base
+  = da757cb1-... (untagged; 06-03: f8b48cdb-...); retrofit op 13/task 14 (06-03: op 19/task 20)
+  built amphora-haproxy-x86_64-ubuntu-22.04-20260616 = ca5552a5-... ACTIVE, tag octavia-amphora
+  (matches octavia amp-image-tag), image-format raw, ~6.2 GB, owned by the services project
+  (06-03 OUTPUT: 4e4a94ac-...).
+- mgmt VM image pre-staged for phase-06: ubuntu-24.04-noble = 899b4b5c-... (public, os props).
+- SEED METHOD this rebuild vs canonical: the base + noble were seeded via WEB-DOWNLOAD this
+  rebuild (the 06-16 expedient). Canonical going forward is STAGE-AND-VERIFY (Step 5.2 header);
+  web-download is a tested alternative (appendix-A). The web-downloaded base landed raw (import
+  conversion ran); a staged --file base lands qcow2 (CORRECTION-1) and is equally fine for the retrofit.
 - Charm gap (parked): glance-simplestreams-sync is metal-only and cannot reach glance
   on a no-DNS deploy (use-internal-endpoints steers keystone auth but not the
-  glance/swift client) -> gss does NOT seed the base. The base is seeded manually
-  (above) and the amphora BUILD stays charm-native via the retrofit over internal
+  glance/swift client) -> gss does NOT seed the base. The base is seeded per Step 5.2
+  and the amphora BUILD stays charm-native via the retrofit over internal
   endpoints. Roosevelt root-fix: cloud DNS + FQDN-valid certs (also fixes gss).
 
 ## Next
diff --git a/runbooks/phase-06-incloud-mgmt-cluster.md b/runbooks/phase-06-incloud-mgmt-cluster.md
index da84bd5..6c5b87c 100644
--- a/runbooks/phase-06-incloud-mgmt-cluster.md
+++ b/runbooks/phase-06-incloud-mgmt-cluster.md
@@ -30,13 +30,13 @@
 ## Constants and env-literals (TAG: regenerate/confirm per site on rebuild)
 Literals below are tagged `ENV(...)` so the later generalization pass is
 mechanical. Discover everything else dynamically at run time.
-- `ENV(project)`     capi-mgmt           (id 674171fd28d446d3a37073b6a761e910)
-- `ENV(ext-net)`     provider-ext        (id 70b34bb2-3afb-4b43-96d3-f520dbcbf9a8)
-- `ENV(image)`       ubuntu-24.04-noble  (id c66342ce-f402-4e6e-a324-ae27032396d7)
+- `ENV(project)`     capi-mgmt           (resolve by name; this rebuild id d5bc125c7c1841d389b76cd0a7b0a915, domain capi)
+- `ENV(ext-net)`     provider-ext        (resolve by name; this rebuild id 0d00ddc1-d2bf-4849-a087-14c07d77f167)
+- `ENV(image)`       ubuntu-24.04-noble  (resolve by name; this rebuild id 899b4b5c-d8f6-4df4-860b-a9210d0eefe8)
 - `ENV(flavor)`      gp.large            (16384 MB / 4 vCPU / 80 GB)
 - `ENV(mgmt-cidr)`   10.20.0.0/24        (capi-mgmt-subnet; overlay, non-IPAM)
 - `ENV(keystone-vip)` 10.12.4.50:5000    (the gate target -- the deployed VIP)
-- `ENV(mgmt-fip)`    10.12.7.40          (assigned in 6.2; apiserver SAN)
+- `ENV(mgmt-fip)`    assigned in 6.2     (apiserver SAN; resolve dynamically. This rebuild capi-mgmt-v2 = 10.12.5.103, tenant 10.20.0.107; the old 10.12.7.40 / 10.20.0.45 was the pre-teardown mgmt VM -- DOCFIX-038)
 - `ENV(pod-cidr)`    10.1.0.0/16   `ENV(svc-cidr)` 10.152.183.0/24  (snap defaults; non-colliding)
 - `ENV(capi-tag)`    0.25.1              (capi-helm-charts release; dependencies.json source)
 
@@ -44,7 +44,8 @@
 - `# RUN: jumphost`   -- on vopenstack-jesse as jessea123, admin-openrc sourced.
 - `# RUN: mgmt VM`    -- shipped to the VM over SSH via the FIP (heredoc below).
 - VM SSH form (used verbatim throughout; DOCFIX-021 `</dev/null` on every sudo):
-  `ssh -i ~/.ssh/id_ed25519 -o BatchMode=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ConnectTimeout=10 ubuntu@10.12.7.40 bash -s <<'REOF' ... REOF`
+  `ssh -i ~/.ssh/id_ed25519 -o BatchMode=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ConnectTimeout=10 ubuntu@10.12.5.103 bash -s <<'REOF' ... REOF`
+  (10.12.5.103 = this rebuild's capi-mgmt-v2 FIP; resolve dynamically -- the old 10.12.7.40 is dead.)
 
 ---
 
@@ -62,17 +63,24 @@
 sets nova-compute `libvirt-image-backend: rbd` (B3) -- DISK_GB comes from the Ceph
 pool, not the ~9 GB local ephemeral ceiling.
 
-The noble image imports via the interoperable import path (glance-direct), the
-VERBATIM-proven path from the 2026-06-08 kube-image upload (plain web-download
-403s on this cloud). With the hardened bundle's glance `image-conversion: true`,
-the stored disk_format lands `raw` on the redeploy (expected; D-021 Ceph
-fast-clone alignment).
+The noble image is seeded by STAGE-AND-VERIFY (canonical per FINDING-3; supersedes the
+2026-06-16 web-download ruling and the standalone-`glance` glance-direct line): download +
+sha256-vs-published-SHA256SUMS + `openstack image create --file --import` (client-safe -- the
+openstack snap's `--import` is the glance-direct equivalent; the standalone `glance` client is
+NOT assumed present). With the hardened bundle's glance `image-conversion: true`, `--import`
+lands the stored disk_format `raw` (D-021 Ceph fast-clone alignment). Web-download is retained
+as a tested alternative (appendix-A); for ubuntu cloud-images it works on the hardened bundle
+(the 2026-06-08 403 was transient/pre-hardening), but it cannot checksum-verify the fetched
+file -- stage-and-verify is preferred for provenance and unifies with the phase-08 kube seed.
 
 AS-BUILT FACTS (verified live 2026-06-10 pre-teardown; supersede the rebuild
 handoff, which wrongly placed capi-mgmt in admin_domain): project `capi-mgmt`
 lives in domain `capi` ("CAPI/Magnum workload identity"); the noble image is
-`public` with os_distro/os_version properties; admin@admin_domain holds `member`
-(not admin) on the project. NOTE -- the old static CAPO identity (user `capo`,
+`public` with os_distro/os_version properties; admin@admin_domain holds `member` +
+`load-balancer_member` + `reader` (NOT admin) on the project -- DOCFIX-036 / D-039: magnum
+mints the per-cluster app-cred from the TRUSTOR's roles, so the trustor must hold
+`load-balancer_member` or CAPO's cred 403s on Octavia and the workload cluster wedges at
+API-LB provisioning. NOTE -- the old static CAPO identity (user `capo`,
 its app-cred, `capo-clouds.yaml`) is a FOSSIL of the retired D-033 out-of-cloud
 path and is deliberately NOT recreated: the current architecture needs no static
 cloud credential (`clusterctl init` takes none; per-cluster creds are
@@ -96,13 +104,22 @@
            --description "CAPI management project" capi-mgmt >/dev/null \
          && echo "[OK] project capi-mgmt (domain $PROJ_DOMAIN)"; }
 
-  echo "=== role: $OS_USERNAME gets MEMBER on capi-mgmt (as-built grant; OS_PROJECT_ID blocks in 6.x/7.8/8.x) ==="
-  openstack role assignment list --user "$OS_USERNAME" --user-domain "$OS_USER_DOMAIN_NAME" \
-      --project capi-mgmt --project-domain "$PROJ_DOMAIN" -f value 2>/dev/null | grep -q . \
-    && echo "[SKIP] role assignment present" \
-    || { openstack role add --user "$OS_USERNAME" --user-domain "$OS_USER_DOMAIN_NAME" \
-           --project capi-mgmt --project-domain "$PROJ_DOMAIN" member \
-         && echo "[OK] member role on capi-mgmt"; }
+  echo "=== roles: $OS_USERNAME gets member + load-balancer_member + reader on capi-mgmt (DOCFIX-036 / D-039) ==="
+  # D-039 ROOT CAUSE: magnum mints the per-cluster app-cred carrying the TRUSTOR's roles,
+  # FROZEN at mint, and delegates ALL trustor roles unfiltered. If admin@admin_domain holds
+  # only `member` here, CAPO's app-cred 403s on Octavia (needs load-balancer_member) and the
+  # workload cluster wedges at API-LB provisioning. Grant all three so future mints carry LB
+  # authority. (load-balancer_member + reader are keystone/Octavia default roles.)
+  for ROLE in member load-balancer_member reader; do
+    if openstack role assignment list --user "$OS_USERNAME" --user-domain "$OS_USER_DOMAIN_NAME" \
+         --project capi-mgmt --project-domain "$PROJ_DOMAIN" --role "$ROLE" -f value 2>/dev/null | grep -q .; then
+      echo "[SKIP] $ROLE already on capi-mgmt"
+    else
+      openstack role add --user "$OS_USERNAME" --user-domain "$OS_USER_DOMAIN_NAME" \
+        --project capi-mgmt --project-domain "$PROJ_DOMAIN" "$ROLE" \
+        && echo "[OK] $ROLE on capi-mgmt"
+    fi
+  done
 
   echo "=== flavors (as-built specs; public -- verified live 2026-06-10 pre-teardown) ==="
   for spec in "gp.large 4 16384 80" "gp.mid 2 8192 40" "capi.node 2 4096 40" \
@@ -114,19 +131,32 @@
            && echo "[OK] $1 ($2 vcpu / $3 MB / $4 GB)"; }
   done
 
-  echo "=== mgmt VM image ubuntu-24.04-noble (verify-or-import; glance-direct; HOME-staged, L7) ==="
+  echo "=== mgmt VM image ubuntu-24.04-noble (verify-or-seed; STAGE-AND-VERIFY canonical; HOME-staged, L7) ==="
   if openstack image show ubuntu-24.04-noble >/dev/null 2>&1; then
     echo "[SKIP] image ubuntu-24.04-noble exists"
   else
-    SRC="$HOME/noble-server-cloudimg-amd64.img"
-    [ -f "$SRC" ] || { echo "ABORT: $SRC missing (re-fetch: cloud-images.ubuntu.com/noble/current/)"; exit 1; }
-    glance image-create-via-import \
-      --import-method glance-direct \
-      --file "$SRC" \
-      --container-format bare --disk-format qcow2 \
-      --visibility public \
-      --property os_distro=ubuntu --property os_version=24.04 \
-      --name ubuntu-24.04-noble
+    # Stage-and-verify (FINDING-3): download to $HOME (snap-readable; NOT /tmp -- L7) if missing/
+    # checksum-stale, verify sha256 vs the published SHA256SUMS, then client-safe import via the
+    # openstack snap (--import == glance-direct; image-conversion lands it raw). NOT the standalone
+    # `glance` client (unconfirmed on this jumphost).
+    IMG_URL="https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img"
+    SUM_URL="https://cloud-images.ubuntu.com/noble/current/SHA256SUMS"
+    IMG_FILE="noble-server-cloudimg-amd64.img"; SRC="$HOME/$IMG_FILE"
+    EXP=$(curl -fsSL "$SUM_URL" | awk -v f="$IMG_FILE" '$2=="*"f || $2==f {print $1}')
+    [ -n "$EXP" ] || { echo "GATE FAIL: no published checksum for $IMG_FILE"; exit 1; }
+    if [ -f "$SRC" ] && [ "$(sha256sum "$SRC" | awk '{print $1}')" = "$EXP" ]; then
+      echo "[OK] staged noble present + checksum-valid; skipping download"
+    else
+      echo "[..] downloading noble to $SRC (snap-readable; NOT /tmp)"
+      wget -q -O "$SRC" "$IMG_URL"
+      GOT=$(sha256sum "$SRC" | awk '{print $1}')
+      [ "$EXP" = "$GOT" ] || { echo "GATE FAIL: checksum mismatch exp='$EXP' got='$GOT'"; exit 1; }
+      echo "[OK] checksum verified ($GOT)"
+    fi
+    openstack image create ubuntu-24.04-noble \
+      --file "$SRC" --import \
+      --container-format bare --disk-format qcow2 --public \
+      --property os_distro=ubuntu --property os_version=24.04
   fi
   # as-built (verified live 2026-06-10): visibility=public, os_distro=ubuntu, os_version=24.04,
   # stored raw in Ceph via the bundle's glance image-conversion=true.
@@ -215,21 +245,29 @@
   openstack server show capi-mgmt-v2 -f value -c status -c addresses
   echo "=== floating ip on provider-ext, associate to the VM ==="
   FIP=$(openstack floating ip create "$EXT" -f value -c floating_ip_address)
-  echo "allocated FIP=$FIP   # expect this to be 10.12.7.40 on a clean run -- ENV(mgmt-fip)"
   openstack server add floating ip capi-mgmt-v2 "$FIP"
+  # tenant (fixed) IP = the server address that is NOT the FIP (single-NIC VM has exactly the two)
+  TENANT_IP=$(openstack server show capi-mgmt-v2 -f json \
+    | FIP="$FIP" python3 -c "import os,json,sys; a=json.load(sys.stdin).get('addresses',{}) or {}; ips=[ip for net in a.values() for ip in net]; print(next((ip for ip in ips if ip!=os.environ['FIP']), ''))")
+  [ -n "$TENANT_IP" ] || { echo "ABORT: could not resolve tenant IP"; exit 1; }
+  # PERSIST both (single source for 6.3-6.6 -- PATTERN-1; the FIP is pool-allocated + the tenant
+  # IP DHCP-assigned, so NEITHER is deterministic per rebuild -- never hardcode them)
+  printf 'MGMT_FIP=%s\nMGMT_TENANT_IP=%s\n' "$FIP" "$TENANT_IP" | tee ~/capi-mgmt-net.env
   openstack server show capi-mgmt-v2 -f value -c addresses
 } )
 ```
-Note: the tenant IP lands on `10.20.0.45` and the FIP on `10.12.7.40` on the
-as-built run. If the FIP differs on rebuild, carry the new value into 6.4
-(`extra-sans`) and 6.5 (kubeconfig server) and phase-07 (conductor kubeconfig).
+Note (DOCFIX-038): the FIP is pool-allocated and the tenant IP is DHCP-assigned -- NEITHER is
+deterministic (this rebuild: FIP 10.12.5.103, tenant 10.20.0.107; the pre-teardown VM was
+10.12.7.40 / 10.20.0.45). Step 6.2 persists both to `~/capi-mgmt-net.env`; 6.3-6.6a source it,
+and phase-07 (conductor kubeconfig) uses the same FIP. Do not hardcode either value.
 
 ## Step 6.3 -- GATE 1: OS-level egress (before any k8s investment)
 `# RUN: mgmt VM`  This is the premise of D-035. PROCEED ONLY IF VIP-OK.
 
 ```bash
+source ~/capi-mgmt-net.env   # MGMT_FIP, MGMT_TENANT_IP (written by 6.2)
 ssh -i ~/.ssh/id_ed25519 -o BatchMode=yes -o StrictHostKeyChecking=no \
-    -o UserKnownHostsFile=/dev/null -o ConnectTimeout=10 ubuntu@10.12.7.40 bash -s <<'REOF'
+    -o UserKnownHostsFile=/dev/null -o ConnectTimeout=10 ubuntu@"$MGMT_FIP" bash -s <<'REOF'
 set -u
 echo "=== VM -> Keystone VIP 10.12.4.50:5000 ==="            # ENV(keystone-vip)
 timeout 6 bash -c 'exec 3<>/dev/tcp/10.12.4.50/5000' && echo VIP-OK || echo VIP-FAIL
@@ -250,15 +288,18 @@
 from stdin).
 
 ```bash
+source ~/capi-mgmt-net.env   # MGMT_FIP, MGMT_TENANT_IP (written by 6.2)
 ssh -i ~/.ssh/id_ed25519 -o BatchMode=yes -o StrictHostKeyChecking=no \
-    -o UserKnownHostsFile=/dev/null -o ConnectTimeout=10 ubuntu@10.12.7.40 bash -s <<'REOF'
+    -o UserKnownHostsFile=/dev/null -o ConnectTimeout=10 ubuntu@"$MGMT_FIP" \
+    bash -s "$MGMT_FIP" "$MGMT_TENANT_IP" <<'REOF'
 set -euo pipefail
+MGMT_FIP="$1"; MGMT_TENANT_IP="$2"    # passed from the jumphost (extra-sans must be the real FIP + tenant IP)
 
 echo "=== install k8s snap 1.32-classic/stable ==="
 sudo snap install k8s --classic --channel=1.32-classic/stable </dev/null
 
 echo "=== write bootstrap config (DOCFIX-024: cluster-config block REQUIRED) ==="
-sudo tee /root/bootstrap-config.yaml >/dev/null <<'CFG'
+sudo tee /root/bootstrap-config.yaml >/dev/null <<CFG
 cluster-config:
   network:
     enabled: true
@@ -267,8 +308,8 @@
 pod-cidr: 10.1.0.0/16
 service-cidr: 10.152.183.0/24
 extra-sans:
-- 10.12.7.40
-- 10.20.0.45
+- $MGMT_FIP
+- $MGMT_TENANT_IP
 CFG
 sudo cat /root/bootstrap-config.yaml
 
@@ -287,10 +328,11 @@
 old k3s node FAILED. On this single-NIC VM it must `Completed`.
 
 ```bash
-# RUN: mgmt VM -- emit a jumphost-facing kubeconfig (server = the FIP, not tenant IP)
+# RUN: jumphost (ssh to the mgmt VM; the kubeconfig lands on the jumphost). server = the FIP, not tenant IP
+source ~/capi-mgmt-net.env   # MGMT_FIP
 ssh -i ~/.ssh/id_ed25519 -o BatchMode=yes -o StrictHostKeyChecking=no \
-    -o UserKnownHostsFile=/dev/null -o ConnectTimeout=10 ubuntu@10.12.7.40 \
-    "sudo k8s config server=https://10.12.7.40:6443 </dev/null" > ~/capi-mgmt.kubeconfig
+    -o UserKnownHostsFile=/dev/null -o ConnectTimeout=10 ubuntu@"$MGMT_FIP" \
+    "sudo k8s config server=https://$MGMT_FIP:6443 </dev/null" > ~/capi-mgmt.kubeconfig
 # [SENSITIVE] ~/capi-mgmt.kubeconfig contains a cluster-admin credential.
 wc -l ~/capi-mgmt.kubeconfig ; head -1 ~/capi-mgmt.kubeconfig   # expect >0 lines, "apiVersion: v1"
 ```
@@ -320,7 +362,7 @@
 
 ## Step 6.6 -- CAPI provider stack (pinned to dependencies.json; D-034)
 `# RUN: mgmt VM`  Run VM-side as root with `KUBECONFIG=/root/kubeconfig` (local
-apiserver 10.20.0.45:6443) so the matched 1.32.13 kubectl is used -- avoids the
+apiserver = the VM's tenant IP:6443) so the matched 1.32.13 kubectl is used -- avoids the
 jumphost kubectl's +3-minor skew. Versions are READ from the tag's
 dependencies.json, never hardcoded (D-034). The as-built pins are in the
 reference block below as a known-good cross-check only.
@@ -339,14 +381,15 @@
 by 6.6b-6.6f (same jumphost shell).
 ```bash
 # define the mgmt-VM connection once (reused by 6.6b-6.6f)
-MGMT_VM=10.12.7.40
+source ~/capi-mgmt-net.env        # MGMT_FIP, MGMT_TENANT_IP (written by 6.2)
+MGMT_VM="$MGMT_FIP"
 SSH_OPTS="-i $HOME/.ssh/id_ed25519 -o BatchMode=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ConnectTimeout=10"
 
 ssh $SSH_OPTS ubuntu@"$MGMT_VM" bash -s <<'REOF'
 set -euo pipefail
 sudo apt-get update -qq </dev/null && sudo apt-get install -y jq curl </dev/null
 
-# kubeconfig for the local apiserver (10.20.0.45:6443), readable by ubuntu -> helm/clusterctl/kubectl need no sudo
+# kubeconfig for the local apiserver (the VM's own tenant IP:6443), readable by ubuntu -> helm/clusterctl/kubectl need no sudo
 mkdir -p "$HOME/.kube"; sudo k8s config </dev/null > "$HOME/.kube/config"; chmod 600 "$HOME/.kube/config"
 
 # egress pre-check (the VM pulls charts/binaries/manifests from these)
@@ -470,7 +513,9 @@
 - Proceed to phase-07 (conductor graft).
 
 ## As-built reference (2026-06-08/09 run -- audit trail; values are run-specific)
-- VM `capi-mgmt-v2`: gp.large, ubuntu-24.04-noble; tenant IP 10.20.0.45 (ens3); FIP 10.12.7.40.
+- VM `capi-mgmt-v2`: gp.large, ubuntu-24.04-noble; tenant IP + FIP are per-rebuild (this rebuild
+  10.20.0.107 ens3 / FIP 10.12.5.103; 2026-06-08/09: 10.20.0.45 / 10.12.7.40). 6.2 persists both
+  to ~/capi-mgmt-net.env.
 - Net `capi-mgmt-net` / subnet `capi-mgmt-subnet` 10.20.0.0/24; router `capi-mgmt-router`.
 - k8s-snap: 1.32-classic/stable, rev 5326, v1.32.13 (classic confinement); CNI Cilium 1.17.12-ck0.
 - pod CIDR 10.1.0.0/16; svc CIDR 10.152.183.0/24; cluster DNS 10.152.183.31.
diff --git a/runbooks/phase-07-conductor-graft.md b/runbooks/phase-07-conductor-graft.md
index d36a162..e9e8355 100644
--- a/runbooks/phase-07-conductor-graft.md
+++ b/runbooks/phase-07-conductor-graft.md
@@ -10,9 +10,10 @@
 
 Decisions: D-031 (driver/engine/surface), D-037 (conf.d drop-in + config-dir via
 /etc/default, NOT a systemd ExecStart drop-in), D-042 (driver must be
-contract-coherent with the Layer-A core; amends D-034). D-036 (driver/engine/
-chart coherence). Troubleshooting: appendix-A DOCFIX-021, D-037, D-042, and
-lessons L-P6-1..4.
+contract-coherent with the Layer-A core; amends D-034), D-036 (driver/engine/
+chart coherence), D-046 (magnum trustee domain-setup; REQUIRED manual step -- Step 7.0),
+D-047 (keystone v3 drop-in for magnum-api -- Step 7.7b). Troubleshooting: appendix-A
+DOCFIX-021, D-037, D-042, and lessons L-P6-1..4.
 
 ---
 
@@ -20,19 +21,22 @@
 - phase-06 EXIT GATE passed: `capi-mgmt-v2` Ready, CAPI stack up (ORC `Image` CRD
   present, no crash-looping CAPO), `~/capi-mgmt.kubeconfig` (server = FIP) works
   from the jumphost.
-- Magnum charm live (`magnum/0`); the Keystone trustee domain is auto-configured by the
-  magnum charm via its keystone (identity-credentials) relation -- verify [trust]
-  (trustee_domain_id / trustee_domain_admin_id / trustee_domain_admin_password) is
-  populated in magnum.conf; no manual step.
+- Magnum charm live (`magnum/0`) and related to keystone. The charm RENDERS magnum.conf
+  `[trust]` (trustee_domain_name=magnum, trustee_domain_admin_name=magnum_domain_admin,
+  password) from the identity-credentials relation, but it does NOT create the keystone
+  domain/user those names reference -- that is the MANUAL `domain-setup` action (Step 7.0,
+  D-046). `[trust]` being populated is NOT sufficient; magnum reports "Unit is ready"
+  whether or not the domain exists, and the omission 403s every `coe` op. Step 7.0 creates
+  AND asserts the domain/user.
 - `admin-openrc` on the jumphost; `juju` (model openstack); `jq`.
 
 ## Constants and env-literals (TAG: confirm per site on rebuild)
-- `ENV(conductor-unit)` magnum/0        (LXD 1/lxd/2 on openstack1; addr 10.12.4.76)
-- `ENV(conductor-src)`  10.12.4.76/32   (the conductor's provider IP; SG source)
-- `ENV(mgmt-fip)`       10.12.7.40       (mgmt apiserver; kubeconfig server)
+- `ENV(conductor-unit)` magnum/0        (LXD 1/lxd/2 on openstack1; addr 10.12.4.76 -- confirm per site)
+- `ENV(conductor-src)`  10.12.4.76/32   (the conductor's provider IP; SG source -- confirm per site)
+- `ENV(mgmt-fip)`       per-rebuild     (mgmt apiserver / kubeconfig server; source ~/capi-mgmt-net.env from phase-06 -- this rebuild 10.12.5.103; the old 10.12.7.40 is dead -- DOCFIX-038)
 - `ENV(mgmt-sg)`        capi-mgmt-sg     (in the capi-mgmt project)
-- `ENV(project)`        capi-mgmt        (id 674171fd28d446d3a37073b6a761e910)
-- `ENV(magnum-ns)`      magnum-674171fd28d446d3a37073b6a761e910  (driver namespace per project)
+- `ENV(project)`        capi-mgmt        (resolve by name; this rebuild id d5bc125c7c1841d389b76cd0a7b0a915, domain capi)
+- `ENV(magnum-ns)`      magnum-<project-id>  (driver namespace per project; this rebuild magnum-d5bc125c7c1841d389b76cd0a7b0a915)
 - `ENV(chart-ver)`      0.25.1           (capi-helm-charts; load-bearing -- driver default is 0.10.1)
 - `ENV(helm-ver)`       v3.17.3
 
@@ -46,6 +50,38 @@
 
 ---
 
+## Step 7.0 -- Magnum trustee domain-setup (D-046; REQUIRED on every (re)deploy)
+`# RUN: jumphost`  The magnum charm action `domain-setup` is MANUAL and idempotent; magnum
+reports active/"Unit is ready" REGARDLESS of whether the trustee domain exists. If the keystone
+domain `magnum` + user `magnum_domain_admin` (referenced by magnum.conf `[trust]`) are absent,
+`magnum/common/policy.py` 401s on EVERY policy-enforced request -> every `coe` op 403s (the
+2026-06-17 incident; the 2026-06-11 redeploy omitted this and it stayed latent until the first
+coe call). Run here, AFTER magnum + identity-service are related, and BEFORE any coe call
+(Step 7.9 / phase-08). No magnum restart needed (domain_admin_auth resolves by NAME;
+trustee_domain_id is recomputed per request).
+
+Step A -- create the trustee domain (charm-native; idempotent; takes no parameters):
+```bash
+juju run magnum/leader domain-setup </dev/null
+```
+
+Step B -- ASSERT the domain + admin user exist (read-only GATE; do NOT proceed on failure):
+```bash
+( { source ~/admin-openrc
+  openstack domain show magnum -f value -c id
+  openstack user show magnum_domain_admin --domain magnum -f value -c id
+} )
+```
+
+Step C -- GATE coe (must return the conductor row, state up, NO 403):
+```bash
+( { source ~/admin-openrc; openstack coe service list; } )
+```
+GATE: Step B returns a domain id + a user id; Step C returns exactly one row
+(`magnum-conductor`, state `up`). A 403 at Step C means domain-setup did not take (re-run
+Step A) or the magnum.conf `[trust]` names differ from the created domain/user. (Benign
+"No domain/user exists" idempotency lines may appear in the action output.)
+
 ## Step 7.1 -- Authorize the conductor source on the mgmt-cluster SG
 `# RUN: jumphost` (scoped to the capi-mgmt project). Idempotent.
 
@@ -54,8 +90,10 @@
   set -u
   # scope openstack CLI to the capi-mgmt project (id form -- robust to name/domain)
   source ~/admin-openrc
+  # resolve the capi-mgmt project id while still admin-scoped, THEN narrow scope to it (id form)
+  CAPI_PID=$(openstack project show capi-mgmt --domain capi -f value -c id)   # ENV(project); resolve, never hardcode
   unset OS_PROJECT_NAME OS_PROJECT_ID OS_TENANT_NAME OS_TENANT_ID
-  export OS_PROJECT_ID=674171fd28d446d3a37073b6a761e910      # ENV(project)
+  export OS_PROJECT_ID="$CAPI_PID"
   SG=$(openstack security group show capi-mgmt-sg -f value -c id)   # ENV(mgmt-sg)
   echo "SG=$SG"
   echo "=== add ingress tcp/6443 from the conductor 10.12.4.76/32 (if absent) ==="
@@ -68,9 +106,10 @@
 ```
 Then prove conductor -> mgmt apiserver reachability:
 ```bash
-# RUN: jumphost -> magnum/0
+# RUN: jumphost -> magnum/0  (FIP from phase-06's ~/capi-mgmt-net.env -- never hardcode; DOCFIX-038)
+source ~/capi-mgmt-net.env   # MGMT_FIP
 juju ssh -m openstack magnum/0 \
-  "timeout 6 bash -c 'exec 3<>/dev/tcp/10.12.7.40/6443' && echo TCP-OK || echo TCP-FAIL" </dev/null
+  "timeout 6 bash -c 'exec 3<>/dev/tcp/$MGMT_FIP/6443' && echo TCP-OK || echo TCP-FAIL" </dev/null
 ```
 GATE: require `TCP-OK`. (Pre-existing jumphost rules tcp/22+6443 from 10.12.4.1/32 remain.)
 
@@ -149,8 +188,31 @@
   'curl -s -o /dev/null -w "pypi:%{http_code}\n" https://pypi.org/simple/ ; \
    curl -s -o /dev/null -w "helm:%{http_code}\n" https://get.helm.sh/' </dev/null
 
-# helm v3.17.3 (if not already present from a prior graft)
-juju ssh -m openstack magnum/0 'command -v helm && helm version --short || echo "helm absent -- install v3.17.3 from get.helm.sh tarball to /usr/local/bin/helm"' </dev/null
+# helm v3.17.3 -- INSTALL it (not check+echo) and put it on the CONDUCTOR's PATH (DOCFIX-035).
+# The magnum-conductor LSB-init PATH excludes /usr/local/bin, so the driver's `helm` shell-out
+# fails even when an interactive `juju ssh` shell finds it. Install the binary to /usr/local/bin
+# AND symlink /usr/bin/helm -> it (/usr/bin IS on the restricted init PATH). Checksum-verified.
+juju ssh -m openstack magnum/0 'set -e
+  WANT=v3.17.3
+  if [ -x /usr/bin/helm ] && /usr/bin/helm version --short 2>/dev/null | grep -q "$WANT"; then
+    echo "[SKIP] /usr/bin/helm already $WANT"
+  else
+    T=helm-$WANT-linux-amd64.tar.gz
+    D=$(mktemp -d); cd "$D"
+    curl -fsSLO "https://get.helm.sh/$T"
+    EXP=$(curl -fsSL "https://get.helm.sh/$T.sha256sum" | cut -d" " -f1)
+    GOT=$(sha256sum "$T" | cut -d" " -f1)
+    [ -n "$EXP" ] && [ "$EXP" = "$GOT" ] || { echo "GATE FAIL: helm checksum exp=$EXP got=$GOT"; exit 1; }
+    tar xzf "$T"
+    sudo install -o root -g root -m 0755 linux-amd64/helm /usr/local/bin/helm
+    sudo ln -sfn /usr/local/bin/helm /usr/bin/helm
+    cd /; rm -rf "$D"
+    echo "[OK] installed $(/usr/bin/helm version --short)"
+  fi' </dev/null
+
+# DOCFIX-035 GATE: helm must resolve from the conductor's RESTRICTED init PATH (no /usr/local/bin),
+# not just an interactive shell. Reproduce that PATH and confirm `helm` is found (via /usr/bin):
+juju ssh -m openstack magnum/0 'env -i PATH=/usr/sbin:/usr/bin:/sbin:/bin sh -c "command -v helm && helm version --short"' </dev/null
 
 # install the RELEASED contract-coherent driver (supersedes 1.3.0)
 juju ssh -m openstack magnum/0 'sudo python3 -m pip install --no-deps --upgrade "magnum-capi-helm==1.4.0"' </dev/null
@@ -160,7 +222,9 @@
   'pip show magnum-capi-helm | egrep "Version|Location"; \
    python3 -c "import importlib.metadata as m; print([e.name for e in m.entry_points(group=\"magnum.drivers\")])"' </dev/null
 ```
-Expect: Version 1.4.0; `k8s_capi_helm_v1` present in the entry points.
+Expect: helm reachable on the restricted PATH -- the gate prints `/usr/bin/helm` + `v3.17.3`
+(DOCFIX-035; `command -v helm` in a login shell is NOT sufficient proof). Driver Version 1.4.0;
+`k8s_capi_helm_v1` present in the entry points.
 
 ## Step 7.5 -- api_resources (D-042; set EXPLICITLY to an empty map on this cluster)
 1.4.0 exposes ONE [capi_helm] option for this -- `api_resources`, a JSON string mapping
@@ -229,14 +293,49 @@
 RESIDUAL (logged): if a future charm hook ever writes /etc/default/magnum-conductor,
 the append is lost and [capi_helm] silently stops being read -- detect via show-args/ps.
 
+## Step 7.7b -- Force keystone v3 for magnum-api via a magnum.conf.d drop-in (D-047)
+`# RUN: jumphost -> magnum/0`  The charm renders `auth_version = v2.0` in magnum.conf
+`[keystone_authtoken]`/`[keystone_auth]` (a template type-compare bug; Caracal keystone does
+not serve v2.0). On THIS deploy it is COSMETIC -- magnum's domain_admin_auth rewrites v2.0->v3
+and token validation worked throughout -- but v2.0 is the provably wrong value, so override it
+with a drop-in (D-047). Same config-dir mechanism as Step 7.7, but for the magnum-API service:
+Step 7.7 wired `--config-dir` only for the conductor, and oslo.config reads `--config-dir` AFTER
+`--config-file`, so the drop-in wins. v3 URLs are DERIVED from the live `[keystone_authtoken]`
+(no hardcoded VIPs). No restart here -- Step 7.8 restarts both services.
+```bash
+juju ssh -m openstack magnum/0 sudo bash -s <<'REOF'
+set -e
+# (1) wire --config-dir into magnum-api (mirror Step 7.7's conductor wiring; idempotent)
+grep -q -- '--config-dir /etc/magnum/magnum.conf.d' /etc/default/magnum-api 2>/dev/null \
+  || echo 'DAEMON_ARGS="$DAEMON_ARGS --config-dir /etc/magnum/magnum.conf.d"' >> /etc/default/magnum-api
+chmod 0644 /etc/default/magnum-api
+# (2) derive v3 URLs from the live [keystone_authtoken] block; write the override drop-in
+WWW=$(awk -F'= ' '/^\[keystone_authtoken\]/{s=1} s&&/^www_authenticate_uri/{print $2; exit}' /etc/magnum/magnum.conf)
+AURL=$(awk -F'= ' '/^\[keystone_authtoken\]/{s=1} s&&/^auth_url/{print $2; exit}' /etc/magnum/magnum.conf)
+WWW3=${WWW/\/v2.0//v3};   case "$WWW3"  in */v3) ;; *) WWW3="${WWW3%/}/v3";;  esac
+AURL3=${AURL/\/v2.0//v3}; case "$AURL3" in */v3) ;; *) AURL3="${AURL3%/}/v3";; esac
+printf '[keystone_authtoken]\nauth_version = v3\nwww_authenticate_uri = %s\nauth_url = %s\n[keystone_auth]\nauth_version = v3\nwww_authenticate_uri = %s\nauth_url = %s\n' \
+  "$WWW3" "$AURL3" "$WWW3" "$AURL3" > /etc/magnum/magnum.conf.d/50-keystone-v3-override.conf
+chmod 0644 /etc/magnum/magnum.conf.d/50-keystone-v3-override.conf
+echo "[OK] 50-keystone-v3-override.conf:"; cat /etc/magnum/magnum.conf.d/50-keystone-v3-override.conf
+REOF
+```
+GATE: the drop-in lists `auth_version = v3` + `/v3` URLs in BOTH sections, and
+`grep -- --config-dir /etc/default/magnum-api` returns the line. The effective value is
+proven in Step 7.8 by the magnum-api launched cmdline carrying `--config-dir` (L-P6-1/2:
+gate on the assembled cmdline, not the file text). Restart happens in Step 7.8.
+
 ## Step 7.8 -- Restart conductor + verify driver + HEALTHY (P6e + D-042 Stage 6)
 `# RUN: jumphost -> magnum/0`, then jumphost health poll.
 
 ```bash
 juju ssh -m openstack magnum/0 \
-  'sudo systemctl restart magnum-conductor && sleep 3 && systemctl is-active magnum-conductor && \
-   ps -ww -C magnum-conductor -o args=' </dev/null
-# expect: active; live cmdline carries --config-dir.
+  'sudo systemctl restart magnum-conductor magnum-api && sleep 3 && \
+   systemctl is-active magnum-conductor magnum-api && \
+   echo "--- conductor args ---"; ps -ww -C magnum-conductor -o args=; \
+   echo "--- magnum-api args (D-047: must carry --config-dir) ---"; ps -ww -C magnum-api -o args=' </dev/null
+# expect: both active; BOTH cmdlines carry --config-dir /etc/magnum/magnum.conf.d
+# (conductor -> [capi_helm] driver config; magnum-api -> the v3 keystone override).
 
 juju ssh -m openstack magnum/0 'sudo magnum-driver-manage list-drivers 2>/dev/null | grep capi || \
    echo "driver list (full):"; sudo magnum-driver-manage list-drivers' </dev/null
@@ -252,8 +351,9 @@
 ```bash
 ( {
   source ~/admin-openrc
+  CAPI_PID=$(openstack project show capi-mgmt --domain capi -f value -c id)   # ENV(project); resolve, never hardcode
   unset OS_PROJECT_NAME OS_PROJECT_ID OS_TENANT_NAME OS_TENANT_ID
-  export OS_PROJECT_ID=674171fd28d446d3a37073b6a761e910       # ENV(project)
+  export OS_PROJECT_ID="$CAPI_PID"
   for i in $(seq 1 10); do
     echo "[$i] health=$(openstack coe cluster show capi-test-1 -f value -c health_status 2>/dev/null)"
     echo "    reason=$(openstack coe cluster show capi-test-1 -f value -c health_status_reason 2>/dev/null)"
@@ -269,12 +369,12 @@
 ## Step 7.9 -- Regression check (confirm create/manage path intact)
 `# RUN: jumphost` (capi-mgmt scope). Prove the upgraded driver still creates+deletes.
 
-FRESH DEPLOY ROUTING: SKIP this step -- the `capi-k8s-v1-32` template does not exist
+FRESH DEPLOY ROUTING: SKIP this step -- the `capi-k8s-v1-34` template does not exist
 yet (phase-08 step 8.0 creates it), and phase-08 itself (create `capi-test-1` to
 CREATE_COMPLETE, full acceptance, then 8.5 delete) is a superset of this check. Run
 7.9 as written only when grafting onto an existing cloud where the template is present.
 ```bash
-openstack coe cluster create capi-fix-check --cluster-template capi-k8s-v1-32 \
+openstack coe cluster create capi-fix-check --cluster-template capi-k8s-v1-34 \
   --keypair capi-mgmt-key --master-count 1 --node-count 1
 # watch to CREATE_COMPLETE, then:
 openstack coe cluster delete capi-fix-check    # watch to gone
@@ -310,13 +410,13 @@
   DEB magnum 18.0.1, python3.10, container ubuntu 22.04; conductor user `magnum`.
 - As-FIRST-built driver: 1.3.0 (pip --no-deps) -> read the version-less v1beta2 ref -> health UNHEALTHY (D-042).
   PHASE-07 BASELINE supersedes this with the RELEASED magnum-capi-helm==1.4.0 (api_resources; default v1beta1).
-- kubeconfig: /etc/magnum/kubeconfig, -rw------- magnum, ~5657 bytes, server = FIP 10.12.7.40:6443.
+- kubeconfig: /etc/magnum/kubeconfig, -rw------- magnum, ~5657 bytes, server = the mgmt FIP:6443 (per-rebuild; this rebuild 10.12.5.103, old 10.12.7.40 dead).
 - conf.d drop-in /etc/magnum/magnum.conf.d/00-capi-helm.conf: kubeconfig_file, helm_chart_repo
   (azimuth), helm_chart_name openstack-cluster, default_helm_chart_version 0.25.1 (api_resources
   left default -- v1beta1 served by CAPI v1.13.2 / CAPO v0.14.4).
 - config-dir injection: /etc/default/magnum-conductor `DAEMON_ARGS="$DAEMON_ARGS --config-dir
   /etc/magnum/magnum.conf.d"`; verified live via `ps` and the init script `show-args`.
-- helm v3.17.3 at /usr/local/bin/helm.
+- helm v3.17.3 at /usr/local/bin/helm + /usr/bin/helm symlink (DOCFIX-035: on the conductor's restricted init PATH).
 - Driver internals (reference, from installed source): routes on (server_type vm, os ubuntu,
   coe kubernetes); k8s version comes from the IMAGE `kube_version` property (NOT a template label),
   os_distro=ubuntu; flavor floor 2048 MB / 2 vCPU; auto-mints an app credential (workload nodes use
@@ -324,5 +424,5 @@
 
 ## Next
 phase-08 -- workload-cluster acceptance: create a tenant cluster from template
-`capi-k8s-v1-32`, confirm CREATE_COMPLETE + Ready nodes + Calico + LB, and run the
+`capi-k8s-v1-34`, confirm CREATE_COMPLETE + Ready nodes + Calico + LB, and run the
 D-011 (amended per D-019) acceptance criteria.
diff --git a/runbooks/phase-08-workload-cluster-acceptance.md b/runbooks/phase-08-workload-cluster-acceptance.md
index 5ac8a0b..5976ce4 100644
--- a/runbooks/phase-08-workload-cluster-acceptance.md
+++ b/runbooks/phase-08-workload-cluster-acceptance.md
@@ -1,7 +1,7 @@
 # Phase 08 -- Workload-Cluster Acceptance (D-011)
 
 Prove tenant self-service Kubernetes end to end: create a workload cluster from
-the `capi-k8s-v1-32` template, confirm it converges (Ready nodes, CNI, CCM/CSI,
+the `capi-k8s-v1-34` template, confirm it converges (Ready nodes, CNI, CCM/CSI,
 API LB), then run the D-011 acceptance bar. Passing D-011 is the gate that unlocks
 the project-completion tasks.
 
@@ -25,12 +25,12 @@
   (8.2 health gate; 8.1-8.5 create path). On an existing-cluster graft, `health_status`
   already reports HEALTHY (if the phase-07 1.4.0 upgrade was skipped, expect the COSMETIC
   UNHEALTHY of D-042 -- functional, but not an acceptance pass).
-- Image `ubuntu-jammy-kube-v1.32.13` present AND carrying Glance properties
-  (8.0 below verifies, and on a fresh deploy imports it from the jumphost-staged qcow2)
-  `kube_version` (e.g. v1.32.13) and `os_distro=ubuntu`. The driver reads the k8s
+- Image `ubuntu-jammy-kube-v1.34.8` present AND carrying Glance properties
+  (8.0 below verifies, and on a fresh deploy stage-and-verifies it from the azimuth CDN --
+  FINDING-3) `kube_version` (v1.34.8) and `os_distro=ubuntu`. The driver reads the k8s
   version from the IMAGE, not a template label (P6-CONTRACT / L-P6-3); a missing
-  property fails create.
-- Cluster template `capi-k8s-v1-32` present (8.0 verifies/creates it).
+  property fails create. (D1: bumped from EOL v1.32.13 to v1.34.8, within CAPI v1.13.2 support.)
+- Cluster template `capi-k8s-v1-34` present (8.0 verifies/creates it).
 - D-039: the Magnum service path mints app-creds carrying `load-balancer_member`
   (+ member, reader). A frozen pre-D-039 app-cred 403s on the Octavia LB step and
   wedges create/delete (appendix-A: stuck-delete).
@@ -39,29 +39,31 @@
   hyperconverged hosts and OOM-kills guests.
 
 ## Constants and env-literals (TAG: confirm per site / run on rebuild)
-- `ENV(project)`       capi-mgmt    (id 674171fd28d446d3a37073b6a761e910)
+- `ENV(project)`       capi-mgmt    (resolve by name; this rebuild id d5bc125c7c1841d389b76cd0a7b0a915, domain capi)
 - `ENV(admin-project)` admin        (id 65ce73e6798e4d1e8dd066609b7033ef)
-- `ENV(template)`      capi-k8s-v1-32   (uuid e2549d8b-4b89-4947-8b9a-0f4fdbe87d59)
-- `ENV(image)`         ubuntu-jammy-kube-v1.32.13 (id de69c243-bd1f-4182-8e9e-33933e926857)
-- `ENV(ext-net)`       provider-ext (id 70b34bb2-3afb-4b43-96d3-f520dbcbf9a8)
+- `ENV(template)`      capi-k8s-v1-34   (D1; uuid regenerates per rebuild -- resolve by name)
+- `ENV(image)`         ubuntu-jammy-kube-v1.34.8 (D1; kube_version v1.34.8; id regenerates -- resolve by name)
+- `ENV(ext-net)`       provider-ext (resolve by name; this rebuild id 0d00ddc1-d2bf-4849-a087-14c07d77f167)
 - `ENV(keypair)`       capi-mgmt-key
 - `ENV(cluster)`       capi-test-1
 - `ENV(workload-cidr)` 10.20.16.0/24
 - `ENV(flavors)`       master gp.mid (8192/2) ; worker capi.node (4096/2)
 - run-specific (do NOT hardcode -- capture at run): API LB id, LB VIP (10.20.16.x),
-  workload API FIP (10.12.7.180 on the as-built run).
+  workload API FIP (10.12.7.180 on the 2026-06-09 as-built run; per-rebuild).
 
 ## Scope-hygiene preambles (the project-scope leak guard)
-Capi-mgmt-scoped (cluster CRUD, show, config):
+Capi-mgmt-scoped (cluster CRUD, show, config). DOCFIX-034: resolve the capi-mgmt project id
+dynamically while admin-scoped, THEN narrow to it -- never hardcode (it regenerates per rebuild):
 ```bash
 source ~/admin-openrc
+CAPI_PID=$(openstack project show capi-mgmt --domain capi -f value -c id)   # ENV(project)
 unset OS_PROJECT_NAME OS_PROJECT_ID OS_TENANT_NAME OS_TENANT_ID OS_PROJECT_DOMAIN_ID OS_PROJECT_DOMAIN_NAME
-export OS_PROJECT_ID=674171fd28d446d3a37073b6a761e910      # ENV(project)
+export OS_PROJECT_ID="$CAPI_PID"
 ```
 Admin-scoped (LB amphora/failover -- these 403 under tenant member scope):
 ```bash
 source ~/admin-openrc
-unset OS_PROJECT_ID OS_TENANT_ID OS_TENANT_NAME            # token -> admin 65ce73e6...
+unset OS_PROJECT_ID OS_TENANT_ID OS_TENANT_NAME            # token -> admin (the admin-openrc project)
 ```
 
 ---
@@ -76,60 +78,77 @@
 ( {
   set -u
   echo "=== image present + carries kube_version / os_distro ==="
-  openstack image show ubuntu-jammy-kube-v1.32.13 -f json \
+  openstack image show ubuntu-jammy-kube-v1.34.8 -f json \
     | python3 -c 'import json,sys;d=json.load(sys.stdin);p=d.get("properties",d);print("kube_version=",d.get("kube_version") or p.get("kube_version"));print("os_distro=",d.get("os_distro") or p.get("os_distro"))'
   echo "=== reserved-host-memory (D-040) on a compute unit ==="
   juju ssh nova-compute/0 'sudo grep -i reserved_host_memory /etc/nova/nova.conf' </dev/null   # expect 8192
   echo "=== template present? ==="
-  openstack coe cluster template show capi-k8s-v1-32 -f value -c uuid 2>/dev/null \
+  openstack coe cluster template show capi-k8s-v1-34 -f value -c uuid 2>/dev/null \
     && echo "template OK" || echo "template ABSENT -- create it below"
 } )
 ```
-If the image is ABSENT (fresh deploy -- nothing survives teardown), import it from
-the jumphost-staged qcow2. The command is the VERBATIM 2026-06-08 as-executed path
-(glance-direct; plain web-download 403s on this cloud). With the hardened bundle's
-glance `image-conversion: true` the stored disk_format lands `raw` on the redeploy
-(expected -- the as-built run stored qcow2 because conversion was off then):
+If the image is ABSENT (fresh deploy -- nothing survives teardown), seed it by
+STAGE-AND-VERIFY (FINDING-3 -- REQUIRED, not merely preferred, for azimuth kube images):
+glance's web-download plugin fetches with urllib (User-Agent `Python-urllib/3.x`) and the
+azimuth CDN returns HTTP 403 to that UA, so a web-download import 202-accepts then hangs in
+`queued` forever. curl sends a different UA and is NOT blocked. So curl the qcow2 to the
+jumphost ($HOME -- snap-readable, NOT /tmp, L7), verify sha512 against the azimuth-images
+0.28.0 manifest, then `openstack image create --file --import` (client-safe: the openstack snap
+HAS `image create --import` = glance-direct and image-conversion lands it `raw`; it does NOT
+have standalone `image stage`/`image import` subcommands, and the standalone `glance` client is
+not assumed present):
 ```bash
 ( {
   set -u
   source ~/admin-openrc
-  if openstack image show ubuntu-jammy-kube-v1.32.13 >/dev/null 2>&1; then
-    echo "[SKIP] image ubuntu-jammy-kube-v1.32.13 present"
+  IMG_NAME=ubuntu-jammy-kube-v1.34.8                                  # ENV(image)
+  KUBE_VER=v1.34.8                                                    # driver reads this from the image, not a label
+  if openstack image show "$IMG_NAME" >/dev/null 2>&1; then
+    echo "[SKIP] image $IMG_NAME present"
   else
-    SRC="$HOME/ubuntu-jammy-kube-v1.32.13-260401-2014.qcow2"
-    [ -f "$SRC" ] || { echo "ABORT: $SRC missing on the jumphost (azimuth-images source; see appendix-B)"; exit 1; }
-    glance image-create-via-import \
-      --import-method glance-direct \
-      --file "$SRC" \
+    # azimuth-images 0.28.0 manifest (build 260518-1604) -- re-confirm vs manifest.json on any bump:
+    URL="https://azimuth-images.stackhpc.cloud/ubuntu-jammy-kube-v1.34.8-260518-1604.qcow2"
+    SHA512_EXP="7efde4857c9f9da045a98d71def30e229b3d7fffd8a5680e8aee0c5a8b13ba73fca3cf758a927230a1fbe3c451d8d21cfaeded96091e2a4f313c6a404760bdb3"
+    SRC="$HOME/ubuntu-jammy-kube-v1.34.8-260518-1604.qcow2"
+    if [ -f "$SRC" ] && [ "$(sha512sum "$SRC" | cut -d' ' -f1)" = "$SHA512_EXP" ]; then
+      echo "[OK] staged image present + sha512-valid; skipping download"
+    else
+      echo "[..] curl the qcow2 to $SRC (curl UA passes the CDN; glance urllib UA 403s -- FINDING-3)"
+      curl -fSL -o "$SRC" "$URL"
+      GOT=$(sha512sum "$SRC" | cut -d' ' -f1)
+      [ "$SHA512_EXP" = "$GOT" ] || { echo "GATE FAIL: sha512 mismatch exp=$SHA512_EXP got=$GOT"; exit 1; }
+      echo "[OK] sha512 verified against the azimuth-images 0.28.0 manifest"
+    fi
+    # CORRECTION-1: a plain --file (no --import) PUT stores qcow2 (boots fine); --import runs
+    # glance-direct + image-conversion -> raw (Ceph fast-clone alignment), so use --import here.
+    openstack image create "$IMG_NAME" \
+      --file "$SRC" --import \
       --container-format bare --disk-format qcow2 \
-      --property os_distro=ubuntu --property kube_version=v1.32.13 \
-      --name ubuntu-jammy-kube-v1.32.13
+      --property os_distro=ubuntu --property kube_version="$KUBE_VER"
   fi
-  echo "=== poll to active (3.7G stage + conversion; allow ~10 min) ==="
+  echo "=== poll to active (multi-GB stage + conversion; allow ~10 min) ==="
   for i in $(seq 1 40); do
-    ST=$(openstack image show ubuntu-jammy-kube-v1.32.13 -f value -c status 2>/dev/null || echo '?')
+    ST=$(openstack image show "$IMG_NAME" -f value -c status 2>/dev/null || echo '?')
     echo "[$i] status=$ST"
     [ "$ST" = active ] && break
     sleep 15
   done
 } )
 ```
-GATE: image `active` and the 8.0 property check above passes (kube_version
-v1.32.13 / os_distro ubuntu). Then create the template only if absent (spec from
-the as-built capture; the two labels
-are intentionally the whole config -- chart 0.25.1 + the conf.d drop-in govern the
-rest). `--network-driver` is OMITTED deliberately: under the 1.4.0 driver the option
-IS honored (it maps to the chart `network_driver`), so to keep the as-built chart
-default (Calico) we leave it unset. Setting `flannel` here would now switch the CNI --
-do that only if Calico is being intentionally replaced (appendix-A: CNI-label / 1.4.0).
+GATE: image `active` and the 8.0 property check above passes (kube_version v1.34.8 /
+os_distro ubuntu). Then create the template only if absent. DOCFIX-032: pin
+`--network-driver calico` EXPLICITLY. Under the 1.4.0 driver `--network-driver` maps to the
+chart `network_driver`, and chart 0.25.1 ships ONLY Calico (flannel is not packaged) -- an
+explicit `calico` documents intent and removes reliance on the default staying Calico. Do NOT
+set `flannel`: it is unsupported by chart 0.25.1 and would fail to converge.
 ```bash
-openstack coe cluster template create capi-k8s-v1-32 \
+openstack coe cluster template create capi-k8s-v1-34 \
   --coe kubernetes --server-type vm \
-  --image ubuntu-jammy-kube-v1.32.13 \
+  --image ubuntu-jammy-kube-v1.34.8 \
   --external-network provider-ext \
   --master-flavor gp.mid --flavor capi.node \
   --master-lb-enabled --floating-ip-enabled \
+  --network-driver calico \
   --dns-nameserver 8.8.8.8 \
   --docker-storage-driver overlay2 \
   --labels fixed_subnet_cidr=10.20.16.0/24,octavia_provider=amphora
@@ -142,7 +161,7 @@
 
 ```bash
 openstack coe cluster create capi-test-1 \
-  --cluster-template capi-k8s-v1-32 \
+  --cluster-template capi-k8s-v1-34 \
   --keypair capi-mgmt-key \
   --master-count 1 --node-count 2
 openstack coe cluster show capi-test-1 -f value -c uuid -c status
@@ -170,17 +189,19 @@
 `# RUN: jumphost`. Pull the cluster's kubeconfig via Magnum, then inspect.
 ```bash
 # capi-mgmt scope
+mkdir -p ~/capi-test-1                                   # DOCFIX-037: `coe cluster config --dir` does NOT create the dir
 openstack coe cluster config capi-test-1 --dir ~/capi-test-1 --force
 export KUBECONFIG=~/capi-test-1/config
-# LIVE-REVIEW: confirm `coe cluster config` returns a usable kubeconfig under the
-#   capi-helm driver; alternative is the CAPI kubeconfig secret on the mgmt cluster:
-#   KUBECONFIG=~/capi-mgmt.kubeconfig clusterctl -n <magnum-ns> get kubeconfig <cluster-name-suffix>
+# confirmed: `coe cluster config` returns a usable kubeconfig under the capi-helm driver.
+# Alternative (CAPI kubeconfig secret on the mgmt cluster), magnum-ns resolved dynamically:
+#   NS=magnum-$(openstack project show capi-mgmt --domain capi -f value -c id)
+#   KUBECONFIG=~/capi-mgmt.kubeconfig clusterctl -n "$NS" get kubeconfig <cluster-name-suffix>
 
 ( {
   export KUBECONFIG=~/capi-test-1/config
-  echo "=== nodes (expect 3 Ready, v1.32.13: 1 control-plane + 2 workers) ==="
+  echo "=== nodes (expect 3 Ready, v1.34.8: 1 control-plane + 2 workers) ==="
   kubectl get nodes -o wide
-  echo "=== CNI = Calico (chart default; --network-driver omitted) ==="
+  echo "=== CNI = Calico (DOCFIX-032: --network-driver calico pinned on the template) ==="
   kubectl -n kube-system get pods | grep -Ei 'calico|tigera' || kubectl get pods -A | grep -Ei 'calico|tigera'
   echo "=== CCM (OpenStack cloud-controller-manager) + Cinder CSI + CoreDNS Running ==="
   kubectl get pods -A | grep -Ei 'cloud-controller|openstack-cloud|cinder-csi|coredns'
@@ -201,26 +222,42 @@
   `juju status --format=short | grep -vE 'active|idle' || echo "all active/idle"`
   Pass: nothing but active/idle (phase-03 re-confirmed here).
 
-- **D-011.2 -- API reachability from the jumphost (all public VIPs).** `# RUN: jumphost`
-  IP-only: hit each service VIP, e.g. Keystone:
+- **D-011.2 -- API reachability from the jumphost (CORE service VIPs).** `# RUN: jumphost`
+  IP-only: hit each CORE service VIP, e.g. Keystone:
   `curl -sk https://10.12.4.50:5000/v3 -o /dev/null -w '%{http_code}\n'` (expect 200/300).
-  Repeat per public VIP (.50-.60 block). Pass: all respond.
+  Repeat per core public VIP (.50-.60 block: keystone .50, barbican .51, cinder .52, glance .53,
+  magnum .54, neutron .55, nova .56, octavia .57, horizon .58/.60, placement .59). DOCFIX-039:
+  product-streams / glance-simplestreams (gss) is NOT a core API VIP -- it registers a unit-IP
+  HTTP endpoint (this rebuild 10.12.8.196) with NO jumphost route to the container space, so it is
+  EXPECTED unreachable from the jumphost and is OUT OF SCOPE for D-011.2. Pass: all core VIPs respond.
 
-- **D-011.3 -- API reachability from a tenant VM (Option B).** `# RUN: mgmt VM`
-  The generalized phase-06 GATE 1: a tenant VM reaches the provider VIP.
-  `ssh ... ubuntu@10.12.7.40 "timeout 6 bash -c 'exec 3<>/dev/tcp/10.12.4.50/5000' && echo VIP-OK || echo VIP-FAIL" </dev/null`
+- **D-011.3 -- API reachability from a tenant VM (Option B).** `# RUN: jumphost -> mgmt VM`
+  The generalized phase-06 GATE 1: a tenant VM reaches the provider VIP. DOCFIX-038: the mgmt
+  FIP is per-rebuild -- source it (never hardcode the dead 10.12.7.40):
+  `source ~/capi-mgmt-net.env`
+  `ssh ... ubuntu@"$MGMT_FIP" "timeout 6 bash -c 'exec 3<>/dev/tcp/10.12.4.50/5000' && echo VIP-OK || echo VIP-FAIL" </dev/null`
   Pass: VIP-OK (proves the shared-L2 Option B path).
 
 - **D-011.4 -- Octavia LB pattern re-passes (round-robin, failover, recovery).**
-  Round-robin: 2-member pool behind a VIP, repeated curls hit both members.
-  Recovery (admin scope): `openstack loadbalancer failover <api-lb-id>` -> watch
-  ERROR/PENDING_UPDATE -> ACTIVE (~100s; single STANDALONE amphora -> brief blip;
-  operating_status holds ONLINE). (appendix-A: LB-failover; amphora ops are
-  admin-scope only.) Pass: round-robin distributes; failover returns to ACTIVE.
-  TODO (before sign-off): this runbook does NOT yet contain the build steps for the
-  standalone 2-member round-robin pool (LB + listener + pool + 2 backend members +
-  health monitor). Add them here, or fold the round-robin check into the
-  workload-cluster API LB the driver already builds, before D-011.4 is marked complete.
+  DOCFIX-040 -- do NOT hand-build a standalone LB/listener/pool/members. Exercise round-robin via
+  a THROWAWAY Kubernetes `Service type=LoadBalancer` on the workload cluster: the OpenStack CCM
+  provisions an Octavia LB + pool + members for it automatically (the Roosevelt-real path -- tenant
+  workloads get LBs exactly this way), then tear it down. `# RUN: jumphost, KUBECONFIG=~/capi-test-1/config`
+  ```bash
+  export KUBECONFIG=~/capi-test-1/config
+  kubectl create deploy rr --image=registry.k8s.io/e2e-test-images/agnhost:2.40 --replicas=2 -- /agnhost netexec --http-port=8080
+  kubectl expose deploy rr --port=80 --target-port=8080 --type=LoadBalancer
+  kubectl get svc rr -w        # Ctrl-C once EXTERNAL-IP is assigned (CCM builds the Octavia LB + FIP)
+  EXT=$(kubectl get svc rr -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
+  for i in $(seq 1 10); do curl -s "http://$EXT/hostname"; echo; done   # expect BOTH pod names (round-robin)
+  kubectl delete svc rr; kubectl delete deploy rr                       # tears down the Octavia LB
+  ```
+  Failover/recovery (admin scope -- against the workload-cluster API LB): `openstack loadbalancer
+  failover <api-lb-id>` -> watch ERROR/PENDING_UPDATE -> ACTIVE (~100s; single STANDALONE amphora
+  -> brief blip; operating_status holds ONLINE). STANDALONE failover needs N+1 amphora placement
+  headroom (it builds the replacement BEFORE reaping the old -- a cloud at its scheduler ceiling
+  cannot self-heal its LBs; Roosevelt sizing implication). (appendix-A: LB-failover; amphora ops
+  are admin-scope only.) Pass: round-robin distributes across both members; failover returns to ACTIVE.
 
 - **D-011.5 -- End-to-end Magnum CAPI cluster create, CCM not crash-looping.**
   Satisfied by 8.1-8.3 (CREATE_COMPLETE + CCM Running). Pass = that gate.
@@ -251,8 +288,8 @@
 frozen app-cred): clear the OpenStackCluster finalizer (the Cluster auto-follows),
 then manual neutron cleanup in dependency order -- appendix-A: stuck-delete.
 ```bash
-# NS=magnum-674171fd28d446d3a37073b6a761e910
-# KUBECONFIG=~/capi-mgmt.kubeconfig kubectl -n $NS patch openstackcluster <cluster>-<suffix> \
+# NS=magnum-$(openstack project show capi-mgmt --domain capi -f value -c id)   # resolve; never hardcode
+# KUBECONFIG=~/capi-mgmt.kubeconfig kubectl -n "$NS" patch openstackcluster <cluster>-<suffix> \
 #   --type=merge -p '{"metadata":{"finalizers":[]}}'
 # then: openstack router remove subnet / router unset external-gateway / router delete /
 #       subnet delete / network delete / security group delete  (dependency order)
@@ -262,13 +299,21 @@
 
 ## EXIT GATE (phase-08 / v1 acceptance)
 - 8.1-8.3 passed: capi-test-1 CREATE_COMPLETE, 3 Ready nodes, Calico, CCM/CSI/CoreDNS, API LB ACTIVE/ONLINE.
-- D-011 items 1-7 PASS; item 8 deferred (D-019).
-- health_status HEALTHY (phase-07 driver).
-- => v1 deployment is ACCEPTED. Project-completion tasks unlocked:
-  consolidate the do-doc runbooks into docs/v1-deploy-runbook.md; revert the
-  GitBucket repo OpenStack/openstack-caracal-ipv4 to PRIVATE.
+- D-011 items 1-6 PASS; item 7 (KVM snapshot baseline) OUTSTANDING -- it is the last gate before
+  the accept-gate formally closes (D-012; dedicated pass); item 8 deferred (D-019).
+- health_status HEALTHY (phase-07 1.4.0 driver clears the D-042 cosmetic UNHEALTHY).
+- ACCEPTANCE SUMMARY (this rebuild): .1 charms PASS; .2 core VIPs PASS; .3 tenant->VIP PASS;
+  .4 Octavia round-robin + admin-scope failover PASS; .5 E2E CAPI create PASS; .6 vault manual
+  unseal PASS; .7 snapshot DEFERRED (operator); .8 Designate DEFERRED (D-019). => v1 is
+  FUNCTIONALLY ACCEPTED; the .7 snapshot baseline is the only item left to formally close the gate.
+- => Project-completion tasks unlocked: consolidate the per-phase runbooks into
+  docs/v1-deploy-runbook.md; revert the GitBucket repo OpenStack/openstack-caracal-ipv4 to PRIVATE.
 
-## As-built reference (capi-test-1, suffix kgwwe7c4qj6a, 2026-06-09)
+## As-built reference (capi-test-1, suffix kgwwe7c4qj6a, 2026-06-09 -- PRE-D1 v1.32.13 capture)
+- D1 NOTE: the procedure above now targets capi-k8s-v1-34 / ubuntu-jammy-kube-v1.34.8. This
+  capture is the 2026-06-09 v1.32.13 run (the D-011 acceptance ran on v1.32.13); re-validation on
+  v1.34.8 follows the stage-and-verify seed (8.0). A later D-039-era recreate carried CAPI suffix
+  qmyxu2xcsghz (CREATE_COMPLETE, HEALTHY).
 - create: `--master-count 1 --node-count 2`; uuid 6de15cf4-8805-4ac2-b413-8de2c48d92cf.
 - nodes: control-plane (xsc62) + 2 workers; v1.32.13; Calico CNI.
 - API LB id 0f968008-8429-4ac3-8b82-452e126982cf, VIP 10.20.16.144, FIP 10.12.7.180,
diff --git a/runbooks/v1-ops-capi-recovery-procedure-20260610.md b/runbooks/v1-ops-capi-recovery-procedure-20260610.md
deleted file mode 100644
index 01e6886..0000000
--- a/runbooks/v1-ops-capi-recovery-procedure-20260610.md
+++ /dev/null
@@ -1,241 +0,0 @@
-# v1 ops -- CAPI/Magnum stack recovery procedure (parking, restart, LB repair)
-
-Status: blocks below are AS-EXECUTED-VERIFIED 2026-06-10 (this is their first
-formal consolidation). Destination: runbooks/ as an ops companion to the
-phase-NN deploy runbook, cross-referenced from appendix-A and from
-OpenStack_Test_Deployment-restart-procedure.md.
-
-Applies when: capi-mgmt-v2 has been stopped (parking, host event, OOM) and the
-CAPI/Magnum stack must be returned to service. ORDER MATTERS: repair from the
-bottom up (VM -> k8s -> CAPI controllers -> Octavia LB -> CAPO conditions ->
-Magnum health). Everything upstream stays red until the layer below is green.
-
-Scope-hygiene preambles are the canonical ones from the 2026-06-09 as-executed
-log. ENV literals: project capi-mgmt 674171fd28d446d3a37073b6a761e910; mgmt FIP
-10.12.7.40; kube-api LB 0f968008-...; regenerate per site on rebuild.
-
----
-
-## 0. Expectations table (read FIRST; saves an hour of false alarms)
-
-| Observation | Meaning |
-|---|---|
-| Magnum UNHEALTHY, reason EMPTY | Conductor cannot reach the mgmt API (VM down / booting). Not D-042. |
-| Magnum UNHEALTHY, reason populated, all components 'Ready', infrastructure 'Infrastructure resource not found.' | D-042 cosmetic false-negative. Known good. |
-| Horizon Container Infra 504 right after mgmt VM start | Conductor stalled mid-reconnect; nginx proxy timeout. Retry after Step 3. |
-| k8sd control.socket deadline / apiserver TLS handshake timeout / mount failures during first ~20 min after boot | Cold-start convergence noise on gp.mid (2 vCPU). Judge by load trend + `k8s status`, not by these. |
-| Cluster Available=False with InfrastructureReady LB-timeout message after a cold start | CAPO reconcile raced the storm. Check the LB (Step 4) BEFORE blaming CAPI. |
-| LB provisioning ERROR, operating ONLINE | Control-plane op failed; dataplane fine. Needs admin failover (Step 5). No urgency. |
-| openstack server list empty in Horizon/CLI | Wrong project scope. CAPI VMs live in capi-mgmt. |
-| juju ssh: "cannot get discharge ... EOF" | Stale macaroon + `</dev/null` ate the password prompt. Use `</dev/tty` or re-login. NOT a controller outage if `juju status` works interactively. |
-
-## 1. Parking (deliberate stop) -- forward procedure
-
-```
-------------------------------------------------------------------------
-BEGIN runbook block: capi-mgmt parking (pre-maintenance / pre-teardown)
-------------------------------------------------------------------------
-# capi-mgmt scope
-source ~/admin-openrc
-unset OS_PROJECT_ID OS_TENANT_ID OS_TENANT_NAME OS_PROJECT_DOMAIN_ID
-export OS_PROJECT_ID=674171fd28d446d3a37073b6a761e910
-unset OS_PROJECT_NAME OS_PROJECT_DOMAIN_NAME OS_TENANT_NAME OS_TENANT_ID
-openstack server stop capi-mgmt-v2
-# NOTE: Nova ACPI stop does NOT produce a clean guest shutdown on this VM
-# (no wtmp shutdown entry; verified 2026-06-10). Accepted for this VM class.
-# If filing jumphost secrets, record the destination IN THIS LOG, e.g.:
-#   ~/sweep-YYYYMMDD/secrets/{capi-mgmt.kubeconfig, capi-test-1-kc/config}
-# EXPECT while parked: Magnum UNHEALTHY with EMPTY reason; Container Infra
-# panel may 504; workload cluster keeps running (no runtime dependency).
-------------------------------------------------------------------------
-END runbook block
-------------------------------------------------------------------------
-```
-
-## 2. Start + boot gate
-
-```
-------------------------------------------------------------------------
-BEGIN runbook block: capi-mgmt-v2 start + ssh-port gate (D-041 manual start)
-------------------------------------------------------------------------
-( {
-  source ~/admin-openrc
-  unset OS_PROJECT_ID OS_TENANT_ID OS_TENANT_NAME OS_PROJECT_DOMAIN_ID
-  export OS_PROJECT_ID=674171fd28d446d3a37073b6a761e910
-  unset OS_PROJECT_NAME OS_PROJECT_DOMAIN_NAME OS_TENANT_NAME OS_TENANT_ID
-  openstack server start capi-mgmt-v2
-  for i in $(seq 1 20); do
-    ST=$(openstack server show capi-mgmt-v2 -f value -c status 2>/dev/null)
-    echo "[$i] status=$ST"
-    [ "$ST" = ACTIVE ] && break
-    sleep 10
-  done
-  echo "=== TCP probe loop: FIP :22 (sshd lags ACTIVE by ~3 min) ==="
-  for i in $(seq 1 18); do
-    timeout 5 bash -c 'exec 3<>/dev/tcp/10.12.7.40/22' 2>/dev/null \
-      && { echo "[$i] SSH-PORT-OK"; break; } || echo "[$i] not yet"
-    sleep 10
-  done
-} )
-------------------------------------------------------------------------
-END runbook block
-------------------------------------------------------------------------
-```
-GATE: SSH-PORT-OK. Timing (verified, gp.mid): ACTIVE ~20 s; sshd ~3.5 min.
-
-## 3. k8s-snap readiness (PATIENCE GATE)
-
-```
-------------------------------------------------------------------------
-BEGIN runbook block: mgmt k8s readiness poll (cold-start aware)
-------------------------------------------------------------------------
-( {
-  for i in $(seq 1 15); do
-    echo "--- [$i] $(date -u +%H:%M:%S) ---"
-    ssh -i ~/.ssh/id_ed25519 -o BatchMode=yes -o StrictHostKeyChecking=no \
-        -o UserKnownHostsFile=/dev/null -o ConnectTimeout=10 ubuntu@10.12.7.40 \
-        'uptime; sudo k8s status 2>&1 </dev/null | head -4'
-    sleep 120
-  done
-} )
-------------------------------------------------------------------------
-END runbook block
-------------------------------------------------------------------------
-```
-GATE: `cluster status: ready`. Verified convergence on gp.mid: ~20-21 min from
-boot, load peak >100 on 2 vCPUs. Do NOT restart services or re-bootstrap inside
-this window; the Section-0 noise is expected. (On the phase-06-spec gp.large,
-expect substantially faster.)
-
-## 4. CAPI stack + LB verification (read-only; decides Step 5)
-
-```
-------------------------------------------------------------------------
-BEGIN runbook block: post-start CAPI + LB verify
-------------------------------------------------------------------------
-( {
-  export KUBECONFIG="$HOME/capi-mgmt.kubeconfig"
-  kubectl get nodes -o wide
-  kubectl get pods -A | egrep 'capi-|capo-|cert-manager|orc-system|janitor|addon'
-  NS=magnum-674171fd28d446d3a37073b6a761e910
-  kubectl -n "$NS" get cluster,openstackcluster,machines
-} )
-# kubeconfig missing? Re-emit (phase-06 Step 6.5, verbatim):
-#   ssh -i ~/.ssh/id_ed25519 -o BatchMode=yes -o StrictHostKeyChecking=no \
-#       -o UserKnownHostsFile=/dev/null -o ConnectTimeout=10 ubuntu@10.12.7.40 \
-#       "sudo k8s config server=https://10.12.7.40:6443 </dev/null" > ~/capi-mgmt.kubeconfig
-( {
-  source ~/admin-openrc
-  unset OS_PROJECT_ID OS_TENANT_ID OS_TENANT_NAME OS_PROJECT_DOMAIN_ID
-  export OS_PROJECT_ID=674171fd28d446d3a37073b6a761e910
-  unset OS_PROJECT_NAME OS_PROJECT_DOMAIN_NAME OS_TENANT_NAME OS_TENANT_ID
-  openstack loadbalancer list -f yaml
-} )
-------------------------------------------------------------------------
-END runbook block
-------------------------------------------------------------------------
-```
-DECISION: controllers Running + Machines Running + every LB provisioning=ACTIVE
--> skip to Step 6. Any LB provisioning=ERROR (operating ONLINE is typical)
--> Step 5. Cluster Available=False with an LB-timeout message -> the LB is the
-cause; fix it first, the condition clears itself afterward.
-
-## 5. LB repair: zombie sweep, headroom, sequential failover
-
-5a. ZOMBIE/ORPHAN SWEEP (admin scope). Confirmed pattern, twice in one day:
-failed failovers leave amphora servers with no Octavia DB row. Two variants:
-ERROR server (failed spawn) and ACTIVE heartbeating zombie (health-manager logs
-"missing from the DB ... An operator must manually delete it" every 10 s).
-
-```
-------------------------------------------------------------------------
-BEGIN runbook block: amphora orphan/zombie sweep (admin scope; verify-then-delete)
-------------------------------------------------------------------------
-( {
-  source ~/admin-openrc
-  unset OS_PROJECT_ID OS_TENANT_ID OS_TENANT_NAME
-  echo "=== octavia's amphora inventory (the DB truth) ==="
-  openstack loadbalancer amphora list -f yaml
-  echo "=== nova's amphora servers (compare; extras are orphans) ==="
-  openstack server list --all-projects --long -f yaml \
-    | grep -B6 -A4 'amphora-haproxy' | grep -E '^(- |  (ID|Name|Status)):'
-} )
-# For each server whose amphora-NAME-uuid is ABSENT from the amphora list:
-#   1) re-grep the amphora list for the uuid (ABORT if present)
-#   2) openstack server delete <SERVER-UUID>   # by UUID; name lookup is project-scoped
-# Each deletion frees one amphora slot (charm-octavia: 1024 MB / 1 vCPU / 8 GB).
-------------------------------------------------------------------------
-END runbook block
-------------------------------------------------------------------------
-```
-
-5b. HEADROOM CHECK. Failover transiently needs +1 amphora placement (replacement
-is built BEFORE the old one is reaped). Scheduler ceiling per host =
-physical_MB * ram_allocation_ratio(1.5) - reserved_host_memory(8192, D-040).
-Verify at least one host clears Used + 1024 <= ceiling:
-`openstack hypervisor list --long -f yaml | grep -E 'Hostname|Memory MB'`.
-If no host clears: free 1024+ MB first (zombie sweep usually suffices; else
-power off a disposable VM, e.g. a backend-* test instance). DO NOT retry
-failover against NoValidHost -- each attempt mints another zombie.
-
-5c. FAILOVER, STRICTLY SEQUENTIAL (one slot of headroom = one failover at a
-time; completion of each reaps its old amphora and re-frees the slot).
-
-```
-------------------------------------------------------------------------
-BEGIN runbook block: LB failover + poll (admin scope; v4 Arc D pattern)
-------------------------------------------------------------------------
-( {
-  source ~/admin-openrc
-  unset OS_PROJECT_ID OS_TENANT_ID OS_TENANT_NAME
-  LB=<LB-ID>
-  openstack loadbalancer failover "$LB"
-  sleep 2
-  for i in $(seq 1 60); do
-    prov=$(openstack loadbalancer show "$LB" -f value -c provisioning_status 2>/dev/null)
-    op=$(  openstack loadbalancer show "$LB" -f value -c operating_status    2>/dev/null)
-    printf '%s  prov=%s  op=%s\n' "$(date +%T)" "${prov:-?}" "${op:-?}"
-    case "$prov" in
-      ACTIVE) echo "failover succeeded"; break ;;
-      ERROR)  echo "failover FAILED -- read octavia-worker.log; do NOT retry blind"; break ;;
-    esac
-    sleep 10
-  done
-} )
-------------------------------------------------------------------------
-END runbook block
-------------------------------------------------------------------------
-```
-Verified timing: ~108 s to ACTIVE; op holds ONLINE; VIP+FIP preserved (VIP port
-is Octavia-owned). A 10-20 s fast-fail to ERROR = early-flow failure (usually
-NoValidHost; see 5b). STANDALONE amphora = brief kube-api endpoint blip
-mid-failover; nodes/pods unaffected.
-
-## 6. Top-of-stack verification
-
-```
-------------------------------------------------------------------------
-BEGIN runbook block: final verify (amphorae, CAPO condition, magnum health)
-------------------------------------------------------------------------
-( {
-  source ~/admin-openrc
-  unset OS_PROJECT_ID OS_TENANT_ID OS_TENANT_NAME
-  openstack loadbalancer amphora list -f yaml          # all ALLOCATED
-  export KUBECONFIG="$HOME/capi-mgmt.kubeconfig"
-  NS=magnum-674171fd28d446d3a37073b6a761e910
-  kubectl -n "$NS" get cluster,openstackcluster        # Available=True (allow ~10 min post-failover for CAPO resync)
-  source ~/admin-openrc
-  unset OS_PROJECT_ID OS_TENANT_ID OS_TENANT_NAME OS_PROJECT_DOMAIN_ID
-  export OS_PROJECT_ID=674171fd28d446d3a37073b6a761e910
-  unset OS_PROJECT_NAME OS_PROJECT_DOMAIN_NAME OS_TENANT_NAME OS_TENANT_ID
-  openstack coe cluster show capi-test-1 -f value -c health_status
-  openstack coe cluster show capi-test-1 -f value -c health_status_reason
-} )
-------------------------------------------------------------------------
-END runbook block
-------------------------------------------------------------------------
-```
-SUCCESS = amphorae ALLOCATED; Cluster Available=True; Magnum reason POPULATED
-with the D-042 cosmetic signature (or HEALTHY post-D-042-fix). Reload Horizon
-Container Infra last. Workload check if desired:
-`KUBECONFIG=~/capi-test-1-kc/config kubectl get nodes -o wide`.