# v1 Redeploy -- Running Change Log

**Purpose:** Living log of design decisions, doc fixes, and runbook edits discovered
DURING the v1 redeploy rehearsal that must be folded into `docs/design-decisions.md`
and the phase runbooks UPON COMPLETION. This is the staging list for the completion
consolidation -- nothing here is applied to the runbooks or design-decisions yet.

**Status:** OPEN -- accumulating. Append-only. ASCII + LF.

**Session opened:** 2026-06-26 (redeploy from clean teardown; D-052/D-053 plane set).

**Next free numbers at session open:** design decision D-054; doc fix DOCFIX-039.
(Verified by grep of design-decisions.md: max D-053, max DOCFIX-038.)

---

## Verified-state checkpoint (measured this session -- authoritative as-built)

`scripts/pre-flight-checks.sh` @ commit 40e3f9e -- ALL PASS, exit 0, 2026-06-26:

Six MAAS planes resolved BY CIDR (subnet IDs are post-D-052-cutover, NOT the old map):

    provider-public  10.12.4.0/22   id=1   vid=0    gw=10.12.4.1   dns=[10.12.4.1]
    metal-admin      10.12.8.0/22   id=2   vid=0    gw=10.12.8.1   dns=[10.12.8.1]
    metal-internal   10.12.12.0/22  id=10  vid=103  gw=none        dns=[10.12.8.1]  (bridged br-internal)
    data-tenant      10.12.16.0/22  id=6   vid=0    gw=none        dns=[10.12.8.1]
    storage          10.12.32.0/22  id=7   vid=0    gw=none        dns=[10.12.8.1]
    replication      10.12.36.0/22  id=8   vid=0    gw=none        dns=[10.12.8.1]

Per-host data/storage NIC links by CIDR, octets .40-.43, all four hosts:
br-internal -> .12, enp8s0 -> .16, enp9s0 -> .32, enp10s0 -> .36.

Nodes openstack0-3 (4na83t / qdbqd6 / h8frng / tmsafc): all Ready, power off.
OSD secondary disks (`osd-blank-check.sh`): all four 512 GiB / 200 KiB blank, RC=0.
Bundle VIPs: 11 triple-column VIPs, aligned, .50-.60 band, OK=11 bad=0.
octavia-pki overlay: present, 5 lb-mgmt-* keys, ASCII clean.

---

## Pending design-decisions.md appends

### D-054 -- Reusable tested scripts in scripts/; runbooks reference them (ADOPTED in practice; formal append pending)

**What:** Repeated discovery/verify logic lives in `scripts/`, authored and tested in a
sandbox against synthetic fixtures, committed to the repo, and referenced by the runbooks.
Runbooks document expected output and remain the gate authority; the scripts are the
executable truth. All pinned network values live once in `scripts/lib-net.sh` (single
source of truth), resolved BY CIDR (subnet IDs drift across cutovers).

**Delivery workflow:** author + test in sandbox -> publish file + sha256 -> commit from
Windows -> jumphost `git pull` -> `sha256sum` match -> run via `bash scripts/X.sh`.

**Convention:** ASCII + LF (`.gitattributes` `*.sh eol=lf`); `set -euo pipefail` +
`shopt -s inherit_errexit` + `IFS=$'\n\t'`; `fail`/`warn`/`pass`/`note` helpers with
exit 0 (pass) / 1 (fatal) / 2 (warning) for gate scripts; read-only discovery kept
separate from gated mutation; `lib-net.sh` is sourced, never executed (direct-run guard).

**Why:** Eliminates the paste-corruption failure class (see Findings below) and turns
repeated discovery -- polled every redeploy cycle -- into a one-liner with a byte-identity
guarantee (sha256) instead of a fragile copy-paste block.

**Scripts added this session:** `lib-net.sh` (new), `pre-flight-checks.sh` (implemented the
placeholder), `juju-spaces-check.sh` (new), `osd-blank-check.sh` (new). All tested
end-to-end against mock `maas`/`juju` + fixtures (positive + 7 negative fault injections
for pre-flight; 4 scenarios for spaces). Committed at 40e3f9e.

---

## Pending DOCFIX entries

### DOCFIX-039 -- phase-01-bundle-deploy.md gate reconciliation (PROPOSED)

The phase-01 pre-deploy GATES encode the OLD plane layout (pre-D-052 CIDR->role map); the
deploy COMMANDS are fine. Superseded by `scripts/pre-flight-checks.sh`. Five stale items:

1. Constants: hardcoded subnet ids `1 2 6 7 8 9` + old CIDR->role map -> resolve BY CIDR
   (now in `lib-net.sh`; metal-internal is id=10 post-cutover, not id=6).
2. CHECK 1 / Step 1.3 deploy guard: provider-column-only VIP check -> triple-column
   validator (provider/admin/internal, aligned, .50-.60).
3. CHECK 2: `enp8s0` + `10.12.12.0/22` (old "data") -> links BY CIDR; `enp8s0` now carries
   `10.12.16.0/22` (data-tenant), metal-internal is on `br-internal`.
4. CHECK 3: hardcoded ids/DNS -> subnets BY CIDR.
5. EXIT GATE binding plane map (old: ceph->.16 / octavia->.12.1 / nova->.12.4x / vault->.8)
   -> corrected per D-052: ceph public/osd/mon->storage(.32); octavia overlay->data-tenant
   (.16); nova-compute neutron-plugin->data-tenant(.16); vault default->metal-admin(.8) +
   cluster->metal-internal(.12).

**Action at completion:** replace the inline CHECK blocks in phase-01 with
`bash scripts/pre-flight-checks.sh` (document expected PASS output) and add a post-add-model
`bash scripts/juju-spaces-check.sh openstack` as the per-model space gate (the old inline
CHECK 5 ran `juju spaces` pre-model and failed "model not found"; spaces are per-model).

---

## Pending runbook / file edits (apply at completion)

1. `runbooks/phase-01-bundle-deploy.md` -- DOCFIX-039 (above): swap inline pre-flight blocks
   for `bash scripts/pre-flight-checks.sh`; add post-add-model `bash scripts/juju-spaces-check.sh
   openstack`; fix the 5 stale gate items; document expected output.
2. `scripts/validate.sh` -- convert UTF-8 to ASCII when implementing the D-011 runner
   (phase-08). `file` reports "Unicode text, UTF-8 text" (em-dashes from the placeholder);
   violates the ASCII-only convention. Currently a placeholder, not yet run.
3. Teardown runbook -- reference `scripts/osd-blank-check.sh` for the OSD-blank verification
   step (replaces the inline qemu-img loop).
4. `runbooks/` README / pre-flight references -- point at the new scripts where the old
   inline discovery blocks were described.

---

## Findings / process learnings (this session)

- **Paste-corruption failure class.** A hand-built base64 pre-flight block shipped two
  transcription defects: `[:space:]` (single bracket, must be `[[:space:]]`) on the grep
  count line, and `ENV{` instead of `END{` on the awk tally (so the summary silently never
  printed). Root cause: the base64 was hand-edited AFTER testing a clean version -- the
  bytes sent were never round-tripped through the sandbox. Mitigation is now standard
  practice (D-054): tested scripts committed to the repo, verified by sha256 on the jumphost.

- **Juju spaces are per-model.** `juju spaces` / `juju reload-spaces` cannot run until after
  `juju add-model`; the old phase-01 CHECK 5 ran pre-model and failed with "model not found".
  Split into `juju-spaces-check.sh`, gated to run post-add-model.

- **Default-space globally poisons network-get (deploy root cause).** The full D-052
  binding deploy failed universally (`network-get ... ERROR space "metal" not found`,
  install hook dies on nearly every charm). Every static layer was correct -- bundle,
  model bindings, MAAS spaces/VLANs/per-NIC space tags all read `metal-internal`. The
  single stale value was controller `model-defaults default-space = metal` (a dead
  pre-D-052 name). An INVALID default-space poisons `network-get` for ALL endpoints
  regardless of their explicit binding. Fix: set `juju model-defaults
  default-space=metal-admin` (a live space) before add-model. A `default-space`-resolves-
  to-a-live-space gate is to be added to `pre-flight-checks.sh`.

- **Teardown --destroy-storage on virsh DELETES machine objects (does NOT release).**
  The phase-00 teardown (`juju destroy-model openstack --force --destroy-storage` then
  per-host `maas machine release`) assumes release-to-Ready. On a virsh/KVM MAAS,
  `--destroy-storage` DECOMPOSES (deletes) the VM-backed machine objects. All four
  openstack hosts were removed from MAAS. Recoverable only because the libvirt domains
  + disks (incl the blank OSD vdb) survived. See D-055.

---

## Pending design-decisions.md appends (continued)

### D-055 -- virsh teardown defect + host re-enrollment procedure (ADOPTED)

**Defect:** `juju destroy-model --destroy-storage` against virsh-power MAAS machines
deletes (decomposes) the machine objects rather than releasing them to Ready. The
phase-00 teardown must NOT pass `--destroy-storage` for virsh hosts; release to Ready
without it.

**Recovery (now a reusable procedure):** the libvirt domains survive, so re-enroll via
`maas admin machines create` per host with virsh power + the boot NIC MAC (NOT add-chassis
-- it would re-grab juju/lxd/tailscale). `machines create` auto-commissions
(New->Commissioning->Ready) by PXE off the 2_metal boot NIC. Then re-tag `openstack`,
then reconstruct the host interface tree (Strategy-B carve, from the captured as-built),
then verify (pre-flight), then redeploy with the default-space fix.

**Artifacts:** `scripts/lib-hosts.sh`, `scripts/reenroll-hosts.sh`,
`docs/maas-as-built-reference.md`. Proven live on openstack0 (2026-06-26): created
virsh, commissioned, Ready, all six NICs discovered, boot NIC on 2_metal.

### DOCFIX-040 -- host identity must be hostname-keyed, not system_id-keyed

`lib-net.sh` lines 45-47 key the host maps (`SYSIDS`, `SYSID_HOST`, `SYSID_OCTET`) on
the system_ids 4na83t/qdbqd6/h8frng/tmsafc -- which DIED on re-enrollment (new random
ids). Any script keyed on them silently breaks. New `scripts/lib-hosts.sh` keys all host
identity on hostname (stable) and resolves system_id at runtime (`host_sysid`). At
completion: retire the SYSID-keyed maps from lib-net.sh (or repoint them to lib-hosts).

---

## Security note (action required)

The libvirt SSH password (`logxen@10.12.64.1`) was printed in plaintext on 2026-06-26 by
`maas admin machine power-parameters` during virsh power-template discovery. Treat as
exposed: **rotate the libvirt SSH credential after the rebuild** and scrub terminal
scrollback. Runbook rule added: never use `machine power-parameters` for templating; read
`power_type` and reconstruct the address pattern instead. `reenroll-hosts.sh` reads the
password interactively (never a CLI arg, never logged, never in the repo).

---

## Scripts / docs added (this batch)

- `scripts/lib-hosts.sh` -- hostname-keyed host identity + virsh power constants (no secret).
- `scripts/reenroll-hosts.sh` -- gated/idempotent re-enrollment (auto-commission, poll Ready,
  boot-NIC-on-2_metal verify; --check read-only mode). Tested: bash -n, shellcheck clean,
  mock-maas behavior test of --check (discover-by-hostname, NOT-ENROLLED detection, exit 0).
- `docs/maas-as-built-reference.md` -- captured MAAS substrate + per-host NIC inventory +
  interface-carve target + virsh template, for DC-DC replay.
- Pending next artifact: the Strategy-B interface-carve script (built once all four are Ready;
  bridge_type pulled verbatim from captured release JSON) -> then consolidate into
  `runbooks/phase-00b-host-reenrollment.md`.

### DOCFIX-041 -- as-built reference: br-ex is charm-built, not a MAAS bridge

Correction to `docs/maas-as-built-reference.md` (first committed this session). The
bundle's ovn-chassis `bridge-interface-mappings` maps `br-ex:<provider-MAC>` for all
four hosts -> **br-ex is built by the ovn-chassis charm at deploy (OVS), enslaving the
provider NIC by MAC; it is NOT a MAAS interface.** The MAAS carve therefore:
- provider plane = **raw enp1s0 + static 10.12.4.N** (MAAS leaves it raw; the charm
  enslaves it into br-ex at deploy). MAAS does NOT create br-ex.
- storage/replication = raw enp9s0/enp10s0 + statics; Juju auto-bridges them
  (br-enp9s0/br-enp10s0, Linux) at deploy.
- the ONLY MAAS-built bridges are the metal-internal stack:
  enp7s0 -> br-metal -> br-metal.103 (VID 103) -> br-internal.

bridge_type: br-internal = standard (confirmed, D-052 command). br-metal = standard
(RECOMMENDED, reasoned-not-measured -- original bring-up predates the repo and the
capture did not preserve bridge_type; pending confirm before carve). The
deployed-host `ip`-level read that showed br-metal/br-internal "OVS" was taken during
the FAILED deploy and is reclassified UNRELIABLE.

### Carve script added + MAAS interface CLI confirmations

- `scripts/carve-host-interfaces.sh <hostname> [--apply]` -- Strategy-B per-host
  interface carve. Default DRY-RUN (resolves every id live, prints each mutation it
  WOULD run, changes nothing); --apply executes. Idempotent (skips existing
  bridge/vlan/link), resolves system_id by hostname / interface id by name / subnet
  id + VLAN object id by CIDR, asserts metal-internal is VID 103, requires Ready.
  Builds: enp1s0 raw+static (provider); enp7s0 -> br-metal(std) -> br-metal.103(VID
  103) -> br-internal(std); enp8/9/10 raw+static (data/storage/repl); enp11s0 idle.
  Does NOT create br-ex (charm-built). Tested: bash -n, shellcheck clean, mock-MAAS
  dry-run (full id resolution + command preview), input guards.

- MAAS 3.7 interface CLI confirmed (canonical.com/maas/docs/3.7 reference):
  create-bridge takes `bridge_type=standard|ovs parent=<ifid> vlan=<vlan-obj-id>`;
  create-vlan takes `vlan=<VLAN-OBJECT-ID> parent=<ifid>` (NOT the VID tag -- resolve
  the object id via the metal-internal subnet); link-subnet `mode=STATIC
  subnet=<id> ip_address=<ip>`; a NIC is moved to a plane's fabric via `interface
  update <sid> <ifid> vlan=<vlan-obj-id>` before link-subnet (re-enrolled raw NICs
  sit on transient auto-fabrics).

- FINDING (teardown runbook bug): `runbooks/phase-00-teardown-maas-reset.md`
  "Phase 3" link-subnet block uses PRE-D-052 CIDRs
  (`enp8s0=10.12.12.0/22 enp9s0=10.12.16.0/22 enp10s0=10.12.20.0/22`) and dead
  system_ids -- it would link NICs to the WRONG subnets (10.12.12 is now
  metal-internal, 10.12.16 is now data-tenant, 10.12.20 no longer exists). Must be
  rewritten to current planes + hostname-keyed before that runbook is trusted. Note:
  the normal release-to-Ready path PRESERVES host interfaces, so that block only ran
  on a normal teardown; the full carve (this script) is needed only after a
  decompose, which is why the bridges were never scripted before.

### Carve hardening: self-discovered metal IP blocks br-metal static (KI)

Root cause (cost several diagnostic rounds): after re-enrollment each host PXE-leases
its own metal IP (10.12.8.4N) at commission. MAAS records this as a StaticIPAddress
of **alloc_type 6 (DISCOVERED)** tied to the node via its boot NIC. This is a
SEPARATE object from the network-discovery table (`discoveries clear-by-mac-and-ip`
does NOT clear it) and from user allocations (`ipaddresses read` user-scope does NOT
show it). It causes `link-subnet ... ip_address=10.12.8.4N` to fail with the
misleading "IP address is already in use".

Authoritative read (the lesson): `maas admin subnet ip-addresses <subnet_id>` reports
every in-use IP WITH its alloc_type and owning node -- this is the single correct
"who holds this IP and why" query. Lead with it; do not probe ipaddresses/discovery/
leases piecemeal.

Release: `maas admin ipaddresses release ip=<ip> force=true discovered=true` (BOTH
flags required; force alone returns "does not exist" for a discovered address).

Script fix (carve-host-interfaces.sh): `release_self_discovered()` runs before every
STATIC link -- releases an alloc_type-6 record for the target IP ONLY when its owning
node == this host (node_summary.system_id), and REFUSES (fatal) if a different node
discovered it (a real conflict). Plus `emit` now captures and prints the MAAS error on
a failed mutation instead of discarding it to /dev/null (the discard hid the real
message and prolonged diagnosis). Only the metal plane (dhcp_on=true) is affected;
the no-DHCP planes never produced a self-lease. Verified: mock self-release path +
foreign-node refuse gate.

NOTE (design consistency, not a blocker): host statics .40-.43 sit inside the
metal-admin/provider/internal VIP+mgmt reserve band (.2-.100). A reserved range blocks
AUTO assignment, not explicit STATIC, so it did not break the carve -- but host octets
arguably belong outside the VIP band. Log for the reserve-layout review.

### DC-DC script audit (post-carve hardening batch)

Reviewed all MAAS scripts against what this session actually hit, so the DC-DC build
replays cleanly instead of re-deriving the metal-IP archaeology.

- **carve gate rewrite (the big one).** `release_self_discovered` keyed on
  `node_summary.system_id`, which is EMPTY on a fresh discovered record -> it silently
  no-op'd and the metal static (.8.41/.42/.43) had to be released by hand on three
  hosts. Replaced with `release_self_indexed`: the target is this host's
  architecturally-indexed metal IP (10.12.8.<octet> from HOST_OCTET), so a DISCOVERED
  observation on it is this host's own commissioning ghost. SAFETY: refuses if the
  record's system_id (when present) OR the discoveries-table MAC (when present)
  identifies a DIFFERENT host; releases otherwise. Removed the (unneeded) release call
  from carve_raw -- the no-DHCP planes never produce discovered records. Tested: 5
  branches (foreign-sysid refuse, foreign-MAC refuse, indexed-basis release, MAC-basis
  release, no-record no-op).

- **missing step added: openstack tag.** `reenroll-hosts.sh` now ensures the
  `openstack` tag exists and applies it to all four hosts after the Ready/boot-NIC
  gate (idempotent; --check-aware). Without it the bundle cannot place units
  (constraint tags=openstack). Was a manual step every rebuild.

- **DOCFIX-040 COMPLETE.** `pre-flight-checks.sh` and `osd-blank-check.sh` both looped
  over the dead system_ids (4na83t...) via lib-net's SYSID maps -- broken for any
  rebuilt/DC-DC cluster. Migrated both to hostname-keyed (lib-hosts HOSTS / HOST_OCTET
  / host_sysid). Retired the SYSID/SYSID_HOST/SYSID_OCTET maps from lib-net.sh and
  added its sourced-library shellcheck directive. osd-blank verified via mock
  (iterates the four hostnames, RC=0).

- **validate.sh**: em-dashes -> ASCII (the silent-UnicodeDecodeError class; ASCII-only
  rule for all scripts). Still a placeholder body otherwise.

REMAINING DC-DC scope (done MANUALLY this session; scripting them would make the
bring-up fully hands-off -- NOT yet built):
1. A multi-host carve-verify wrapper (assert all four hosts show the six expected
   static links on the right fabrics) -- currently an ad-hoc jq loop.
2. A redeploy-prep wrapper: set model-defaults default-space=metal-admin, add-model,
   verify the MODEL's effective default-space (the value that poisoned the last
   deploy), reload-spaces, run juju-spaces-check. Currently manual steps R1-R3.

---

## Phase-02 vault bring-up (as-executed -- COMPLETE 2026-06-27)

**Session:** 2026-06-27. Origin/jumphost HEAD at phase-02 start: 1a103f5 ("Create
phase-02-vault-preflight.sh"; was 68a0bd5 at the redeploy handoff). Model: openstack.
Next free numbers at section open: design decision D-056; doc fix DOCFIX-042 (verified
by grep: changelog max D-055 / DOCFIX-041; design-decisions max D-053 / DOCFIX-038).

### Pre-flight gate (Step 2.1 verify-before-mutate) -- PASS

Manual A-E audit on the jumphost cleared all gates; the new
`scripts/phase-02-vault-preflight.sh` then reproduced it identically with REAL jq:

    A auth      jessea123 / juju-controller / model openstack; no macaroon EOF.
    B machines  4/4 started.
    C mysql     mysql-innodb-cluster 3/3 -- /0 R/W, /1 R/O, /2 R/O, all
                "Cluster is ONLINE and can tolerate up to ONE failure." (vault backend OK)
    D vault     vault/0 [blocked] "Vault needs to be initialized" -- FRESH
                (irreversibility guard satisfied).
    E census    units=63  workload-error=0  agent-error(hook)=0
                blocked=2 (vault + octavia "Awaiting configure-resources")
                waiting=9  active=51  unknown=1 (glance-simplestreams-sync).

Census 63 vs the handoff's 31 is NOT a discrepancy: the handoff counted PRINCIPALS
(active=25/blocked=2/waiting=3/unknown=1); the script recurses into subordinates.
waiting 3->9 reconciles against the handoff prose (ovn-central x3 principals + ovn-chassis
x3 + ovn-chassis-octavia + neutron-api-plugin-ovn + nova-compute certs); active 25->51 is
the hacluster/mysql-router/filebeat subordinate layer. blocked=2 and unknown=1 match exactly.

### FINDING -- live machine IDs are 0-3, NOT the handoff's 8-11   [-> DOCFIX-043]

Committed bundle.yaml declares SYMBOLIC machine IDs "8"/"9"/"10"/"11" (machines: section,
constraints tags=openstack). Juju treats bundle machine keys as PLACEHOLDERS, not real IDs;
deployed into a fresh model they map in order to real IDs 0/1/2/3. Live (confirmed by the
preflight script's machine display lines):

    real m0 = openstack0 = 10.12.12.40   (bundle "8")   control-only, 7 LXD: 0/lxd/0..6
    real m1 = openstack1 = 10.12.12.41   (bundle "9")
    real m2 = openstack3 = 10.12.12.43   (bundle "10")
    real m3 = openstack2 = 10.12.12.42   (bundle "11")  holds vault/0 (juju-f5a310-3-lxd-5)

The openstack2/openstack3 <-> m3/m2 "swap" is MAAS tag-based allocation (hosts pinned by
tag=openstack, NOT by system_id), so the host->machine binding floats per deploy.
nova-compute `to: ["9","10","11"]` (symbolic) therefore landed on real m1/m2/m3 =
openstack1/openstack3/openstack2, leaving m0/openstack0 control-only -- CONSISTENT with the
handoff's intended role split. ceph `to: ["8","9","10","11"]` -> all four real machines.

IMPACT: zero on phase-02 (vault/0 resolves by unit name). The handoff text "= bundle
machines 8/9/10/11" is stale on LIVE ids. RULE to fold into the runbook: resolve everything
by unit name / hostname / CIDR, NEVER by machine ID; document the bundle-symbolic vs
live-real mapping so a future operator does not mistake it for a deploy fault. Phase-03
host-role verify (open-item 2) confirms which units run on which machine definitively.

### DELIVERABLE -- scripts/phase-02-vault-preflight.sh   (committed 1a103f5)   [-> DOCFIX-042]

Read-only verify-before-mutate gate packaging the A-E audit into one re-runnable command.
Mutates NOTHING; the vault init/unseal/authorize MUTATIONS stay gated human steps (item-8
principle: scripts own the deterministic/read-only/repeated; the human gate owns the
consequential mutation + secret custody). Gates: B all machines started; C mysql 3 units /
all active+ONLINE / exactly 1 R/W; D vault fresh ([blocked] "needs to be initialized" --
REFUSES and escalates if not, since a non-fresh vault may already hold keys); E zero
workload-error AND zero agent-error(hook), subordinates included. Exit 0 PROCEED / 1 HOLD /
2 precondition. Sources lib-net.sh (need_jq); whoami-direct-first so a stale-macaroon prompt
reaches the tty before captured calls; single juju-status snapshot; one jq metrics pass
(eval'd key=value); dynamic lookups, nothing host/IP/ID hardcoded. ASCII+LF, bash -n clean.

Testing: shellcheck + jq both ABSENT from Claude's sandbox -> behavior-tested with juju+jq
shims across 5 fixtures (1 healthy + 4 single-fault: vault-already-initialized D, mysql
OFFLINE C, hook-failure E, machine-down B); each produced the correct exit code and gate
attribution. jq metrics algorithm mirrored/validated in Python. REAL-jq/REAL-data
confirmation on the jumphost first run reproduced the manual audit EXACTLY (units=63,
errors 0, PROCEED, EXIT=0); Windows -> GitHub Desktop -> push -> jumphost-pull preserved
LF/ASCII/parse. FOLD INTO phase-02 do-doc: invoke this script as the Step 2.1 pre-flight gate.

### DELIVERABLE -- tests/phase-02/ regression harness   (staged; commit pending)

run-tests.sh + make_fixtures.py + fakebin/{juju,jq} shims. Offline regression for the
preflight script: drives the REAL script's decision/exit logic against the 5 generated
fixtures; touches NO live infra (fake juju emits fixtures, fake jq mirrors the metrics in
Python); runs anywhere with python3 + bash (no real jq needed); re-asserts shim exec bits so
the Windows -> git round trip dropping them will not break it. Sandbox run: ALL PASS / exit 0.
Target paths: tests/phase-02/{run-tests.sh, make_fixtures.py, fakebin/juju, fakebin/jq}.

### Step 2.1 delivery split   [-> DOCFIX-042]

The do-doc presents Step 2.1's in-session block as one paste (env-setup; vault status;
vault operator init | tee; grep -c; grep -q). Split at the verify/mutate boundary into two
gated pastes:
  2.1a (read-only verify): `export VAULT_ADDR...; umask 077; mkdir -p ~/vault-init` +
       `vault status 2>&1 | grep -E 'Initialized|Sealed|Storage Type|HA Enabled' || true`
  2.1b (irreversible)     : `vault operator init -key-shares=5 -key-threshold=3 2>&1 | tee
       ~/vault-init/init.txt` + `grep -c '^Unseal Key' ...` + `grep -q '^Initial Root Token:' ...`
Rationale: the `vault status` line exists to be OBSERVED before the irreversible init; a
single paste runs init before it can be read, defeating verify-before-mutate. Commands are
verbatim/unchanged -- only the paste boundary moves. Amend phase-02 do-doc Step 2.1 to
present 2.1a/2.1b as two gated pastes.

### Step 2.1a verify -- FRESH confirmed (2026-06-27)

Session opened on vault/0: `juju ssh -m openstack vault/0` -> ubuntu@juju-f5a310-3-lxd-5
(= real machine 3 = openstack2, LXD container 5). 2.1a output:

    Initialized        false      uninitialized; safe to init
    Sealed             true
    Storage Type       mysql      vault-on-mysql backend (mysql-innodb-cluster)
    HA Enabled         false      CORRECT for vault-on-mysql (R3); NOT a defect

Vault's own status agrees with the Juju workload-status. Cleared for 2.1b (vault init).

### Step 2.1b vault init -- EXECUTED 2026-06-27 (irreversible one-shot done)

`vault operator init -key-shares=5 -key-threshold=3 2>&1 | tee ~/vault-init/init.txt` ran
once. Token gate: `grep -q '^Initial Root Token:'` -> TOKEN_OK (root token line captured in
init.txt). Unseal-Key count gate (`grep -c '^Unseal Key'` MUST = 5): = 5 (operator confirmed); not
inferred. Operator confirmed all key material (5 shares + root token) saved OFF cloud/host;
~/vault-init/init.txt on the unit is the only on-unit copy (dies with the unit).

Post-init expected state: vault Initialized true / Sealed true (init does NOT unseal a
vault-on-mysql; unseal is the separate 2.2 step).

### Step 2.2 unseal -- EXECUTED 2026-06-27
3-of-5 via `vault operator unseal` (no arg, vault's own hidden prompt; keys never on argv/
history -- L4). Final `vault status`: Initialized true / Sealed false / Storage Type mysql /
HA Enabled false (HA false correct for single-unit vault-on-mysql -- R3). Vault is now
initialized AND unsealed. v1 policy: MANUAL unseal is the v1 standard -- re-run 3-of-5 at
the hidden prompt after any vault-unit reboot (auto-unseal via transit/KMS not configured
in v1; D-011.6 re-confirms in phase-08).

### Step 2.3 authorize-charm + generate-root-ca -- EXECUTED 2026-06-27
Short-lived child token (10m TTL) minted in vault/0 via hidden `read -s` root token +
`vault token create -ttl=10m -field=token` (NOT the root token -- juju op-log persists
action params; DOCFIX-011 param=`token`). `juju run vault/leader authorize-charm token=...`
then `juju run vault/leader generate-root-ca` (REQUIRED -- DOCFIX-014) both completed; child
token entered via hidden `read -s` on the jumphost too (narrows, does not eliminate, op-log
exposure). Root CA PEM emitted ("Vault Root Certificate Authority (charm-pki-local)") and
copied OFF cloud.

Result (juju status vault): vault 1.8.8 active "Unit is ready (active: true, mlock:
disabled)"; vault/0 active/idle on 3/lxd/5 (= machine 3 = openstack2; container 10.12.12.106);
vault-mysql-router/0 active. The "Missing CA cert" block cleared STRAIGHT to active --
validates DOCFIX-014. `mlock: disabled` is expected/benign for container vault (no IPC_LOCK).

### PHASE-02 EXIT GATE -- MET (2026-06-27)
- Vault Initialized true / Sealed false; 5 shares + root token saved OFF cloud/host.      [DONE]
- vault/0 active/idle; root CA generated (the cloud's PKI anchor); PEM saved off-cloud.    [DONE]
- Narrow cert cascade to consumers (ovn-central x3, ovn-chassis x3, ovn-chassis-octavia,
  neutron-api-plugin-ovn, barbican-vault) ACTIVE/proceeding -- watched + accepted phase-03. [IN PROGRESS]
- POST-INIT SWEEP -- cascade SETTLED 2026-06-27 (full juju status, two-image capture):
  all apps active EXCEPT octavia (blocked "Awaiting ... configure-resources" -- EXPECTED,
  phase-05) and glance-simplestreams-sync (unknown -- expected sync state). Cert consumers
  now active: ovn-central x3 (leader ovnnb_db/ovnsb_db; northd active), ovn-chassis x3,
  ovn-chassis-octavia, neutron-api-plugin-ovn, barbican + barbican-vault. NO errors / NO
  hook failures.
  * magnum/0 active "Unit is ready" (1/lxd/2, 10.12.12.115; public port 9511/tcp) -- the
    phase-01 pre-vault 9501 loopback BLOCK self-resolved at the TLS cutover, as predicted.
    Definitive *:9501 not-loopback bind check via read-only `ss` on magnum/0 (juju exec):
    9501 -> `*:9501` (all-interfaces; NOT 127.0.0.1); 9511 -> `0.0.0.0:9511` + `*:9511`.
    NOT loopback -> escalation condition NOT met; the phase-01 9501 line was the expected
    pre-vault posture, NOT a defect. Settle also confirmed at principal level via
    deploy-watch.sh: active=29 / blocked=1 (octavia) / unknown=1 (glance-ss-sync) = 31
    principals -- reconciles with the handoff's original 31. **PHASE-02 EXIT GATE CLOSED.**
  * keystone/0 "PO (broken): Unit is ready" -- UNCHANGED (FINDING-1; expected; no regression).
  * Host-role confirm (open-item 2): nova-compute on machines 1/2/3 = openstack1/openstack3/
    openstack2; openstack0 (m0) carries NO nova-compute / NO ovn-chassis (control-only, 7
    LXD: 0/lxd/0..6). CONFIRMS the bundle-symbolic->live-real machine-ID remap (DOCFIX-043)
    and the intended 3-compute/1-control split. Open design Q remains (Jesse's call, not
    phase-02): openstack0's provider MAC is still in ovn-chassis bridge-interface-mappings
    though no chassis runs there -- trim it (3-compute/1-control) vs add openstack0 to
    nova-compute `to:`.

PHASE-02 COMPLETE -- discrete vault mutations done. Cascade-settle + the post-init sweep are
the opening activities of phase-03 (runbooks/phase-03-core-verify.md).

### Next-free numbers after this append
Design decision: D-056. Doc fix: DOCFIX-044.
  DOCFIX-042 = phase-02 Step 2.1 split (2.1a verify / 2.1b init) + invoke preflight script.
  DOCFIX-043 = document bundle-symbolic vs live-real machine-ID remap + MAAS tag-allocation
               host swap; resolve by unit/hostname/CIDR, never by machine ID.

---

## Phase-03 core verify (as-executed -- IN PROGRESS)

**Session:** 2026-06-27 (continues). Next free numbers at section open: D-056; DOCFIX-044.

### CORRECTION (DOCFIX-044) -- phase-02 preflight hook-error key wrong (agent-status -> juju-status)

scripts/phase-02-vault-preflight.sh (committed 1a103f5) computed the agent/hook-error count
as `select(."agent-status".current=="error")`. In `juju status --format json` a UNIT carries
`workload-status` + `juju-status` (the agent state: idle/executing/error); there is NO
`agent-status` key on units. Confirmed against two authoritative consumers: deploy-watch.sh:43
(`.value."juju-status".current=="error"` for units) and the phase-03 do-doc acceptance walk
(`u.get('juju-status')`). So the `ae` (hook-failed) half of the E gate was INERT -- it read a
nonexistent key and always returned 0; a real hook failure would NOT have been caught.
- Decision impact this run: NONE. The cloud had zero errors of either kind, so 0/0 was the
  correct verdict regardless; the workload-error half (workload-status, correct key) worked.
  The defect is a latent false-negative, not a wrong decision.
- Why the harness missed it: the phase-02 mock fixtures + jq shim used the SAME wrong key, so
  the regression validated internal consistency against a fiction, not the real schema.
  LESSON (fold into conventions): mock fixtures MUST mirror the real `juju status` JSON schema;
  a fixture that agrees with the script's bug hides the bug. The phase-03 harness surfaced this
  because its fixtures (built from the do-doc's juju-status walk) disagreed with the bad key.
- FIX: phase-02 preflight `ae` -> `select(."juju-status".current=="error")` + an anti-regression
  header note. Harness corrected (fixtures + shim now juju-status); the FAIL-E case now sets
  juju-status.current=error and only passes because the key is right. RE-COMMIT REQUIRED over
  1a103f5. Re-running on the (healthy) cloud still yields PROCEED; the fix matters for catching
  FUTURE hook failures.

### Step 3.1 core verify -- PASS (2026-06-27)
3.1a acceptance walk: 2 non-active/idle, BOTH expected -- glance-simplestreams-sync/0 (unknown,
image-sync state) + octavia/0 (blocked "Awaiting configure-resources", D-021). No TLS consumer
stuck. 3.1b haproxy backend-health sweep (D-045/DOCFIX-031): ZERO DOWN across all principal
units -- the plaintext-vs-SSL backend failure did NOT recur this cycle (cert cascade + haproxy
reload state healthy). No remediation needed.

### DELIVERABLE -- scripts/phase-03-core-verify.sh + scripts/phase03_accept_walk.py + tests/phase-03/
Read-only Step 3.1 gate packaging 3.1a (acceptance walk) + 3.1b (haproxy sweep). HARDENED beyond
the do-doc's bare count gate: phase03_accept_walk.py gates on IDENTITY -- only octavia
(blocked/configure-resources) and glance-simplestreams-sync (unknown/waiting) may be
non-active/idle; a different app blocked also yields count==2 yet correctly FAILS. The do-doc's
inline python-in-bash acceptance walk is moved to its own tested .py (convention); the haproxy
sweep's unit list comes from jq on the captured snapshot (no second juju call, no inline python).
Mutations stay gated: a DOWN backend's `haproxy -c` + `systemctl reload` is a per-unit human step;
Step 3.2 (admin-openrc) and 3.3 (Horizon) too. tests/phase-03/: unit-tests the .py
(pass/unexpected-blocked) + behavior-tests the .sh with juju+jq shims (settled / unexpected-unit /
injected haproxy-DOWN). ALL PASS, offline, no real jq. Real-jq/real-data: 3.1a+3.1b already ran by
hand this session and PASSED; the script reproduces them.

### Next-free numbers after this append
Design decision: D-056. Doc fix: DOCFIX-045.
  DOCFIX-044 = phase-02 preflight hook-error key agent-status -> juju-status (+ harness fix).

### Artifact validation -- all four confirmed (2026-06-27, post-commit, on jumphost)
1. tests/phase-02/run-tests.sh: ALL PASS / 0 (corrected FAIL-E now drives juju-status.current=error).
2. tests/phase-03/run-tests.sh: ALL PASS / 0 (accept-walk + haproxy gate incl injected-DOWN).
3. scripts/phase-03-core-verify.sh LIVE: PROCEED / 0 -- accept walk 2 expected (glance-ss unknown,
   octavia blocked); haproxy sweep ZERO DOWN across 31 principal units. Reproduces manual 3.1.
4. scripts/phase-02-vault-preflight.sh LIVE: HOLD / 1 on gate D ONLY (vault/0 now [active],
   units=1 fresh=0) -- irreversibility guard correctly refusing re-init of a live vault. B/C/E pass.
   DOCFIX-044 closed with REAL-DATA confirmation: the corrected `ae` ran live and reported
   agent-error(hook)=0 via juju-status (post-settle census: units=63, workload-error=0,
   agent-error=0, blocked=1 [octavia], waiting=0, active=61, unknown=1).

### Step 3.2 build admin-openrc -- PASS (2026-06-27)
Vault root CA pulled via get-root-ca --format json + jq (DOCFIX-021 path): CN=Vault Root
Certificate Authority (charm-pki-local), valid 2026-06-27 -> 2036-06-24. Admin password from
get-admin-password --format json; admin project DISCOVERED via the scope-test loop (DOCFIX-022;
value recorded in ~/admin-openrc OS_PROJECT_NAME, not captured this turn). ~/admin-openrc written
(chmod 600); `openstack endpoint list` authenticated and returned the full catalog -> confirms a
SCOPED token (the gate). Endpoints IP-only on the three D-052 planes:
  public   -> provider VIP    10.12.4.5x
  internal -> metal-internal  10.12.12.5x
  admin    -> metal-admin     10.12.8.5x    (keystone admin on :35357)
VIP octets match bundle: keystone .50, barbican .51, cinderv3 .52, glance .53, magnum .54,
neutron .55, nova .56, octavia .57, placement .59, radosgw/s3/swift .60:443.

### DOCFIX-045 -- phase-03 do-doc 3.2 gate text is pre-D-052 (internal plane)
The 3.2 GATE text reads "internal+admin on the metal VIP .8.5x" -- predates D-052's dedicated
metal-internal plane. LIVE (correct) shows INTERNAL on metal-internal 10.12.12.5x and ADMIN on
metal-admin 10.12.8.5x (bundle triple-VIP "10.12.4.5x 10.12.8.5x 10.12.12.5x" + D-052 internal
binding). Amend the 3.2 gate to: public provider .4.5x; internal metal-internal .12.5x; admin
metal-admin .8.5x; keystone admin :35357.
ALSO (value drift, non-blocking): gss image-stream endpoint is HTTP on metal 10.12.8.226 this
deploy (do-doc note said .172) -- the simplestreams image-stream IP is per-deploy; note as
dynamic, do not hardcode. s3/swift on radosgw VIP .60:443 -- re-check vs radosgw :80 listener
during any Swift/S3 smoke (carried-forward do-doc note).

### Step 3.3 Horizon nginx reverse proxy -- PASS (2026-06-27)
v1 Horizon = PLAIN-HTTP reverse-proxy leg per D-044 (authoritative, adopted 2026-06-17). NO nginx
edit was needed: the existing /etc/nginx/sites-available/openstack vhost on the nginx host
(10.12.4.7) already proxies `listen 81` -> `proxy_pass http://10.12.4.58:80` at the CURRENT
dashboard provider VIP (.58 confirmed vs bundle), with `proxy_set_header Host $http_host` (B5
ALLOWED_HOSTS) + X-Forwarded-*. No proxy_ssl_* applied (that is the Roosevelt root-fix, not v1).
The vhost's "Main LXD UI" comment is a stale mislabel (it is the Horizon proxy) -- cosmetic,
flag for consolidation cleanup; left untouched to avoid mutating a working MAAS-fronting host.
Live scheme probes (decisive, verify-before-mutate, from both jumphost and nginx host):
  jumphost->.58  https rc=000 FAIL(35) | http rc=200
  nginx->.58     https rc=000 FAIL(35) | http rc=200
  s_client .58:443 -> CONNECTED but "no peer certificate available" (certless :443 listener)
  => dashboard serves Horizon over HTTP :80; :443 is an unused, certless haproxy frontend.
The certless :443 is EXPECTED under D-044 (v1 does not use dashboard HTTPS). The bundle's
openstack-dashboard:certificates<->vault:certificates relation provisions a cert, but the v1
plain-HTTP leg never serves it. NOT a v1 defect; the Roosevelt DNS + FQDN-cert workstream is the
end-to-end HTTPS root-fix. The earlier dashboard-SAN probe was therefore moot (proxy_ssl_name is
a Roosevelt concern, not v1).
Steps executed:
  A (nginx host, read-only): curl -sI http://127.0.0.1:81/horizon/ -> HTTP/1.1 302 Found (login
    redirect). GATE A met.
  B (jumphost, the one v1 mutation, PER-REBUILD, verbatim do-doc): juju ssh
    openstack-dashboard/leader wrote _99_internal_http_cookies.py (CSRF_COOKIE_SECURE=False +
    SESSION_COOKIE_SECURE=False, ASCII-only) + systemctl reload apache2. Clean.
  C (jumphost, verify adapted https->http per DOCFIX-046): csrftoken Set-Cookie present, no Secure
    attribute -> "OK: csrftoken not Secure". GATE C met.
  D: external browser login over http://10.17.11.246:81/horizon/ SUCCEEDED -- Horizon Overview
    renders as admin_domain/admin, fresh-cloud quotas 0-of-N. "Not secure" address bar = expected
    (plain-HTTP client leg, D-044). GATE D met.
PHASE-03 EXIT GATE MET: 3.1 PASS (accept walk 2-expected + haproxy ZERO DOWN across 31 principals),
3.2 PASS (admin-openrc + scoped catalog), 3.3 PASS (Horizon reachable + login).

### DOCFIX-046 -- phase-03 do-doc 3.3 carries an abandoned HTTPS-upstream edit set (contradicts D-044)
The phase-03 do-doc Step 3.3 body contains BOTH (a) an HTTPS-upstream edit set -- proxy_pass
https://10.12.4.58:443, proxy_ssl_verify on, proxy_ssl_trusted_certificate, proxy_ssl_name + a
dashboard-cert SAN discovery -- AND (b) the real "the upstream stays PLAIN HTTP (as-built)" line.
These contradict. D-044 (authoritative) resolves it: v1 is the plain-HTTP leg; the proxy_ssl_name
/ HTTPS-upstream handling is the ROOSEVELT root-fix, not v1. The (a) block, if applied on the
testcloud, would repoint nginx at the certless :443 and BREAK Horizon (curl 35) -- exactly what
the live probes confirmed would happen.
Also: the do-doc's D-044 VERIFY command uses `curl --cacert ... https://10.12.4.58/...` -- same
HTTPS assumption; it fails (rc=000/35) against the v1 HTTP dashboard. Adapt to
`curl ... http://10.12.4.58/horizon/auth/login/`.
FIX (for completion consolidation): rewrite 3.3 to the v1 plain-HTTP path (verify the existing
vhost points at the current dashboard VIP over http:80; no proxy_ssl_*; apply the cookie override;
verify over http); move the proxy_ssl_*/SAN block verbatim into a clearly-marked "Roosevelt
root-fix (DNS+FQDN certs)" subsection so a future operator does not apply it on the testcloud; fix
the verify command https->http. Cross-ref D-044. Also fix the stale "Main LXD UI" vhost comment.

### Phase-04 prep -- network-carve verify deliverable + DOCFIX-047 (2026-06-27, pre-execution)
New read-only deliverable staged ahead of running phase-04 (network carve):
  scripts/phase-04-network-verify.sh -- verify-before-mutate + EXIT-GATE check for the
    Neutron external provider network. PRE gate: discovers the MAAS provider subnet BY CIDR
    (10.12.4.0/22) -- lib-net PATTERN-1, never a hardcoded subnet id -- asserts its gateway ==
    pinned PLANE_GW (10.12.4.1) and that the FIP pool 10.12.5.0-10.12.7.254 is a RESERVED
    iprange on it (KI-P3-001). POST gate (auto-detected if provider-ext exists): external/flat/
    physnet1/NOT-shared + subnet cidr/gateway/no-dhcp/FIP-pool. Sources lib-net.sh + need_jq;
    requires admin-openrc sourced + the 'admin' MAAS profile; never calls 'maas list' (DOCFIX-016).
    Exit 0 PROCEED|PASS / 1 HOLD|FAIL / 2 precondition. Mutates nothing.
  tests/phase-04/ -- offline regression (real jq + fake maas/openstack data shims; no live
    infra). 7/7 green: PRE PROCEED (net absent); POST PASS for BOTH allocation_pools shapes
    (list-of-objects AND list-of-strings -- tolerance proven, not assumed, so the live client's
    shape cannot silently break the gate); and four failure variants (FIP pool not reserved;
    wrong gateway; provider subnet absent-by-CIDR; provider-ext shared=true). bash -n clean;
    shellcheck 0.9.0 clean (no warnings) on script + harness + shims; ASCII + 0 CR on all five.
  NOTE: fixtures put the provider subnet at id=7 (NOT 1) on purpose, to prove CIDR discovery is
  id-independent (the exact failure mode DOCFIX-047 guards against).

DOCFIX-047 -- phase-04 do-doc hardcodes the provider MAAS subnet id (violates PATTERN-1).
  runbooks/phase-04-network-carve.md reads the provider gateway via `maas admin subnet read 1`
  and its CHECK prose says "subnet id 1 (provider)" / "subnet id 2 (metal)" -- the PRE-D-052
  two-plane numbering. lib-net.sh:9 records that the D-052 cutover renumbered subnets (metal-
  internal moved id 6 -> 10), so a hardcoded `read 1` may now read the WRONG subnet. FIX (for
  completion consolidation): replace `subnet read 1` / the "subnet id N" prose with CIDR-based
  discovery (select(.cidr=="10.12.4.0/22")), exactly as scripts/phase-04-network-verify.sh does;
  cross-ref the verify script from the do-doc's CHECK block. Not yet applied to the do-doc.

### Phase-04 EXECUTED -- network carve COMPLETE (2026-06-27)
Step 4.1 create block ran clean (do-doc idempotent `( set -e )`, with the DOCFIX-047
CIDR-discovery correction for the gateway; the `[ GW = 10.12.4.1 ]` gate retained as belt+braces):
  network provider-ext     = bb386c86-d646-4c71-b6b7-550f5c691bfb  (created + tagged role=provider)
  subnet  provider-ext-fip = 544afa6a-b0cf-486b-89be-2b8e36983072  (created + tagged)
  (object IDs regenerate per deploy; the do-doc's As-built IDs are dead post-teardown, not a discrepancy.)
CONFIRM: provider-ext external=true type=flat physnet=physnet1 shared=false;
  provider-ext-fip cidr=10.12.4.0/22 gateway=10.12.4.1 enable_dhcp=false
  allocation_pools=[{start:10.12.5.0,end:10.12.7.254}] tags=[role=provider, netbox-iprange=10.12.5.0-10.12.7.254].
phase-04-network-verify.sh POST gate: PASS -- EXIT GATE met (all network+subnet assertions green;
  fip-pool-match=true). Live allocation_pools came back as the list-of-OBJECTS shape -- the real
  client emits {start,end} objects; the harness string-shape case is confirmed safety-margin only.
  PRE re-run also PASS (provider subnet by CIDR id=1 this deploy; gateway pinned; FIP reserved).
PHASE-04 EXIT GATE MET. FIP allocation + tenant router gateways now possible (needed by phase-06
mgmt-VM FIP; phase-08 cluster FIPs + LB validation).

DOCFIX-047 CONFIRMED LIVE: provider resolved to subnet id=1 THIS deploy, so the do-doc's
`subnet read 1` would have worked by luck -- but CIDR discovery is the correct id-independent
pattern (lib-net.sh:9: cutover moved metal-internal 6->10) and ran clean. Do-doc fix still pending
at consolidation.

DOCFIX-048 -- phase-04 do-doc IPAM reference VIP-reserve width drift.
  The do-doc "IPAM carve reference" lists the provider VIP reserve as 10.12.4.2-10.12.4.63 (front-
  loaded /26). LIVE MAAS shows the WIDER reserve 10.12.4.2-10.12.4.100 (comment "supersedes
  .224-.236") -- the D-052 "VIP reserve ceilings" correction. Both sit entirely in .4.x, OUTSIDE
  the FIP pool (10.12.5.0-10.12.7.254) -> no conflict; provider-ext created cleanly. The live
  mgmt-plane reserve 10.12.4.101-10.12.4.110 is also present (already in the do-doc As-built note).
  FIX (consolidation): update the do-doc IPAM reference VIP-reserve from .2-.63 to .2-.100 to match
  live + D-052. Non-blocking.

NOTE (repo hygiene, operator decision pending): all scripts on origin are committed mode 100644
(the Windows/GitHub-Desktop path strips +x), so the jumphost must invoke them as `bash scripts/X.sh`
(`./scripts/X.sh` -> Permission denied). Two durable fixes offered: (a) standardize do-docs on
`bash scripts/...`; or (b) one-time `git update-index --chmod=+x scripts/*.sh tests/*/run-tests.sh
tests/*/fakebin/*` from Git Bash + commit (writes 755 into the tree). Not yet actioned.

### Next-free numbers
Design decision: D-056. Doc fix: DOCFIX-049.