# jumphost provider-vip L3 gateway (virbr1.104 = 10.12.8.1) -- D-057

Provisions the L3 gateway that makes provider <-> provider-vip routing real on the
jumphost (vopenstack-jesse, 10.17.11.246). provider-vip (10.12.8.0/22, VID 104)
rides the SAME libvirt bridge as provider (virbr1), tagged; the jumphost already
routes between its directly-connected planes (ip_forward=1), so once virbr1.104 =
10.12.8.1 exists, tenant SNAT (on provider 10.12.5-7) reaches the API VIPs on
10.12.8.50-60 and back.

WHY A RUNBOOK, NOT A SCRIPT: this is a one-time, consequential host change. The real
risk is how persistence interacts with libvirt (virbr1 is libvirt-managed, created at
libvirtd start) -- which a fixture test cannot exercise. It is also NOT portable to
Roosevelt (no virbr1 there; the provider-vip gateway is a physical router/SVI). So it
is gated and human-run, per the project's "human gates own consequential mutations".

NOT required for: the MAAS plane stand-up, or the carve. The MAAS subnet records
gateway_ip=10.12.8.1 as metadata regardless. REQUIRED before: D-011 #3 (tenant ->
API reachability) and any provider<->provider-vip traffic test.

================================================================================
## PHASE 1 -- AUDIT (read-only). Run, paste back; this picks the persistence method.
================================================================================

--- BEGIN runbook block: gw-01-audit (RUN ON jumphost) ---
echo "=== G1: virbr1 must pass tagged frames (VID 104). MUST be 0 ==="
cat /sys/class/net/virbr1/bridge/vlan_filtering 2>/dev/null \
  || echo "WARN: virbr1 has no bridge/vlan_filtering node -- investigate before proceeding"

echo "=== G2: ip_forward must be 1 ==="
cat /proc/sys/net/ipv4/ip_forward

echo "=== G3: virbr1 detail (is it a bridge? up? who owns it?) ==="
ip -d link show virbr1 | sed -n '1,6p'
ip -br addr show virbr1

echo "=== G4: libvirt 1_provider net -- autostart + forward mode (NAT double-NAT note) ==="
sudo virsh net-info 1_provider 2>/dev/null
sudo virsh net-dumpxml 1_provider 2>/dev/null | sed -n '1,30p'

echo "=== G5: is virbr1 touched by netplan? (decides systemd-vs-netplan persistence) ==="
ls -1 /etc/netplan/ 2>/dev/null
sudo grep -RnE 'virbr1|10\.12\.4\.1|10\.12\.8\.' /etc/netplan/ 2>/dev/null || echo "netplan: no virbr1 / .8 references"

echo "=== G6: must NOT already exist ==="
ip -br addr show | grep -E 'virbr1\.104|10\.12\.8\.' || echo "clean: no virbr1.104 / .8 yet"
--- END runbook block: gw-01-audit ---

STOP. Decision from the audit:
- G1 != 0  -> STOP. VID 104 will not traverse virbr1; the tagged-secondary approach
  needs rework. This is the same hard gate as the MAAS stand-up.
- G5 shows virbr1 already managed in netplan -> prefer the NETPLAN persistence
  variant (Phase 3B) to avoid two managers fighting.
- G5 shows virbr1 is purely libvirt (the expected case) -> use the SYSTEMD ONESHOT
  variant (Phase 3A): it orders cleanly after libvirtd and won't race a netplan that
  doesn't manage virbr1.
- G4 autostart != yes -> enable it (`sudo virsh net-autostart 1_provider`) so virbr1
  exists at boot before the gateway unit runs.

================================================================================
## PHASE 2 -- RUNTIME (reversible; proves it works before persisting)
================================================================================

GATE. Brings the gateway up immediately (lost on reboot -- Phase 3 persists it).
Fully reversible via the rollback block.
--- BEGIN runbook block: gw-02-runtime ---
sudo ip link add link virbr1 name virbr1.104 type vlan id 104
sudo ip addr add 10.12.8.1/22 dev virbr1.104
sudo ip link set virbr1.104 up
ip -br addr show virbr1.104
ip route show 10.12.8.0/22
--- END runbook block: gw-02-runtime ---

ROLLBACK (if anything looks wrong):
  sudo ip link del virbr1.104

TEST (after the MAAS plane exists and a host carries a .8 static, e.g. post-carve):
  ping -c2 10.12.8.1                       # the gateway itself
  ping -c2 10.12.8.40                      # a host's br-prov-api static (if carved)
  # from a provider-plane host, confirm .8 is reachable via the jumphost route

NOTE (libvirt NAT, cosmetic): 1_provider is forward mode=nat, so .4<->.8 traffic may
be masqueraded to the jumphost's address. It still works statefully (the API does not
care about source IP). If you later want symmetric, un-NATed provider<->provider-vip
routing, add an iptables RETURN rule ahead of the libvirt masquerade for
10.12.4.0/22 <-> 10.12.8.0/22 -- optional, not needed for v1.

================================================================================
## PHASE 3 -- PERSISTENCE (pick ONE per the Phase-1 decision)
================================================================================

### 3A -- systemd oneshot (RECOMMENDED for libvirt-managed virbr1)
Orders after libvirtd; idempotent (deletes any stale virbr1.104 first).
--- BEGIN runbook block: gw-03a-systemd ---
sudo tee /etc/systemd/system/provider-vip-gw.service >/dev/null <<'UNIT'
[Unit]
Description=provider-vip L3 gateway (virbr1.104 = 10.12.8.1) -- D-057
After=libvirtd.service network-online.target
Wants=network-online.target
Requires=libvirtd.service
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStartPre=-/sbin/ip link del virbr1.104
ExecStart=/sbin/ip link add link virbr1 name virbr1.104 type vlan id 104
ExecStart=/sbin/ip addr add 10.12.8.1/22 dev virbr1.104
ExecStart=/sbin/ip link set virbr1.104 up
ExecStop=/sbin/ip link del virbr1.104
[Install]
WantedBy=multi-user.target
UNIT
sudo systemctl daemon-reload
sudo systemctl enable --now provider-vip-gw.service
systemctl --no-pager status provider-vip-gw.service | sed -n '1,6p'
ip -br addr show virbr1.104
--- END runbook block: gw-03a-systemd ---

Persistence test (the real proof): `sudo reboot`, then after it returns
`ip -br addr show virbr1.104` must show 10.12.8.1/22 UP. (libvirt 1_provider must be
autostart -- see G4 -- so virbr1 exists when the unit runs.)

ROLLBACK 3A:
  sudo systemctl disable --now provider-vip-gw.service
  sudo rm -f /etc/systemd/system/provider-vip-gw.service && sudo systemctl daemon-reload
  sudo ip link del virbr1.104 2>/dev/null || true

### 3B -- netplan (ONLY if G5 showed virbr1 already managed by netplan)
Add a vlans stanza. Risk: if virbr1 is NOT up when netplan runs at boot, the vlan
fails -- which is exactly why 3A is preferred for a libvirt bridge. Use only if your
jumphost already manages virbr1 via netplan.
  # in the relevant /etc/netplan/*.yaml, under network::
  #   vlans:
  #     virbr1.104:
  #       id: 104
  #       link: virbr1
  #       addresses: [10.12.8.1/22]
  # then: sudo netplan try   (auto-reverts in 120s if unreachable), then sudo netplan apply
ROLLBACK 3B: remove the stanza; sudo netplan apply.

================================================================================
## PHASE 4 -- VERIFY
================================================================================
--- BEGIN runbook block: gw-04-verify ---
ip -br addr show virbr1.104           # 10.12.8.1/22, UP
ip route show 10.12.8.0/22           # directly-connected via virbr1.104
cat /proc/sys/net/ipv4/ip_forward     # 1
--- END runbook block: gw-04-verify ---

DONE when virbr1.104 = 10.12.8.1/22 is UP, survives a reboot (3A), and a provider-plane
host can reach 10.12.8.x through the jumphost.
