Newer
Older
openstack-caracal-ipv4 / docs / design-decisions.md

Design Decisions — VR0 DC0 Omega Cloud

This document is the architectural record for the VR0 DC0 testcloud rebuild. Every decision listed here has been deliberately made and discussed; it is not a wishlist or a brainstorm. If a decision is changed, this document is updated and the change is committed with a referencing message.

Scope split: This repository implements v1 (IPv4-only). Several decisions below are tagged with [v2-scope] — they remain valid design intent but are deferred to a future v2 deployment when upstream router infrastructure supports IPv6. See D-015 for the v1/v2 fork record.


D-001: Deployment target paradigm

Decision: Path 2A — Charmed OpenStack Caracal (2024.1) via Juju bundle.

Alternatives considered:

  • Path 2B — Canonical Sunbeam (microk8s-based). Rejected: discards most of the test-cloud experience accumulated to date; different operator paradigm.
  • Path 1 — Stay on Bobcat 2023.2. Rejected: defeats the purpose of a Caracal rehearsal ahead of Roosevelt bare-metal.

Consequences:

  • Bundle-based deployment, suitable for both KVM testcloud and bare-metal scale.
  • Caracal-stable channel matrix applies (see D-002).
  • EOL date is April 2027 (Caracal upstream support window).

D-002: Channel pinning matrix

Decision: Pin every charm to a Caracal-stable channel. No OVN pinning on testcloud (Roosevelt will pin via ovn-source).

Charm group Channel
OpenStack core (keystone, glance, nova-*, neutron-api, cinder, placement, octavia, barbican, designate, magnum, vault) 2024.1/stable
OVN (ovn-central, ovn-chassis, ovn-dedicated-chassis-octavia) 24.03/stable
Ceph (ceph-mon, ceph-osd, ceph-radosgw if used) squid/stable (see D-005)
MySQL (mysql-innodb-cluster, mysql-router subordinates) 8.0/stable
RabbitMQ 3.9/stable
Vault 1.8/stable
etcd, easyrsa latest/stable

Verification source: Caracal channel matrix per Canonical Charmed OpenStack docs, current as of design date. Verify against Charmhub before deploy via scripts/pre-flight-checks.sh.


D-003: Network architecture — Option B

Decision: Provider network carries BOTH ext_net (tenant FIPs + SNAT egress) AND OpenStack public API VIPs on the same L2 segment.

Rationale: During Magnum CAPI Phase 3 on the Bobcat testcloud, OCCM crashloop was traced to tenant networks being unable to reach OpenStack API endpoints — the libvirt FORWARD chain rejected cross-bridge packets between provider (virbr1) and metal (virbr2) bridges. With API VIPs on metal, tenant workloads cannot reach them. Putting API VIPs on the same network as the FIPs makes the API path tenant-reachable by construction.

Address space layout for v1 (IPv4-only):

Range Purpose
10.12.4.10 – 10.12.4.223 Neutron FIP pool
10.12.4.224 – 10.12.4.254 Charm API VIPs (excluded from Neutron allocation_pools)

The Provider /22 (10.12.4.0/22) carries both ranges within a single Neutron subnet. Neutron allocation_pools MUST exclude the API VIP range.

v2-scope extension: IPv6 Provider subnet adds parallel FIP and API VIP IPv6 IP Ranges within a single /64. See D-004.


D-004 [v2-scope]: Dual-stack vs IPv6-only matrix

Decision (v2-scope, NOT for v1): Network role determines address family. IPv6 preferred; IPv6-only where the network has no external clients.

v1 reality: All networks are IPv4-only on the existing MAAS-provisioned layout. This matrix becomes active in v2.

Role IPv4 (v1) IPv4 (v2) IPv6 (v2) Reasoning
Metal Charm-to-charm; MAAS PXE IPv4-first
Provider Tenant FIPs need IPv4; API VIPs reachable from both
Data (Geneve underlay) v2: no external clients; underlay agnostic
Storage (Ceph public) v2: ms-bind-ipv6: true; no external clients
Replication (Ceph cluster) v2: internal OSD↔OSD only
LBaaS Management Amphora image compatibility
OOB n/a n/a n/a Bare-metal-only concern
OpenStack Tenant pool ✓ (v1: D-016) v1 IPv4 hybrid; v2 IPv6 modeled

D-004a [v2-scope]: Host management → Metal

Decision (v2-scope, NOT for v1): Under v2, openstack0-3 host management IPs move from storage (10.12.16.40-.43) to Metal (10.12.8.0/22) when Storage becomes IPv6-only. v1 keeps host management on storage.


D-005: Ceph release

Decision: Squid (Ceph 19, released October 2024).

Rationale: Matches Caracal default; one fewer source override in bundle; rehearses what Roosevelt will run. If Squid has rough edges, the testcloud is the place to find them, not production.

Alternatives considered:

  • Reef (Ceph 18) — current on Bobcat testcloud; lower risk; would require source: cloud:jammy-caracal override on ceph-mon/ceph-osd while keeping reef/stable channel. Rejected: defeats the rehearsal purpose.

D-006: Vault HA backend

Decision: etcd + easyrsa, per Canonical Charmed Vault HA docs.

Rationale: This is the documented charm path. The chicken-and-egg TLS dependency (Vault needs certs to start, but Vault issues certs) is resolved by easyrsa bootstrapping the etcd cluster's TLS, after which Vault relations to etcd come up cleanly.

Topology on testcloud (v1): Vault num_units=1 + hacluster relation (decorative; documents the relation pattern). Vault HA quorum is not actually exercised at testcloud scale.

Topology on Roosevelt: Vault num_units=3 + hacluster on metal space; etcd num_units=3; easyrsa num_units=1.


D-007: Magnum inclusion

Decision: Magnum in bundle from day one. Two-layer install.

Layer A — Bundle:

  • magnum charm
  • magnum-mysql-router subordinate
  • magnum-dashboard subordinate
  • Standard relations: keystone, mysql-innodb-cluster (via router), rabbitmq-server, vault (certificates), openstack-dashboard
  • Binding: public: provider with VIP on provider API VIP range
  • Hacluster relation included (decorative on testcloud)

Layer B — Post-deploy runbook (runbooks/05-magnum-capi-driver.md):

  • juju run magnum/leader domain-setup --wait=10m
  • pip install stackhpc/magnum-capi-helm v0.13.0 into the magnum charm venv with --break-system-packages
  • Deploy /etc/magnum/kubeconfig pointing at capi-mgmt.maas bootstrap k3s
  • Systemd override replacing init.d ExecStart to load --config-dir
  • /etc/magnum/magnum.conf.d/99-capi.conf setting enabled_drivers=k8s_capi_helm_v1 and [capi_helm] kubeconfig_file=/etc/magnum/kubeconfig

CAPI mgmt plane: stays on capi-mgmt.maas bootstrap k3s. Not in-cloud. This pattern transfers to Roosevelt unchanged.


D-008: DNS architecture

Decision: Layered — static /etc/hosts for bootstrap + Designate (in bundle from day one) for tenant-level resolution.

Naming convention:

<service>.<cloud>.<dc>.<region>.cloud.neumatrix.local

Examples:

  • keystone.omega.dc0.vr0.cloud.neumatrix.local
  • nova.omega.dc0.vr0.cloud.neumatrix.local

Bootstrap order:

  1. Static /etc/hosts on jumphost + all openstack0-3 hosts + all LXD containers
  2. Bundle deploys with os-public-hostname: <fqdn> per API charm
  3. Vault issues certs with FQDN in SAN
  4. Post-deploy: Designate zone created, A records populated (v1: A records only; v2 adds AAAA records)
  5. Neutron default_dns_domain and dns_servers configured to point at Designate
  6. Tenant subnets created with --dns-nameserver <designate-vip>

D-009: Hacluster modeling at testcloud scale

Decision: Include hacluster + VIP relations at num_units=1 across all HA-eligible API charms.

Rationale: Decorative at testcloud scale (a single unit can't form a real HA quorum). Documents the relation pattern so Roosevelt scale-up is mechanical: change num_units: 1num_units: 3 and rerun.

Charms with hacluster relation: keystone, glance, neutron-api, nova-cloud-controller, placement, openstack-dashboard, cinder, octavia, barbican, magnum, vault, designate.


D-010: NetBox-upstream policy

Decision: NetBox is the single source of truth for IPAM at the role and cloud-level pool layer. Per-project tenant subnets are exempt under the hybrid model (D-016).

Workflow: Update NetBox → update bundle/overlay → commit both with cross-reference.

Standing imports for v1 (gating the bundle):

  • VR0 DC0 site exists in NetBox ✓
  • IPv4 prefixes for v1: Metal /22, Provider /22, LBaaS Mgmt /22 (via netbox/ipv4-prefixes-import.py) — pending
  • Provider IP Ranges for FIPs and API VIPs (same script) — pending
  • IPv4 tenant pool /16 (same script, per D-016) — pending
  • IPv6 entries marked as Reservation status (via netbox/ipv6-mark-reserved.py) — pending

Deferred to v2 (per Q2): VR0 DC0-VLANs group additions beyond VID 240 (already imported during prior session work). MAAS currently uses untagged-per-fabric; modeling additional VLANs in NetBox without corresponding network-side tagging would be misleading documentation.


D-011: Validation bar — Roosevelt-rehearsal level

Decision: Deployment is not considered successful until all of the following pass:

  1. All charms active/idle in juju status
  2. API reachability from jumphost (all public VIPs respond on hostname)
  3. API reachability from a tenant VM (Option B verification)
  4. Octavia LB pattern re-passes (round-robin, failover, recovery — per Bobcat v3 work)
  5. End-to-end Magnum CAPI cluster creation succeeds, including OCCM not crash-looping
  6. Vault unseal + auto-unseal-after-reboot pattern verified
  7. KVM snapshot baseline taken (Phase 5)
  8. Designate zones populated and tenant VMs resolve API hostnames

Validation script: scripts/validate.sh (TBD).


D-012: Snapshot strategy

Decision: Two baseline snapshots.

  • Snapshot 1: Post-deploy, post-validation, pre-tenant-resources. Clean cloud state — what a fresh install looks like.
  • Snapshot 2: Post-tenant-setup. Includes domain1, project1, user1, openrc, flavors, base images (noble-amd64), keypair. Restore point for tenant work.

Snapshots are KVM/qcow2-level on the jumphost hypervisor. Per-VM.


D-013: Clean teardown of existing capi-mgmt

Decision: Before destroying the OpenStack model, gracefully delete the CAPI workload cluster on capi-mgmt.maas to allow OpenStack resources (LBs, FIPs, volumes) to be cleaned up properly by CAPI controllers.

Steps: kubectl delete cluster capi-mgmt-cluster → wait for CAPI to clean up tenant-side OpenStack resources → juju destroy-model openstack --destroy-storage --no-prompt.

Preserved across rebuild: capi-mgmt.maas bootstrap k3s + CAPI controllers themselves. Re-used as the Magnum CAPI mgmt plane post-deploy.


D-014: Repository storage location and naming

Decision: Self-hosted GitBucket at git.baldurkeep.com.

Repo path: jesse.austin/openstack-caracal-ipv4 (v1; IPv4-only).

v2 repository: TBD when v2 work begins. Two viable paths: sibling repo openstack-caracal-ipv6 or openstack-caracal-dualstack, OR v2 branch in this repo with an overlays/v2-dualstack.yaml. The single-repo-with-branch approach preserves history of what changed v1→v2 together; the sibling-repo approach keeps v1 frozen as a reference once v2 is in motion.

Branching strategy: main is canonical. Per-phase work in feature branches when a deploy is in progress; merge back to main at successful validation.


D-015: v1 / v2 Fork

Decision: Caracal testcloud ships in two iterations.

v1 (this repository, openstack-caracal-ipv4): IPv4-only Caracal on existing MAAS-provisioned network layout. Proves the bundle, Option B binding fix, Magnum CAPI graft, Designate-from-day-one, hacluster relation pattern, and validation framework. Ships first.

v2 (deferred): Adds IPv6 / dual-stack per D-004. Requires upstream router infrastructure to be IPv6-capable, which is not currently the case in this environment. v2 work begins after v1 validation passes AND router-side IPv6 is in place.

Rationale: Decoupling the OpenStack-side rebuild from the network-side IPv6 readiness lets us prove the more-important architectural fix (Option B) without waiting on infrastructure work outside the OpenStack deployment's control. The IPv6 design intent is preserved as NetBox Reservation-status entries (per D-010 and netbox/ipv6-mark-reserved.py).

v1→v2 migration scope (forward-look):

  • Re-IP roles per D-004 (add IPv6 sibling to Metal/Provider/LBaaS; move Data/Storage/Replication to IPv6-only)
  • Move host management IPs from storage to Metal (D-004a)
  • Re-bind charms to listen on both families where dual-stack
  • Add AAAA records to Designate zones
  • Add tenant IPv6 pool carve-outs

D-016: IPv4 tenant pool — hybrid model (v1)

Decision: NetBox owns one upstream IPv4 tenant pool prefix for VR0 DC0. Per-project tenant subnets are Neutron-managed within that pool and are NOT modeled in NetBox.

Pool allocation: 10.20.0.0/16 (default; configurable in netbox/ipv4-prefixes-import.py). 65,536 addresses; 256 /24s available for per-project tenant subnets. Modeled under VR0 DC0 with role openstack-tenant.

Per-project allocation pattern (operationally):

When a project is created, allocate a /24 from the pool. Operator records the allocation in tenant-setup runbook output but does NOT create a NetBox prefix entry for it. Suggested convention: 10.20.<project-index>.0/24, starting with 10.20.1.0/24 for project1, etc.

Rationale (Option C from the discussion):

  • Option A (NetBox-modeled per-project) — full IPAM rigor; high friction for tenant lifecycle; round-trips to NetBox for ephemeral tenants.
  • Option B (Neutron-only, no NetBox standing) — minimum friction; loses upstream visibility of total tenant footprint; violates D-010 in spirit.
  • Option C (hybrid, chosen) — NetBox documents what space is reserved for tenants and prevents accidental collision with infra ranges; Neutron owns the lifecycle of individual tenant subnets without NetBox round-trips.

Constraint: Tenant CIDRs MUST be within the pool. The pre-flight checklist (scripts/pre-flight-checks.sh) should assert that proposed tenant subnets fall within the modeled pool.

v2-scope counterpart: IPv6 tenant pool 2602:f3e2:ff:0::/56 (NetBox-modeled, Reservation status in v1) becomes active in v2 with the same hybrid model — pool has NetBox standing, per-project IPv6 subnets Neutron-managed.


Known bugs to avoid in bundle drafting

From prior bundle review work — these are anti-patterns:

  • magnum-shared-db missing colon — causes a relation endpoint syntax error, deploy-blocking. Bundle must use - - magnum:shared-db (with the colon).
  • Empty osd-devices YAML anchor referenced by multiple ceph-osd applications.
  • ovn-chassis binding overlay-suffix — invalid binding name. Correct value is data.
  • GUI annotation collision between NUMA-split ceph-osd apps (not applicable to testcloud since we don't NUMA-split, but flagged for Roosevelt).
  • Hardcoded NIC name in bridge-interface-mappings. Use MAC where possible.
  • openstack -f value column ordering — column order is not guaranteed; use -c <column> -f value for single-column output.
  • Snap confinement: openstackclients snap has home-only interface; commands cannot read paths under /tmp. File paths must resolve under $HOME.
  • Non-ASCII characters in local_settings.d overrides cause silent daemon failures in Horizon.

Change log

Date Change Reference
2026-05-22 Initial document captured Caracal rebuild planning session
2026-05-22 D-015 v1/v2 fork added; D-004 and D-004a marked v2-scope; D-016 IPv4 tenant pool hybrid model added; D-014 updated with new repo name v1/v2 fork session (this update)