Capacity and scale targets

This runbook covers the per-Domain capacity-and-scale collector: the six scale dimensions plexsphere tracks, their Phase-1 target defaults, how the collector samples usage and computes a used / target ratio, the 80% crossing audit contract, the structured refusal a hard ceiling produces, and the load-test harness operators use to validate a target against a real deployment.

The numbers below are design-phase orientation figures, not SLAs. They size a single deployment on the HA minimums, are operator-overridable defaults, and Phase-1 load-testing may move individual targets up or down. Exceeding a target is an operator-visible signal — never a silent failure — surfaced through the Dashboard capacity view, the Platform Audit Log, and Prometheus.

The six dimensions

The collector tracks six axes per Domain. Each has a unit, a Phase-1 default target, and an owning enforcement point that defines what the number means.

Dimension	Unit	Phase-1 default target	Owning enforcement point
`nodes`	count	10 000	Identity & Registration — sustained enrolled-plexd ceiling per Domain (bursts to 15 000 during mass rollouts are out of scope for the sustained target).
`sse_fanout`	events/sec	1 000	Signed Event Bus (mesh SSE) — sustained SSE fan-out per Domain.
`secret_reads`	reads/sec	10 000	Secret Store — cluster-wide sustained secret-read rate per Domain (NSK-rewrap dominated).
`mediated_sessions`	count	500	Access Orchestrator — concurrent mediated sessions per Domain across all kinds (`ssh` + `k8s` + `tcp`).
`observability_ingest`	bytes/sec	5 MiB/s (5 242 880 bytes/s)	Observability Ingest — aggregated per-Domain byte budget across metrics + logs + audit.
`action_executions`	count	1 000	Action Orchestrator — concurrent action executions per Domain.

Ingest unit note. The project README states the ingest target in prose as "5 MB/sec", but the collector tracks it as 5 MiB/s (5 × 1024 × 1024 = 5 242 880 wire bytes/sec, binary) to stay consistent with the gzip-compressed wire-byte counters the Observability Ingest quota debits. A 1000-based reading would drift roughly 4.6% below the enforced cap, so the binary figure is authoritative.

A target of 0 means "no target configured": the collector skips the ratio for that dimension rather than dividing by zero.

How the collector works

A platform background collector samples each Domain's usage on a fixed interval (default 15 seconds), computes a used / target ratio per dimension, and publishes the result as a per-Domain snapshot.

There are two source kinds:

Level sources (nodes, mediated_sessions, action_executions): an absolute tally read directly each tick via per-Domain COUNT(*) … GROUP BY domain_id queries over the live rows (enrolled nodes / live access sessions / live executions).
Counter sources (sse_fanout, secret_reads, observability_ingest): a sustained rate computed as the delta of a cumulative counter divided by the elapsed interval. The first sample reports rate 0, because there is no prior delta to subtract.

The snapshot is per-Domain. Until the first sample for a Domain completes, that Domain's snapshot is unavailable and the read endpoint returns HTTP 503 with a Retry-After header; the wait shrinks as you tighten the sample interval (see the knob below).

The 80% crossing audit contract

Crossing detection is edge-triggered with hysteresis: a crossing fires exactly once when a dimension's ratio reaches 0.80 from below, and re-arms only after the ratio drops back below 0.75. The 0.05 deadband keeps a ratio hovering at the threshold from flapping. A restart re-fires one entry per still-crossed dimension, because crossing state is intentionally not persisted.

Each crossing writes one row to the Platform Audit Log on the addressed Domain's hash chain:

Subject: system:capacity-monitor — a synthetic system principal, because the collector has no human caller.
Object: domain:<domain-uuid>.
Reason: granted.
Relation: one of these six exact strings, one per dimension:
- capacity.nodes.threshold_crossed
- capacity.sse_fanout.threshold_crossed
- capacity.secret_reads.threshold_crossed
- capacity.mediated_sessions.threshold_crossed
- capacity.observability_ingest.threshold_crossed
- capacity.action_executions.threshold_crossed

The ratio value itself is deliberately not carried on the audit row: the audit CaveatContext is a names-only contract. The audit row records that a Domain crossed 80%; the plexsphere_capacity_ratio gauge below records by how much. Read the two together — the audit log for the discrete crossing event, the gauge for the live magnitude.

The capacity_exceeded refusal

There is no silent throttling. When a per-Domain hard ceiling is hit, the offending request receives an RFC 9457application/problem+json response with HTTP 429, a code of capacity_exceeded, an optional dimension member naming the scale axis, and a Retry-After header so automation backs off cleanly:

json

{
  "type": "https://plexsphere.dev/errors/capacity-exceeded",
  "title": "Too Many Requests",
  "status": 429,
  "detail": "the per-Domain observability ingest budget is exhausted; retry after the window resets.",
  "code": "capacity_exceeded",
  "dimension": "observability_ingest"
}

Today the dimension member is set by exactly the two dimensions that enforce a hard ceiling at request time:

observability_ingest — the per-Domain ingest byte budget.
action_executions — the per-Domain live-execution cap.

The other four dimensions are observed-and-audited but not refused at this surface: a node, session, secret-read, or SSE-fan-out overshoot shows up in the Dashboard ratio and the 80% audit crossing, not as a capacity_exceeded response.

Metrics

The collector exposes these Prometheus series:

plexsphere_capacity_used{domain_id,dimension} — the absolute tally or sustained rate from the latest sample.
plexsphere_capacity_ratio{domain_id,dimension} — used / target; the live magnitude behind the 80% audit crossing.
plexsphere_capacity_target{dimension} — the configured target (Domain-independent label set).
plexsphere_capacity_crossings_total{domain_id,dimension} — count of 80% crossings fired.
plexsphere_capacity_crossing_record_failures_total — count of audit writes that failed when recording a crossing; a sustained increase means crossings are happening but not landing on the hash chain, so alert on it.

Alert on plexsphere_capacity_ratio approaching 1.0 per dimension, and join it with plexsphere_capacity_crossings_total to scope which Domains are pushing a ceiling.

Tuning the sample interval

PLEXSPHERE_CAPACITY_SAMPLE_INTERVAL is a Go-duration env var (e.g. 30s, 1m) that overrides the default 15-second sample cadence. An empty or unset value keeps the default.

Tighten it (shorter than 15 s) to make the snapshot fresher and shorten the Retry-After window before a Domain's first sample lands, at the cost of more sampling load.
Loosen it (longer than 15 s) to reduce sampling load on Postgres and the counters, at the cost of a staler ratio and a longer wait before the snapshot is first available.

Runbook: driving a load test

make load-test drives one capacity dimension via the tests/load harness against a provisioned deployment. Select the axis with DIMENSION= — one of nodes, sse-fanout, secret-reads, sessions, ingest, or actions (default ingest).

Two variables are required and have no safe default:

LOAD_DOMAIN_ID=<domain-uuid> — the Domain to drive.
LOAD_TOKEN=<bearer-token> — a bearer token authorised for that Domain.

The optional variables are LOAD_BASE_URL (default http://localhost:8080), LOAD_RATE (default 100), LOAD_DURATION (default 30s), and LOAD_RAMP (default 5s).

shell

make load-test DIMENSION=ingest  LOAD_DOMAIN_ID=<uuid> LOAD_TOKEN=<token>
make load-test DIMENSION=actions LOAD_DOMAIN_ID=<uuid> LOAD_TOKEN=<token>

Reading the result. The driver reports p50/p95/p99 latencies and buckets every response by its RFC 9457 Problem code. It exits non-zero if the target rate is not sustained, or if a refusal code outside this expected set appears:

capacity_exceeded
per_node_rate_limited
per_domain_rate_limited
session_limit_exceeded

Those four refusals are the system correctly defending its ceilings under load — not harness failures. Any other 4xx or 5xx code is a real regression and fails the run.

Not a CI gate. Full-scale load runs are deliberately not a blocking CI job: there is no in-pipeline provisioned deployment to drive, and a sustained-rate run is open-ended by design. Only the harness's own unit tests under tests/load run in make test; make load-test is operator-invoked against a real target.

Why these targets

The six numbers are design-phase orientation figures for a single deployment on the HA minimums, chosen so evaluators can judge order-of-magnitude fit for their fleet rather than as contractual SLAs. Phase-1 load-testing — driven by the harness above — may move individual targets up or down. The model is deliberately layered: exceeding a target is always an operator-visible signal (the Dashboard capacity view ratio plus the 80% audit crossing), but only the two hard-limited dimensions produce a structured capacity_exceeded refusal. Observation and refusal are separate contracts: everything is watched, only ceilings are enforced at the request edge.

Capacity and scale targets ​

The six dimensions ​

How the collector works ​

The 80% crossing audit contract ​

The capacity_exceeded refusal ​

Metrics ​

Tuning the sample interval ​

Runbook: driving a load test ​

Why these targets ​

See also ​