Skip to content

Capacity and scale targets

This runbook covers the per-Domain capacity-and-scale collector: the six scale dimensions plexsphere tracks, their Phase-1 target defaults, how the collector samples usage and computes a used / target ratio, the 80% crossing audit contract, the structured refusal a hard ceiling produces, and the load-test harness operators use to validate a target against a real deployment.

The numbers below are design-phase orientation figures, not SLAs. They size a single deployment on the HA minimums, are operator-overridable defaults, and Phase-1 load-testing may move individual targets up or down. Exceeding a target is an operator-visible signal — never a silent failure — surfaced through the Dashboard capacity view, the Platform Audit Log, and Prometheus.

The six dimensions

The collector tracks six axes per Domain. Each has a unit, a Phase-1 default target, and an owning enforcement point that defines what the number means.

DimensionUnitPhase-1 default targetOwning enforcement point
nodescount10 000Identity & Registration — sustained enrolled-plexd ceiling per Domain (bursts to 15 000 during mass rollouts are out of scope for the sustained target).
sse_fanoutevents/sec1 000Signed Event Bus (mesh SSE) — sustained SSE fan-out per Domain.
secret_readsreads/sec10 000Secret Store — cluster-wide sustained secret-read rate per Domain (NSK-rewrap dominated).
mediated_sessionscount500Access Orchestrator — concurrent mediated sessions per Domain across all kinds (ssh + k8s + tcp).
observability_ingestbytes/sec5 MiB/s (5 242 880 bytes/s)Observability Ingest — aggregated per-Domain byte budget across metrics + logs + audit.
action_executionscount1 000Action Orchestrator — concurrent action executions per Domain.

Ingest unit note. The project README states the ingest target in prose as "5 MB/sec", but the collector tracks it as 5 MiB/s (5 × 1024 × 1024 = 5 242 880 wire bytes/sec, binary) to stay consistent with the gzip-compressed wire-byte counters the Observability Ingest quota debits. A 1000-based reading would drift roughly 4.6% below the enforced cap, so the binary figure is authoritative.

A target of 0 means "no target configured": the collector skips the ratio for that dimension rather than dividing by zero.

How the collector works

A platform background collector samples each Domain's usage on a fixed interval (default 15 seconds), computes a used / target ratio per dimension, and publishes the result as a per-Domain snapshot.

There are two source kinds:

  • Level sources (nodes, mediated_sessions, action_executions): an absolute tally read directly each tick via per-Domain COUNT(*) … GROUP BY domain_id queries over the live rows (enrolled nodes / live access sessions / live executions).
  • Counter sources (sse_fanout, secret_reads, observability_ingest): a sustained rate computed as the delta of a cumulative counter divided by the elapsed interval. The first sample reports rate 0, because there is no prior delta to subtract.

The snapshot is per-Domain. Until the first sample for a Domain completes, that Domain's snapshot is unavailable and the read endpoint returns HTTP 503 with a Retry-After header; the wait shrinks as you tighten the sample interval (see the knob below).

The 80% crossing audit contract

Crossing detection is edge-triggered with hysteresis: a crossing fires exactly once when a dimension's ratio reaches 0.80 from below, and re-arms only after the ratio drops back below 0.75. The 0.05 deadband keeps a ratio hovering at the threshold from flapping. A restart re-fires one entry per still-crossed dimension, because crossing state is intentionally not persisted.

Each crossing writes one row to the Platform Audit Log on the addressed Domain's hash chain:

  • Subject: system:capacity-monitor — a synthetic system principal, because the collector has no human caller.
  • Object: domain:<domain-uuid>.
  • Reason: granted.
  • Relation: one of these six exact strings, one per dimension:
    • capacity.nodes.threshold_crossed
    • capacity.sse_fanout.threshold_crossed
    • capacity.secret_reads.threshold_crossed
    • capacity.mediated_sessions.threshold_crossed
    • capacity.observability_ingest.threshold_crossed
    • capacity.action_executions.threshold_crossed

The ratio value itself is deliberately not carried on the audit row: the audit CaveatContext is a names-only contract. The audit row records that a Domain crossed 80%; the plexsphere_capacity_ratio gauge below records by how much. Read the two together — the audit log for the discrete crossing event, the gauge for the live magnitude.

The capacity_exceeded refusal

There is no silent throttling. When a per-Domain hard ceiling is hit, the offending request receives an RFC 9457application/problem+json response with HTTP 429, a code of capacity_exceeded, an optional dimension member naming the scale axis, and a Retry-After header so automation backs off cleanly:

json
{
  "type": "https://plexsphere.dev/errors/capacity-exceeded",
  "title": "Too Many Requests",
  "status": 429,
  "detail": "the per-Domain observability ingest budget is exhausted; retry after the window resets.",
  "code": "capacity_exceeded",
  "dimension": "observability_ingest"
}

Today the dimension member is set by exactly the two dimensions that enforce a hard ceiling at request time:

  • observability_ingest — the per-Domain ingest byte budget.
  • action_executions — the per-Domain live-execution cap.

The other four dimensions are observed-and-audited but not refused at this surface: a node, session, secret-read, or SSE-fan-out overshoot shows up in the Dashboard ratio and the 80% audit crossing, not as a capacity_exceeded response.

Metrics

The collector exposes these Prometheus series:

  • plexsphere_capacity_used{domain_id,dimension} — the absolute tally or sustained rate from the latest sample.
  • plexsphere_capacity_ratio{domain_id,dimension}used / target; the live magnitude behind the 80% audit crossing.
  • plexsphere_capacity_target{dimension} — the configured target (Domain-independent label set).
  • plexsphere_capacity_crossings_total{domain_id,dimension} — count of 80% crossings fired.
  • plexsphere_capacity_crossing_record_failures_total — count of audit writes that failed when recording a crossing; a sustained increase means crossings are happening but not landing on the hash chain, so alert on it.

Alert on plexsphere_capacity_ratio approaching 1.0 per dimension, and join it with plexsphere_capacity_crossings_total to scope which Domains are pushing a ceiling.

Tuning the sample interval

PLEXSPHERE_CAPACITY_SAMPLE_INTERVAL is a Go-duration env var (e.g. 30s, 1m) that overrides the default 15-second sample cadence. An empty or unset value keeps the default.

  • Tighten it (shorter than 15 s) to make the snapshot fresher and shorten the Retry-After window before a Domain's first sample lands, at the cost of more sampling load.
  • Loosen it (longer than 15 s) to reduce sampling load on Postgres and the counters, at the cost of a staler ratio and a longer wait before the snapshot is first available.

Runbook: driving a load test

make load-test drives one capacity dimension via the tests/load harness against a provisioned deployment. Select the axis with DIMENSION= — one of nodes, sse-fanout, secret-reads, sessions, ingest, or actions (default ingest).

Two variables are required and have no safe default:

  • LOAD_DOMAIN_ID=<domain-uuid> — the Domain to drive.
  • LOAD_TOKEN=<bearer-token> — a bearer token authorised for that Domain.

The optional variables are LOAD_BASE_URL (default http://localhost:8080), LOAD_RATE (default 100), LOAD_DURATION (default 30s), and LOAD_RAMP (default 5s).

shell
make load-test DIMENSION=ingest  LOAD_DOMAIN_ID=<uuid> LOAD_TOKEN=<token>
make load-test DIMENSION=actions LOAD_DOMAIN_ID=<uuid> LOAD_TOKEN=<token>

Reading the result. The driver reports p50/p95/p99 latencies and buckets every response by its RFC 9457 Problem code. It exits non-zero if the target rate is not sustained, or if a refusal code outside this expected set appears:

  • capacity_exceeded
  • per_node_rate_limited
  • per_domain_rate_limited
  • session_limit_exceeded

Those four refusals are the system correctly defending its ceilings under load — not harness failures. Any other 4xx or 5xx code is a real regression and fails the run.

Not a CI gate. Full-scale load runs are deliberately not a blocking CI job: there is no in-pipeline provisioned deployment to drive, and a sustained-rate run is open-ended by design. Only the harness's own unit tests under tests/load run in make test; make load-test is operator-invoked against a real target.

Why these targets

The six numbers are design-phase orientation figures for a single deployment on the HA minimums, chosen so evaluators can judge order-of-magnitude fit for their fleet rather than as contractual SLAs. Phase-1 load-testing — driven by the harness above — may move individual targets up or down. The model is deliberately layered: exceeding a target is always an operator-visible signal (the Dashboard capacity view ratio plus the 80% audit crossing), but only the two hard-limited dimensions produce a structured capacity_exceeded refusal. Observation and refusal are separate contracts: everything is watched, only ceilings are enforced at the request edge.

See also

  • Capacity HTTP API reference — the GET /v1/domains/{domainId}/capacity snapshot endpoint, its schema, and the 503-before-first-sample contract.
  • The OpenAPI specification at ../../api/openapi/plexsphere-v1.yaml — the wire contract for the snapshot operation and the capacity_exceeded Problem shape.
  • plexctl metrics query and the Metrics and Logs query context — query the per-Domain metric series behind these capacity ratios over the read-only GET /v1/domains/{domainId}/metrics/query proxy, which bounds every query and injects the addressed Domain as the upstream tenant server-side.
  • Failure modes and degradation — the degradation cousin of this runbook: how a single dependency outage surfaces a structured problem code instead of a generic 5xx.
  • Multi-region runbook — region pinning and per-region ingress; capacity targets are per-Domain and therefore per-region.
  • Disaster recovery — restoring the control plane after a larger-blast-radius event.

The Dashboard capacity view renders the live used / target ratio per dimension per Domain; it is the first place an operator sees a Domain approaching a target.