Appearance
Audit log storage topology
This page covers the two persistence backends the Platform Audit Log writes to and the deterministic byte projection the chain hash is computed over. For the chain state machine itself see Hash chain and residency model; for the entry point and ubiquitous language see the index.
Two-store split — Postgres primary, object-store mirror
The Platform Audit Log is split deliberately across two persistence backends, each chosen for the failure mode it shoulders. The README's storage topology table pins the split as a deployment commitment; this section reproduces the split with code anchors so a reader can navigate from the diagram into the SQL and back.
text
┌────────────────────────────────────────────────────────────────┐
│ Emitters (FROZEN ports — Sink.Record(ctx, Entry)) │
│ bootstraptokens/audit signing/audit_middleware authz │
│ │ │ │ │
│ └────────── audit.Sink.Record ────────────┘ │
│ │ │
│ ┌───────────────▼───────────────────┐ │
│ │ audit.Service (hash-chained) │ │
│ │ · DomainResolver · Pepper │ │
│ │ · canonical.Canonical │ │
│ │ · sha256 chain · advisory lock │ │
│ └───────────────┬───────────────────┘ │
│ │ AppendAuditEntry (sqlc) │
│ ┌───────────────▼───────────────────┐ │
│ │ Postgres (primary, hot) │ │
│ │ plexsphere.audit_log_entry │ │
│ │ plexsphere.audit_log_chain_head │ │
│ │ plexsphere.audit_subject_pii │ │
│ │ plexsphere.audit_tamper_quarantine │
│ └────────┬───────────────┬───────────┘ │
│ │ pull │ verify │
│ audit.archiver audit.ReconcileChain (/readyz) │
│ │ │
│ ┌────────▼────────┐ │
│ │ blobstore (S3) │ audit/<domain>/<seq:020d>.json.zst │
│ │ 7-yr cold │ │
│ └─────────────────┘ │
└────────────────────────────────────────────────────────────────┘| Store | Backend | Role | Schema / layout |
|---|---|---|---|
| Primary (hot) | PostgreSQL | Authoritative chain. Every append is a single transaction under a per-Domain advisory lock; reads (ListAuditEntries, GetAuditEntry, VerifyAuditChain) hit this store. The chain remains queryable for the full retention window. | 0011_audit_log.sql — four tables: audit_log_entry, audit_log_chain_head, audit_subject_pii, audit_tamper_quarantine. |
| Mirror (cold) | S3-compatible object store (SeaweedFS self-hosted, AWS S3 SaaS) | Long-term retention. The archiver drains archived_at IS NULL rows, zstd-compresses the canonical-JSON projection, and uploads to audit/<domain_id>/<seq:020d>.json.zst. Used by the audit-archive-restore Chainsaw suite to recover a dropped seq window. | internal/audit/archiver/upload.go — archiveRow JSON projection + zstd compressor + per-Domain bucket key. |
The split is not a hot/cold tier with a TTL on the hot side: a chain row never leaves PostgreSQL during the retention window. The mirror is a parallel write that gives the operator a recoverable snapshot if the Postgres cluster suffers catastrophic loss; the Chainsaw audit-archive-restore suite proves the round-trip end-to-end against a real seaweedfs testcontainer.
Loki carries the node-side audit stream (POST /v1/nodes/{id}/audit) entirely separately. The two streams MUST NOT silently merge — the depguard rule no-node-audit-on-platform-chain in .golangci.yml refuses any import of internal/audit/repo from internal/observability/**, and the workspace test tests/workspace/depguard_audit_isolation_test.go backs the rule at go test time so a future contributor cannot silently route the node-side stream through the operator-action chain.
Canonical-byte encoder pin
The chain hash is computed over canonical.Canonical(entry), a deterministic length-prefixed binary form whose magic prefix is PXA1 (the four ASCII bytes 'P','X','A','1' — see internal/audit/chain/canonical.go). The encoder is pinned: any drift retroactively invalidates every downstream row's entry_hash. Two contracts protect the pin:
- Golden corpus. Every encoder change must round-trip byte-for-byte against
internal/audit/chain/testdata/canonical/*.golden. The current corpus coversgenesis_minimal,full_fields,empty_slices, andunicode_relationcases; new cases are added alongside their golden file. The testcanonical_diff_test.godiffs the live encoder output against each golden. - Versioned magic. A future encoder revision MUST bump the magic prefix (e.g.
PXA2) so re-encoded rows are not byte-equal to legacy ones. The verifier uses the magic to refuse a cross-version mix in the same chain segment.
Field-level limits (maxStringBytes = 64 KiB, maxSliceElements = 1024, maxCaveatNameLen = 64) are defence in depth: oversized inputs are rejected at encode time so they never reach storage. These ceilings are exercised by the fuzz harness at internal/audit/chain/chain_fuzz_test.go, which CI runs for ≥ 30 s on every PR.