CI pipeline — jobs, triggers, and reproducing failures

This document is the canonical contributor guide for plexsphere's CI pipeline. It maps each job in .github/workflows/ci.yaml (and the fast-feedback .github/workflows/pr-smoke.yaml) to the laptop-mirrored make target that reproduces it byte-for-byte, names the tools each job needs on the runner, lists the artefacts the job publishes to the run summary, and records the branch-protection check name reviewers gate merges on. It then lists how to reproduce a red run locally for each of the per-tool failure modes reviewers see most often.

The pipeline is intentionally flat: every Makefile-driven job is a thin wrapper over the target of the same name, so make <job> on a fresh clone is the full reproduction story. The drift gate in tests/docs/ci_doc_drift_test.go asserts this document lists every workflow job and that every make command in the table below resolves to a real Makefile target — documenting a job here without wiring the target, or renaming the target without updating this file, fails CI.

The companion context for this document is docs/contributing/testing.md (the test-pyramid guide) and docs/contributing/toolchain.md (the pinned-tool and SHA-pinning policy this pipeline enforces).

Overview

Two workflows run on every pull request:

ci.yaml — the authoritative gate matrix. Runs on every pull_request (plus a nightly schedule and workflow_dispatch for the fuzz and dev-smoke lanes); every job is required by branch protection. There is deliberately no push trigger: a commit on main only lands through a merged PR whose run was already green, so re-running the validation matrix post-merge adds no signal. Future artifact-publishing jobs (container-image push, Helm-chart push) introduce their own push-gated lane when they arrive; the drift gate TestCIWorkflow_TriggersOnPullRequestNotPush in tests/workspace/ci_workflow_test.go blocks a bare push trigger on the validation matrix. Each job pins the Go toolchain via go-version-file: .go-version (the single source of truth enforced by tests/workspace/goworkdrift_test.go) and every third-party action pins a 40-hex commit SHA (policy recorded in docs/contributing/toolchain.md).
pr-smoke.yaml — the sub-minute feedback gate. Runs on pull_request only, cancels on force-push, and executes EXACTLY make lint, make tidy-check, and go build ./.... It is advisory-but-required: branch protection points at the check name pr-smoke so a reviewer sees a fast red signal before the full matrix completes.

The Makefile-keyed jobs below mirror the requiredCIJobs map in tests/workspace/ci_workflow_test.go: drift between the two is a test failure, not a style note.

Path-based CI skipping

CI has two layers of path-based filtering so a commit only pays for the jobs it actually affects. The rules below apply to both ci.yaml and pr-smoke.yaml.

Layer 1 — `paths-ignore` (skip the entire workflow)

paths-ignore is evaluated by GitHub before any job starts. A commit whose changed files all match paths-ignore triggers zero runner minutes, zero queue time, and zero notifications.

Path pattern	Skipped in `ci.yaml`	Skipped in `pr-smoke.yaml`	Rationale
`.planwerk/**`	✓	✓	Planning JSON — no runnable artefacts, no code surface. A plan-only commit used to trigger the full matrix (~15 jobs); now it costs nothing.
`docs/**`	—	✓	pr-smoke tests code; docs-only PRs are handled by `docs-check` in `ci.yaml`.
`tools/docs/**`	—	✓	Same reasoning as `docs/**`.
`*/.md`	—	✓	Catches `README.md`, `CLAUDE.md`, inline docs.

Layer 2 — the `changes:` gate (per-job skipping)

Inside ci.yaml a cheap changes: job runs dorny/paths-filter and emits boolean outputs. Every other job depends on changes and guards itself with an if: expression:

Output	Filter	Downstream jobs that require it
`code`	`` minus `.planwerk/`, `docs/`, `tools/docs/`, `*/.md`, `LICENSE`, `CODE_OF_CONDUCT` (AND semantics — see below)	`lint`, `tidy`, `unit`, `vuln`, `race`, `integration`, `integration-cli`, `actionlint`, `authz-lint`, `hadolint`, `e2e`, `openapi-lint`, `generated-drift`, `image-scan`, `fuzz-selector`, `fuzz-signing-rotation`
`docs`	`docs/`, `tools/docs/`, `*/.md`	`docs-check` (also runs on `code == 'true'` so a Go-side rename that breaks a doc link still fires lychee)

code and docs are computed by two separate dorny/paths-filter steps because the two need opposite matching semantics. dorny/paths-filter compiles each pattern into its own matcher and, with its default predicate-quantifier: some, a file matches a filter when it matches any single pattern. That is correct for docs and the per-context filters (positive include lists), but it makes an "everything except docs" filter impossible: the leading ** matches every file, so the !docs/** / **/*.md negations never subtract and code would be true for every PR — a docs-only PR would run the full matrix. The code step therefore sets predicate-quantifier: every, switching to AND semantics: a file matches code only when it satisfies all patterns (matches ** and is not under docs/, tools/docs/, .planwerk/, and is not a Markdown / LICENSE / CODE_OF_CONDUCT file).

The code filter covers the signing bounded context (internal/signing/**) via the unconditional ** glob — a change to any signing source, proto, or migration re-runs unit, integration, and generated-drift. The drift gate TestCIWorkflow_SigningPathsCoveredByCodeFilter rejects any PR that adds an !internal/signing/** negation or moves the code filter back to the default some quantifier.

What this means in practice:

Plan-only commit — paths-ignore skips both workflows entirely. No jobs run at all.
Docs-only PR — pr-smoke is skipped by paths-ignore. In ci.yaml, only the changes: gate and docs-check run (all other jobs see code == 'false' and skip).
Code-only commit — changes: emits code=true, docs=false. Every job except docs-check would run on code-only; docs-check also runs because its guard is docs OR code.
Mixed commit — both outputs are true; the full matrix runs.

Adding a new job

Add the job under jobs: in ci.yaml with needs: [changes] and an if: expression referencing needs.changes.outputs.<flag>. The drift gate TestCIWorkflow_DownstreamJobsGuardOnChanges fails the PR if either piece is missing.
If the new job needs its own change-category flag, extend both the filters: block and the outputs: map in changes:. The drift gate TestCIWorkflow_ChangesJobShape keeps them in sync.
Update the per-job table below and, when relevant, the table above.
If the job is expensive (testcontainers, kind, image build), stage it behind the fast gates — see Staged execution below — instead of letting it fan out in parallel.

Forcing a full run

If you need the full matrix despite a docs-only diff (e.g. you want to dry-run a failing image-scan), amend the branch with a whitespace-only change to any code-matched file — the changes: gate will flip code to true and every job will run.

Staged execution (fast gates before expensive jobs)

The changes: gate decides whether a job is relevant to the diff, but it does not order the jobs. Historically every relevant job fanned out in parallel the instant changes finished, so a red lint still burned a runner on the full testcontainers / kind / image fleet in parallel before the cheap signal ever came back.

The expensive jobs are therefore staged behind a tier of cheap, fast-feedback gates. A gate is a job that stays gated on changes alone; an expensive job lists the relevant gates in its needs: and only consumes a runner once they are green. The gate sets are:

Gate set	Gate jobs	Expensive jobs that wait for it
Go fast gates	`lint`, `tidy`, `unit`, `cli`, `generated-drift`	`vuln`, `race`, `integration`, `integration-cli`, `e2e`, `image-scan`, `credentials-broker`, `cloud-credentials`, `management-fleet`, `blueprints`, `provisioning-broker`, `resource-adoption-migration`, `dr`

The remaining cheap jobs (actionlint, authz-lint, hadolint, openapi-lint, docs-check) and the opt-in lanes (fuzz-*, dev-smoke, dev-stack-smoke) keep needs: [changes] — gating them buys little and only adds latency.

Each staged job pairs the gate dependencies with a status-guarded if::

yaml

needs: [changes, lint, tidy, unit, cli, generated-drift]
if: >-
  always() &&
  needs.changes.outputs.code == 'true' &&
  !contains(needs.*.result, 'failure') &&
  !contains(needs.*.result, 'cancelled')

always() keeps the job reachable when an upstream gate was skipped (e.g. a schedule / workflow_dispatch event where code is unset); without it a skipped dependency would cascade-skip the job through the implicit success().
!contains(needs.*.result, 'failure') short-circuits the expensive run the moment any gate fails — that is the resource saver.
!contains(needs.*.result, 'cancelled') keeps a cancelled run from spinning up new expensive work.
The path-filter clause (needs.changes.outputs.<flag>) stays so a docs-only or out-of-scope PR still skips the job entirely.

A failed gate stays red and blocks the merge on its own; the expensive jobs it gates are skipped (neutral), so branch protection is still satisfied only when the cheap signal is green. The staging is pinned by the drift gate TestCIWorkflow_ExpensiveJobsAwaitFastGates in tests/workspace/ci_workflow_test.go: dropping an expensive job back to needs: [changes], or removing a status guard, fails the PR. When you add a new expensive job, list its fast gates in needs:, add the three status guards to its if:, and extend expensiveJobGates in that test.

The jobs

Job ID	Trigger	Local command	Required tools	Artefacts	Branch-protection name
`changes`	PR	— (path-filter only)	`dorny/paths-filter` (action)	—	`changes`
`lint`	PR	`make lint`	Go 1.26, golangci-lint (source-built)	—	`lint`
`tidy`	PR	`make tidy-check`	Go 1.26	—	`tidy`
`unit`	PR	`make test`	Go 1.26	Codecov upload (`coverage/coverage.out`)	`unit`
`vuln`	PR	`make vuln`	Go 1.26, `govulncheck`	—	`vuln`
`race`	PR	`make test-race`	Go 1.26	—	`race`
`integration`	PR	`make test-integration`	Go 1.26, Docker engine	`testcontainers-log` artefact	`integration`
`actionlint`	PR	`make actionlint`	Go 1.26, `actionlint`	—	`actionlint`
`authz-lint`	PR	`make authz-lint`	Go 1.26, `zed` (source-built, pinned by `ZED_VERSION`)	—	`authz-lint`
`hadolint`	PR	`make hadolint`	Docker (hadolint image)	—	`hadolint`
`e2e`	PR	`make e2e`	Go 1.26, kind, chainsaw, Docker	—	`e2e`
`openapi-lint`	PR	`make openapi-lint`	Go 1.26, Node (`.nvmrc`), Spectral	—	`openapi-lint`
`generated-drift`	PR	`make generate-check`	Go 1.26, Node (`.nvmrc`)	—	`generated-drift`
`docs-check`	PR	`make docs-check`	Node (`.nvmrc`), lychee (Rust release binary or Docker fallback)	—	`docs-check`
`sbom`	tag push `v*` + `workflow_dispatch` (in `release.yaml`, not `ci.yaml`)	`make sbom`	Go 1.26, Docker, `syft`	`sbom-plexsphere`, `sbom-plexsphere-signer` (SPDX-JSON)	— (release lane, not branch-protection required)
`image-scan`	PR	`make image-scan`	Go 1.26, Docker, `trivy`	SARIF under `bin/scan/*.sarif` (upload to Code Scanning temporarily disabled, see §Reproducing failures)	`image-scan`
`cli`	PR	`make plexctl-build && go test ./cmd/plexctl/...`	Go 1.26	Codecov upload (`coverage/plexctl-coverage.out`, flag `plexctl`)	`cli`
`integration-cli`	PR	`make test-cli-integration`	Go 1.26	—	`integration-cli`
`fuzz-selector`	PR (30s budget) + schedule (5 min budget) + `workflow_dispatch`	`make fuzz-selector`	Go 1.26	Fuzz-corpus cache (`~/.cache/go-build/fuzz`)	`fuzz-selector`
`fuzz-signing-rotation`	PR (30s budget) + schedule (5 min budget) + `workflow_dispatch`	`make fuzz-signing-rotation`	Go 1.26	Fuzz-corpus cache (`~/.cache/go-build/fuzz`)	`fuzz-signing-rotation`
`credentials-broker`	PR (gated by `credentials_broker` paths-filter output)	`make test-credentials-broker`	Go 1.26, Docker engine	—	`credentials-broker`
`cloud-credentials`	PR (gated by `cloud_credentials` paths-filter output)	`make test-cloud-credentials`	Go 1.26, Docker engine	—	`cloud-credentials`
`management-fleet`	PR (gated by `management_fleet` paths-filter output)	`make test-management-fleet`	Go 1.26, Docker engine	—	`management-fleet`
`blueprints`	PR (gated by `blueprints` paths-filter output)	`make test-blueprints`	Go 1.26, Docker engine	—	`blueprints`
`provisioning-broker`	PR (gated by `provisioning_broker` paths-filter output)	`make test-provisioning-broker`	Go 1.26, Docker engine	—	`provisioning-broker`
`resource-adoption-migration`	PR (gated by `resource_adoption_migration` paths-filter output)	`make test-resource-adoption-migration`	Go 1.26, Docker engine	—	`resource-adoption-migration`
`dr`	PR (gated by `dr` paths-filter output)	`make test-dr`	Go 1.26, Docker engine	—	`dr`
`pr-smoke`	PR only	`make lint && make tidy-check && go build ./...`	Go 1.26, golangci-lint	—	`pr-smoke`
`dev-smoke`	schedule (nightly 03:00 UTC) + `workflow_dispatch` + PR only when labelled `dev`	`make dev` (then `make dev-down`)	Go 1.26, kind, kubectl, Docker	—	— (not branch-protection required)
`dev-stack-smoke`	schedule (nightly 03:00 UTC) + `workflow_dispatch` + PR only when labelled `dev-stack`	`make dev` (then golden-flow chainsaw, then `make dev-down`)	Go 1.26, kind, kubectl, Docker, chainsaw	—	— (not branch-protection required)

Every row whose local command starts with the make command resolves to a target declared in Makefile; the drift test TestCIDocLocalCommandsResolveToMakeTargets enforces that contract. The one non-Make row (pr-smoke) is exempt because its CI shape is guarded by a dedicated test (the pr_smoke_workflow_test.go suite). The dev-smoke row is deliberately NOT in the requiredCIJobs map — it runs on a schedule and on label-gated PRs only, documented in docs/tutorials/set-up-local-plexsphere.md and guarded by TestCIWorkflow_DevSmokeJobShape / TestCIWorkflow_DevSmokeAlwaysRunsTeardown. The dev-stack-smoke row follows the same posture: schedule + opt-in PR label only, NOT branch-protection required, documented alongside the dev-stack contract in docs/reference/dev-stack/index.md, and guarded by the tests/workspace/dev_stack_ci_test.go shape gates.

The sbom row lives in a separate workflow, .github/workflows/release.yaml, and is the one job that does NOT run on a pull request. An SBOM is a record of what is actually shipped, so it is generated at the moment a release tag (v*) is cut — not on every code PR, where the artefact had no consumer and paid a full docker-build for nothing. ci.yaml's deliberate "no push trigger" posture reserved exactly this push-gated artifact lane; the release sbom job fills it. It is NOT in the requiredCIJobs map and is NOT branch-protection required; its shape is guarded by tests/workspace/release_workflow_test.go. The durable step — attaching each SBOM to the published image / GitHub Release as an in-toto attestation — lands together with the image-publish pipeline; until then the tag run uploads the two SPDX-JSON files as workflow artefacts.

Caching and runtime budget

The workflows are aggressively cached so a warm PR reruns in a fraction of the cold-run cost. The following table lists each cache layer, the path it covers, the invalidation key, and where the drift-gate lives when one is installed. Removing or narrowing any of the caches below is a runtime regression — revert or open a ticket before landing.

Layer	Jobs	Path	Invalidation key	Drift gate
Go module + build cache	every Go job	`~/go/pkg/mod`, `~/.cache/go-build`	`go.sum` (via `actions/setup-go`'s built-in `cache: true`)	— (setup-go internal)
GOBIN (installed Go tools)	`lint`, `vuln`, `actionlint`, `authz-lint`, `image-scan`, `generated-drift`, `pr-smoke`, release `sbom`	`~/go/bin`	`hashFiles('.go-version', 'Makefile')` per-job scope (`go-bin-<job>-…`)	—
npm global cache	`openapi-lint`, `generated-drift`, `docs-check`	setup-node's managed cache path	Respective `package-lock.json` via `cache: npm` + `cache-dependency-path`	—
Lychee binary	`docs-check`	`~/.local/bin/lychee`	`env.LYCHEE_VERSION`	—
Trivy vulnerability DB	`image-scan`	`~/.cache/trivy`	Stable `trivy-db-${{ runner.os }}` (trivy's own 24 h freshness check rolls the DB forward on-disk)	`TestCIWorkflowTrivyDBCacheKeyIsStable`
Buildx layer cache (GHA backend)	`image-scan` (ci.yaml), `sbom` (release.yaml)	GitHub Actions cache (type=gha)	Per-component scope `docker-plexsphere` / `docker-plexsphere-signer`; `mode=max`	`TestCIWorkflowBuildxCachedJobsSetUpBuildx` (image-scan), `TestReleaseJobSetsUpBuildxAndGHACache` (release sbom)
Concurrency (PR supersession)	every ci.yaml / pr-smoke.yaml job	—	`${{ github.workflow }}-${{ github.ref }}` with `cancel-in-progress: ${{ github.event_name == 'pull_request' }}`	`TestCIWorkflowDeclaresConcurrencyGroup`

Key invariants worth calling out:

Concurrency cancels PR runs, NOT main pushes. Cancelling on main would leave holes in the CI history and hide regressions. The drift gate TestCIWorkflowDeclaresConcurrencyGroup rejects a bare cancel-in-progress: true.
Per-job GOBIN cache scopes (go-bin-lint-…, go-bin-vuln-…, …) avoid cross-job thrashing. A shared namespace would cause jobs installing different tool subsets to overwrite each other's cache, forcing reinstalls on the next run.
Buildx GHA cache requires docker/setup-buildx-action. The stock docker driver on ubuntu-latest supports --load but not --cache-to type=gha; the image-scan job (and the release lane's sbom job) set up buildx first, then set BUILDX_GHA_CACHE=1 on the make step so Makefile's docker-build target expands per-component --cache-from / --cache-to type=gha,mode=max flags. Dropping either half silently turns the cache into a no-op — the drift gate TestCIWorkflowBuildxCachedJobsSetUpBuildx rejects that.
The lint job checks out at the default depth (1). golangci-lint does not consult git history (no new: / new-from-rev: / revgrep entries in .golangci.yml), so a full-history clone would be pure waste. TestCIWorkflowLintJobOmitsFetchDepth prevents a drift back to fetch-depth: 0.

Clearing a cache while debugging

If a cache entry is ever poisoned (e.g. a bad golangci-lint binary, a corrupt trivy DB shard), rotate the relevant pin rather than trying to delete the cache from the GitHub UI — a fresh key forces a fresh entry and the LRU eviction takes care of the old one:

GOBIN cache — touch Makefile (any change invalidates the key).
npm cache — re-run npm install locally, commit the updated package-lock.json.
Lychee binary — bump LYCHEE_VERSION in both the docs-check job's job-level env: block and Makefile.
Trivy DB — the cache key is stable (trivy-db-<os>), so a poisoned entry does NOT roll over automatically. Append a suffix to the key in ci.yaml's Cache Trivy vulnerability database step (e.g. trivy-db-${{ runner.os }}-v2) and leave restore-keys untouched so everyday runs keep warming from the prior snapshot. Trivy's own 24 h freshness check refreshes the on-disk DB between key rotations, so this is only needed after a DB-schema break.
Buildx GHA cache — force a cache miss by changing the scope name in Makefile's BUILDX_CACHE_ARGS_* variables, or use the gh cache delete CLI.

Reproducing failures

Each subsection below walks through the minimum steps to reproduce a red run locally. The goal is a sub-minute loop for the fast jobs and a sub-ten-minute loop for the container-heavy ones.

lint

bash

# Install the pinned golangci-lint (same line CI runs — see ci.yaml).
GOTOOLCHAIN=local go install github.com/golangci/golangci-lint/v2/cmd/golangci-lint@v2.11.4

# Reproduce:
make lint

make lint runs golangci-lint on the root module, then make depguard-all, then the meta-tests under tests/workspace/ and tests/docs/. If a per-file finding is too noisy to fix inline, read docs/contributing/toolchain.md#static-analysis-golangci-lint for the //nolint:<name> // <reason> escape hatch — the reason is required, and reviewers will push back on a bare disable.

make depguard-all is a per-module sweep: golangci-lint run ./... is module-scoped, so a single root invocation lints only the root module and never reaches the bounded-context (internal/<ctx>), cmd/, or tests/ modules — each of which is its own Go module. The sweep iterates go list -m and runs golangci-lint with --enable-only depguard inside every workspace module, so the cross-context, persistence, and net/http boundary rules in .golangci.yml are actually enforced where they apply. To reproduce a depguard-only failure without the rest of the lint gate:

bash

make depguard-all

openapi-lint

bash

# The Node toolchain is pinned by the repo-root .nvmrc the job reads.
nvm use --install "$(cat .nvmrc)"

# Reproduce:
make openapi-lint

make openapi-lint drives Spectral against api/openapi/plexsphere-v1.yaml with the ruleset at tools/openapi/.spectral.yaml. A failure here is authored, not generated — fix the YAML. The authoring workflow is documented at docs/contributing/openapi.md.

kind / chainsaw (e2e)

bash

# The Makefile builds plexsphere images, boots kind, and side-loads
# the e2e images before chainsaw runs — no extra setup needed.
make e2e

A failed chainsaw step prints the failing manifest path and the namespace; re-running with chainsaw test --test-dir <dir> narrows the loop while you iterate. The image-load scripts tests/e2e/bootstrap/kind-load.sh, tests/e2e/openapi/kind-load-no-v1.sh, and tests/e2e/messaging/kind-load.sh are idempotent, so you can rerun them without tearing the cluster down.

docs-check

bash

# Node toolchain from the canonical .nvmrc.
nvm use --install "$(cat .nvmrc)"

# Reproduce:
make docs-check

make docs-check runs markdownlint-cli2 over docs/**/*.md README.md CLAUDE.md and then lychee over the same set to verify every relative link resolves. The lychee binary is resolved in priority order: a local release binary, then a pinned go install build, then the official Docker image fallback. Each failure line names the offending file, the line number, and either the broken link or the markdown rule that fired — fix the file and re-run the target until the output is clean.

sbom

bash

# Docker must be running — `make sbom` calls `$(MAKE) docker-build`
# to materialise the two images syft scans.
make sbom

make sbom does NOT run on a pull request — it runs in the release lane (.github/workflows/release.yaml) on a v* tag push (or workflow_dispatch), because an SBOM is a record of what is actually shipped and only means something at the tag. The make recipe is identical whether you invoke it on a laptop or the release lane invokes it.

make docker-build tags each built image TWICE: once with the branch-tracking :$(VERSION) reference, and once with the stable :$(CI_TAG) alias (default ci). make sbom scans the :ci tag so the scanned reference is byte-identical across a laptop run and a CI run regardless of the repo's VERSION state. Override the alias with make sbom CI_TAG=<alias> when experimenting, but the committed default is what the workspace tests assert. The target emits bin/sbom/plexsphere.spdx.json and bin/sbom/plexsphere-signer.spdx.json; the release-lane job uploads each as a named artefact (sbom-plexsphere, sbom-plexsphere-signer) so Dependency-Track-style ingestion tools can consume one SBOM per image. A red run still ships whatever syft emitted before the failure — the two upload steps are unconditional on purpose.

image-scan

bash

# Same docker dependency as sbom — make image-scan shares the
# docker-build prerequisite.
make image-scan

make image-scan runs Trivy at --severity HIGH,CRITICAL --exit-code 1 --ignore-unfixed --ignorefile.trivyignore --format sarif against the stable plexsphere:ci / plexsphere-signer:ci alias make docker-build produces (see the sbom section above for the tag rationale) and prints SARIF to bin/scan/<image>.sarif. A non-zero exit is the gate firing on a real finding — suppressing it is a documented policy change that requires a justified entry in .trivyignore (the tests/workspace/trivyignore_test.go gate rejects an un-justified entry). The CI job would normally upload SARIF to the security tab via github/codeql-action/upload-sarif even on a failing run so findings remain visible, but that step is temporarily commented out in .github/workflows/ci.yaml because this repository is private and does not have GitHub Advanced Security enabled (upload-sarif fails with Advanced Security must be enabled for this repository to use code scanning). The HIGH/CRITICAL gate still fires in CI — only the Security-tab dashboard is unavailable until the repo goes public or Advanced Security is licensed. Re-enable the upload by uncommenting the two Upload Trivy SARIF steps in the image-scan job.

generated-drift

bash

# Same Node toolchain as openapi-lint.
nvm use --install "$(cat .nvmrc)"

# Regenerate locally, then re-check:
make generate
make generate-check

A failure means the committed OpenAPI Go artefacts (the generated types, server interface, and client) do not match a fresh make generate run. The fix is always to regenerate, stage the diff, and push — never to hand-edit a generated file. See docs/contributing/openapi.md for the full authoring loop.

Cross-references

docs/contributing/testing.md — the test pyramid, build tags, and the shared internal/platform/testutil harness.
docs/contributing/toolchain.md — pinned Go and golangci-lint versions, SHA-pinning policy for GitHub Actions, Dependabot wiring.
docs/contributing/openapi.md — the OpenAPI-first authoring workflow that the openapi-lint and generated-drift jobs protect.
docs/contributing/docs-preview.md — laptop-only make docs-preview / make docs-build workflow for the VitePress preview of the docs/ corpus.
CLAUDE.md — top-level contributor rules, including the tests-and-documentation-from-the-start contract this pipeline enforces.

CI pipeline — jobs, triggers, and reproducing failures ​

Overview ​

Path-based CI skipping ​

Layer 1 — paths-ignore (skip the entire workflow) ​

Layer 2 — the changes: gate (per-job skipping) ​

Adding a new job ​

Forcing a full run ​

Staged execution (fast gates before expensive jobs) ​

The jobs ​

Caching and runtime budget ​

Clearing a cache while debugging ​

Reproducing failures ​

lint ​

openapi-lint ​

kind / chainsaw (e2e) ​

docs-check ​

sbom ​

image-scan ​

generated-drift ​

Cross-references ​

CI pipeline — jobs, triggers, and reproducing failures

Overview

Path-based CI skipping

Layer 1 — `paths-ignore` (skip the entire workflow)

Layer 2 — the `changes:` gate (per-job skipping)

Adding a new job

Forcing a full run

Staged execution (fast gates before expensive jobs)

The jobs

Caching and runtime budget

Clearing a cache while debugging

Reproducing failures

lint

openapi-lint

kind / chainsaw (e2e)

docs-check

sbom

image-scan

generated-drift

Cross-references