Module probe_runners

Module probe_runners 

Source
Expand description

Per-kind probe runners (RFC-0007 §3.1). Each runner consumes a ProbeDecl + returns a RunnerOutcome. Uniform strict-mode semantics: any runtime error → ProbeStatus::Fail with a failure_reason string. Per RFC-0007 §6 there is no Unknown or “swallowed error” class.

Runners are pure (modulo I/O and the system clock) — they don’t emit events; the probe worker handles event emission + state tracking. Each runner is Send + 'static so it can be tokio::spawn’d.

Modules§

evidence
Evidence probe runner (RFC-0007 §3.1 + §7). READ-ONLY consumer of the local collector unit’s signed evidence file.
exec
Exec probe runner (RFC-0007 §3.1). Pass iff exit code 0 within timeoutSecs wallclock. Argv runs as the agent’s user; declare absolute paths to avoid PATH surprises.
http
HTTP probe runner (RFC-0007 §3.1). GET <url> with timeoutSecs wallclock budget; Pass iff response status matches expectStatus. Error classes that count as Fail (RFC-0007 §6 uniform strict mode):
tcp
TCP probe runner (RFC-0007 §3.1). Pass iff connect_timeout_secs TCP connect succeeds against host:port. host defaults to 127.0.0.1 if absent.

Structs§

ControlOverrideDecl
Single entry in controlOverrides / controls (RFC-0007 §3.4 per-control granularity). mode is the effective mode for the control; reason is operator-facing audit rationale, surfaced in event_log + dashboards.
ProbeDecl
On-disk probe declaration. Loaded from /etc/nixfleet/agent/health-checks.json (rendered from lib/mk-fleet.nix:effectiveHealthChecks by _agent.nix).
RunnerOutcome
Output of one runner invocation.

Constants§

FAILURE_REASON_MAX_LEN
LOADBEARING: per-failure cap on failure_reason string length keeps the wire body bounded. Without truncation, runners can emit arbitrarily long stderr / response bodies that inflate the outbound queue’s JSON payloads and event-log row sizes. Runners pass their failure-reason strings through truncate_reason before constructing a RunnerOutcome::Fail.
MIN_INTERVAL_SECS
LOADBEARING: floor on probe interval guards against a misconfigured 0/1-second probe DOSing the host. Operator-declared intervalSeconds values below this are rounded up at the worker layer (crate::runtime::workers::probe::spawn clamps via interval_seconds.max(MIN_INTERVAL_SECS)). A weaker .max(1) floor would still let a 1-second HTTP probe issue 60 reqs/min against an operator-unintended backend.

Functions§

default_connect_timeout_secs 🔒
default_evidence_path 🔒
default_expect_status 🔒
default_interval_seconds 🔒
default_timeout_secs 🔒
run
Dispatch on decl.kind. Unknown kinds fail closed.
truncate_reason
Truncate to FAILURE_REASON_MAX_LEN chars; appends "...[truncated]" when truncation fires. UTF-8 safe: bumps end back to the prior char boundary if a multibyte sequence would be split.