Health Check / Heartbeat Service — Detailed
flowchart TB
subgraph Nodes
N1[Node 1]
N2[Node 2]
NN[Node N]
end
subgraph Modes[Probe modes]
ACTIVE[Active checks - HTTP/TCP/gRPC]
PASSIVE[Passive - outcome-based]
HB[Heartbeat push from node]
PHI[Phi-accrual detector]
end
subgraph Service
COL[Collector cluster]
STATE[(Health state KV)]
GOSSIP[Gossip across collectors]
RULES[Status rules]
DEPS[Dependency graph]
end
subgraph Reactions
LB[LB pool update]
ROUTE[Routing change]
ALERT[Alerts / pages]
RUNBOOK[Automated remediation]
end
Nodes --> Modes --> Service --> Reactions
classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
class LB edge;
class N1,N2,NN,ACTIVE,PASSIVE,HB,PHI,COL,GOSSIP,RULES,DEPS,ROUTE,RUNBOOK service;
class STATE datastore;
class ALERT obs;
Failure detection
- Binary up/down is brittle.
- Phi-accrual outputs a suspicion value (probability host is dead); thresholding triggers reactions.
Probe vs heartbeat
- Active probe = control plane initiates; ensures network path works.
- Heartbeat = node-initiated; cheaper at scale.
- Both used together.
Glossary & fundamentals