Skip to content

Job / Task Scheduler — Detailed (Airflow / distributed cron / Temporal-style)#

flowchart TB
  subgraph Author
    DAG[DAG / workflow definition]
    GIT[Git repo]
    UI([Web UI])
  end

  subgraph Sched[Scheduler]
    PARSE([DAG parser])
    PLAN([Run planner<br/>schedules + backfill])
    LEAD[Leader election - HA]
    LOCK[Distributed lock per DAG run]
    TIME([Cron evaluator])
  end

  subgraph Queue
    Q[[Queue per worker pool]]
    PRIO[[Priority queues]]
    DLQ[Dead-letter]
  end

  subgraph Workers
    W1([Worker / executor])
    W2([Worker])
    KEXEC[K8s executor]
    CEXEC[Celery executor]
    SUB[Sub-process / pod per task]
  end

  subgraph State[State + History]
    DB[(Metadata DB)]
    LOG[Task logs]
    TS([Trigger / event store])
    ART[Artifacts]
  end

  subgraph Reliability
    RETRY[Retry policies + backoff]
    SLA[SLA miss alerts]
    IDEMP[Idempotency for tasks]
    CKPT[Checkpoint long tasks]
  end

  subgraph Trigger
    CRON([Cron schedule])
    SENS([Sensors / external triggers])
    API([On-demand trigger API])
  end

  Author --> Sched
  Sched --> Queue --> Workers
  Workers --> State
  Reliability --- Sched
  Trigger --- Sched

    classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
    classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
    classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
    classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
    classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
    classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
    classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
    classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
    classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
    classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
    class UI client;
    class DAG,GIT,LEAD,LOCK,DLQ,KEXEC,CEXEC,SUB,ART,RETRY,IDEMP,CKPT service;
    class DB datastore;
    class Q,PRIO queue;
    class PARSE,PLAN,TIME,W1,W2,TS,CRON,SENS,API compute;
    class LOG,SLA obs;

Correctness patterns#

  • Singleton scheduler: leader election in HA pair to avoid double-runs.
  • Idempotent tasks: each run keyed by (dag, run_id, task_id, attempt).
  • Workers ack work: re-queue on heartbeat loss; tasks must tolerate at-least-once.
  • Backfill = scheduling historical runs after deploy.

Workflow systems vs cron#

  • Simple cron: timer + job command.
  • Workflow systems (Airflow / Argo / Temporal / Cadence) add DAGs, retries, sensors, observability, durable state.

Glossary & fundamentals#

Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.

Tag Concept What it is Page
HLD Pub/Sub & message brokers topics, consumer groups, delivery semantics pub-sub-pattern
HLD Raft / Paxos consensus replicated state machine via majority quorum consensus-raft-paxos
HLD Idempotency & retries safe re-execution, backoff + jitter idempotency-retries
HLD Observability metrics, logs, traces, SLOs observability
HLD Event sourcing + CQRS commands -> events; separate read model event-sourcing-cqrs
LLD Creational patterns Singleton, Factory, Builder, Prototype creational-patterns
LLD Behavioural patterns Strategy, Observer, State, Command, Chain behavioral-patterns