Skip to content

Idempotency & Retries — Notes#

The core insight#

Network errors are ambiguous — you don't know if the request happened. Either build idempotency in, or accept duplicated effects.

Where to put the key#

  • Header: Idempotency-Key: <uuid> (Stripe, AWS).
  • Body field: client_token (DynamoDB).
  • HTTP method semantics: PUT /resource/{id} is naturally idempotent.

Key scope#

  • Per (route, tenant, key) — don't collide across endpoints.
  • Hash the request body — reject same key + different body with 422.

TTL#

  • Long enough for any client retry (24–48 hours typical).
  • Short enough to bound table growth.
  • Use TTL-enabled stores (DynamoDB TTL, Redis EXPIRE).

Anti-patterns#

  • "Idempotent because it's a SELECT" — but the side-effect was a write somewhere downstream.
  • Using request UUID generated server-side — defeats purpose; client must generate.
  • Storing only the response body, not the status — re-running on retry instead of returning cached.
  • Infinite retries without budget → outage amplification.

Refs#

  • Stripe API "Idempotent Requests" guide.
  • AWS Architecture Blog: "Exponential Backoff And Jitter."
  • Marc Brooker: "Timeouts, Retries, and Backoff with Jitter."
  • "Production-Ready Microservices" — Susan Fowler.