Idempotency & Retries — Notes#
The core insight#
Network errors are ambiguous — you don't know if the request happened. Either build idempotency in, or accept duplicated effects.
Where to put the key#
- Header:
Idempotency-Key: <uuid>(Stripe, AWS). - Body field:
client_token(DynamoDB). - HTTP method semantics:
PUT /resource/{id}is naturally idempotent.
Key scope#
- Per (route, tenant, key) — don't collide across endpoints.
- Hash the request body — reject same key + different body with 422.
TTL#
- Long enough for any client retry (24–48 hours typical).
- Short enough to bound table growth.
- Use TTL-enabled stores (DynamoDB TTL, Redis EXPIRE).
Anti-patterns#
- "Idempotent because it's a SELECT" — but the side-effect was a write somewhere downstream.
- Using request UUID generated server-side — defeats purpose; client must generate.
- Storing only the response body, not the status — re-running on retry instead of returning cached.
- Infinite retries without budget → outage amplification.
Refs#
- Stripe API "Idempotent Requests" guide.
- AWS Architecture Blog: "Exponential Backoff And Jitter."
- Marc Brooker: "Timeouts, Retries, and Backoff with Jitter."
- "Production-Ready Microservices" — Susan Fowler.