Netflix — Detailed#
flowchart TB
subgraph Devices
TV([Smart TV / Roku])
PH([Phone])
BR([Browser])
CON([Console])
end
subgraph Edge[Open Connect - Netflix CDN]
OCA[Open Connect Appliances<br/>installed at ISPs]
OCBackbone[Netflix backbone]
PEERED[Peered exchange + transit]
end
subgraph Control[Control Plane on AWS]
GW[Zuul API Gateway]
DISCOVERY[Eureka service discovery]
META[(Catalog Metadata)]
ACCT[Account / Auth]
BILL[Billing]
PERS[Personalization Service]
PLAYBACK[Playback API]
LICENSE[DRM License<br/>Widevine / FairPlay]
SUB[Subtitles]
TRACK[Telemetry / QoE]
end
subgraph Encode[Content Pipeline]
INGEST([Studio master ingest])
CO[Color / mastering tools]
PER_TITLE[Per-title encoding<br/>complexity-aware]
LADDER[Adaptive bitrate ladders<br/>H.264, HEVC, VP9, AV1]
AUDS[Audio: AAC, EAC3, Atmos]
SUBS[Subtitles, dubs, multi-lang]
DRM_PACK[DRM packaging / CMAF]
QC[Automated QC]
end
subgraph Microservices[Microservice mesh - 1000s services]
HOME[Home rows]
SEARCH[Search]
EVIDENCE[Artwork personalization]
RANK([Ranker DNN])
AB[A/B - Spinnaker rollout]
HYS[Hystrix circuit breakers]
end
subgraph Data
EVCACHE[(EVCache - memcached fork)]
CASS[(Cassandra clusters)]
DYNOMITE[Dynomite - Redis fronted]
BIGDATA([(S3 + Iceberg + Spark)])
KEYSTONE[[Keystone - Kafka + Flink]]
end
subgraph Chaos
SIM[Simian Army<br/>Chaos Monkey/Kong/Gorilla]
GAME[Game days]
end
Devices --> OCA
OCA -. miss .-> OCBackbone --> ORIG[(Origin S3)]
Devices --> GW
GW --> ACCT
GW --> Personalization
GW --> PLAYBACK
PLAYBACK --> LICENSE
PLAYBACK --> OCA
PLAYBACK --> TRACK --> KEYSTONE --> BIGDATA
Microservices --> Data
Encode --> ORIG
ORIG -. fill nightly .-> OCA
DISCOVERY -.- Microservices
Chaos --- Microservices
RANK --> EVCACHE
RANK --> CASS
EVIDENCE --> RANK
AB --- Microservices
classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
class TV,PH,BR,CON client;
class GW edge;
class OCA,OCBackbone,PEERED,DISCOVERY,ACCT,BILL,PERS,PLAYBACK,LICENSE,SUB,CO,PER_TITLE,LADDER,AUDS,SUBS,DRM_PACK,QC,HOME,SEARCH,EVIDENCE,AB,HYS,SIM,GAME service;
class META,CASS datastore;
class EVCACHE,DYNOMITE cache;
class KEYSTONE queue;
class INGEST,RANK compute;
class BIGDATA,ORIG storage;
class TRACK obs;
Open Connect (Netflix's CDN)#
- Custom appliances (caches) inside ISPs.
- ISPs save transit; Netflix saves egress.
- Catalog warm-filled in off-hours; popular titles pinned.
- Anycast routing not used; client picks server via control-plane steering.
Per-title encoding#
- Complexity analysis per scene → distinct bitrate ladders per title.
- Saves 20-50% bandwidth on average vs fixed ladder.
- Today: per-shot dynamic optimizer.
Microservices in AWS#
- Hundreds of services in EC2.
- EVCache (memcached fork) is the primary cache layer.
- Cassandra for OLTP; S3 + Iceberg + Spark for batch analytics.
- Keystone = Kafka + Flink for streaming.
QoE / playback#
- Adaptive Bitrate switching driven by client buffer + bandwidth estimation.
- Telemetry continuously feeds dashboards and reco signals.
Chaos engineering#
- Random instance termination in prod (Chaos Monkey).
- Region failover game-days (Chaos Kong).
Glossary & fundamentals#
Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.
| Tag | Concept | What it is | Page |
|---|---|---|---|
HLD |
Load balancer / GSLB | L4/L7 traffic distribution and failover | load-balancer |
HLD |
CDN | edge caching for static assets | cdn |
HLD |
API gateway / BFF | single ingress, auth, rate limit, routing | api-gateway |
HLD |
Pub/Sub & message brokers | topics, consumer groups, delivery semantics | pub-sub-pattern |
HLD |
Resilience patterns | timeout, retry, breaker, bulkhead, backpressure | resilience-patterns |
HLD |
Service mesh | sidecar mesh, mTLS, traffic policy | service-mesh |
HLD |
Batch & stream processing | Lambda vs Kappa, watermarks, windows | batch-stream-processing |