Skip to content

Service Mesh — Detailed#

flowchart TB
  subgraph DataPlane[Data plane - per pod sidecar]
    direction LR
    P1([Sidecar A<br/>Envoy / linkerd2-proxy])
    P2([Sidecar B])
    P3([Sidecar C])
    P1 --> P2
    P2 --> P3
  end

  subgraph ControlPlane[Control plane]
    XDS[xDS server<br/>route + cluster + listener config]
    CA[Cert authority<br/>SPIFFE / SPIRE]
    POLICY[Policy engine<br/>OPA / native]
    TELEM[Telemetry collector]
  end

  subgraph Features[Cross-cutting features]
    MTLS[mTLS everywhere]
    RETRY[Retries + timeouts + circuit breaker]
    TRAFFIC[Traffic split / canary / mirroring]
    AUTHZ[L7 authz]
    OBS[Distributed tracing + metrics]
  end

  ControlPlane -.config.-> DataPlane
  DataPlane --- Features

    classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
    classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
    classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
    classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
    classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
    classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
    classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
    classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
    classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
    classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
    class P1 edge;
    class P2,P3,XDS,CA,POLICY,MTLS,TRAFFIC,AUTHZ service;
    class RETRY datastore;
    class TELEM,OBS obs;

Why it exists#

Microservice resilience + security duplicates across every language. A mesh consolidates these concerns into a sidecar (or a per-host agent), uniformly enforced:

  • mTLS between every service, certs rotated automatically.
  • Retries with budget, timeouts, circuit breakers, outlier detection.
  • Traffic shifting for canary / blue-green / A-B.
  • Authz by service identity (SPIFFE ID), method, headers.
  • Telemetry: golden signals + traces, no app instrumentation needed.

Architecture choices#

Sidecar (per pod) Per-host agent Sidecarless (eBPF)
Examples Istio, Linkerd, Consul Connect early Linkerd 1.x Cilium Service Mesh
Resource cost +1 container/pod shared kernel only
Mature yes aging newer
Granularity per app per host per socket

Istio data flow#

sequenceDiagram
  participant A as Service A
  participant SA as Envoy A
  participant SB as Envoy B
  participant B as Service B
  A->>SA: HTTP/gRPC localhost
  SA->>SB: mTLS, retries, tracing
  SB->>B: HTTP/gRPC localhost
  B-->>SB: response
  SB-->>SA: response
  SA-->>A: response

Ingress + mesh#

The mesh is for east-west traffic (service↔service). For north-south (internet ↔ service) you still need an ingress / API gateway. Many meshes ship an ingress gateway component that's just another Envoy.

When to use one#

  • 30 services and growing.

  • Polyglot stack — Java, Go, Python, Node.
  • Zero-trust requirement (mTLS + identity-based authz).
  • You want canary / traffic shifting without app changes.

When to skip#

  • Monolith or a handful of services — library-based resilience (Resilience4j, Polly) is simpler.
  • Tight latency budget — the sidecar adds 0.5–2 ms per hop.

Common pitfalls#

  • mTLS misconfigurations silently drop traffic — start in permissive mode.
  • Resource overhead: 50–200 MB RAM per sidecar at scale.
  • Debug visibility: an extra network hop changes how curl localhost behaves.
  • Upgrades: control plane and proxies must move together; canary it.

Glossary & fundamentals#

Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.

Tag Concept What it is Page
HLD Load balancer / GSLB L4/L7 traffic distribution and failover load-balancer
HLD API gateway / BFF single ingress, auth, rate limit, routing api-gateway
HLD Idempotency & retries safe re-execution, backoff + jitter idempotency-retries
HLD Resilience patterns timeout, retry, breaker, bulkhead, backpressure resilience-patterns
HLD Observability metrics, logs, traces, SLOs observability
HLD Service mesh sidecar mesh, mTLS, traffic policy service-mesh
LLD Structural patterns Adapter, Decorator, Facade, Proxy, Composite structural-patterns