Data Structures & Complexity — Detailed#

Cheat sheet (average case)#

Structure	Access	Search	Insert	Delete	Notes
Array	O(1)	O(n)	O(n)	O(n)	contiguous, cache-friendly
Dynamic array (Vector / ArrayList)	O(1)	O(n)	O(1) amortised	O(n)	doubling growth
Linked list	O(n)	O(n)	O(1) at known position	O(1)	cache-unfriendly
Stack	O(1)	O(n)	O(1)	O(1)	LIFO
Queue	O(1)	O(n)	O(1)	O(1)	FIFO
Deque	O(1)	O(n)	O(1)	O(1)	double-ended
Hash map	O(1)*	O(1)*	O(1)*	O(1)*	worst case O(n) on bad hash
Tree map / Red-Black tree	O(log n)	O(log n)	O(log n)	O(log n)	ordered
Heap (binary)	O(1) peek	O(n)	O(log n)	O(log n)	priority queue
Trie	O(k)	O(k)	O(k)	O(k)	k = key length
B-tree / B+ tree	O(log n)	O(log n)	O(log n)	O(log n)	disk-friendly, DB indexes
Skip list	O(log n)	O(log n)	O(log n)	O(log n)	simpler than RB-tree
Bloom filter	O(k)	O(k)	O(k)	n/a	probabilistic; no remove
Disjoint set / Union-Find	O(α(n)) ≈ O(1)	n/a	O(α(n))	n/a	with path compression + union by rank
Graph (adj list)	n/a	varies	O(1) edge	O(deg)	sparse graphs
Graph (adj matrix)	O(1) edge	O(n²)	O(1)	O(1)	dense graphs
LSM tree	O(log n)	O(log n) avg	O(log n)	O(log n)	write-optimised, see storage engines

* hash: amortised; worst case O(n) on collisions or adversarial input.

When to pick what#

flowchart TB
  Q[What's the dominant op?]
  Q --> A[fast lookup by key] --> AH[Hash map]
  Q --> B[ordered iteration] --> BT[Tree map / B-tree]
  Q --> C[get min / max repeatedly] --> CH[Heap]
  Q --> D[prefix search / autocomplete] --> DT[Trie]
  Q --> E[set membership, no removes, ok with FP] --> EB[Bloom filter]
  Q --> F[range queries] --> FT[B-tree, segment tree]
  Q --> G[append-only log] --> GL[Vector / log-structured]
  Q --> H[disjoint groups + union] --> HU[Union-Find]
  Q --> I[graph traversal] --> IG[Adjacency list]

  classDef p fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
  class A,B,C,D,E,F,G,H,I p;

    classDef client fill:#dbeafe,stroke:#1e40af,stroke-width:1px,color:#0f172a;
    classDef edge fill:#cffafe,stroke:#0e7490,stroke-width:1px,color:#0f172a;
    classDef service fill:#fef3c7,stroke:#92400e,stroke-width:1px,color:#0f172a;
    classDef datastore fill:#fee2e2,stroke:#991b1b,stroke-width:1px,color:#0f172a;
    classDef cache fill:#fed7aa,stroke:#9a3412,stroke-width:1px,color:#0f172a;
    classDef queue fill:#ede9fe,stroke:#5b21b6,stroke-width:1px,color:#0f172a;
    classDef compute fill:#d1fae5,stroke:#065f46,stroke-width:1px,color:#0f172a;
    classDef storage fill:#e5e7eb,stroke:#374151,stroke-width:1px,color:#0f172a;
    classDef external fill:#fce7f3,stroke:#9d174d,stroke-width:1px,color:#0f172a;
    classDef obs fill:#f3e8ff,stroke:#6b21a8,stroke-width:1px,color:#0f172a;
    class Q,A,AH,B,BT,C,CH,D,E,EB,F,FT,G,GL,H,HU,I,IG service;
    class DT datastore;

Common interview algos + complexity#

Problem	Algorithm	Time	Space
Sort	Merge / Heap / Quick	O(n log n)	O(n) / O(1)
Shortest path (no neg weights)	Dijkstra	O((V+E) log V)	O(V)
Shortest path (neg weights ok)	Bellman-Ford	O(VE)	O(V)
All-pairs shortest path	Floyd-Warshall	O(V³)	O(V²)
Topological sort	Kahn's algo	O(V+E)	O(V)
Connected components	DFS/BFS or Union-Find	O(V+E) / O(α·E)	O(V)
MST	Prim / Kruskal	O(E log V)	O(V+E)
Cycle detection	DFS with colors	O(V+E)	O(V)
Strongly connected components	Tarjan / Kosaraju	O(V+E)	O(V)
String search	KMP / Rabin-Karp	O(n+m)	O(m)
Longest common subseq	DP	O(nm)	O(nm)

Space-time trade-offs#

Want to optimise	Pay with
Read latency	denormalisation, caches
Write latency	append-only, batch flush
Memory	probabilistic structures, compression
Hot keys	sharding, replication
Network	colocation, batching

Amortised cost#

ArrayList push: O(1) amortised even though occasional resizes are O(n).
Union-Find with path compression: ~O(1).
Hash map resize: O(n) every doubling; amortised O(1) per insert.

Memory model nuance#

Linked list = pointer chasing → many cache misses.
Array of structs > struct of arrays for sequential access.
4 KB page is the typical access granularity from disk.

Glossary & fundamentals#

Concepts referenced in this design. Each row links to its canonical page; the tag column shows whether it is a high-level (HLD) or low-level (LLD) concept.

Tag	Concept	What it is	Page
`HLD`	Sharding	horizontal partitioning across nodes	database-sharding
`HLD`	Leader/follower replication	sync/semi-sync/async replication, failover	replication-leader-follower
`HLD`	LSM vs B-Tree engines	WAL, memtable, SSTables, compaction	storage-engines-lsm-btree
`HLD`	Probabilistic data structures	Bloom, HLL, Count-Min, MinHash, t-digest	probabilistic-data-structures
`LLD`	Data structures & complexity	Big-O, common DS, latency numbers	data-structures-complexity