Graph Fundamentals
TL;DR
Graphs are collections of vertices connected by edges. Adjacency lists are efficient for sparse graphs (O(V + E) space), while adjacency matrices suit dense graphs or frequent lookups (O(V²) space). BFS explores level-by-level using a queue, finding shortest paths in unweighted graphs (O(V + E) time). DFS explores deeply using recursion or stack, useful for detecting cycles and topological ordering (O(V + E) time). Connected components identify disjoint subgraphs. Choose representations and algorithms based on graph density and query patterns.
Core Concepts
Graph Terminology
Graphs model relationships between entities as collections of vertices (nodes) connected by edges.
Key definitions:
- Vertex: Individual element or entity in graph
- Edge: Connection between two vertices, optionally with a weight (cost/distance)
- Path: Sequence of vertices v₁, v₂, ..., vₖ where consecutive vertices are connected by edges
- Cycle: Path starting and ending at same vertex (v₁ = vₖ)
- Simple path: Path with no repeated vertices
- Simple cycle: Cycle with no repeated vertices except endpoints
Graph properties:
- Directed vs Undirected: Directed edges have direction; undirected are bidirectional
- Weighted vs Unweighted: Edges may have associated weights
- Cyclic vs Acyclic: Acyclic graphs contain no cycles (DAGs)
- Connected vs Disconnected: Connected if path exists between any two vertices
- Dense vs Sparse: Dense has O(V²) edges; sparse has O(V) edges
Graph Representation: Adjacency List
Adjacency lists store a list of neighbors for each vertex. Most efficient for sparse graphs.
Structure:
Graph: 0 -- 1
| |
2 -- 3
List representation:
0: [1, 2]
1: [0, 3]
2: [0, 3]
3: [1, 2]
Characteristics:
- Space: O(V + E) - proportional to vertices and edges
- Edge lookup: O(degree) - must iterate neighbor list
- Insertion/deletion: O(1) to O(degree)
- Best for: Sparse graphs, traversal operations
Graph Representation: Adjacency Matrix
Adjacency matrices use 2D array where matrix[i][j] indicates edge from i to j.
Structure:
0 1 2 3
0 [ 0 1 1 0 ]
1 [ 1 0 0 1 ]
2 [ 1 0 0 1 ]
3 [ 0 1 1 0 ]
Characteristics:
- Space: O(V²) - always stores full matrix
- Edge lookup: O(1) - direct array access
- Insertion/deletion: O(1)
- Best for: Dense graphs, frequent edge queries
Breadth-First Search (BFS)
BFS explores graph level-by-level using a queue, discovering vertices in order of distance from start.
Key properties:
- Time complexity: O(V + E) - visits each vertex and edge once
- Space complexity: O(V) - queue stores at most V vertices
- Shortest path: Finds shortest path in unweighted graphs
- Level-order: Explores all vertices at distance k before distance k+1
Algorithm:
- Start from source vertex, mark as visited
- Add to queue
- While queue not empty:
- Dequeue vertex
- Process vertex
- For each unvisited neighbor, mark as visited and enqueue
Depth-First Search (DFS)
DFS explores as deeply as possible before backtracking, using recursion or stack.
Key properties:
- Time complexity: O(V + E) - visits each vertex and edge once
- Space complexity: O(V) - recursion stack or explicit stack
- Applications: Cycle detection, topological sort, strongly connected components
- Discovery order: Depth-first order may differ from breadth-first
Algorithm (Recursive):
- Mark vertex as visited
- Process vertex
- For each unvisited neighbor, recursively call DFS
Connected Components
Connected components are maximal sets of vertices where path exists between any two vertices.
Finding connected components:
- Initialize all vertices as unvisited
- For each unvisited vertex, run DFS/BFS
- Each DFS/BFS traversal marks one connected component
Example: Graph with 3 connected components:
- Component 1: 2
- Component 2: 4
- Component 3: 5
Key Algorithms
BFS Traversal and Shortest Path
Uses queue to explore all neighbors at current distance before moving to next distance. Optimal for unweighted shortest paths.
DFS Traversal and Cycle Detection
Uses stack (implicit or explicit) to explore deeply. Detects cycles by finding back edges during traversal.
Connected Components Identification
Run DFS/BFS from each unvisited vertex, assigning component IDs. Count number of DFS calls to find number of components.
Practical Example
- Python
- Java
- TypeScript
from collections import defaultdict, deque
class Graph:
def __init__(self):
self.graph = defaultdict(list)
def add_edge(self, u, v):
self.graph[u].append(v)
self.graph[v].append(u) # For undirected graph
def bfs(self, start):
visited = set()
queue = deque([start])
visited.add(start)
result = []
while queue:
vertex = queue.popleft()
result.append(vertex)
for neighbor in self.graph[vertex]:
if neighbor not in visited:
visited.add(neighbor)
queue.append(neighbor)
return result
def dfs_recursive(self, vertex, visited, result):
visited.add(vertex)
result.append(vertex)
for neighbor in self.graph[vertex]:
if neighbor not in visited:
self.dfs_recursive(neighbor, visited, result)
def dfs(self, start):
visited = set()
result = []
self.dfs_recursive(start, visited, result)
return result
def dfs_iterative(self, start):
visited = set()
stack = [start]
result = []
while stack:
vertex = stack.pop()
if vertex not in visited:
visited.add(vertex)
result.append(vertex)
stack.extend(reversed(self.graph[vertex]))
return result
def has_cycle(self):
visited = set()
rec_stack = set()
def dfs_cycle(v):
visited.add(v)
rec_stack.add(v)
for neighbor in self.graph[v]:
if neighbor not in visited:
if dfs_cycle(neighbor):
return True
elif neighbor in rec_stack:
return True
rec_stack.remove(v)
return False
for vertex in self.graph:
if vertex not in visited:
if dfs_cycle(vertex):
return True
return False
def connected_components(self):
visited = set()
components = []
def dfs(v, component):
visited.add(v)
component.append(v)
for neighbor in self.graph[v]:
if neighbor not in visited:
dfs(neighbor, component)
for vertex in self.graph:
if vertex not in visited:
component = []
dfs(vertex, component)
components.append(component)
return components
# Test
g = Graph()
edges = [(0, 1), (1, 2), (2, 3), (4, 5)]
for u, v in edges:
g.add_edge(u, v)
print("BFS from 0:", g.bfs(0)) # [0, 1, 2, 3]
print("DFS from 0:", g.dfs(0)) # [0, 1, 2, 3]
print("Has cycle:", g.has_cycle()) # False
print("Connected components:", g.connected_components()) # [[0,1,2,3], [4,5]]
import java.util.*;
public class Graph {
private Map<Integer, List<Integer>> adjacencyList;
public Graph() {
this.adjacencyList = new HashMap<>();
}
public void addEdge(int u, int v) {
adjacencyList.putIfAbsent(u, new ArrayList<>());
adjacencyList.putIfAbsent(v, new ArrayList<>());
adjacencyList.get(u).add(v);
adjacencyList.get(v).add(u); // Undirected
}
public List<Integer> bfs(int start) {
Set<Integer> visited = new HashSet<>();
Queue<Integer> queue = new LinkedList<>();
List<Integer> result = new ArrayList<>();
queue.offer(start);
visited.add(start);
while (!queue.isEmpty()) {
int vertex = queue.poll();
result.add(vertex);
for (int neighbor : adjacencyList.getOrDefault(vertex, new ArrayList<>())) {
if (!visited.contains(neighbor)) {
visited.add(neighbor);
queue.offer(neighbor);
}
}
}
return result;
}
public List<Integer> dfs(int start) {
Set<Integer> visited = new HashSet<>();
List<Integer> result = new ArrayList<>();
dfsRecursive(start, visited, result);
return result;
}
private void dfsRecursive(int vertex, Set<Integer> visited, List<Integer> result) {
visited.add(vertex);
result.add(vertex);
for (int neighbor : adjacencyList.getOrDefault(vertex, new ArrayList<>())) {
if (!visited.contains(neighbor)) {
dfsRecursive(neighbor, visited, result);
}
}
}
public boolean hasCycle() {
Set<Integer> visited = new HashSet<>();
Set<Integer> recStack = new HashSet<>();
for (int vertex : adjacencyList.keySet()) {
if (!visited.contains(vertex)) {
if (hasCycleDFS(vertex, visited, recStack)) {
return true;
}
}
}
return false;
}
private boolean hasCycleDFS(int v, Set<Integer> visited, Set<Integer> recStack) {
visited.add(v);
recStack.add(v);
for (int neighbor : adjacencyList.getOrDefault(v, new ArrayList<>())) {
if (!visited.contains(neighbor)) {
if (hasCycleDFS(neighbor, visited, recStack)) {
return true;
}
} else if (recStack.contains(neighbor)) {
return true;
}
}
recStack.remove(v);
return false;
}
public List<List<Integer>> connectedComponents() {
Set<Integer> visited = new HashSet<>();
List<List<Integer>> components = new ArrayList<>();
for (int vertex : adjacencyList.keySet()) {
if (!visited.contains(vertex)) {
List<Integer> component = new ArrayList<>();
dfsFill(vertex, visited, component);
components.add(component);
}
}
return components;
}
private void dfsFill(int v, Set<Integer> visited, List<Integer> component) {
visited.add(v);
component.add(v);
for (int neighbor : adjacencyList.getOrDefault(v, new ArrayList<>())) {
if (!visited.contains(neighbor)) {
dfsFill(neighbor, visited, component);
}
}
}
}
class Graph {
private adjacencyList: Map<number, number[]> = new Map();
addEdge(u: number, v: number): void {
if (!this.adjacencyList.has(u)) this.adjacencyList.set(u, []);
if (!this.adjacencyList.has(v)) this.adjacencyList.set(v, []);
this.adjacencyList.get(u)!.push(v);
this.adjacencyList.get(v)!.push(u); // Undirected
}
bfs(start: number): number[] {
const visited = new Set<number>();
const queue: number[] = [start];
const result: number[] = [];
visited.add(start);
while (queue.length > 0) {
const vertex = queue.shift()!;
result.push(vertex);
for (const neighbor of this.adjacencyList.get(vertex) || []) {
if (!visited.has(neighbor)) {
visited.add(neighbor);
queue.push(neighbor);
}
}
}
return result;
}
dfs(start: number): number[] {
const visited = new Set<number>();
const result: number[] = [];
this.dfsRecursive(start, visited, result);
return result;
}
private dfsRecursive(vertex: number, visited: Set<number>, result: number[]): void {
visited.add(vertex);
result.push(vertex);
for (const neighbor of this.adjacencyList.get(vertex) || []) {
if (!visited.has(neighbor)) {
this.dfsRecursive(neighbor, visited, result);
}
}
}
hasCycle(): boolean {
const visited = new Set<number>();
const recStack = new Set<number>();
for (const vertex of this.adjacencyList.keys()) {
if (!visited.has(vertex)) {
if (this.hasCycleDFS(vertex, visited, recStack)) {
return true;
}
}
}
return false;
}
private hasCycleDFS(v: number, visited: Set<number>, recStack: Set<number>): boolean {
visited.add(v);
recStack.add(v);
for (const neighbor of this.adjacencyList.get(v) || []) {
if (!visited.has(neighbor)) {
if (this.hasCycleDFS(neighbor, visited, recStack)) {
return true;
}
} else if (recStack.has(neighbor)) {
return true;
}
}
recStack.delete(v);
return false;
}
connectedComponents(): number[][] {
const visited = new Set<number>();
const components: number[][] = [];
for (const vertex of this.adjacencyList.keys()) {
if (!visited.has(vertex)) {
const component: number[] = [];
this.dfsFill(vertex, visited, component);
components.push(component);
}
}
return components;
}
private dfsFill(v: number, visited: Set<number>, component: number[]): void {
visited.add(v);
component.push(v);
for (const neighbor of this.adjacencyList.get(v) || []) {
if (!visited.has(neighbor)) {
this.dfsFill(neighbor, visited, component);
}
}
}
}
// Test
const g = new Graph();
const edges = [[0, 1], [1, 2], [2, 3], [4, 5]];
edges.forEach(([u, v]) => g.addEdge(u, v));
console.log("BFS from 0:", g.bfs(0)); // [0, 1, 2, 3]
console.log("DFS from 0:", g.dfs(0)); // [0, 1, 2, 3]
console.log("Has cycle:", g.hasCycle()); // false
console.log("Connected components:", g.connectedComponents()); // [[0,1,2,3], [4,5]]
Common Pitfalls
Not marking visited vertices during traversal: Revisiting vertices causes infinite loops in cycles. Always mark as visited before processing.
Forgetting to handle disconnected graphs: Graph may have multiple components. Must iterate through all vertices as starting points if finding all connections.
Confusing BFS and DFS use cases: BFS for shortest paths (unweighted), DFS for cycles and topological sort. Using wrong algorithm gives incorrect results.
Adjacency matrix for sparse graphs: O(V²) space wasteful for sparse graphs. Use adjacency list for better efficiency.
Not considering graph direction: Adding edges in only one direction for undirected graphs creates incorrect connectivity. Must add both directions.
Stack/Queue operation errors: Push/pop confusion with stack, enqueue/dequeue with queue. Wrong order ruins traversal.
Self-Check
- When would you use adjacency list vs adjacency matrix? What are space-time tradeoffs?
- How does BFS guarantee shortest path in unweighted graphs but not weighted graphs?
- What property of DFS enables cycle detection? How do you distinguish back edges from tree edges?
One Takeaway
Graphs are fundamental data structures modeling relationships. Master adjacency list (efficient for sparse graphs) and adjacency matrix (efficient for dense graphs) representations. BFS and DFS both run in O(V + E) but serve different purposes: BFS finds shortest paths in unweighted graphs, DFS enables cycle detection and topological sorting. Understanding which representation and algorithm fits your problem determines solution efficiency.
References
- Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to Algorithms (3rd ed.). MIT Press.
- Sedgewick, R., & Wayne, K. (2011). Algorithms (4th ed.). Addison-Wesley.
- Tarjan, R. E. (1972). "Depth-first search and linear graph algorithms." SIAM Journal on Computing, 1(2), 146-160.