Skip to main content

Binary Tree Problems

TL;DR

LCA (Lowest Common Ancestor) recursively returns first node in both subtrees. Path sum variants include root-to-leaf (DFS tracking), any-path (hash map prefix sums), and maximum (track global max during recursion). Tree diameter uses DFS returning height and tracking max diameter seen. Serialization uses preorder with nulls; deserialization recursively rebuilds from array. Mastering these patterns solves most binary tree interview problems.

Core Concepts

Lowest Common Ancestor (LCA)

LCA definition: Deepest node appearing in both p and q's ancestor chains. If node is LCA, p and q appear in different subtrees (or one is node itself).

Recursive approach: if current node is p or q, return it (one might be ancestor of other). Recursively search left and right. If both return non-null, current node is LCA. If only one non-null, return that.

Advantage over iterative: simplicity and elegance. Handles cases where one node is ancestor of other automatically.

Path Sum Problems

Path Sum I (root-to-leaf): DFS tracking current path sum. When reaching leaf, check if sum equals target.

Path Sum III (any-path): Track prefix sums in hash map. At each node, check if (current_sum - target) exists in map. This finds paths ending at current node with sum = target.

Maximum Path Sum: DFS returning max path through node. Track global maximum. Path can go through any node, not just root-to-leaf.

Tree Diameter and Width

Diameter is longest path between any two nodes. Not necessarily through root. Use DFS to get height and track diameter. For each node, diameter might be: max_path_through_node = left_height + right_height.

Width is maximum nodes at any level. Use level-order BFS, tracking level sizes.

Serialization and Deserialization

Preorder serialization: root, left subtree, right subtree. Null nodes represented explicitly. "1,2,null,null,3" is valid.

Deserialization: parse array into list. Recursively deserialize: consume first element as root, recurse left and right until null.

Complexity: O(n) for both operations.

Key Algorithms and Techniques

LCA Recursive Pattern

Return node if match, search subtrees, combine results.

DFS with Path Tracking

Maintain current state as recursing; check conditions at each node.

Hash Map for Prefix Sum

Track cumulative sums; check if complement exists for range queries.

Height-Based Properties

Calculate heights while finding diameter/width simultaneously.

Detailed Problem Walkthroughs

Lowest Common Ancestor (LCA) - Deep Analysis

Problem: Find the deepest node that is an ancestor of both p and q.

Key Insight: If both p and q are in different subtrees of a node, that node is their LCA. If both are in the same subtree, recurse into that subtree.

Example:

       3
/ \
5 1
/ \ / \
6 2 0 8
/ \
7 4

LCA(5, 1) = 3 (they're in different subtrees)
LCA(5, 4) = 5 (4 is in left subtree of 5; 5 is ancestor of 4)
LCA(6, 4) = 5 (both in subtree rooted at 5)

Algorithm:

1. If node == p or node == q, return node (either is LCA of itself and the other)
2. Search left and right subtrees
3. If both return non-null, current node is LCA (both are in different subtrees)
4. If only one returns non-null, that side has both values; return it

Why This Works: The algorithm automatically handles the case where p is ancestor of q (or vice versa) because when we find the first match, we return it, making it the LCA.

Path Sum Variants

Path Sum I (Root-to-Leaf) Find if path from root to leaf sums to target.

# DFS, track path sum
def pathSum(root, target):
def dfs(node, current_sum):
if not node:
return False
current_sum += node.val

# Must reach a LEAF
if not node.left and not node.right:
return current_sum == target

return dfs(node.left, current_sum) or dfs(node.right, current_sum)
return dfs(root, 0)

Path Sum III (Any Path) Find count of paths with given sum. Paths can start/end anywhere.

# Key insight: prefix sum hash map
# For range [i,j]: if sum[i..j] == target, then path exists
# To find: look for (current_sum - target) in prefix sums

def pathSumIII(root, target):
count = [0]
prefix_sums = {0: 1} # Seed with 0 for paths starting at current node

def dfs(node, current_sum):
if not node:
return

current_sum += node.val

# How many paths ending here sum to target?
if (current_sum - target) in prefix_sums:
count[0] += prefix_sums[current_sum - target]

# Add current sum to prefix map
prefix_sums[current_sum] = prefix_sums.get(current_sum, 0) + 1

dfs(node.left, current_sum)
dfs(node.right, current_sum)

# Remove when backtracking (this sum is only valid in this path)
prefix_sums[current_sum] -= 1

Maximum Path Sum (Any Path) Find maximum sum of any path (not necessarily root-to-leaf).

def maxPathSum(root):
max_sum = [float('-inf')]

def dfs(node):
if not node:
return 0

# Don't include negative contributions
left = max(dfs(node.left), 0)
right = max(dfs(node.right), 0)

# Path through this node: left + node + right
max_sum[0] = max(max_sum[0], left + node.val + right)

# Return max path going down from this node
return node.val + max(left, right)

dfs(root)
return max_sum[0]

Key difference: Path through node can include both subtrees (returning left + node + right), but path going down includes only one subtree (returning node + max(left, right)).

Tree Diameter

Problem: Find the longest path between any two nodes.

       1
/ \
2 3
/
4

Diameter: 4 -> 2 -> 1 -> 3 (length 3)
Not necessarily: longest from root

Algorithm: Track diameter while computing heights.

def treeDiameter(root):
diameter = [0]

def dfs(node):
if not node:
return 0

left_height = dfs(node.left)
right_height = dfs(node.right)

# Diameter through this node
diameter[0] = max(diameter[0], left_height + right_height)

# Height of subtree
return max(left_height, right_height) + 1

dfs(root)
return diameter[0]

Note: Diameter is sum of heights (longest path through node), but return value is height (longest path going down).

Serialization and Deserialization

Why It Matters: Serialize trees for storage/transmission; deserialize to reconstruct.

Preorder Approach:

      1
/ \
2 3

Serialized: "1,2,null,null,3,null,null"
(preorder: node, left, right)

Deserialize Logic:

def deserialize(data):
vals = data.split(',')
def dfs():
val = vals.pop(0)
if val == 'null':
return None
node = TreeNode(int(val))
node.left = dfs() # Preorder: process left before right
node.right = dfs()
return node
return dfs()

The stack automatically tracks where we are in the tree during recursion.

Practical Example

from collections import defaultdict

class TreeNode:
def __init__(self, val=0):
self.val = val
self.left = None
self.right = None

# LCA
def lowestCommonAncestor(root, p, q):
if not root or root == p or root == q:
return root
left = lowestCommonAncestor(root.left, p, q)
right = lowestCommonAncestor(root.right, p, q)
if left and right:
return root
return left if left else right

# Path Sum (root-to-leaf)
def pathSum(root, target):
def dfs(node, current_sum):
if not node:
return False
current_sum += node.val
if not node.left and not node.right:
return current_sum == target
return dfs(node.left, current_sum) or dfs(node.right, current_sum)
return dfs(root, 0)

# Path Sum III (any path)
def pathSumIII(root, target):
count = [0]
prefix_sum = {0: 1}

def dfs(node, current_sum):
if not node:
return
current_sum += node.val
count[0] += prefix_sum.get(current_sum - target, 0)
prefix_sum[current_sum] = prefix_sum.get(current_sum, 0) + 1
dfs(node.left, current_sum)
dfs(node.right, current_sum)
prefix_sum[current_sum] -= 1

dfs(root, 0)
return count[0]

# Tree diameter
def treeDiameter(root):
diameter = [0]

def dfs(node):
if not node:
return 0
left = dfs(node.left)
right = dfs(node.right)
diameter[0] = max(diameter[0], left + right)
return max(left, right) + 1

dfs(root)
return diameter[0]

# Maximum path sum
def maxPathSum(root):
max_sum = [float('-inf')]

def dfs(node):
if not node:
return 0
left = max(dfs(node.left), 0)
right = max(dfs(node.right), 0)
max_sum[0] = max(max_sum[0], left + node.val + right)
return node.val + max(left, right)

dfs(root)
return max_sum[0]

# Serialize
def serialize(root):
result = []
def dfs(node):
if not node:
result.append('null')
else:
result.append(str(node.val))
dfs(node.left)
dfs(node.right)
dfs(root)
return ','.join(result)

# Deserialize
def deserialize(data):
vals = data.split(',')
def dfs():
val = vals.pop(0)
if val == 'null':
return None
node = TreeNode(int(val))
node.left = dfs()
node.right = dfs()
return node
return dfs()

Common Pitfalls

warning

LCA assuming node != p and node != q: If p or q is ancestor of the other, node == p case catches it. Must check equality.

Path sum forgetting leaf check: Path sum root-to-leaf must verify node is actually leaf. Intermediate path sums don't count.

Diameter calculation missing combination: Diameter through node is left + right heights, not max of them.

Serialization null handling: Must explicitly represent nulls. Without them, deserialization can't distinguish structure.

Maximum path sum not tracking global max: Local computation isn't enough; must maintain global maximum during recursion.

Advanced Variations

Variant: All Paths with Given Sum

Return the actual paths, not just count.

def allPathsWithSum(root, target):
result = []

def dfs(node, current_sum, path):
if not node:
return

current_sum += node.val
path.append(node.val)

# Check if this point is target
if current_sum == target:
result.append(list(path))

# Continue searching
dfs(node.left, current_sum, path)
dfs(node.right, current_sum, path)

# Backtrack
path.pop()

dfs(root, 0, [])
return result

Variant: LCA in Binary Search Tree

In BST, you can use the value comparison to prune:

def lcaBST(root, p, q):
# If both p and q less than root, go left
if p.val < root.val and q.val < root.val:
return lcaBST(root.left, p, q)
# If both greater, go right
elif p.val > root.val and q.val > root.val:
return lcaBST(root.right, p, q)
# Otherwise, root is LCA
else:
return root

More efficient than general tree LCA (no need to search both subtrees).

Self-Check

  1. Trace LCA on tree where one node is ancestor of the other.

    • Example: Tree with node A as root, node B as A's left child. LCA(A, B) = A. The algorithm returns A immediately when checking root == p.
  2. Explain why Path Sum III uses prefix sum hash map. What does target - current_sum represent?

    • Answer: If we're at position i with sum = current_sum, and we want a path summing to target, we need a previous position j where sum[j..i] = target. This means sum[i] - sum[j] = target, so sum[j] = current_sum - target. The hash map lets us check if that previous sum exists in O(1).
  3. How does tree diameter differ from height? Why can't we just return max of both subtrees?

    • Answer: Height is longest path going down (node to leaf). Diameter is longest path overall (can go through node in any direction). You need both subtrees to form diameter through node. Returning max(left, right) loses paths that go through the node.
  4. What's the time complexity of serialization and deserialization?

    • Answer: O(n) for both. Traverse every node once. Space: O(n) for the serialized string or recursion stack.
  5. Why do we need to pop() from the prefix sum map during backtracking in Path Sum III?

    • Answer: The prefix sum is only valid for this particular path from root. When we backtrack, this path is no longer active, so the sum should no longer count for other branches.

One Takeaway

info

Binary tree problems showcase recursive thinking. LCA elegantly handles all cases through simple recursion. Path sum patterns with prefix maps solve range sum problems directly. Diameter and serialization reveal how to extract properties while traversing. Master these patterns and most tree problems become straightforward.

References

  • Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to Algorithms (3rd ed.). MIT Press.
  • McDowell, G. L. (2015). Cracking the Coding Interview (6th ed.). CareerCup.