Skip to main content

Problem Analysis

TL;DR

Effective problem solving starts with thorough analysis before coding. Problem Decomposition: Break problem into smaller subproblems; understand inputs, outputs, constraints; identify invariants. Pattern Recognition: Problems follow patterns (sliding window for subarray queries, two-pointer for paired search, DP for optimization). Recognizing pattern immediately suggests approach. Constraint Analysis: Identify hard constraints (must satisfy) vs soft constraints (optimize); numerical constraints (size limits) vs structural constraints (ordering). Data Structure Selection: Choose structure matching access patterns: arrays for random access, linked lists for frequent insertion, hash tables for lookups, trees for ordered traversal. Edge Cases: Empty input, single elements, boundary values, duplicates. Thoroughly test these. Analysis prevents wrong approach; wrong approach wastes hours. Invest 5 minutes analyzing before 45 minutes implementing.

Core Concepts

Problem Decomposition Framework

Problem decomposition breaks complex problems into manageable pieces:

Input Understanding: What are inputs? Sizes? Types? Value ranges?

Output Specification: What exactly must we compute? Return format?

Core Constraint Identification: What makes problem hard? (e.g., "in-place" means O(1) space)

Invariant Recognition: What property must hold throughout solution? (e.g., two-pointer: elements before left pointer satisfy condition)

Simple Cases First: Solve trivial cases (n=1, n=2) to verify understanding.

Pattern Recognition Guide

Common algorithmic patterns solve recurring problem structures:

Sliding Window: Subarray/substring problems with contiguous subarrays. Fix window size or expand/contract based on condition.

Two-Pointer: Problems with paired elements (sorted array median, remove duplicates, partition). One pointer from each end or same direction with different speeds.

Binary Search: When answer is monotonic (increasing/decreasing). Search for boundary where property changes.

Dynamic Programming: When problem has optimal substructure and overlapping subproblems. Memoize to avoid recomputation.

Greedy: When local optimal choices lead to global optimum. Prove correctness with exchange argument.

DFS/BFS: Graph/tree traversal or exploring state space. DFS for space efficiency, BFS for shortest path.

Backtracking: Constraint satisfaction or exhaustive search. Try choices, explore, undo.

Topological Sort: When problem involves dependencies or ordering constraints.

Constraint Analysis Methodology

Constraints shape the solution approach:

Hard Constraints (must satisfy): In-place modification, single pass, no extra data structure. These eliminate certain approaches.

Soft Constraints (optimize): Time complexity, space complexity, code simplicity. These guide implementation details.

Numerical Constraints: Input size (n <= 10^5 suggests O(n log n) acceptable; n <= 10^9 requires O(1) or O(log n)). Value range affects data structure choice (array vs hash table).

Structural Constraints: Sorted/unsorted, tree/graph properties, duplicate handling. Sorted input suggests binary search; graph suggests graph algorithms.

Data Structure Selection Criteria

Access Pattern: Random access (array/hash table); sequential (linked list); ordered traversal (balanced tree); all smallest elements frequently (heap).

Operations Required:

  • Frequent insertion/deletion: linked list or balanced tree
  • Frequent search: hash table or balanced tree
  • Frequent min/max: heap
  • Range queries: segment tree or balanced tree

Space Constraints: Arrays use contiguous memory; linked lists have pointer overhead; hash tables have collision overhead; trees have pointer overhead.

Trade-off Analysis: Hash table gives O(1) average search but O(n) worst case and O(n) space. Balanced tree gives O(log n) guaranteed and O(n) space. Choose based on distribution and constraints.

Key Algorithms

Problem Analysis Checklist

  1. Read problem statement carefully (twice if needed)
  2. Identify inputs and outputs with specific sizes/ranges
  3. List all constraints (hard and soft)
  4. Work through simple examples (n=1, n=2, small cases)
  5. Recognize algorithmic pattern (sliding window, DP, etc.)
  6. Select data structures matching access patterns
  7. Identify edge cases and corner cases
  8. Estimate complexity (time and space)

Common Pattern Matching

Problem TypePatternData StructureComplexity
Subarray sum, max lengthSliding windowArray, hash tableO(n)
Sorted array searchBinary searchArrayO(log n)
Pair sum, partitionTwo pointerArrayO(n)
Overlapping subproblemsDynamic programmingArray, hash tableVaries
Graph/tree traversalDFS/BFSGraph, queue/stackO(V+E)
Constraint satisfactionBacktrackingRecursionExponential

Practical Example

PROBLEM: "Find the longest substring without repeating characters"

1. INPUTS & OUTPUTS
- Input: string s
- Output: integer (length of longest substring)
- Size: n = len(s), typically up to 10^5

2. CONSTRAINTS
- Hard: None obvious, "substring" means contiguous
- Soft: Optimize for time (likely O(n) required)
- Structural: Need contiguous characters

3. SIMPLE CASES
- "": output 0
- "a": output 1
- "au": output 2
- "aa": output 1 (repeating 'a')
- "abba": output 2 (substrings "ab" or "ba")

4. PATTERN RECOGNITION
- Subarray/substring with property: "no repeating characters"
- Property changes as we add characters
- Suggests: SLIDING WINDOW

5. CONSTRAINT ANALYSIS
- What defines "longest"? Length
- What makes substring invalid? Repeating character
- How to track repeating? Character frequency

6. DATA STRUCTURE
- Need to track character frequencies: hash table (dict/map)
- Need window pointers: two variables (left, right)
- Time: O(n) with sliding window
- Space: O(min(n, 26)) for character frequencies

7. EDGE CASES
- Empty string: return 0
- Single character: return 1
- All same characters: return 1
- All different: return length
- Unicode vs ASCII: clarify with interviewer

8. ALGORITHM SKETCH
- Use sliding window with left and right pointers
- Expand right, adding characters to seen set
- When duplicate found, shrink from left until no duplicate
- Track maximum window size

Common Pitfalls

warning

Skipping analysis: Jumping to code without understanding problem thoroughly. Leads to solving wrong problem or choosing inefficient approach. Spend 5-10 minutes analyzing first.

Ignoring constraints: Missing "in-place" constraint or size limit. Leads to infeasible solutions that don't compile or time out.

Missing edge cases during analysis: Testing only provided examples. Edge cases (empty, single element, all same, boundary values) often break solutions.

Wrong pattern recognition: Misidentifying problem type. "Looks like DP but it's actually greedy" leads to hours debugging wrong approach.

Over-designing: Choosing overly complex data structure for simple problem. Sometimes simple array is best; don't use segment tree if prefix sum works.

Assuming homogeneous input: Input described as "array" could have duplicates, unsorted, negative numbers. Clarify all assumptions.

Self-Check

  1. You see "find max length of contiguous subarray" - what pattern comes to mind? Why?
  2. Problem requires "O(1) space" - what does this eliminate? What data structures remain valid?
  3. Given "n <= 10^8" and "time limit 2 seconds", what complexity is acceptable (assuming 10^9 operations per second)?

One Takeaway

info

Problem analysis is the highest-leverage activity in problem solving. Ten minutes of thorough analysis prevents forty minutes of implementing the wrong solution. Decompose into inputs/outputs/constraints, recognize algorithmic patterns, and select data structures matching access patterns. This systematic approach transforms vague problems into clear strategies before a single line of code.

References

  • Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to Algorithms (3rd ed.). MIT Press.
  • Kleinberg, J., & Tardos, É. (2005). Algorithm Design. Pearson Education.