Problem Analysis
TL;DR
Effective problem solving starts with thorough analysis before coding. Problem Decomposition: Break problem into smaller subproblems; understand inputs, outputs, constraints; identify invariants. Pattern Recognition: Problems follow patterns (sliding window for subarray queries, two-pointer for paired search, DP for optimization). Recognizing pattern immediately suggests approach. Constraint Analysis: Identify hard constraints (must satisfy) vs soft constraints (optimize); numerical constraints (size limits) vs structural constraints (ordering). Data Structure Selection: Choose structure matching access patterns: arrays for random access, linked lists for frequent insertion, hash tables for lookups, trees for ordered traversal. Edge Cases: Empty input, single elements, boundary values, duplicates. Thoroughly test these. Analysis prevents wrong approach; wrong approach wastes hours. Invest 5 minutes analyzing before 45 minutes implementing.
Core Concepts
Problem Decomposition Framework
Problem decomposition breaks complex problems into manageable pieces:
Input Understanding: What are inputs? Sizes? Types? Value ranges?
Output Specification: What exactly must we compute? Return format?
Core Constraint Identification: What makes problem hard? (e.g., "in-place" means O(1) space)
Invariant Recognition: What property must hold throughout solution? (e.g., two-pointer: elements before left pointer satisfy condition)
Simple Cases First: Solve trivial cases (n=1, n=2) to verify understanding.
Pattern Recognition Guide
Common algorithmic patterns solve recurring problem structures:
Sliding Window: Subarray/substring problems with contiguous subarrays. Fix window size or expand/contract based on condition.
Two-Pointer: Problems with paired elements (sorted array median, remove duplicates, partition). One pointer from each end or same direction with different speeds.
Binary Search: When answer is monotonic (increasing/decreasing). Search for boundary where property changes.
Dynamic Programming: When problem has optimal substructure and overlapping subproblems. Memoize to avoid recomputation.
Greedy: When local optimal choices lead to global optimum. Prove correctness with exchange argument.
DFS/BFS: Graph/tree traversal or exploring state space. DFS for space efficiency, BFS for shortest path.
Backtracking: Constraint satisfaction or exhaustive search. Try choices, explore, undo.
Topological Sort: When problem involves dependencies or ordering constraints.
Constraint Analysis Methodology
Constraints shape the solution approach:
Hard Constraints (must satisfy): In-place modification, single pass, no extra data structure. These eliminate certain approaches.
Soft Constraints (optimize): Time complexity, space complexity, code simplicity. These guide implementation details.
Numerical Constraints: Input size (n <= 10^5 suggests O(n log n) acceptable; n <= 10^9 requires O(1) or O(log n)). Value range affects data structure choice (array vs hash table).
Structural Constraints: Sorted/unsorted, tree/graph properties, duplicate handling. Sorted input suggests binary search; graph suggests graph algorithms.
Data Structure Selection Criteria
Access Pattern: Random access (array/hash table); sequential (linked list); ordered traversal (balanced tree); all smallest elements frequently (heap).
Operations Required:
- Frequent insertion/deletion: linked list or balanced tree
- Frequent search: hash table or balanced tree
- Frequent min/max: heap
- Range queries: segment tree or balanced tree
Space Constraints: Arrays use contiguous memory; linked lists have pointer overhead; hash tables have collision overhead; trees have pointer overhead.
Trade-off Analysis: Hash table gives O(1) average search but O(n) worst case and O(n) space. Balanced tree gives O(log n) guaranteed and O(n) space. Choose based on distribution and constraints.
Key Algorithms
Problem Analysis Checklist
- Read problem statement carefully (twice if needed)
- Identify inputs and outputs with specific sizes/ranges
- List all constraints (hard and soft)
- Work through simple examples (n=1, n=2, small cases)
- Recognize algorithmic pattern (sliding window, DP, etc.)
- Select data structures matching access patterns
- Identify edge cases and corner cases
- Estimate complexity (time and space)
Common Pattern Matching
| Problem Type | Pattern | Data Structure | Complexity |
|---|---|---|---|
| Subarray sum, max length | Sliding window | Array, hash table | O(n) |
| Sorted array search | Binary search | Array | O(log n) |
| Pair sum, partition | Two pointer | Array | O(n) |
| Overlapping subproblems | Dynamic programming | Array, hash table | Varies |
| Graph/tree traversal | DFS/BFS | Graph, queue/stack | O(V+E) |
| Constraint satisfaction | Backtracking | Recursion | Exponential |
Practical Example
- Example Analysis
- Python Pattern Examples
- TypeScript Pattern Examples
PROBLEM: "Find the longest substring without repeating characters"
1. INPUTS & OUTPUTS
- Input: string s
- Output: integer (length of longest substring)
- Size: n = len(s), typically up to 10^5
2. CONSTRAINTS
- Hard: None obvious, "substring" means contiguous
- Soft: Optimize for time (likely O(n) required)
- Structural: Need contiguous characters
3. SIMPLE CASES
- "": output 0
- "a": output 1
- "au": output 2
- "aa": output 1 (repeating 'a')
- "abba": output 2 (substrings "ab" or "ba")
4. PATTERN RECOGNITION
- Subarray/substring with property: "no repeating characters"
- Property changes as we add characters
- Suggests: SLIDING WINDOW
5. CONSTRAINT ANALYSIS
- What defines "longest"? Length
- What makes substring invalid? Repeating character
- How to track repeating? Character frequency
6. DATA STRUCTURE
- Need to track character frequencies: hash table (dict/map)
- Need window pointers: two variables (left, right)
- Time: O(n) with sliding window
- Space: O(min(n, 26)) for character frequencies
7. EDGE CASES
- Empty string: return 0
- Single character: return 1
- All same characters: return 1
- All different: return length
- Unicode vs ASCII: clarify with interviewer
8. ALGORITHM SKETCH
- Use sliding window with left and right pointers
- Expand right, adding characters to seen set
- When duplicate found, shrink from left until no duplicate
- Track maximum window size
# Sliding Window Pattern
def longest_substring_without_repeating(s):
"""Longest substring with no repeating characters"""
char_to_index = {}
max_length = 0
left = 0
for right in range(len(s)):
if s[right] in char_to_index:
# Duplicate found, move left pointer
left = max(left, char_to_index[s[right]] + 1)
char_to_index[s[right]] = right
max_length = max(max_length, right - left + 1)
return max_length
# Two-Pointer Pattern
def two_sum_sorted(arr, target):
"""Find two numbers that sum to target in sorted array"""
left, right = 0, len(arr) - 1
while left < right:
current_sum = arr[left] + arr[right]
if current_sum == target:
return [left, right]
elif current_sum < target:
left += 1
else:
right -= 1
return None
# Pattern Recognition: Identify the pattern
def analyze_problem(problem_description):
"""
Example of problem analysis workflow
"""
patterns = {
'contiguous subarray': 'Sliding Window',
'subarray sum': 'Sliding Window or Prefix Sum',
'sorted array': 'Binary Search',
'pair or triplet': 'Two Pointer or Hash Table',
'overlapping subproblems': 'Dynamic Programming',
'tree/graph traversal': 'DFS/BFS',
'all combinations': 'Backtracking',
'constraint satisfaction': 'Backtracking or CSP',
}
# Scan description for keywords
for keyword, pattern in patterns.items():
if keyword in problem_description.lower():
return pattern
return "Unknown - Analyze Further"
# Tests
print("Longest substring:", longest_substring_without_repeating("abcabcbb")) # 3
print("Two sum:", two_sum_sorted([2, 7, 11, 15], 9)) # [0, 1]
print("Pattern:", analyze_problem("Find max length of contiguous subarray with sum > 5"))
// Sliding Window Pattern
function longestSubstringWithoutRepeating(s: string): number {
const charToIndex = new Map<string, number>();
let maxLength = 0;
let left = 0;
for (let right = 0; right < s.length; right++) {
if (charToIndex.has(s[right])) {
left = Math.max(left, charToIndex.get(s[right])! + 1);
}
charToIndex.set(s[right], right);
maxLength = Math.max(maxLength, right - left + 1);
}
return maxLength;
}
// Two-Pointer Pattern
function twoSumSorted(arr: number[], target: number): [number, number] | null {
let left = 0;
let right = arr.length - 1;
while (left < right) {
const sum = arr[left] + arr[right];
if (sum === target) {
return [left, right];
} else if (sum < target) {
left++;
} else {
right--;
}
}
return null;
}
// Pattern Recognition
interface PatternMatch {
keyword: string;
suggestedPattern: string;
}
function analyzePattern(description: string): PatternMatch[] {
const patterns: Record<string, string> = {
'contiguous subarray': 'Sliding Window',
'subarray sum': 'Sliding Window or Prefix Sum',
'sorted array': 'Binary Search',
'pair': 'Two Pointer or Hash Table',
'overlapping': 'Dynamic Programming',
'traversal': 'DFS/BFS',
'combinations': 'Backtracking',
};
const matches: PatternMatch[] = [];
for (const [keyword, pattern] of Object.entries(patterns)) {
if (description.toLowerCase().includes(keyword)) {
matches.push({keyword, suggestedPattern: pattern});
}
}
return matches;
}
// Tests
console.log("Longest substring:", longestSubstringWithoutRepeating("abcabcbb")); // 3
console.log("Two sum:", twoSumSorted([2, 7, 11, 15], 9)); // [0, 1]
console.log("Pattern matches:", analyzePattern("Find contiguous subarray with max sum"));
Common Pitfalls
Skipping analysis: Jumping to code without understanding problem thoroughly. Leads to solving wrong problem or choosing inefficient approach. Spend 5-10 minutes analyzing first.
Ignoring constraints: Missing "in-place" constraint or size limit. Leads to infeasible solutions that don't compile or time out.
Missing edge cases during analysis: Testing only provided examples. Edge cases (empty, single element, all same, boundary values) often break solutions.
Wrong pattern recognition: Misidentifying problem type. "Looks like DP but it's actually greedy" leads to hours debugging wrong approach.
Over-designing: Choosing overly complex data structure for simple problem. Sometimes simple array is best; don't use segment tree if prefix sum works.
Assuming homogeneous input: Input described as "array" could have duplicates, unsorted, negative numbers. Clarify all assumptions.
Self-Check
- You see "find max length of contiguous subarray" - what pattern comes to mind? Why?
- Problem requires "O(1) space" - what does this eliminate? What data structures remain valid?
- Given "n <= 10^8" and "time limit 2 seconds", what complexity is acceptable (assuming 10^9 operations per second)?
One Takeaway
Problem analysis is the highest-leverage activity in problem solving. Ten minutes of thorough analysis prevents forty minutes of implementing the wrong solution. Decompose into inputs/outputs/constraints, recognize algorithmic patterns, and select data structures matching access patterns. This systematic approach transforms vague problems into clear strategies before a single line of code.
References
- Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to Algorithms (3rd ed.). MIT Press.
- Kleinberg, J., & Tardos, É. (2005). Algorithm Design. Pearson Education.