Hash Table Applications
TL;DR
Hash tables provide O(1) average-case lookups enabling fast complement checking. Two-sum leverages storing seen values and checking complement = target - current. Frequency counting maps elements to occurrence counts for anagram grouping and duplicate detection. Group anagrams by sorting characters or using character count tuples as keys. These patterns transform naive O(n²) searches into O(n) solutions, making hash tables essential for optimization.
Core Concepts
Frequency Counting
Character frequency counts each unique character's occurrences. Hash map maps character to count; increment when seen. Enables problems like anagram detection, palindrome permutation checking, and character rearrangement.
Element frequency extends character frequency to arrays. Useful for finding mode, detecting duplicates, or validating permutations. Single pass O(n) time, O(k) space where k is unique elements.
Running frequency maintains frequency map as processing data stream. Enables dynamic queries like "is current substring an anagram of target?" without recomputing.
Frequency patterns reveal data structure insights. Majority element uses frequency. Most frequent K uses heap with frequency comparator. Least frequent depends on frequency values.
Complement Lookups and Two-Sum
Two-sum problem finds two indices where nums[i] + nums[j] = target. Naive O(n²) tries all pairs. Hash map approach: for each number, check if complement (target - num) exists in map. O(n) time, O(n) space.
Prerequisite: current number must not be same as complement (avoid using same index twice). Track index or count carefully.
Three-sum and beyond build on two-sum. Three-sum: fix first element, solve two-sum for remaining. O(n²) with sorting. Four-sum similar with nested loops.
Closest pair sum finds pair summing closest to target. Track minimum difference encountered while checking complements. Similar O(n) approach.
Group Anagrams
Anagram definition strings are anagrams if containing same characters in different order. "listen" and "silent" are anagrams.
Sorting approach sorts each string's characters. Anagrams map to identical sorted strings. Use sorted string as key in hash map grouping anagrams.
Counting approach counts each character creating tuple (a_count, b_count, ..., z_count). Anagrams map to identical tuples. More efficient than sorting for large alphabets.
Complexity trade-off: sorting is O(n * m log m) where m is max word length; counting is O(n * m) for m-character strings.
Duplicate Detection
Set-based detection tracks seen elements. First encounter: add to set. Second encounter: found duplicate. O(n) time, O(n) space.
In-place detection exploits array indices as storage. For array containing integers 1 to n, use sign of arr[abs(value)] to mark presence. O(1) space but modifies array.
First duplicate stores first occurrence index. When duplicate found, compare indices to find earliest.
Duplicate count frequency map directly stores count. Iterate once to build map, once more to count duplicates.
Key Algorithms and Techniques
Two-Sum Template
Hash map stores seen elements. For each new element, check if complement exists.
Group Anagrams by Sorted Key
Sort each string; group by sorted key in hash map.
Frequency Counting Pattern
Single pass building hash map. Iterate through data incrementing counts.
Complement Checking
For each element, compute complement; check existence in map before insertion.
Practical Example
- Python
- Java
- TypeScript
from collections import defaultdict, Counter
# Two-sum: find two indices that sum to target
def two_sum(nums, target):
"""Find two indices where nums[i] + nums[j] = target."""
seen = {}
for i, num in enumerate(nums):
complement = target - num
if complement in seen:
return [seen[complement], i]
seen[num] = i
return []
# Group anagrams together
def group_anagrams(strs):
"""Group strings that are anagrams of each other."""
anagram_map = defaultdict(list)
for s in strs:
# Use sorted string as key
key = ''.join(sorted(s))
anagram_map[key].append(s)
return list(anagram_map.values())
# Frequency counting
def is_anagram(s, t):
"""Check if two strings are anagrams using frequency."""
return Counter(s) == Counter(t)
# Find all duplicates in array 1 to n
def find_duplicates(nums):
"""Find duplicate numbers in array containing 1 to n+1."""
seen = set()
duplicates = set()
for num in nums:
if num in seen:
duplicates.add(num)
seen.add(num)
return list(duplicates)
# Most frequent K elements
def top_k_frequent(nums, k):
"""Find k most frequent elements."""
count = Counter(nums)
return [num for num, _ in count.most_common(k)]
# Valid anagram check
def valid_anagram(s, t):
"""Efficient anagram check using frequency map."""
if len(s) != len(t):
return False
freq = {}
for char in s:
freq[char] = freq.get(char, 0) + 1
for char in t:
if char not in freq:
return False
freq[char] -= 1
if freq[char] < 0:
return False
return all(count == 0 for count in freq.values())
# Character frequency in string
def character_frequency(s):
"""Count frequency of each character."""
return dict(Counter(s))
# Test cases
print(f"Two-sum [2,7,11,15] target=9: {two_sum([2,7,11,15], 9)}")
print(f"Group anagrams: {group_anagrams(['eat', 'tea', 'ate', 'bat', 'tab'])}")
print(f"Is 'listen' anagram of 'silent'? {is_anagram('listen', 'silent')}")
print(f"Find duplicates [1,4,4,2,4]: {find_duplicates([1,4,4,2,4])}")
print(f"Top-2 frequent [1,1,1,2,2,3]: {top_k_frequent([1,1,1,2,2,3], 2)}")
print(f"Character freq of 'hello': {character_frequency('hello')}")
import java.util.*;
public class HashTableApplications {
// Two-sum: find two indices that sum to target
static int[] twoSum(int[] nums, int target) {
Map<Integer, Integer> seen = new HashMap<>();
for (int i = 0; i < nums.length; i++) {
int complement = target - nums[i];
if (seen.containsKey(complement)) {
return new int[]{seen.get(complement), i};
}
seen.put(nums[i], i);
}
return new int[]{};
}
// Group anagrams
static List<List<String>> groupAnagrams(String[] strs) {
Map<String, List<String>> anagramMap = new HashMap<>();
for (String s : strs) {
char[] chars = s.toCharArray();
Arrays.sort(chars);
String key = new String(chars);
anagramMap.computeIfAbsent(key, k -> new ArrayList<>()).add(s);
}
return new ArrayList<>(anagramMap.values());
}
// Find duplicates
static List<Integer> findDuplicates(int[] nums) {
Set<Integer> seen = new HashSet<>();
Set<Integer> duplicates = new HashSet<>();
for (int num : nums) {
if (!seen.add(num)) {
duplicates.add(num);
}
}
return new ArrayList<>(duplicates);
}
// Most frequent K elements
static List<Integer> topKFrequent(int[] nums, int k) {
Map<Integer, Integer> count = new HashMap<>();
for (int num : nums) {
count.put(num, count.getOrDefault(num, 0) + 1);
}
// Use PriorityQueue with custom comparator
PriorityQueue<Integer> heap = new PriorityQueue<>(
(a, b) -> count.get(b) - count.get(a)
);
heap.addAll(count.keySet());
List<Integer> result = new ArrayList<>();
for (int i = 0; i < k; i++) {
result.add(heap.poll());
}
return result;
}
// Check if anagram
static boolean isAnagram(String s, String t) {
if (s.length() != t.length()) return false;
int[] freq = new int[26];
for (char c : s.toCharArray()) {
freq[c - 'a']++;
}
for (char c : t.toCharArray()) {
freq[c - 'a']--;
if (freq[c - 'a'] < 0) return false;
}
return true;
}
public static void main(String[] args) {
int[] nums = {2, 7, 11, 15};
System.out.println("Two-sum: " + Arrays.toString(twoSum(nums, 9)));
String[] strs = {"eat", "tea", "ate", "bat", "tab"};
System.out.println("Group anagrams: " + groupAnagrams(strs));
int[] freqNums = {1, 1, 1, 2, 2, 3};
System.out.println("Top-2 frequent: " + topKFrequent(freqNums, 2));
System.out.println("'listen' anagram 'silent'? " + isAnagram("listen", "silent"));
}
}
// Two-sum
function twoSum(nums: number[], target: number): number[] {
const seen = new Map<number, number>();
for (let i = 0; i < nums.length; i++) {
const complement = target - nums[i];
if (seen.has(complement)) {
return [seen.get(complement)!, i];
}
seen.set(nums[i], i);
}
return [];
}
// Group anagrams
function groupAnagrams(strs: string[]): string[][] {
const anagramMap = new Map<string, string[]>();
for (const s of strs) {
const key = s.split('').sort().join('');
if (!anagramMap.has(key)) {
anagramMap.set(key, []);
}
anagramMap.get(key)!.push(s);
}
return [...anagramMap.values()];
}
// Find duplicates
function findDuplicates(nums: number[]): number[] {
const seen = new Set<number>();
const duplicates = new Set<number>();
for (const num of nums) {
if (seen.has(num)) {
duplicates.add(num);
}
seen.add(num);
}
return [...duplicates];
}
// Character frequency
function characterFrequency(s: string): Record<string, number> {
const freq: Record<string, number> = {};
for (const char of s) {
freq[char] = (freq[char] || 0) + 1;
}
return freq;
}
// Is anagram
function isAnagram(s: string, t: string): boolean {
if (s.length !== t.length) return false;
const freq = new Map<string, number>();
for (const char of s) {
freq.set(char, (freq.get(char) || 0) + 1);
}
for (const char of t) {
const count = freq.get(char) || 0;
if (count === 0) return false;
freq.set(char, count - 1);
}
return [...freq.values()].every(c => c === 0);
}
// Test
console.log("Two-sum [2,7,11,15] target=9:", twoSum([2, 7, 11, 15], 9));
console.log("Group anagrams:", groupAnagrams(['eat', 'tea', 'ate', 'bat', 'tab']));
console.log("Find duplicates:", findDuplicates([1, 4, 4, 2, 4]));
console.log("Is anagram:", isAnagram('listen', 'silent'));
Common Pitfalls
Using same element twice in two-sum: Check that complement's index differs from current index. Store index, not just presence.
Anagram grouping key errors: Character counts must be consistent. Sorting is safe; if using tuples, ensure same ordering always.
Frequency off-by-one: Decrementing frequency without checking if element still valid causes false positives. Verify before use.
Hash collision assumptions: Hash tables average O(1) but worst-case O(n). Very large datasets with poor hash functions degrade performance.
Not handling edge cases: Empty strings in anagram grouping, zero target in two-sum, or single-element arrays require explicit handling.
Self-Check
- Why does two-sum run in O(n) with hash tables versus O(n²) with brute force? What's the trade-off?
- How do anagrams map to identical keys? Compare sorting versus frequency counting approaches.
- Explain how frequency counting detects duplicates in O(n) time. What happens with multiple duplicates?
One Takeaway
Hash tables unlock O(1) lookups transforming naive problems from O(n²) to O(n). Complement checking for two-sum, frequency counting for anagrams, and duplicate tracking all leverage this principle. Recognizing these patterns turns interview problems into straightforward implementations.
References
- Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to Algorithms (3rd ed.). MIT Press.
- McDowell, G. L. (2015). Cracking the Coding Interview (6th ed.). CareerCup.