Complete Guide to Trie Data Structure
Introduction
A Trie (pronounced "try"), also known as a prefix tree or digital tree, is a specialized tree-based data structure used primarily for efficient storage and retrieval of strings. The name "Trie" comes from the word "retrieval," highlighting its primary use case.
Tries are particularly powerful for problems involving string matching, prefix searching, and autocomplete functionality. Unlike binary search trees that compare entire keys, tries leverage the structure of strings by storing characters at each node.
A basic understanding of strings, hash maps/dictionaries, and tree-based data structures will help in learning Trie efficiently.
What is a Trie?
A Trie is a tree data structure where:
- Each node represents a single character of a string
- The path from root to a node represents a prefix
- The path from root to a leaf (or marked end node) represents a complete word
- Each node can have multiple children, typically one for each possible character
Visual Example
Consider storing the words: "cat", "car", "card", "dog", "door"
root
/ \
c d
| |
a o
/ \ |
t r g
| |
d o
|
r
Structure and Properties
Node Structure
Each Trie node typically contains:
- Children: An array or hash map of child nodes (one for each possible character)
- IsEndOfWord: A boolean flag indicating if this node marks the end of a valid word
- Optional: Additional data like frequency count, word meaning, etc.
Key Properties
- Root is empty: The root node doesn't store any character
- Shared prefixes: Words with common prefixes share the same path
- One path per word: Each complete word has exactly one path from root to an end node
- No key collision: Unlike hash tables, tries don't have collision issues
- Alphabetically ordered: Children can be stored in alphabetical order
Basic Operations
1. Insertion
Process:
- Start at the root
- For each character in the word:
- If the character exists as a child, move to that child
- If not, create a new node for that character
- Mark the last node as end of word
Example: Inserting "cat"
Step 1: root -> create 'c'
Step 2: 'c' -> create 'a'
Step 3: 'a' -> create 't', mark as end
2. Search
Process:
- Start at the root
- For each character in the word:
- If the character exists as a child, move to that child
- If not, return false
- After traversing all characters, check if current node is marked as end of word
3. StartsWith (Prefix Search)
Process:
- Similar to search
- But only need to verify all characters exist
- Don't need to check if it's an end of word
4. Deletion
Process:
- Find the word in the trie
- Remove nodes from bottom to top, but only if:
- The node has no other children
- The node is not marking end of another word
- Use recursion for clean implementation
Implementation
class TrieNode {
Map<Character, TrieNode> children;
boolean isEndOfWord;
public TrieNode() {
children = new HashMap<>();
isEndOfWord = false;
}
}
class Trie {
private TrieNode root;
public Trie() {
root = new TrieNode();
}
public void insert(String word) {
TrieNode node = root;
for (char c : word.toCharArray()) {
node.children.putIfAbsent(c, new TrieNode());
node = node.children.get(c);
}
node.isEndOfWord = true;
}
public boolean search(String word) {
TrieNode node = root;
for (char c : word.toCharArray()) {
if (!node.children.containsKey(c)) {
return false;
}
node = node.children.get(c);
}
return node.isEndOfWord;
}
public boolean startsWith(String prefix) {
TrieNode node = root;
for (char c : prefix.toCharArray()) {
if (!node.children.containsKey(c)) {
return false;
}
node = node.children.get(c);
}
return true;
}
}
Time and Space Complexity
Time Complexity
| Operation | Complexity | Explanation |
|---|---|---|
| Insert | O(m) | m = length of word |
| Search | O(m) | m = length of word |
| StartsWith | O(m) | m = length of prefix |
| Delete | O(m) | m = length of word |
Space Complexity
-
Worst case: O(ALPHABET_SIZE × N × M)
- ALPHABET_SIZE: Number of possible characters (26 for lowercase letters)
- N: Number of words
- M: Average length of words
-
Best case: O(N × M) when words share many prefixes
-
Practical: Much better than worst case due to prefix sharing
Applications
00
Autocomplete Systems
Search engines and text editors use tries to suggest word completions as users type. Each keystroke navigates down the trie, and all complete words from that node are potential suggestions.
01
Spell Checkers
Dictionary implementations for spell checking by quickly verifying if a word exists. The trie structure allows for fast lookups and can also suggest corrections by finding similar words in nearby branches.
02
IP Routing
Longest prefix matching in network routers uses trie-like structures. Each IP address is stored as a path in the trie, allowing routers to quickly find the best matching route for incoming packets.
03
T9 Predictive Text
Old mobile phone keyboards used tries for predictive text input. Each number sequence maps to multiple possible letters, and the trie helps predict the most likely word based on the key presses.
04
Genome Analysis
Storing and searching DNA sequences efficiently. Tries can quickly find matching gene sequences, identify patterns, and locate specific genetic markers in large genomic databases.
05
Contact Lists
Phone contact search and filtering by name prefix. As you type a name, the trie structure allows instant filtering of contacts that match the prefix, providing real-time search results.
06
Word Games
Boggle, Scrabble solvers use tries to find valid words quickly. The trie stores a dictionary of valid words, allowing the game to rapidly check if a letter sequence forms a valid word.
Advantages and Disadvantages
Advantages
- Fast prefix search: O(m) time where m is prefix length
- No hash collisions: Unlike hash tables
- Alphabetical ordering: Can retrieve keys in sorted order
- Memory efficient: For large datasets with common prefixes
- Predictable performance: No worst-case hash collisions
Disadvantages
- Memory intensive: Can use more memory than hash tables for small datasets
- Cache performance: Poor cache locality due to pointer-heavy structure
- Implementation complexity: More complex than arrays or hash tables
- Not suitable for dense data: Best for sparse string datasets
Must-Do LeetCode Problems
Beginner Level
| # | Problem | Difficulty | Importance | Key Concepts | Link |
|---|---|---|---|---|---|
| 1 | Implement Trie (Prefix Tree) | Medium | ⭐⭐⭐⭐⭐ | Basic trie implementation, insert, search, startsWith | #208 |
| 2 | Longest Common Prefix | Easy | ⭐⭐⭐ | Trie for prefix matching | #14 |
Why Problem #1 is Critical: This is the foundation for all trie problems. You must master this before attempting any other trie question.
Intermediate Level
| # | Problem | Difficulty | Importance | Key Concepts | Link |
|---|---|---|---|---|---|
| 3 | Design Add and Search Words Data Structure | Medium | ⭐⭐⭐⭐ | Trie with wildcard matching, DFS/backtracking | #211 |
| 4 | Word Search II | Hard | ⭐⭐⭐⭐⭐ | Trie + DFS on 2D grid, optimization | #212 |
| 5 | Map Sum Pairs | Medium | ⭐⭐⭐ | Trie with additional data storage, prefix sum | #677 |
| 6 | Replace Words | Medium | ⭐⭐⭐ | Finding shortest prefix match | #648 |
Why Problem #4 is Critical: This is one of the most common trie questions in technical interviews. It combines trie with backtracking and is excellent for demonstrating optimization skills.
Advanced Level
| # | Problem | Difficulty | Importance | Key Concepts | Link |
|---|---|---|---|---|---|
| 7 | Maximum XOR of Two Numbers in an Array | Medium | ⭐⭐⭐⭐ | Binary trie, XOR problems, bit manipulation | #421 |
| 8 | Search Suggestions System | Medium | ⭐⭐⭐⭐ | Autocomplete, real-world application | #1268 |
| 9 | Longest Word in Dictionary | Medium | ⭐⭐⭐ | Incremental word building | #720 |
| 10 | Prefix and Suffix Search | Hard | ⭐⭐⭐⭐ | Advanced trie with multiple indices | #745 |
Why Problem #7 is Critical: Introduces binary trie concept, which is essential for XOR-related problems. This pattern appears frequently in competitive programming.
Expert Level
| # | Problem | Difficulty | Importance | Key Concepts | Link |
|---|---|---|---|---|---|
| 11 | Maximum XOR With an Element From Array | Hard | ⭐⭐⭐⭐ | Binary trie with constraints, offline queries | #1707 |
| 12 | Distinct Echo Substrings | Hard | ⭐⭐⭐ | Suffix trie/array concepts, advanced string matching | #1316 |
| 13 | Count Pairs With XOR in a Range | Hard | ⭐⭐⭐⭐ | Binary trie for range queries, complex counting | #1803 |
Common Patterns
Pattern 1: Basic Trie with DFS
Used in: Word Search II, Design Add and Search Words
def dfs(node, path):
if node.is_end:
results.append(path)
for char, child in node.children.items():
dfs(child, path + char)
Pattern 2: Binary Trie for XOR
Used in: Maximum XOR problems
class BinaryTrieNode:
def __init__(self):
self.children = [None, None] # 0 and 1
def insert_binary(root, num):
node = root
for i in range(31, -1, -1):
bit = (num >> i) & 1
if not node.children[bit]:
node.children[bit] = BinaryTrieNode()
node = node.children[bit]
Pattern 3: Trie with Additional Data
Used in: Map Sum Pairs, Autocomplete Systems
class TrieNode:
def __init__(self):
self.children = {}
self.value = 0 # Additional data
self.suggestions = [] # Additional data
self.frequency = 0 # Additional data
Problem Priority Guide
If you only have time for a few problems, focus on these in order:
- #208 - Implement Trie (Must do - foundation)
- #212 - Word Search II (Most common in interviews)
- #421 - Maximum XOR (Learn binary trie pattern)
- #211 - Add and Search Words (Wildcard matching)
- #1268 - Search Suggestions (Real-world application)
Conclusion
Mastering tries requires understanding both the data structure itself and the problems it solves best. Start with the basic implementation, progress through the must-do problems, and focus on recognizing when a trie is the right tool for the job.
The problems marked with ⭐⭐⭐⭐⭐ and ⭐⭐⭐⭐ are particularly important for technical interviews. Remember that trie problems often combine multiple concepts (DFS, backtracking, bit manipulation), so practice integrating these skills.
Additional Resources
- LeetCode Trie Tag
- Visualgo Trie Visualization
- Pattern Recognition: Most trie problems fall into 3 categories:
- String prefix/suffix matching
- Word search optimization
- Binary/XOR operations