Complete Guide to Trie Data Structure

Introduction

A Trie (pronounced "try"), also known as a prefix tree or digital tree, is a specialized tree-based data structure used primarily for efficient storage and retrieval of strings. The name "Trie" comes from the word "retrieval," highlighting its primary use case.

Tries are particularly powerful for problems involving string matching, prefix searching, and autocomplete functionality. Unlike binary search trees that compare entire keys, tries leverage the structure of strings by storing characters at each node.

Prerequisite

A basic understanding of strings, hash maps/dictionaries, and tree-based data structures will help in learning Trie efficiently.

What is a Trie?

A Trie is a tree data structure where:

  • Each node represents a single character of a string
  • The path from root to a node represents a prefix
  • The path from root to a leaf (or marked end node) represents a complete word
  • Each node can have multiple children, typically one for each possible character

Visual Example

Consider storing the words: "cat", "car", "card", "dog", "door"

        root
       /    \
      c      d
      |      |
      a      o
     / \     |
    t   r    g
        |    |
        d    o
             |
             r

Structure and Properties

Node Structure

Each Trie node typically contains:

  1. Children: An array or hash map of child nodes (one for each possible character)
  2. IsEndOfWord: A boolean flag indicating if this node marks the end of a valid word
  3. Optional: Additional data like frequency count, word meaning, etc.

Key Properties

  1. Root is empty: The root node doesn't store any character
  2. Shared prefixes: Words with common prefixes share the same path
  3. One path per word: Each complete word has exactly one path from root to an end node
  4. No key collision: Unlike hash tables, tries don't have collision issues
  5. Alphabetically ordered: Children can be stored in alphabetical order

Basic Operations

1. Insertion

Process:

  • Start at the root
  • For each character in the word:
    • If the character exists as a child, move to that child
    • If not, create a new node for that character
  • Mark the last node as end of word

Example: Inserting "cat"

Step 1: root -> create 'c'
Step 2: 'c' -> create 'a'
Step 3: 'a' -> create 't', mark as end

Process:

  • Start at the root
  • For each character in the word:
    • If the character exists as a child, move to that child
    • If not, return false
  • After traversing all characters, check if current node is marked as end of word

Process:

  • Similar to search
  • But only need to verify all characters exist
  • Don't need to check if it's an end of word

4. Deletion

Process:

  • Find the word in the trie
  • Remove nodes from bottom to top, but only if:
    • The node has no other children
    • The node is not marking end of another word
  • Use recursion for clean implementation

Implementation

class TrieNode {
  Map<Character, TrieNode> children;
  boolean isEndOfWord;

  public TrieNode() {
      children = new HashMap<>();
      isEndOfWord = false;
  }
}

class Trie {
  private TrieNode root;

  public Trie() {
      root = new TrieNode();
  }

  public void insert(String word) {
      TrieNode node = root;
      for (char c : word.toCharArray()) {
          node.children.putIfAbsent(c, new TrieNode());
          node = node.children.get(c);
      }
      node.isEndOfWord = true;
  }

  public boolean search(String word) {
      TrieNode node = root;
      for (char c : word.toCharArray()) {
          if (!node.children.containsKey(c)) {
              return false;
          }
          node = node.children.get(c);
      }
      return node.isEndOfWord;
  }

  public boolean startsWith(String prefix) {
      TrieNode node = root;
      for (char c : prefix.toCharArray()) {
          if (!node.children.containsKey(c)) {
              return false;
          }
          node = node.children.get(c);
      }
      return true;
  }
}

Time and Space Complexity

Time Complexity

OperationComplexityExplanation
InsertO(m)m = length of word
SearchO(m)m = length of word
StartsWithO(m)m = length of prefix
DeleteO(m)m = length of word

Space Complexity

  • Worst case: O(ALPHABET_SIZE × N × M)

    • ALPHABET_SIZE: Number of possible characters (26 for lowercase letters)
    • N: Number of words
    • M: Average length of words
  • Best case: O(N × M) when words share many prefixes

  • Practical: Much better than worst case due to prefix sharing

Applications

00

Autocomplete Systems

Search engines and text editors use tries to suggest word completions as users type. Each keystroke navigates down the trie, and all complete words from that node are potential suggestions.

01

Spell Checkers

Dictionary implementations for spell checking by quickly verifying if a word exists. The trie structure allows for fast lookups and can also suggest corrections by finding similar words in nearby branches.

02

IP Routing

Longest prefix matching in network routers uses trie-like structures. Each IP address is stored as a path in the trie, allowing routers to quickly find the best matching route for incoming packets.

03

T9 Predictive Text

Old mobile phone keyboards used tries for predictive text input. Each number sequence maps to multiple possible letters, and the trie helps predict the most likely word based on the key presses.

04

Genome Analysis

Storing and searching DNA sequences efficiently. Tries can quickly find matching gene sequences, identify patterns, and locate specific genetic markers in large genomic databases.

05

Contact Lists

Phone contact search and filtering by name prefix. As you type a name, the trie structure allows instant filtering of contacts that match the prefix, providing real-time search results.

06

Word Games

Boggle, Scrabble solvers use tries to find valid words quickly. The trie stores a dictionary of valid words, allowing the game to rapidly check if a letter sequence forms a valid word.

Advantages and Disadvantages

Advantages

  1. Fast prefix search: O(m) time where m is prefix length
  2. No hash collisions: Unlike hash tables
  3. Alphabetical ordering: Can retrieve keys in sorted order
  4. Memory efficient: For large datasets with common prefixes
  5. Predictable performance: No worst-case hash collisions

Disadvantages

  1. Memory intensive: Can use more memory than hash tables for small datasets
  2. Cache performance: Poor cache locality due to pointer-heavy structure
  3. Implementation complexity: More complex than arrays or hash tables
  4. Not suitable for dense data: Best for sparse string datasets

Must-Do LeetCode Problems

Beginner Level

#ProblemDifficultyImportanceKey ConceptsLink
1Implement Trie (Prefix Tree)Medium⭐⭐⭐⭐⭐Basic trie implementation, insert, search, startsWith#208
2Longest Common PrefixEasy⭐⭐⭐Trie for prefix matching#14

Why Problem #1 is Critical: This is the foundation for all trie problems. You must master this before attempting any other trie question.

Intermediate Level

#ProblemDifficultyImportanceKey ConceptsLink
3Design Add and Search Words Data StructureMedium⭐⭐⭐⭐Trie with wildcard matching, DFS/backtracking#211
4Word Search IIHard⭐⭐⭐⭐⭐Trie + DFS on 2D grid, optimization#212
5Map Sum PairsMedium⭐⭐⭐Trie with additional data storage, prefix sum#677
6Replace WordsMedium⭐⭐⭐Finding shortest prefix match#648

Why Problem #4 is Critical: This is one of the most common trie questions in technical interviews. It combines trie with backtracking and is excellent for demonstrating optimization skills.

Advanced Level

#ProblemDifficultyImportanceKey ConceptsLink
7Maximum XOR of Two Numbers in an ArrayMedium⭐⭐⭐⭐Binary trie, XOR problems, bit manipulation#421
8Search Suggestions SystemMedium⭐⭐⭐⭐Autocomplete, real-world application#1268
9Longest Word in DictionaryMedium⭐⭐⭐Incremental word building#720
10Prefix and Suffix SearchHard⭐⭐⭐⭐Advanced trie with multiple indices#745

Why Problem #7 is Critical: Introduces binary trie concept, which is essential for XOR-related problems. This pattern appears frequently in competitive programming.

Expert Level

#ProblemDifficultyImportanceKey ConceptsLink
11Maximum XOR With an Element From ArrayHard⭐⭐⭐⭐Binary trie with constraints, offline queries#1707
12Distinct Echo SubstringsHard⭐⭐⭐Suffix trie/array concepts, advanced string matching#1316
13Count Pairs With XOR in a RangeHard⭐⭐⭐⭐Binary trie for range queries, complex counting#1803

Common Patterns

Pattern 1: Basic Trie with DFS

Used in: Word Search II, Design Add and Search Words

def dfs(node, path):
    if node.is_end:
        results.append(path)
    for char, child in node.children.items():
        dfs(child, path + char)

Pattern 2: Binary Trie for XOR

Used in: Maximum XOR problems

class BinaryTrieNode:
    def __init__(self):
        self.children = [None, None]  # 0 and 1

def insert_binary(root, num):
    node = root
    for i in range(31, -1, -1):
        bit = (num >> i) & 1
        if not node.children[bit]:
            node.children[bit] = BinaryTrieNode()
        node = node.children[bit]

Pattern 3: Trie with Additional Data

Used in: Map Sum Pairs, Autocomplete Systems

class TrieNode:
    def __init__(self):
        self.children = {}
        self.value = 0  # Additional data
        self.suggestions = []  # Additional data
        self.frequency = 0  # Additional data

Problem Priority Guide

If you only have time for a few problems, focus on these in order:

  1. #208 - Implement Trie (Must do - foundation)
  2. #212 - Word Search II (Most common in interviews)
  3. #421 - Maximum XOR (Learn binary trie pattern)
  4. #211 - Add and Search Words (Wildcard matching)
  5. #1268 - Search Suggestions (Real-world application)

Conclusion

Mastering tries requires understanding both the data structure itself and the problems it solves best. Start with the basic implementation, progress through the must-do problems, and focus on recognizing when a trie is the right tool for the job.

The problems marked with ⭐⭐⭐⭐⭐ and ⭐⭐⭐⭐ are particularly important for technical interviews. Remember that trie problems often combine multiple concepts (DFS, backtracking, bit manipulation), so practice integrating these skills.

Additional Resources