642. Design Search Autocomplete System
At a Glance
- Topic: system-design-coding
- Pattern: Trie for prefix reachability + Top-K / Heap or offline sort per query
- Difficulty: Hard
- Companies: Google, Meta, Amazon, Uber, Dropbox
- Frequency: Low
- LeetCode: 642
Problem (one-liner)
Streaming character input builds a query buffer; after each character (except '#'), return up to three historical sentences with that prefix, ranked by higher hotness (times), then lexicographically smaller; '#' submits the sentence, increments its count, clears the buffer.
Recognition Cues
- Prefix while typing — trie over sentence characters (including spaces per problem charset)
- Top 3 with composite ordering — tuple sort
(-hotness, sentence)or min-heap of size 3 over negated key '#'sentinel — commit buffer as new training example
Diagram
At-a-glance flow (replace with problem-specific Mermaid as you refine this note). camelCase node IDs; no spaces in IDs.
Approaches
- Scan full corpus each keystroke —
O(N)per char — too slow for largeN. - Trie + collect all completions + sort —
O(P + M log M)wherePis prefix length andMis matches — acceptable for interview corpora; production uses bounded heaps per trie node (heavier implementation). - Optimal (interview) — global
sentence -> count; trie navigates prefix; DFS gathers candidate sentences under node; sort with comparator; take 3.
Optimal Solution
Go
package main
import "sort"
type trieNode struct {
children map[byte]*trieNode
}
type AutocompleteSystem struct {
root *trieNode
sentenceTimes map[string]int
currentBuffer []byte
}
func Constructor(sentences []string, times []int) AutocompleteSystem {
system := AutocompleteSystem{
root: &trieNode{children: make(map[byte]*trieNode)},
sentenceTimes: make(map[string]int),
currentBuffer: []byte{},
}
for index := range sentences {
sentence := sentences[index]
system.sentenceTimes[sentence] += times[index]
}
for sentence := range system.sentenceTimes {
system.insertSentence(sentence)
}
return system
}
func (system *AutocompleteSystem) insertSentence(sentence string) {
node := system.root
for position := 0; position < len(sentence); position++ {
byteValue := sentence[position]
if _, exists := node.children[byteValue]; !exists {
node.children[byteValue] = &trieNode{children: make(map[byte]*trieNode)}
}
node = node.children[byteValue]
}
}
func (system *AutocompleteSystem) gatherUnderPrefix(node *trieNode, prefix string, matches *[]string) {
if system.sentenceTimes[prefix] > 0 {
*matches = append(*matches, prefix)
}
for byteValue, child := range node.children {
system.gatherUnderPrefix(child, prefix+string([]byte{byteValue}), matches)
}
}
func (system *AutocompleteSystem) Input(character byte) []string {
if character == '#' {
sentence := string(system.currentBuffer)
system.sentenceTimes[sentence]++
system.insertSentence(sentence)
system.currentBuffer = system.currentBuffer[:0]
return []string{}
}
system.currentBuffer = append(system.currentBuffer, character)
prefixNode := system.root
for position := 0; position < len(system.currentBuffer); position++ {
byteValue := system.currentBuffer[position]
next, exists := prefixNode.children[byteValue]
if !exists {
return []string{}
}
prefixNode = next
}
matches := []string{}
system.gatherUnderPrefix(prefixNode, string(system.currentBuffer), &matches)
sort.Slice(matches, func(left, right int) bool {
leftCount := system.sentenceTimes[matches[left]]
rightCount := system.sentenceTimes[matches[right]]
if leftCount != rightCount {
return leftCount > rightCount
}
return matches[left] < matches[right]
})
if len(matches) > 3 {
matches = matches[:3]
}
return matches
}JavaScript
class TrieNode {
constructor() {
this.children = new Map();
}
}
/**
* @param {string[]} sentences
* @param {number[]} times
*/
var AutocompleteSystem = function (sentences, times) {
this.root = new TrieNode();
this.sentenceTimes = new Map();
for (let index = 0; index < sentences.length; index++) {
const sentence = sentences[index];
this.sentenceTimes.set(sentence, (this.sentenceTimes.get(sentence) || 0) + times[index]);
this._insert(sentence);
}
this.buffer = '';
};
AutocompleteSystem.prototype._insert = function (sentence) {
let node = this.root;
for (let position = 0; position < sentence.length; position++) {
const character = sentence[position];
if (!node.children.has(character)) {
node.children.set(character, new TrieNode());
}
node = node.children.get(character);
}
};
AutocompleteSystem.prototype._gather = function (node, prefix) {
const results = [];
const dfs = (current, built) => {
if (this.sentenceTimes.has(built) && this.sentenceTimes.get(built) > 0) {
results.push(built);
}
for (const [character, child] of current.children) {
dfs(child, built + character);
}
};
dfs(node, prefix);
return results;
};
/**
* @param {character} character
* @return {string[]}
*/
AutocompleteSystem.prototype.input = function (character) {
if (character === '#') {
const sentence = this.buffer;
this.sentenceTimes.set(sentence, (this.sentenceTimes.get(sentence) || 0) + 1);
this._insert(sentence);
this.buffer = '';
return [];
}
this.buffer += character;
let node = this.root;
for (let position = 0; position < this.buffer.length; position++) {
const characterAt = this.buffer[position];
if (!node.children.has(characterAt)) {
return [];
}
node = node.children.get(characterAt);
}
const matches = this._gather(node, this.buffer);
matches.sort((left, right) => {
const leftTimes = this.sentenceTimes.get(left);
const rightTimes = this.sentenceTimes.get(right);
if (leftTimes !== rightTimes) {
return rightTimes - leftTimes;
}
return left.localeCompare(right);
});
return matches.slice(0, 3);
};Indexed top-k (production follow-up): maintain, at each trie node, a min-heap of size 3 (or sorted multiset) of sentences reachable below that prefix, ordered by (-times, sentence). Update heaps lazily on # along the inserted sentence’s path — amortized cheaper than collecting every match when the corpus under a prefix is huge (same spirit as Top-K / Heap).
Walkthrough
Corpus: "i love you" (5), "island" (3). Query "i " → prefix "i " navigates trie; gather sentences sharing prefix; sort by times then string → ["i love you", ...] up to 3.
Invariant: sentenceTimes always holds authoritative counts; trie mirrors inserted sentence keys so DFS does not scan unrelated branches.
Edge Cases
- Prefix with no completions — return
[] - Fewer than three matches — return fewer
'#'on empty buffer — often inserts empty string (platform-dependent; guard if spec requires nonempty)- Tie on hotness — lexicographically smaller sentence wins
Pitfalls
- Sorting only by frequency — must break ties lexicographically as stated
- Rebuilding trie on every
#from scratch — wasteful; incremental insert per sentence suffices - Character set includes space — trie depth spans whole sentences, not just tokens
Similar Problems
- 1268. Search Suggestions System — product catalog prefix + lex order — Medium foundational (different ranking rules)
- 208. Implement Trie (Prefix Tree) — trie spine only — foundational Medium
- 355. Design Twitter — merge multiple ranked streams — design cousin
Variants
- Heap per trie node — cache top 3 at each prefix for
O(1)output after amortized updates - Personalization — user-specific
sentenceTimespartitions or features blended with global counts - Fuzzy / edit-distance — trigram index or BK-tree layered on top of exact trie
Mind-Map Tags
#trie #prefix-search #top-k #custom-comparator #streaming-input #autocomplete
Last updated on
Spotted something unclear or wrong on this page?