THN Interview Prep

030. Substring with Concatenation of All Words

At a Glance

  • Topic: sliding-window
  • Pattern: Sliding Window (fixed multi-unit window + hash map)
  • Difficulty: Hard
  • Companies: Amazon, Google, Meta, Microsoft
  • Frequency: Medium
  • LeetCode: 30

Problem (one-liner)

Given string source and array words where every word has the same length, return all starting indices of substrings that are a concatenation of each word in words exactly once (any order). Words may repeat in words.

Recognition Cues

  • All words same length — unlocks stepping wordLength at a time
  • Total window length = len(words) * wordLength
  • Permutation of multiset — compare counts, not order

Diagram

At-a-glance flow (replace with problem-specific Mermaid as you refine this note). camelCase node IDs; no spaces in IDs.

Loading diagram…

Approaches

  • Brute force — try every start index, partition into numWords slices of length wordLength, compare multiset — O(n · numWords) string compares per start → O(n²)–scale in practice.
  • Acceptable — Rabin–Karp rolling hash per slot + hash multiset equality — reduces compare cost but hash collisions need tie-break.
  • Optimal — fix wordLength offset streams: for each offset ∈ [0, wordLength), slide one window of numWords consecutive pieces along indices offset, offset+L, …; shrink when counts overflow expected — O(n) total across offsets with careful map updates (each character enters/leaves O(1) amortized per offset pass).

Optimal Solution

Go

func findSubstring(source string, words []string) []int {
	if len(words) == 0 {
		return nil
	}
	wordLength := len(words[0])
	numWords := len(words)
	n := len(source)
	expected := make(map[string]int)
	for _, word := range words {
		expected[word]++
	}
	out := []int{}

	for offset := 0; offset < wordLength; offset++ {
		leftPointer := offset
		current := make(map[string]int)
		matchedSlots := 0
		for rightEnd := offset + wordLength; rightEnd <= n; rightEnd += wordLength {
			piece := source[rightEnd-wordLength : rightEnd]
			if expected[piece] == 0 {
				current = make(map[string]int)
				matchedSlots = 0
				leftPointer = rightEnd
				continue
			}
			current[piece]++
			matchedSlots++
			for current[piece] > expected[piece] {
				leftPiece := source[leftPointer : leftPointer+wordLength]
				current[leftPiece]--
				matchedSlots--
				leftPointer += wordLength
			}
			if matchedSlots == numWords {
				out = append(out, leftPointer)
			}
		}
	}
	return out
}

JavaScript

function findSubstring(source, words) {
	if (words.length === 0) {
		return [];
	}
	const wordLength = words[0].length;
	const numWords = words.length;
	const expected = new Map();
	for (const word of words) {
		expected.set(word, (expected.get(word) || 0) + 1);
	}
	const out = [];

	for (let offset = 0; offset < wordLength; offset++) {
		let leftPointer = offset;
		const current = new Map();
		let matchedSlots = 0;
		for (let rightEnd = offset + wordLength; rightEnd <= source.length; rightEnd += wordLength) {
			const piece = source.slice(rightEnd - wordLength, rightEnd);
			if (!expected.has(piece) || expected.get(piece) === 0) {
				current.clear();
				matchedSlots = 0;
				leftPointer = rightEnd;
				continue;
			}
			current.set(piece, (current.get(piece) || 0) + 1);
			matchedSlots++;
			while (current.get(piece) > expected.get(piece)) {
				const leftPiece = source.slice(leftPointer, leftPointer + wordLength);
				current.set(leftPiece, current.get(leftPiece) - 1);
				matchedSlots--;
				leftPointer += wordLength;
			}
			if (matchedSlots === numWords) {
				out.push(leftPointer);
			}
		}
	}
	return out;
}

Walkthrough

source = "barfoothefoobarman", words = ["foo","bar"]wordLength=3, numWords=2

Try offset 0: slide pieces bar, foo …When window multiset matches {"foo":1,"bar":1}, record start.

Invariant for one offset line: current never exceeds expected for any word; excess triggers shrink from left by whole words.

Edge Cases

  • Words with repetition in words list — counts must match exactly
  • words = [""] edge case (problem constraints usually exclude)
  • source shorter than total concatenation length — no matches
  • Many duplicates — map counts critical

Pitfalls

  • Only scanning starts multiple of wordLength — wrong; need all offsets 0..wordLength-1
  • Treating permutation as sorted join — order-free; multiset only
  • Integer overflow not relevant; time blow-up if naive nested loops

Similar Problems

Variants

  • Different word lengths — NP-hard in general; ask constraints (often not interview-sized).
  • Count concatenations instead of list indices — same traversal, push count.
  • Streaming text — restart offset schedules; heavy engineering variant.

Mind-Map Tags

#sliding-window #fixed-window #multiset #word-grid-offset #same-length-words #hard-string

Last updated on

Spotted something unclear or wrong on this page?

On this page