THN Interview Prep

217. Contains Duplicate

Quick Identifier

  • Topic: arrays-hashing
  • Pattern: Hash Set (Membership Checking)
  • Hint: Don't compare every number to every other number. Just keep a mathematical "Set" of numbers you've seen so far. Before you add a number to the Set, ask: "Is it already in here?"
  • Difficulty: Easy
  • Companies: Amazon, Apple, Google, Bloomberg, Microsoft
  • LeetCode: 217

Quick Sights

  • Time Complexity: O(n). We iterate through the array at most once. Hash Set lookups and insertions take $O(1)$ amortized time.
  • Space Complexity: O(n). In the worst-case scenario (all elements are unique), the Hash Set will grow to contain all $n$ elements.
  • Core Mechanism: Create an empty Hash Set. Loop through the array. For each num, check if it exists in the Hash Set. If it does, you've found a duplicate—return true. If it doesn't, add num to the Hash Set and continue. If the loop finishes without returning true, all numbers must be unique—return false.

Concept Explanation

As a senior engineer, this is the canonical introduction to the space-time tradeoff.

  1. The Time Penalty (Brute Force): If you aren't allowed to use extra memory, you have to compare nums[0] with every other number. Then nums[1] with every other number. This is a nested loop and takes $O(n^2)$ time. It's too slow.
  2. The Sorting Penalty: You could sort the array in-place ($O(1)$ space) taking $O(n \log n)$ time. Once sorted, duplicates are guaranteed to be right next to each other. You just scan the array once looking for nums[i] == nums[i-1]. This is better, but still mathematically suboptimal.
  3. The Space Tradeoff (Optimal): To achieve true $O(n)$ linear time, you must spend $O(n)$ memory. You create a Hash Set. A Hash Set acts as a highly optimized physical checklist. Checking the checklist takes $O(1)$ time.

Diagram

This flowchart outlines the simple $O(n)$ Hash Set membership check.

Loading diagram…

Study Pattern (SDE3+)

  • 0–3 min: This is usually a warm-up question. Immediately explain the three tiers: Brute Force ($O(n^2)$), Sorting ($O(n \log n)$), and Hash Set ($O(n)$). Mention the space-time tradeoff.
  • Implementation pass: In JavaScript, you can use the syntax new Set(nums).size !== nums.length. While clever and compact, writing out the for loop is often preferred in an interview because the for loop returns true the instant it finds a duplicate (early exit). Converting the entire array to a Set processes all $N$ elements even if the duplicate is at index 0 and 1.
  • 5 min extension: Discuss what happens if memory is extremely tight but the numbers are small and bounded (e.g., all numbers are between 1 and 100). You could use a boolean array or a Bitmask integer instead of a heavy Hash Set, drastically reducing the constant factor of the space complexity.

Approaches

  • Brute force — Compare all pairs. Time: $O(n^2)$, Space: $O(1)$.
  • Sorting — Sort array, check adjacent elements. Time: $O(n \log n)$, Space: $O(1)$ (depends on sort).
  • Hash Set — Store seen elements. Time: $O(n)$, Space: $O(n)$. (Always pick this)

Optimal Solution

Go

func containsDuplicate(nums []int) bool {
	// Use map[int]struct{} as a set in Go.
	// struct{} takes 0 bytes of memory.
	seen := make(map[int]struct{}, len(nums))

	for _, num := range nums {
		// If the number is already in the map, we found a duplicate
		if _, exists := seen[num]; exists {
			return true
		}
		// Otherwise, record that we've seen it
		seen[num] = struct{}{}
	}

	return false
}

JavaScript

function containsDuplicate(nums) {
	const seen = new Set();

	for (const num of nums) {
		// If the Set already has the number, we found a duplicate
		if (seen.has(num)) {
			return true;
		}
		// Otherwise, record that we've seen it
		seen.add(num);
	}

	return false;
}

// Alternative (One-Liner):
// return new Set(nums).size !== nums.length;
// Note: This does not "early exit" like the for-loop does.

Walkthrough

Input: nums = [1, 2, 3, 1]

numActionIs num in seen?seen Set State
1CheckNo{1}
2CheckNo{1, 2}
3CheckNo{1, 2, 3}
1CheckYes{1, 2, 3}

Output: true (Algorithm terminates early upon finding the second 1).

Edge Cases

  • Array length of 1 (or 0). The loop finishes without finding duplicates, safely returning false.
  • Array consists of entirely the same number ([1, 1, 1]). Returns true on the second element.

Pitfalls

  • Using a plain Array/List as your "seen" tracker and using .includes(). Array.includes() is an $O(n)$ operation. Doing this inside an $O(n)$ loop degrades your entire algorithm back to $O(n^2)$. You must use a Set or a Hash Map for $O(1)$ lookups.

Similar Problems

  • 242. Valid Anagram — Also uses a Hash Map/Set mechanism, but tracks frequencies instead of just existence.
  • 219. Contains Duplicate II — The sliding window variant. You need a Hash Map to track where you last saw the duplicate to calculate distance.

Mind-Map Tags

#arrays #hashset #membership-check #early-exit

Mark this page when you finish learning it.

Last updated on

Spotted something unclear or wrong on this page?

On this page