https://leetcode.com/problems/find-the-index-of-the-first-occurrence-in-a-string/
Given two strings
needleandhaystack.
Return the index of the first occurrence ofneedleinhaystack, or-1ifneedleis not part ofhaystack.
-
"Can you restate the problem in your own words?"
β "We need to find where a substring (needle) first appears in a larger string (haystack), and return that starting position. If it doesn't appear at all, return -1." -
"What should we return if the needle is empty?"
β "If needle is empty, we should return 0 (it's found at the beginning by convention)." -
"Are we looking for the first occurrence or all occurrences?"
β "Just the first occurrence. Once we find it, we can stop searching." -
"Is the search case-sensitive?"
β "Yes, 'A' and 'a' are different characters." -
"What if the needle is longer than the haystack?"
β "Then it's impossible for needle to be in haystack, so we return -1." -
"Can you give an example?"
β "If haystack is 'hello' and needle is 'll', we return 2 because 'll' starts at index 2."
-
"What type of problem is this?"
β "This is a string searching or pattern matching problem." -
"Have you seen similar problems before?"
β "Yes, this is the classic substring search problem. It's similar to how Ctrl+F works in a text editor." -
"What approaches do you know for string searching?"
β "The naive approach checks every possible position. More advanced algorithms include KMP (Knuth-Morris-Pratt), Boyer-Moore, and Rabin-Karp." -
"What would be the simplest approach?"
β "Check each position in haystack to see if needle starts there. Use a sliding window of size equal to needle's length." -
"Does Python have a built-in for this?"
β "Yes, thestr.find()method does exactly this, but we should implement it ourselves for the interview."
-
"Walk me through your approach step by step."
β "Iterate through each possible starting position in haystack (up to haystack length minus needle length). At each position, check if the substring matches needle. Return the index if we find a match, or -1 if we finish without finding it." -
"How many positions do we need to check?"
β "We need to check positions 0 through (len(haystack) - len(needle)). Beyond that, there's not enough room for needle to fit." -
"How do we check if needle matches at a given position?"
β "We can either: (1) slice haystack and compare to needle, or (2) use a nested loop to compare character by character." -
"Which approach would you prefer and why?"
β "Using Python's string slicing is cleaner and leverages optimized built-in code. The nested loop gives more control but is more verbose." -
"What's the early termination condition?"
β "As soon as we find a match, we return that index immediately."
def strStr(haystack: str, needle: str) -> int:
# Edge case: empty needle
if not needle:
return 0
# Edge case: needle longer than haystack
if len(needle) > len(haystack):
return -1
# Check each possible starting position
for i in range(len(haystack) - len(needle) + 1):
# Check if needle matches at position i
if haystack[i:i + len(needle)] == needle:
return i
# No match found
return -1def strStr(haystack: str, needle: str) -> int:
if not needle:
return 0
if len(needle) > len(haystack):
return -1
# Try each starting position
for i in range(len(haystack) - len(needle) + 1):
# Check if needle matches starting at position i
match = True
for j in range(len(needle)):
if haystack[i + j] != needle[j]:
match = False
break
if match:
return i
return -1def strStr(haystack: str, needle: str) -> int:
if not needle:
return 0
h_len, n_len = len(haystack), len(needle)
if n_len > h_len:
return -1
# Try each valid starting position
for i in range(h_len - n_len + 1):
# Use two pointers to compare
h_ptr = i
n_ptr = 0
while n_ptr < n_len and haystack[h_ptr] == needle[n_ptr]:
h_ptr += 1
n_ptr += 1
# If we matched all of needle, we found it
if n_ptr == n_len:
return i
return -1-
"What edge cases should we test?"
β "Empty needle, empty haystack, needle longer than haystack, needle not in haystack, needle at the beginning, needle at the end, single character strings, needle equals haystack." -
"Can you trace through an example?"
β "Let's trace 'sadbutsad' searching for 'sad': Start at index 0: 'sad' == 'sad' β Return 0." -
"Can you trace through a non-match example?"
β "Let's trace 'leetcode' searching for 'leeto': Index 0: 'leeto' vs 'leetc' β, Index 1: 'eeto' (only 4 chars left, need 5) β return -1." -
"Could you write some test cases?"
def test_strStr():
# Test case 1: Needle found at beginning
assert strStr("sadbutsad", "sad") == 0
# Test case 2: Needle found in middle
assert strStr("hello", "ll") == 2
# Test case 3: Needle not found
assert strStr("aaaaa", "bba") == -1
# Test case 4: Empty needle
assert strStr("hello", "") == 0
# Test case 5: Needle longer than haystack
assert strStr("abc", "abcd") == -1
# Test case 6: Needle equals haystack
assert strStr("abc", "abc") == 0
# Test case 7: Single character match
assert strStr("a", "a") == 0
# Test case 8: Single character no match
assert strStr("a", "b") == -1
# Test case 9: Needle at end
assert strStr("mississippi", "issip") == 4
# Test case 10: Multiple occurrences (return first)
assert strStr("ababcabc", "abc") == 2-
"What's the time and space complexity of your solution?"
β "Time: O(n * m) where n is haystack length and m is needle length. In the worst case, we check each of the n-m+1 positions and compare m characters. Space: O(1) if we use the character-by-character approach, or O(m) if we use slicing (creates temporary strings)." -
"Can this be optimized further?"
β "Yes! The KMP algorithm achieves O(n + m) time by preprocessing the needle to avoid redundant comparisons. Boyer-Moore can be even faster in practice by skipping characters." -
"When would you use the naive approach vs. KMP?"
β "For short strings or when needle length is small, the naive approach is simpler and sufficient. KMP is worth the complexity for large texts with long patterns, especially with many partial matches." -
"What about using built-in functions?"
β "In production code, usehaystack.find(needle)which is optimized in C. In interviews, implement it yourself unless told otherwise." -
"How does string slicing compare to character-by-character?"
β "Slicing is cleaner and often faster due to optimization, but creates temporary strings (O(m) space). Character comparison uses O(1) space but may be slower for long needles." -
"What if we needed to find all occurrences, not just the first?"
β "Continue the loop instead of returning early, and collect all matching indices in a list. Time complexity remains O(n * m)."
def strStr_KMP(haystack: str, needle: str) -> int:
if not needle:
return 0
# Build failure function (LPS array)
def build_lps(pattern):
lps = [0] * len(pattern)
length = 0
i = 1
while i < len(pattern):
if pattern[i] == pattern[length]:
length += 1
lps[i] = length
i += 1
elif length > 0:
length = lps[length - 1]
else:
lps[i] = 0
i += 1
return lps
lps = build_lps(needle)
i = j = 0
while i < len(haystack):
if haystack[i] == needle[j]:
i += 1
j += 1
if j == len(needle):
return i - j
elif j > 0:
j = lps[j - 1]
else:
i += 1
return -1Time Complexity: O(n + m)
Space Complexity: O(m) for LPS array