Minimum Window Substring in Python

Minimum Window Substring

The Minimum Window Substring problem in Python is a popular challenge that requires you to find the smallest substring in a given string s that contains all the characters of another string t. This problem is usually asked in coding interviews because it tests your ability to optimize string traversal and manage data structures like hash maps. Solving this problem teaches you key techniques, such as the sliding window approach and character frequency tracking, which are widely used in real-world applications. Let’s break this problem into simple steps and learn how to solve it efficiently.

Steps to Solve Minimum Window Substring

  • Understand the Problem:
    • The task is to find the smallest substring in string s that contains all the characters from string t (including duplicates).
  • Use Hash Maps for Character Count:
    • Create a hash map (dictionary) to store the frequency of characters in string t.
  • Apply the Sliding Window Technique:
    • Use two pointers (start and end) to define a sliding window in string s.
    • Expand the window by moving the end pointer to include characters in the window.
  • Check Window Validity:
    • Use another hash map to track the frequency of characters in the current window.
    • If the current window contains all characters of t with the required frequency, mark it as valid.
  • Shrink the Window:
    • Move the start pointer to minimize the window size while still keeping it valid.
    • Update the result if the current window is smaller than the previous result.
  • Handle Edge Cases:
    • If no valid window exists, return an empty string.
    • Ensure the solution handles cases where t has characters not in s.
  • Optimize for Efficiency:
    • Use O(n) time complexity by ensuring each character is processed at most twice (once by end and once by start).
  • Test with Examples:
    • Example:
      • Input: s = "ADOBECODEBANC", t = "ABC"
      • Output: "BANC"
    • Explanation: "BANC" is the smallest substring in s that contains all characters of t.
Minimum Window Substring

The Minimum Window Substring problem has many practical applications. For example:

  • In text processing, it can help find the smallest section of text that contains all the keywords.
  • In DNA sequencing, it can help find the shortest segment of DNA that contains a specific set of genes.
  • In data analysis, it can be used to find the smallest window in a time series that contains all necessary data points for a specific analysis.

Problem statement

Given two strings s and t, find the minimum window in s which will contain all the characters in t.

Input

  • Two strings s and t.

Output

  • A string representing the minimum window in s which contains all characters in t. If there is no such window, return an empty string.

Example

python
1s = "ADOBECODEBANC"
2t = "ABC"
3# Output: "BANC"

How to solve

To solve the Minimum Window Substring problem efficiently, we can use the sliding window technique. The idea is to expand the window by moving the right pointer to find a valid window, then contract it by moving the left pointer to find the minimum window.

Full code in Python

Here's the complete Python code to solve the Minimum Window Substring problem:

python
1from collections import Counter, defaultdict
2
3def minWindow(s, t):
4    if not s or not t:
5        return ""
6
7    dict_t = Counter(t)
8    window_counts = defaultdict(int)
9
10    required = len(dict_t)
11    formed = 0
12
13    left, right = 0, 0
14    min_len = float("inf")
15    min_left, min_right = 0, 0
16
17    while right < len(s):
18        char = s[right]
19        window_counts[char] += 1
20
21        if char in dict_t and window_counts[char] == dict_t[char]:
22            formed += 1
23
24        while left <= right and formed == required:
25            char = s[left]
26
27            if right - left + 1 < min_len:
28                min_len = right - left + 1
29                min_left, min_right = left, right
30
31            window_counts[char] -= 1
32            if char in dict_t and window_counts[char] < dict_t[char]:
33                formed -= 1
34
35            left += 1
36
37        right += 1
38
39    if min_len == float("inf"):
40        return ""
41
42    return s[min_left:min_right + 1]
43
44# Example usage
45s = "ADOBECODEBANC"
46t = "ABC"
47print("Minimum window substring:", minWindow(s, t))  # Output: "BANC"

Explanation of the code

  • Initialize Variables: The dict_t stores the frequency of characters in t. window_counts keeps track of the characters in the current window. required is the number of unique characters in t that need to be present in the window.
  • Expand the Window: The right pointer is used to expand the window by adding characters to window_counts.
  • Contract the Window: Once the window contains all characters of t, the left pointer is used to contract the window and find the minimum length.
  • Update Result: Update the minimum length and the result if the current window is smaller than the previously found windows.
  • Return the Result: If a valid window is found, return it. Otherwise, return an empty string.

Time Complexity

The time complexity of this solution is O(n+m), where n is the length of s and m is the length of t. This is because each character in s and t is processed at most twice.

Frequently Asked Questions

Related Articles

Sign-in First to Add Comment

Leave a comment 💬

All Comments

No comments yet.