Given a string and a positive integer k, find all distinct substrings of any length containing exactly k distinct characters.

Please note that the problem specifically targets substrings that are contiguous (i.e., occupy consecutive positions) and inherently maintains the order of elements.

 
For example,

Input:  str = abcbd, k = 3
Output: [abc, abcb, bcbd, cbd]
 
 
Input:  str = abcadce, k = 4
Output: [abcad, abcadc, bcad, bcadc, cadce, adce]
 
 
Input:  str = aa, k = 1
Output: [a, aa]

Practice this problem

A simple solution would be to generate all substrings of the given string and return substrings containing exactly k distinct characters. The time complexity of this solution is O(n3) since it takes O(n2) time to generate all substrings for a string of length n and O(n) time to process each substring.

 
We can improve the time complexity using extra space. The idea is to process substrings starting with each index i in the input string. In each iteration of the loop, add each character present in the processed substring to a hash table. If the count of distinct characters becomes k at index j at any point of time, include substring str[i…j] in the result.

Following is the C++, Java, and Python program that demonstrates it:

C++


Download  Run Code

Output:

adce
abcadc
bcadc
abcad
bcad
cadce

Java


Download  Run Code

Output:

[abcad, abcadc, cadce, adce, bcad, bcadc]

Python


Download  Run Code

Output:

{‘bcad’, ‘adce’, ‘cadce’, ‘bcadc’, ‘abcad’, ‘abcadc’}

The time complexity of the above solution is O(n2) and requires O(n) extra space, where n is the length of the input string.

 
Exercise: Modify the solution to find substrings of length k containing exactly k distinct characters.