Return k’th largest element in a stream
Given an infinite stream of integers, return the element representing the k'th
largest element in the stream.
In the previous post, we have discussed several methods to find k'th
largest element in an array. In this post, we are given an infinite stream instead of a fixed-length array.
For example,
4
5
12
8
9
10
20
42
…
Output:
The k’th largest element is NA.
The k’th largest element is NA.
The k’th largest element is 4
The k’th largest element is 5
The k’th largest element is 8
The k’th largest element is 9
The k’th largest element is 10
The k’th largest element is 12
…
1. Brute-Force Solution
A simple solution is to maintain an array of size k
for storing the top k
largest elements among all elements encountered so far in the stream. The array should be sorted in ascending order to find the k'th
largest element in constant time. Now, the first element in the sorted array is the k'th
largest element.
The idea is to compare each incoming element n
in the stream with the current k'th
largest element x
. If n
is more than x
, remove x
from the sorted array and push n
into the array in sorted order. If n
is less than x
, ignore it.
The time complexity of processing a new element in the stream using this approach is O(k) since we need to sort the array every time. The k'th
largest element can be retrieved in O(1) time. The additional space used by the program is O(k).
2. Using Self-balancing BST
Another solution is to use a self-balancing BST. The basic idea for processing a new element remains the same. The k'th
largest element will be the smallest element in the BST. The time complexity of processing a new element improves to O(log(k)). However, the time taken for finding the k'th
largest element increases to O(log(k)). The additional space used by the program remains O(k).
3. Using Priority Queue
The recommended solution to target this problem is using the priority queue. The idea is to keep track of the top k
largest elements using a min-heap of size k
. The advantage of using the min-heap is that the k'th
largest element is always present at the root and can be retrieved in constant time.
The idea is to compare each incoming element n
in the stream with the root element of min-heap r
. If n
is more than r
, replace r
with n
and call the Heapify procedure on the modified heap’s root. If n
is less than r
, ignore it.
The time complexity of processing a new element in the stream using this approach is O(log(k)), while the time taken for finding the k'th
largest element improves to O(1). The additional space used by the program is O(k).
Following is the implementation in C++, Java, and Python based on the min-heap solution:
C++
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
#include <iostream> #include <queue> using namespace std; // Comparison object to order the min-heap struct comp { bool operator()(int a, int b) const { return a > b; } }; class MinHeap { priority_queue<int, vector<int>, comp> pq; int k; public: MinHeap(int k) { this->k = k; } int add(int n) { // if the min-heap's size is less than `k`, push the current element // into the min-heap if (pq.size() < k) { pq.push(n); } // otherwise, if the current element is more than the smallest element // in the min-heap, remove the smallest element from the heap and // push the current element else if (n > pq.top()) { pq.pop(); pq.push(n); } // if the size of the min-heap reaches `k`, return the top element if (pq.size() == k) { return pq.top(); } else { return -1; } } }; class Solution { public: MinHeap* pq == nullptr; // Function to find the k'th largest element in a stream int findKthLargest(int k, int nextInt) { // create a min-heap first time when function is invoked if (pq == nullptr) { pq = new MinHeap(k); } return pq->add(nextInt); } }; int main() { int k = 3; Solution s; int n; // loop till the end-of-file (EOF) is reached while (cin >> n) { int x = s.findKthLargest(k, n); if (x != -1) { cout << "The k'th largest element is " << x << endl; } } return 0; } |
Java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 |
import java.util.NoSuchElementException; import java.util.PriorityQueue; import java.util.Scanner; class MinHeap { final PriorityQueue<Integer> pq; final int k; public MinHeap(int k) { this.k = k; pq = new PriorityQueue<>(k); } public int add(int n) { // if the min-heap's size is less than `k`, push the current element // into the min-heap if (pq.size() < k) { pq.add(n); } // otherwise, if the current element is more than the smallest element // in the min-heap, remove the smallest element from the heap and // push the current element else if (pq.peek() < n) { pq.poll(); pq.add(n); } // if the size of the min-heap reaches `k`, return the top element if (pq.size() == k) { return pq.peek(); } else { return -1; } } } class Solution { public static MinHeap pq = null; // Function to find the k'th largest element in a stream public static int findKthLargest(int k, int nextInt) { // create MinHeap class instance first time when function is invoked if (pq == null) { pq = new MinHeap(k); } return pq.add(nextInt); } } class Main { public static void main(String[] args) { int k = 3; Scanner in = new Scanner(System.in); // loop till the end-of-file (EOF) is reached while (true) { try { int nextInt = in.nextInt(); int x = Solution.findKthLargest(k, nextInt); if (x != -1) { System.out.println("The k'th largest element is " + x); } } catch (NoSuchElementException e) { break; } } in.close(); } } |
Python
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
import heapq class MinHeap: def __init__(self, k): self.pool = [] self.k = k def add(self, n): # if the min-heap's size is less than `k`, push the current element # into the min-heap if len(self.pool) < self.k: heapq.heappush(self.pool, n) # otherwise, if the current element is more than the smallest element # in the min-heap, remove the smallest element from the heap and # push the current element elif self.pool[0] < n: heapq.heappushpop(self.pool, n) # if the size of the min-heap reaches `k`, return the top element if len(self.pool) == self.k: return self.pool[0] else: return -1 class Solution: def __init__(self): self.pq = None # Function to find the k'th largest element in a stream def findKthLargest(self, k, nextInt): # create MinHeap class instance first time when function is invoked if not self.pq: self.pq = MinHeap(k) return self.pq.add(nextInt) if __name__ == '__main__': k = 3 s = Solution() # loop till the end-of-file (EOF) is reached while True: try: x = s.findKthLargest(k, int(input())) if x != -1: print('The k\'th largest element is', x) except EOFError: break |
Thanks for reading.
To share your code in the comments, please use our online compiler that supports C, C++, Java, Python, JavaScript, C#, PHP, and many more popular programming languages.
Like us? Refer us to your friends and support our growth. Happy coding :)