Find maximum occurring word in given set of strings

Given a huge set of strings with duplicate strings present, find the maximum occurring word in it. If two words have same count, return any one of them.


 

For example,


Input:  

keys = ["code", "coder", "coding", "codable", "codec", "codecs", "coded",
        "codeless", "codec", "codecs", "codependence", "codex", "codify",
        "codependents", "codes", "code", "coder", "codesign", "codec",
        "codeveloper", "codrive", "codec", "codecs", "codiscovered"]

Output:

Maximum occurring word is: codec
It's count is: 4

 

 
The idea is to use Trie (Prefix Tree) to solve this problem. We start by inserting each key into the trie and store its count so far (along with the key itself) in the leaf nodes. After all nodes are inserted into the trie, we perform its pre-order traversal (depth first search) and find maximum frequency word by comparing the count present at leaf nodes. Note that we can also use a map to solve this problem.

 
Below is C++ and Java implementation of the idea:

C++

Download   Run Code

Output:

Word : codec
Count: 4

Java

Download   Run Code

Output:

Word : codec
Count: 4

 
Exercise:

1. Extend this solution to print all maximum occurring word (having same count).

2. Extend this solution to print first k maximum occurring word.

 
Thanks for reading.

Please use our online compiler to post code in comments.
Like us? Please spread the word and help us grow. Happy coding 🙂
 



Leave a Reply

avatar
  Subscribe  
newest oldest most voted
Notify of
arund
Guest

Is there any particluar reason to use trie. Couldnt we use hashmap to store count and iterate and push to heap or use heap itself.

Kevin
Guest

Since we store the count at each leaf node when we call insert(), can we just return the count in this function? In the main() function, we keep track of the max count and the word seen so far. We don’t have to perform pre-order traversal