Find maximum occurring word in given set of strings

Given a huge set of strings with duplicate strings present, find the maximum occurring word in it. If two words have same count, return any one of them.


For example,


keys = ["code", "coder", "coding", "codable", "codec", "codecs", "coded",
        "codeless", "codec", "codecs", "codependence", "codex", "codify",
        "codependents", "codes", "code", "coder", "codesign", "codec",
        "codeveloper", "codrive", "codec", "codecs", "codiscovered"]


Maximum occurring word is: codec
It's count is: 4


The idea is to use Trie (Prefix Tree) to solve this problem. We start by inserting each key into the trie and store its count so far (along with the key itself) in the leaf nodes. After all nodes are inserted into the trie, we perform its pre-order traversal (depth first search) and find maximum frequency word by comparing the count present at leaf nodes. Note that we can also use a map to solve this problem.

Below is C++ and Java implementation of the idea:


Download   Run Code


Word : codec
Count: 4


Download   Run Code


Word : codec
Count: 4


1. Extend this solution to print all maximum occurring word (having same count).

2. Extend this solution to print first k maximum occurring word.

1 Star2 Stars3 Stars4 Stars5 Stars (3 votes, average: 4.67 out of 5)


Thanks for reading.

Please use our online compiler to post code in comments. To contribute, get in touch with us.
Like us? Please spread the word and help us grow. Happy coding 🙂

Leave a Reply

newest oldest most voted
Notify of

Is there any particluar reason to use trie. Couldnt we use hashmap to store count and iterate and push to heap or use heap itself.


Since we store the count at each leaf node when we call insert(), can we just return the count in this function? In the main() function, we keep track of the max count and the word seen so far. We don’t have to perform pre-order traversal