Given an array with many duplicated elements, write an algorithm to efficiently sort it in linear time where the order of equal elements doesn’t matter.

**Input: ** { 4, 2, 40, 10, 10, 1, 4, 2, 1, 10, 40 }

**Output:** { 1, 1, 2, 2, 4, 4, 10, 10, 10, 40, 40 }

Simple solution would to use efficient sorting algorithms like Merge Sort, QuickSort, Heap Sort, etc that can solve this problem in O(nlog(n)) time, but those will not take advantage of the fact that there are many duplicated values in the array.

A better approach is to use Counting sort. This will bring down the time complexity to O(n + k) where k is the range of the input. Below is implementation in C –

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
#include <stdio.h> #include <string.h> #define RANGE 100 // Function to efficiently sort an array with many duplicated values // using Counting Sort algorithm void customSort(int arr[], int n) { // create a new array to store counts of elements in the input array // freq[i] stores the number of items with key equal to i int freq[RANGE]; // initialize array by 0 memset(freq, 0, sizeof(freq)); // using value of elements in the input array as index, // update their frequencies in the new array for (int i = 0; i < n; i++) freq[arr[i]]++; // overwrite the input array with sorted order int k = 0; for (int i = 0; i < RANGE; i++) { while (freq[i]--) arr[k++] = i; } } // Efficiently sort an array with many duplicated values int main() { int arr[] = { 4, 2, 40, 10, 10, 1, 4, 2, 1, 10, 40 }; int n = sizeof(arr) / sizeof(arr[0]); customSort(arr, n); for (int i = 0; i < n; i++) printf("%d ", arr[i]); return 0; } |

`Output: `

1 1 2 2 4 4 10 10 10 40 40

The problem with the counting sort is its high space complexity O(k) when k >> n i.e. variation in keys is significantly greater than the number of items. Here is another implementation that calculates the exact range of array elements instead of making any assumption by calculating the difference between the maximum and minimum key values.

We can also use hash table to effectively solve this problem. The idea is to iterate through the array once and use a hash table the count the number of occurrences of the individual elements. Then we sort the unique elements present in the hash table according to the natural ordering. Finally, we overwrite the input array with sorted elements depending on frequencies stored in the hash table.

The time complexity of this solution would be O(nlog(k)) where k is the number of unique elements present in an array of size n, which can be much less than O(nlog(n)) if range of input elements is small. Auxiliary space used by the program is O(k). Below is implementation in C++ –

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
#include <iostream> #include <map> // Function to efficiently sort an array with many duplicated values void customSort(int arr[], int n) { // create an empty map to store frequency of array elements std::map<int, int> freq; // store distinct values in the input array as key and // their respective counts as values for (int i = 0; i < n; i++) freq[arr[i]]++; // sort the map according to the natural ordering of its keys // since we have used std::map instead of std::unordered_map, // keys are already sorted by default // traverse the sorted map & overwrite the input array with sorted elements int i = 0; for (auto it : freq) while (it.second--) arr[i++] = it.first; } // Efficiently sort an array with many duplicated values int main() { int arr[] = { 4, 2, 40, 10, 10, 1, 4, 2, 1, 10, 40 }; int n = sizeof(arr) / sizeof(arr[0]); customSort(arr, n); for (int i = 0; i < n; i++) std::cout << arr[i] << " "; return 0; } |

`Output: `

1 1 2 2 4 4 10 10 10 40 40

**Thanks for reading.**

Please use ideone or C++ Shell or any other online compiler link to post code in comments.

Like us? Please spread the word and help us grow. Happy coding 🙂

## Leave a Reply