Problem:
You have an array of a million numeric keys (e.g. 32-bit unsigned integers) and need the largest 11 of them.
Solutions:
- You could sort this array of keys using a comparison-based sorting algorithm (e.g. C++ std::sort, C# Array.Sort) and then read the top 11 array elements. This takes O(nlgn) time.
- You could sort using a non-comparison-based sorting algorithm (e.g. C++ ParallelAlgorithms RadixSortLSDPowerOf2Radix_unsigned_TwoPhase_DeRandomize, C# HPCsharp SortRadixLsd) and then read the top 11 array elements. This takes O(n) time.
- You could use the Top-K Radix Selection algorithm described in this blog, and then read the top 11 array elements. This takes O(n) time.
The first two solutions do more work than is necessary, as they sort all of the array elements, including the top 11 elements. This was not required by the problem statement. The last solution does not sort the array and does not provide the top 11 elements in sorted order. If the top 11 array elements need to be in sorted order, it’s much quicker to sort these 11 elements than to sort the entire array. The last solution is many times faster and is presented in this blog.
Example Implementation
K-top Radix Selection algorithm implementation is available in C# in the HPCsharp repository: https://github.com/DragonSpit/HPCsharp/blob/master/HPCsharp/SelectionKtopRadix.cs
C++ implementation is under development and will be part of the https://github.com/DragonSpit/ParallelAlgorithms repo.