Update: A follow on blog Even Faster Sorting in C# is taking shape.
C# has several built-in algorithms for sorting arrays, lists and other containers. Array.Sort and List.Sort have many variations for sorting full or partial containers, using an algorithm that runs on a single CPU core. Array.Sort and List.Sort uses a hybrid sorting algorithm called Introspective Sort, which is a mixture of Insertion Sort, Heap Sort and Quicksort. This hybrid approach is also used by C++ STL sort, performing well while staying generic.
C# also provides the Linq library, which can also sort various containers using a single CPU core or multiple cores in parallel. It’s also generic, able to sort containers holding built-in numeric data types and user defined types/classes.
These are powerful, flexible, high performance capabilities. If you need to sort even faster in C# (.NET), using modern multi-core processors, this blog describes high performance sorting algorithms provided in a free, open source library called HPCsharp.
HPCsharp library is a NuGet package, available on nuget.org, with source code on GitHub. Currently, it provides Insertion Sort, Merge Sort, and Radix Sort. Merge Sort has a parallel and a serial implementation. Parallel implementation perform substantially faster than sorting using Linq.AsParallel(). Radix Sort runs on a single core, at the moment, and is competitive versus Array.Sort, List.Sort and even parallel Linq.
These sorting algorithms in HPCsharp provide the same familiar interfaces of the standard C# sorting, making it simple to check these out.
Parallel Merge Sort
The above graph compares performance of Linq.AsParallel() sorting versus HPCsharp Parallel Merge Sort. The x-axis is the size of the UInt32 array being sorted. The y-axis is time in seconds it takes to sort that array. For example, parallel Linq takes 28 seconds to sort a 150 million element array, while HPCsharp parallel Merge Sort takes 1.4 seconds. This is a speedup of about 20 times, not 20%, but 20 times faster.
The benchmarks in the above graph were run on a 36-core machine in AWS – c5.18xlarge node. The graph shows that the parallel Merge Sort scales much better. On a quad-core CPU parallel Merge Sort is at least 4X faster than parallel Linq.
Parallel Merge Sort is not limited to only sorting arrays of integers, but can sort arrays and lists of user defined classes – in general any type that can be compared. You can provide your own comparison method. The sort is also stable, whereas C# built-in sort for arrays and lists is not. Linq sort can be stable when .AsStable is used.
Serial implementation of Merge Sort is also provided, which is slower than C# Array.Sort and List.Sort, but is faster than Linq sort, serial and parallel. In this case, Merge Sort is using a single core, while parallel Linq is using all cores on a quad-core CPU, and serial Merge Sort is faster.
The above graph shows speedup of Parallel Merge Sort versus Parallel Linq. The x-axis is the array size of UInt32’s. The y-axis is the speedup provided by the Parallel Merge Sort. Most of the points land in the 15X to 30X range, showing that the Parallel Merge Sort is that many times faster.
Another fast sorting algorithm is Radix Sort. HPCsharp provides two variations: LSD (Least Significant Digit) and MSD (most significant digit). LSD Radix Sort is the faster of the two, is a stable sort, but is not in-place. MSD Radix Sort is slower, can be in-place. Both versions are O(N) sorting algorithm – that’s right, linear in time.
See Faster LSD Radix Sort blog post for some old and new ideas on how to optimize this amazing algorithm.
At the heart of the parallel Merge Sort is a parallel Merge algorithm. This algorithm is O(N) linear and parallelizes incredibly well. For details on this wonderful algorithm see my Parallel Merge blog. HPCsharp nuget package provides parallel Merge as a generic algorithm, and bases many other parallel algorithms on it.
C# Built-in Sorting Algorithms Details
C# Array.Sort and List.Sort uses a hybrid sorting algoritm called introspective sort, which is a mixture of Insertion Sort, Heap Sort and Quicksort. “This implementation performs an unstable sort; that is, if two elements are equal, their order might not be preserved. In contrast, a stable sort preserves the order of elements that are equal. On average, this method is an O(n log n) operation, where n is Count; in the worst case it is an O(n ^ 2) operation” (msdn).