Like the Selection algorithm, described in several previous blogs, Partition is another algorithm closely related to sorting. Given a single value or an array of values, the Partition algorithm splits the array into sections with useful statistical properties, without sorting it, in linear time – i.e. O(n).
Single Value Partition
For example, C++ nth_element function rearranges the array in such as way, that the n-th array element has a value as if the array has been sorted. It also rearranges the array elements in such a way that array elements to the left of the n-th array index are all smaller than or equal to the values to the right of the n-th index.
In other words, the element at the n-th index partitions the array into smaller or equal values on the left and equal or larger values on the right. The contents of each of the two resulting partitions are not in sorted order.
Multiple Value Partition
Extending the Partition concept to more than one partition element, where one partition element leads to two partition regions (to the left and to the right or the partition element) is natural. Two partition elements results in three partition regions (left, middle and right), and three partition elements leads to four partition regions, and so on…
Each partition region has elements whose values are in between the two partition elements that are on the left and right sides of that region. However, the elements within a partition region are not in sorted order.
Distinct Array Value
When array values are distinct, Partition splits the array precisely.
Using Comparisons
Traditional method used for Partitioning is based on the QuickSelect algorithm, which uses comparisons. Partitioning is a biproduct of the selection process. Or, it could be that partitioning is a necessary step to achieve selection when using comparisons. It would be powerful to develop a proof of this.
Using Radix to Partition
Comparisons is not the only way to achieve partitioning in linear time. In-place MSD Radix Sort is linear-time O(N) and is a good starting point.
| Algorithm | Random | Presorted | Constant |
| nth_element | 3.60 seconds | 2.6 seconds | 0.70 seconds |
| In-Place MSD Radix Sort | 2.90 seconds | 2.5 seconds | 2.6 seconds |
The above table is for processing an array of 100 Million 32-bit unsigned integers. The two algorithms are nearly the same performance. In-Place MSD Radix Sort lags for the particular case of all array elements being constant.
However, the In-Place MSD Radix Sort performs too much work by sorting all elements of the array. To create a partition algorithm out of it, after performing the first level of the In-Place MSD Radix Sort, which based on the most significant digit, recurse only into those bins where the partitioning indexes are.
| Algorithm | Random | Presorted | Constant |
| nth_element | 3.60 seconds | 2.6 seconds | 0.70 seconds |
| Radix Partition | 0.70 seconds | 0.53 seconds | 2.22 seconds |
Radix Partition is significantly faster (5X) for random and presorted data distributions, but is slower for the array filled with constants. It is linear time also.
Partition More Ways
Another useful partition variant, which Python implements, is to partition the array based of several values. Radix Partition supports this variant also. The implementation links provided below support single or multiple way partition, extending nth_element method to multi-way partitioning.
Implementations
C++ implementation is available here. C# implementation is available here.
More Detailed Explanation
Coming soon…