# Hybrid In-Place Merge Sort

Two truly simple sorting algorithms are shown in my previous blog https://duvanenko.tech.blog/2021/11/26/two-simple-sorting-algorithms/. These algorithms are both in-place and perform surprisingly well compared to standard C++ sort, especially when sufficient memory is available. In this blog, let’s see if we can raise the performance bar higher by adding slightly more complexity.

## Hybrid Bottom Up Merge Sort

Pure algorithms usually perform well, but hybrids have been used time and time again to perform even better. For instance, C++ standard combines Insertion Sort, Heap Sort, and Quicksort, into a high performance in-place sorting algorithm – called Introspective Sort.

Pure Bottom Up In-Place Merge Sort can also be combined with Insertion Sort, to create a higher performing sorting algorithm. Insertion Sort is in-place, which fits perfectly, for in-place usage. Insertion Sort is good at handling small sub-arrays, and that’s how we use it.

``````template< class _Type >
inline void merge_sort_bottom_up_inplace_hybrid(_Type* src, size_t l, size_t r)
{
if ((r - l) <= 32) {
insertion_sort(src + l, r - l + 1);
return;
}
size_t m = 32;
for (size_t i = l; i <= r; i += m)
insertion_sort(src + i, __min(m, r - m + 1));
for (; m <= r - l; m = m + m)
for (size_t i = l; i <= r - m; i += m + m)
std::inplace_merge(src + i, src + i + m, src + __min(i + m + m, r + 1));
}
``````

The above implementation starts out by splitting the input array into 32-element sub-array chunks, sorting each of them, before continuing on with the pure bottom up merge algorithm.

### Hybrid Divide-And-Conquer Merge Sort

Divide-and-conquer algorithms are made for hybrids, especially to process the leaf nodes of the recursive tree, as shown in the implementation below:

``````template< class _Type >
inline void merge_sort_hybrid_inplace(_Type* src, size_t l, size_t r)
{
if (r <= l) return;
if ((r - l) <= 48) {
insertion_sort(src + l, r - l + 1);
return;
}

size_t m = l + (r - l) / 2;   // computes the average without overflow

merge_sort_hybrid_inplace(src, l,     m);
merge_sort_hybrid_inplace(src, m + 1, r);

std::inplace_merge(src + l, src + m + 1, src + r + 1);
}
``````

Insertion Sort is added to the implementation to handle sub-array chunks once recursive subdivision reaches less than 48 elements.

### Performance

Let’s see how well these two In-Place Merge Sort algorithms perform.

The bottom up in-place Merge Sort algorithm performance has been raised to be nearly identical to the standard C++ sort implementation. Divide-and-conquer algorithm performance also increased substantially and is equivalent to performance of the standard C++ sort. Both algorithms are in-place and both use a hybrid approach to raise performance by terminating recursion early.

### Conclusion

Two hybrid in-place Merge Sort algorithms have been developed. These are simple and competitive in performance to C++ standard sort algorithm, whenever sufficient memory resources are available, due to the adaptive nature of inplace_merge implementation. The algorithms shown run on a single processor core.

However, when memory resources are not available the above algorithms most likely perform worse than the standard C++ sort.

This is a great starting point for future Parallel In-Place Merge Sort implementations, which will run on many processor cores, and will take performance even higher.