Parallel Sort of Byte Arrays

This blog progressively develops a simple parallel algorithm described in Chapter 6 of a soon to be released book, as a fun parallelism exercise. The goal of this exercise is to sort a large array of unsigned bytes as quickly as possible, using multiple cores if that provides performance improvement, in-place if possible, targeting the […]

Read more "Parallel Sort of Byte Arrays"

Two Writes Make a Read

This a reprint of my September 1997 letter to IEEE Computer magazine’s Open Channel. It’s still relevant today. For instance, when designing add-in computer cards or chips that connect over PCI-express connection. Preface Back in the 1990’s when designing add-in cards for PC’s and Apple computers various busses such as PCI (precursor to PCI Express), […]

Read more "Two Writes Make a Read"

In-Place N-bit Radix Sort

This blog is a repost of an article I wrote in Dr. Dobb’s Journal in November of 2009, since Dr. Dobb’s Journal has ceased operation and its website has broken since then. The algorithm developed is a novel In-Place Radix Sort, which sorts in Linear Time. The article is still available through the amazing web […]

Read more "In-Place N-bit Radix Sort"

In-Place Binary Radix Sort

This blog is a repost of an article I wrote in Dr. Dobb’s Journal in October of 2009, since Dr. Dobb’s Journal has ceased operation and its website has broken since then. The article is still available through the amazing web archival of the Wayback machine, as provided below: V.J. Duvanenko, “Algorithm Improvement through Performance […]

Read more "In-Place Binary Radix Sort"

Benchmarking Block Swapping Algorithms

This is a re-print of my Dr. Dobb’s article from April of 2012. The original article used to be available: https://www.drdobbs.com/parallel/benchmarking-block-swapping-algorithms/232900395 It is still available https://web.archive.org/web/20130130063936/http://www.drdobbs.com/parallel/benchmarking-block-swapping-algorithms/232900395 Below is the recreated version, with code fixes to one of the algorithms. Swapping two elements within an array quickly is simple, requiring a single temporary storage element. Swapping […]

Read more "Benchmarking Block Swapping Algorithms"

Two Simple Sorting Algorithms

Merge Sort was invented by John von Neumann in 1945. Many implementations have been shown in books and on the internet. However, in-place implementations have been difficult to accomplish, due to the need for an in-place merge, which only recently became available in standard C++, and is not available in other languages. In this blog, […]

Read more "Two Simple Sorting Algorithms"

Parallel In-Place Merge

This is a re-post of my 2012 article in Dr. Dobb’s Journal, that is still available through the amazing WayBackMachine web archive. In my article on parallel merge, I developed and optimized a generic parallel merge algorithm. It utilized multiple CPU cores and scaled well in performance. The algorithm was stable, leading directly to Parallel Merge Sort. […]

Read more "Parallel In-Place Merge"

Average Fun

Computing an average or two numbers seems simple and innocent enough of an operation. Here is an example from a C++ computer science text book: int m = (a+b) / 2 where a and b are also int’s, which are signed 32-bit values on most computers. This code seems innocent enough: add two integers, divide […]

Read more "Average Fun"

Giga Sort – Sorting at 10X Faster Than C++ Parallel Sort

In my previous blog https://duvanenko.tech.blog/2020/02/03/faster-c-sorting/ I benchmarked standard C++ Sort algorithm on a single core at 11 Million 32-bit integers per second. Parallel C++ Sort runs at 93 Million integers per second on a 48-core Xeon CPU on AWS. I also benchmarked Parallel Merge Sort on the same machine, which reaches over 600 Million integers […]

Read more "Giga Sort – Sorting at 10X Faster Than C++ Parallel Sort"