Parallel Standard Deviation

Standard deviation is a core statistical algorithm used to measure variability of a data set. It is used in data science extensively, to provide useful information about the data. The computation itself uses summation twice within the algorithm: once to compute the mean (average) of the data set, and another to sum the square of […]

Read more "Parallel Standard Deviation"

Memory Access

We consider computer system memory as RAM – random access memory. Originally, this meant accessing any random location took the same amount of time. This is true of certain kinds of memory types, such as Static RAM – SRAM. In this blog I will show that current computer system memory has significantly deviated from being […]

Read more "Memory Access"

Git by Task

This blog is a slightly different kind of a cheat sheet. It is based on common tasks that we do using git command line. Hopefully, this blog will reduce your memorization overload. To Clone a Repository git clone https://repositoryName.git grab the repository.git part from the git web UI for cloning a repository. This command will […]

Read more "Git by Task"

Parallel LSD Radix Sort

I’ve taken several attempts at parallelizing the LSD Radix Sort algorithm. This blog provides the key concepts and details about the latest attempt, which has succeeded and is beginning to pay off dividends of higher performance. Performance Summary The following table shows performance of three variations of the LSD Radix Sort on two different multi-core […]

Read more "Parallel LSD Radix Sort"

Faster C++ Sorting

C++ has included wonderful implementations of sorting algorithms over the years. Recently, with C++17 support for parallelism, sorting performance has skyrocketed by running on all of the available cores. The number of cores is predicted to grow in double-digit percentage per year, as competition between Intel, AMD, ARM and other processor vendors heats up. The […]

Read more "Faster C++ Sorting"

Parallel Divide-and-Conquer in C#

C# has several fundamental abstractions that make parallel programming possible. By splitting work across multiple processor cores, gains in performance can be quite significant. And, with AMD and Intel shipping processors with more and more cores, using all of these compute resources is critical. Some of C# parallel facilities are: Parallel.Invoke() which runs several functions […]

Read more "Parallel Divide-and-Conquer in C#"

Faster Checked Addition in C# (Part 2)

In the initial blogĀ I developed a faster checked summation of ulong arrays. By using integer operations when no arithmetic overflow is detected, helped propel performance 15X higher to 682 MegaUlongAdditions/second, for a typical case. However, the worst case was still as slow as summation of Decimal arrays – 48 MegaUlongAdditions/second. In this blog I’ll develop […]

Read more "Faster Checked Addition in C# (Part 2)"