Parallel Standard Deviation

Standard deviation is a core statistical algorithm used to measure variability of a data set. It is used in data science extensively, to provide useful information about the data. The computation itself uses summation twice within the algorithm: once to compute the mean (average) of the data set, and another to sum the square of […]

Read more "Parallel Standard Deviation"

Memory Access

We think of computer memory as a RAM – random access memory – which means memory can be accessed at any location with nearly the same latency. This is true of certain kinds of memory types, such as static RAM – SRAM. As we shall see in this blog, when system memory is made of […]

Read more "Memory Access"

Git by Task

This blog is a slightly different kind of a cheat sheet. It is based on common tasks that we do using git command line. To Cloning a Repository git clone https://something.git grab the something.git part from the git web UI for cloning a repository. This command will create a directory with the same name as […]

Read more "Git by Task"

Parallel LSD Radix Sort

I’ve taken several attempts at parallelizing the LSD Radix Sort algorithm. This is a conceptual description of the latest attempt, which hopefully will work out well. My latest implementation of partially parallel version of LSD Radix Sort is performing very well, running at around 150 MegaInt32/sec implemented in C++ and nearly the same speed in […]

Read more "Parallel LSD Radix Sort"

Faster C++ Sorting

C++ has included wonderful implementations of sorting algorithms over the years. Recently, with C++17 support for parallelism, sorting performance has skyrocketed by running on all of the available cores. The number of cores is predicted to grow in double-digit percentage per year, as competition between Intel and AMD heats up, giving these parallel algorithms wil […]

Read more "Faster C++ Sorting"