In this blog I’ll assemble variety of resources for code performance optimization – from videos to blog posts to StackOverflow topics to university class instructions to books. I’ll update this post from time to time. Videos Books Blogs Gallery of Processor Cache EffectsRead more "Code Performance Optimization Resources"
LSD Radix Sort algorithm has been around for a long time with first computer usage in 1950’s and 1960’s to sort punch cards by performing several passes at the stack of cards. It has resurfaced in the last decade with GPUs having hundreds and even thousands of processor with internal local truly random access memories […]Read more "Faster LSD Radix Sort"
Computer Scientists develop algorithms for all kinds of tasks, such as sorting numbers, searching the web for information, showing web pages and interacting with them, and so on – thousands of different algorithms. Even a task such as sorting a bunch of numbers, has more than 20 different algorithms. When you need to choose one […]Read more "Big-O by a Concrete Example"
Parallel algorithms are here! Parallel algorithms are now standard, accessible in VisualStudio 2017 (version 15.8). According to a wonderful Microsoft blog C++17 parallel algorithms are no longer experimental. Algorithms such as sort, for_each, reduce, equal, count, and many more. This give us a standard and portable way to use all of the cores in a multi-core processor. […]Read more "Standard Parallel Algorithms Have Arrived"
Current processors offer some variety in the types of computational engines. They consist of multiple identical general purpose cores, embedded multi-core graphics cores (GPUs), and accelerators for video and audio compression. Recently, Intel has integrated Field Programmable Gate Array (FPGA) into the server-class processors, for even more customizable computational flexibility. Modern CPU’s (Intel, AMD, ARM, […]Read more "What We Need Is More Heterogeneity"
Merging of two pre-sorted arrays into a single array is a core part of the merge sort algorithm. In this blog I’ll discuss a couple of high performance implementations for a serial merge, and a way to reduce the number of comparisons. static public void Merge(int a, Int32 aStart, Int32 aEnd, int b, Int32 bStart, […]Read more "Faster Merge in C# and C++"