C# is a wonderfully powerful object oriented language with support for many modern constructs, supporting variety of abstractions, managed environment, numerous libraries, with one of the most enjoyable and productive development environments. At each level of abstraction, as one of my friends reminded me, we can loose performance, such as when going from assembly language to a modern programming language, to generic constructs, to class hierarchies, and so on, in our world of 100,000 software onion layers.
As our code becomes more general and more abstract, it becomes useful in more cases. Each abstraction layer can add its slight overhead, leading to loss of performance. I’m currently reading a book, “Writing High-Performance .NET Code”, written by Ben Watson, a 10 year Microsoft veteran. This book is filled with many high performance tips. One gem of a quote is, “In-depth performance optimization will often defy code abstractions.”
For instance, chapter 5, section “for vs. foreach”, takes a deep dive into and compares a for loop with an array, a foreach loop with an array, and a foreach with IEnumerable based array. It shows assembly code (IL), and demonstrates that once IEnumerable iterator abstraction gets involved, virtual method calls (get_Current and MoveNext) are used, along with memory allocations. As he states, “It uses more CPU and more memory!”
Motivations for the HPCsharp Package
This book echoes my performance measurement findings, and serves as one of the main motivations for the HPCsharp NuGet package – to give developers flexibility to choose the level of abstraction depending on performance needs. When performance is not critical in a section of code, then more abstract, more compact, mostly likely more readable implementation can be used. When performance is critical, then less abstract, closer to the silicon, more performant implementation can be chosen.
HPCsharp library helps you the developer gain performance where you need it, by having similar algorithms available at various levels of abstraction. For example, Linq has many useful algorithms, such as Min, Max, Equal, Sum, and more. These algorithms operate on variety of containers as long as those containers provide an IEnumerable interface. Linq is a powerfully abstract library, that the author of “Writing High-Performance .NET Code” book cautions about, since it has lots of hidden code, which can get in the way of performance. Linq is fairly simple to use and is highly abstract, making the code readable. The developer needs to be cautious with Linq, however, for high performance operations, and not assume that Linq code will be performant.
HPCsharp library provides generic versions of the same extension methods that Linq has, but at a lower level of abstraction. The same Min, Max, Equal, Sum and others are provided. However, IEnumerable interface is not used, making the library less general, but more performant. It only works on standard containers, such as Array and List, with support for more standard containers to follow. These containers are generic, able to hold any data type, which covers many common use cases. This gives developers the flexibility to either use Linq for higher level of abstraction, or HPCsharp for higher performance.
HPCsharp NuGet package is available on nuget.org and is open source in the following GitHub repo. A working VisualStudio 2017 solution with usage examples and performance comparisons is also included in the repo. Contributions, donations, along with constructive feedback is welcome.
Platform for More Algorithms
HPCsharp package can also serve as a platform to grow the number and diversity of algorithms. New algorithms can be added and distributed to many developers. These algorithms can be implemented at several levels of abstraction to continue providing flexibility of trading off abstraction and performance.
The author of “Writing High-Performance .NET Code” book continuously reminds the reader to “Measure, Measure, Measure”. This hits right at my Dr. Dobb’s Journal column, “Algorithm Improvement thru Performance Measurement“, where all algorithmic performance improvements were backed by and based on measurements.