Faster Copying in C#

Copying List to Array

To convert a List to an Array, C# provides a convenient ToArray function:

var listSource = new List<int> { 5, 7, 16, 3 };

int[] arrayDestination1 = listSource.ToArray();    // C# standard conversion
int[] arrayDestination2 = listSource.ToArrayPar(); // HPCsharp parallel conversion

ToArray comes up often, whenever some method requires an Array, but you happen to be using a List, because you needed a collection that can grow or shrink. Many C# standard functions and APIs take Arrays as inputs, because Arrays are a more efficient collection to work with.

Notice how in the example above a new Array is created and returned by ToArray. HPCsharp nuget package provides a parallel multi-core ToArrayPar function that is about 3X faster than ToArray, with the same usage.

C# also provides a function to copy a List to an Array:

var listSource = new List<int> { 5, 7, 16, 3 };
int[] arrayDestination = new int[4];

listSource.CopyTo(arrayDestination);    // C# standard List to Array copy
listSource.CopyToPar(arrayDestination); // HPCsharp parallel copy

CopyTo copies the List to a pre-existing Array. Notice, how arrayDestination has already been created, of the right size, for CopyTo to copy the List to. CopyTo can perform much better, especially when the destination array is re-used several times. I’ll show performance numbers shortly.

Copying Array to Array

C# provides the same methods for converting Array to another Array:

int[] arraySource = new int[] { 5, 7, 16, 3 };

int[] arrayDestination1 = arraySource.ToArray();    // C# standard conversion
int[] arrayDestination2 = arraySource.ToArrayPar(); // HPCsharp parallel conversion

In the above case, ToArray creates a new Array and copies contents to it. HPCsharp has a parallel multi-core ToArrayPar function that is about 2.5X faster than ToArray.

C# also provides CopyTo for Array to Array copy, just like List to Array copy. CopyTo copies to a pre-existing Array.

Two Cases

When copying, the source Array or List has been filled with data already, paged into system memory. For the destination two possible cases arise:

  • the destination Array is brand new (just allocated)
  • the destination Array has been used before (being re-used)

The above cases differentiate between an Array that has not been paged into system memory yet (the first case), and one which has been paged-in. Performance difference on a laptop with dual-channel system memory is dramatic: about 1 GigaInts/sec copy, versus about 2.2 GigaInts/sec.  This is either 8 GigaBytes/sec versus 17.6 GigaBytes/sec. The paged-in case is the fastest. I’ll show a way to make both of these cases go faster.

One Core Is Not Enough

https://stackoverflow.com/questions/56803987/memory-bandwidth-for-many-channels-x86-systems

The above link shows that on Xeon and desktop processors a single thread is not sufficient to use all of the system memory bandwidth. On dual-memory
channel desktop systems, two threads are necessary to saturate system memory. On Xeon workstation and cloud systems, many-many threads and cores are needed.

Multi-Core

C# Linq provides .AsParallel() option that can be pre-pended to a function, such as List.AsParallel().ToArray() or Array.AsParallel().CopyTo(), which uses multiple processor cores to do the work. However, in the case of these C# Linq copy functions, performance goes down dramatically by about 5X or more. Even when limiting the number of cores doing the work to two, performance still gets drastically reduced. Somehow, C# Linq copy functions are not multi-core friendly.

HPCsharp Parallel Copy Functions

HPCsharp nuget package, on nuget.org, provides several high performance multi-core copy functions that speed up both of the cases above:

  • destination Array is brand new: 2.5X to 3X speedup
  • destination Array is being re-used: about 15% speedup

HPCsharp copy functions provide the familiar interfaces of List.ToArray() and List.CopyTo(). For Array, the same functions are also provided. When an new array is being returned by List.ToArray() or Array.CopyTo() functions, that array is brand new and has not been allocated yet. However, because multiple cores are being used, these HPCsharp functions nearly 3X faster on a quad-core laptop with two memory channels.

For the second case, when the destination Array has already been paged-in (being re-used), HPCsharp copy functions provide a slight speedup of 15%, as the performance on my laptop is already near the available system memory bandwidth.

Possibly, on systems with more memory channels, such as Intel Xeon or AMD EPYC workstation/cloud processors, these functions may provide much higher speedup.

HPCsharp Examples

Each function has a self documenting interface that explains all of the arguments and discusses its unique attributes. This is a good place to start.

I’ve added examples of copy functions usage to the Examples VisualStudio 2017 solution included with HPCSharp open source repository, to demonstrate usage and to show that the interfaces are the same as C# Linq.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s