To In-Place or To Not-In-Place

What Is It?

Computing constantly provides space-time trade-offs. To make an algorithm faster, more space can be used. Or, if space is at a premium, then a slower and more space efficient algorithm can be used. This kind of a trade-off occurs in every aspect of computing, such as software development and chip design.

In algorithms, an in-place algorithm is one that uses a “no” extra space to accomplish its task. No extra space, doesn’t really mean zero extra space, it just means a constant amount of extra space is being used, no matter the size of the problem. In other words, as the problem size grows, the amount of extra space used by the algorithm does not.

Examples of in-place algorithms are: Insertion Sort, Heap Sort, QuickSort, Most Significant Digit Radix Sort (can be), Block Swap.

A not-in-place algorithm uses extra space to perform its task potentially faster. The amount of extra space is directly related to the size of the problem. For example, a not-in-place algorithm would use the same size array internally as the input array provided. Some algorithms may use several times more space.

Examples of not-in-place algorithm are: Merge Sort, Least Significant Digit Radix Sort.

When It Is Beneficial

Not-in-place algorithms can be substantially faster than their in-place counterparts. For example, LSD Radix Sort is 2X faster than MSD Radix Sort. Merge Sort is more than 2X faster than in-place Merge Sort.

At times not-in-place algorithms can be quite a bit simpler than in-place ones. Merge Sort is a good example of this. Most algorithms books don’t discuss in-place Merge Sort, because of its complexity.

The main benefit of in-place algorithms is the ability to handle larger data sets than not-in-place ones, before exhausting memory resources. For instance, in-place Radix Sort could work on twice as big of an array as the not-in-place Radix Sort, since the not-in-place sort needs a destination array that is different from the source array.

When It Must Be Used

As the problem size grows and starts bumping into the memory size limit, allocation of additional memory by a not-in-place algorithms starts to fail. There is only so much memory available, and when it’s exhausted it usually means program failure. Most modern programs don’t deal with running out of memory gracefully, as they don’t know what to do in that case, except to stop as gracefully as they can.

Switching dynamically from a not-in-place to an in-place algorithm is a wonderful idea to keep a software application from exhausting memory resources. When a faster not-in-place algorithms tries to allocate additional memory and that allocation fails, a switch to an in-place algorithm is done instead. The program is able to continue and to produce results.

This dynamic switching methodology is used by C++ inside the Standard Template Library (STL). It’s also coming to C# inside the HPCsharp nuget package.

When It Should Be Used

Most operating systems utilize virtual memory, which uses disk space to give an illusion to one or more application of more memory than is provided by system memory. When applications allocate memory, the most recently and most often used memory is paged into system memory where it is used. This paging in and out process is much slower than accessing system memory.

When an application is being paged in and out, it doesn’t know paging is happening, but the performance drops dramatically. Under this condition, the application is not running out of memory. There are no out of memory exceptions thrown when an application starts paging, since paging could start because another application needs memory. No warning is provided by the operating system. This application has not idea this is happening, except that it’s running substantially slower. It’s still getting the memory that was requested and most likely can even request more.

Going with a slower in-place algorithm may be the solution to a higher overall performance. The ideal scenario would be for the application to avoid paging, to avoid reducing performance. By using a slower in-place algorithm, paging is less likely to happen. As long as paging is being avoided, then using a slower algorithm makes the overall application go faster, as paging causes a much bigger drop in performance.

To Virtual Memory or Not

Some operating systems, such as Linux, allow configurations where virtual memory is disabled. This setup avoids paging altogether, avoiding drop in performance. In this case, out of memory exception would indicate that the system is running out of memory, since the sizes of system memory and virtual memory are equal.

With the latest generation of Solid State Drives (SSD), which are several orders of magnitude faster than the mechanical/traditional hard drives, paging in and out performance many not drop as substantially. The latest SSD’s connect on M.2, which is PCI-express, supporting 5X bandwidth of the older SATA connection, bringing disk bandwidth to 2.5 GBytes/sec. Gen4 M.2 SSD drives ratchet performance up by another 2X (between 5GBytes/sec and 7GBytes/sec), claiming 12X faster than SATA SSD’s. This is only a single order of magnitude away from system memory bandwidth. More exploration and bandwidth is needed to benchmark paging on these modern systems, as it may not be as severe as it used to be on older systems using hard drives with SATA or SCSI.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s