What We Need Is More Heterogeneity

Current processors offer some variety in the types of computational engines. They consist of multiple identical general purpose cores, embedded multi-core graphics cores (GPUs), and accelerators for video and audio compression. Recently, Intel has integrated Field Programmable Gate Array (FPGA) into the server-class processors, for even more customizable computational flexibility.

Modern CPU’s (Intel, AMD, ARM, MIPS and others) support multiple types of parallelism. Multiple cores are provided to support efficient execution by multiple threads. Each core supports serial and data-parallel computation. Serial computation supports up to 64-bit integers at a time. Data-parallel support up to 512-bits in width, split into 8-bit, 16-bit, 32-bit, or 64-bit values. Thus, eight 64-bit values can be operated on in parallel, of sixty-four 8-bit values in parallel.

Specialized workloads such as video compression are supported by built-in video compression/decompression hardware units, to improve mobile experience of watching online videos, by lowering power consumption when decoding video. Specialized instruction are available to accelerate general data compression of the AES algorithm.

It’s amazing how richly diverse current processors have become in their variety of computational hardware.

Graphics processors (GPU’s) provide just as diverse of compute variety as the CPU, with support for many threads, wide parallel computation and specialized computational engines for video compression/decompression. The latest support specialized computation to support Artificial Intelligence and Ray Tracing.

Cloud vendors have been augmenting their compute nodes with large FPGA’s, GPU’s and Tensor Flow processors to support Artificial Intelligence and other highly demanding applications.


Software has also grown, matured and diversified in support this hardware variety. Multi-threaded software is maturing with libraries, parallel patterns, thread pools and support of tasks by several modern languages, making it much simpler to have robust and consistent parallelism. Data parallel computing is mainly supported by specialized libraries, with limited automatic compiler support.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s