Since it was originally based on an earlier GPU design (codenamed "Larrabee") by Intel[6] that was cancelled in 2009,[7] it shared application areas with GPUs.
Initially in the form of PCI Express-based add-on cards, a second-generation product, codenamed Knights Landing, was announced in June 2013.
[12] The Larrabee microarchitecture (in development since 2006[14]) introduced very wide (512-bit) SIMD units to an x86 architecture based processor design, extended to a cache-coherent multiprocessor system connected via a ring bus to memory; each core was capable of four-way multithreading.
Due to the design being intended for GPU as well as general purpose computing, the Larrabee chips also included specialised hardware for texture sampling.
[17] Another contemporary Intel research project implementing x86 architecture on a many-multicore processor was the 'Single-chip Cloud Computer' (prototype introduced 2009[18]), a design mimicking a cloud computing computer datacentre on a single chip with multiple independent cores: the prototype design included 48 cores per chip with hardware support for selective frequency and voltage control of cores to maximize energy efficiency, and incorporated a mesh network for inter-chip messaging.
[22][23] Intel's Many Integrated Core (MIC) prototype board, named Knights Ferry, incorporating a processor codenamed Aubrey Isle was announced 31 May 2010.
The product was stated to be a derivative of the Larrabee project and other Intel research including the Single-chip Cloud Computer.
[29] Initial developers included CERN, Korea Institute of Science and Technology Information (KISTI) and Leibniz Supercomputing Centre.
[36] On 18 June 2012, Intel announced at the 2012 Hamburg International Supercomputing Conference that Xeon Phi will be the brand name used for all products based on their Many Integrated Core architecture.
[45] An important component of the Intel Xeon Phi coprocessor's core is its vector processing unit (VPU).
The VPU also supports Fused Multiply-Add (FMA) instructions and hence can execute 32 SP or 16 DP floating point operations per cycle.
[78] The National Energy Research Scientific Computing Center announced that Phase 2 of its newest supercomputing system "Cori" would use Knights Landing Xeon Phi coprocessors.
[86] Knights Mill is Intel's codename for a Xeon Phi product specialized in deep learning,[99] initially released in December 2017.
[102] Knights Hill was expected to be used in the United States Department of Energy Aurora supercomputer, to be deployed at Argonne National Laboratory.
[105][106] In 2017, Intel announced that Knights Hill had been canceled in favor of another architecture built from the ground up to enable Exascale computing in the future.
[109] Other studies in various domains, such as life sciences[110] and deep learning,[111] have shown that exploiting the thread- and SIMD-parallelism of Xeon Phi achieves significant speed-ups.