QPACE

The main goal was the design of an application-optimized scalable architecture that beats industrial products in terms of compute performance, price-performance ratio, and energy efficiency.

The system architecture is also suitable for other applications that mainly rely on nearest-neighbor communication, e.g., lattice Boltzmann methods.

[2] The title was defended in June 2010, when the architecture achieved an energy signature of 773 MFLOPS per Watt in the Linpack benchmark.

[6][7][8] It is one of the building blocks of the IBM Roadrunner cluster, which was the first supercomputer architecture to break the PFLOPS barrier.

Cluster architectures based on the PowerXCell 8i typically rely on IBM BladeCenter blade servers interconnected by industry-standard networks such as Infiniband.

Each node card hosts one PowerXCell 8i, 4 GB of DDR2 SDRAM with ECC, one Xilinx Virtex-5 FPGA and seven network transceivers.

The QPACE network co-processor is implemented on a Xilinx Virtex-5 FPGA, which is directly connected to the I/O interface of the PowerXCell 8i.

[9][10] The functional behavior of the FPGA is defined by a hardware description language and can be changed at any time at the cost of rebooting the node card.

Each node card is mounted to a thermal box, which acts as a large heat sink for heat-critical components.