Titan (supercomputer)

Titan or OLCF-3 was a supercomputer built by Cray at Oak Ridge National Laboratory for use in a variety of science projects.

Titan was eclipsed at Oak Ridge by Summit in 2019, which was built by IBM and features fewer nodes with much greater GPU capability per node as well as local per-node non-volatile caching of file data from the system's parallel file system.

[2] Titan employed AMD Opteron CPUs in conjunction with Nvidia Tesla GPUs to improve energy efficiency while providing an order of magnitude increase in computational power over Jaguar.

Titan was available for any scientific purpose; access depends on the importance of the project and its potential to exploit the hybrid architecture.

The modifications typically increased the degree of parallelism, given that GPUs offer many more simultaneous threads than CPUs.

[14] ORNL's external ESnet connection was upgraded from 10 Gbit/s to 100 Gbit/s and the system interconnect (the network over which CPUs communicate with each other) was updated.

[15] Beginning on September 13, 2012, Nvidia K20X GPUs were fitted to all of Jaguar's XK7 compute blades, including the 960 TitanDev nodes.

[32] Titan uses Jaguar's 200 cabinets, covering 404 square meters (4,352 ft2), with replaced internals and upgraded networking.

[35] Power is provided to each cabinet at three-phase 480 V. This requires thinner cables than the US standard 208 V, saving $1 million in copper.

[40] Although the GPUs have a slower clock speed than the CPUs, each GPU contains 2,688 CUDA cores at 732 MHz,[43] resulting in a faster overall system.

[34][44] Consequently, the CPUs' cores are used to allocate tasks to the GPUs rather than directly processing the data as in conventional supercomputers.

[46] Fan noise is so loud that hearing protection is required for people spending more than 15 minutes in the machine room.

[48] In 2009, the Oak Ridge Leadership Computing Facility that manages Titan narrowed the fifty applications for first use of the supercomputer down to six "vanguard" codes chosen for the importance of the research and for their ability to fully utilize the system.

[49] OLCF formed the Center for Accelerated Application Readiness (CAAR) to aid with the adaptation process.

It holds developer workshops at Nvidia headquarters to educate users about the architecture, compilers and applications on Titan.

[56][57] CAAR has been working on compilers with Nvidia and code vendors to integrate directives for GPUs into their programming languages.

"[56] Moab Cluster Suite is used to prioritize jobs to nodes to keep utilization high; it improved efficiency from 70% to approximately 95% in the tested software.

According to Dr. Messer of NRDF, only a small percentage of his code runs on GPUs because the calculations are relatively simple but processed repeatedly and in parallel.