Google data centers are the large data center facilities Google uses to provide their services, which combine large drives, computer nodes organized in aisles of racks, internal and external networking, environmental controls (mainly cooling and humidification control), and operations software (especially as concerns load balancing and fault tolerance).
To handle this peak load, they built a compute cluster with ~15,000 commodity-class PCs instead of expensive supercomputer hardware to save money.
There were several CPU generations in use, ranging from single-processor 533MHz Intel-Celeron-based servers to dual 1.4GHz Intel Pentium III.
How this is measured is unclear, but it is likely to incorporate running costs of the entire server, and CPU power consumption could be a significant factor.
[80] Servers as of 2009–2010 consisted of custom-made open-top systems containing two processors (each with several cores[81]), a considerable amount of RAM spread over 8 DIMM slots housing double-height DIMMs, and at least two SATA hard disk drives connected through a non-standard ATX-sized power supply unit.
According to CNET and a book by John Hennessy, each server had a novel 12-volt battery to reduce costs and improve power efficiency.
In order to run such a large network, with direct connections to as many ISPs as possible at the lowest possible cost, Google has a very open peering policy.
This public network is used to distribute content to Google users as well as to crawl the internet to build its search indexes.
These custom switch-routers are connected to DWDM devices to interconnect data centers and point of presences (PoP) via dark fiber.
From an operation standpoint, when a client computer attempts to connect to Google, several DNS servers resolve www.google.com into multiple IP addresses via Round Robin policy.
[96] To support fault tolerance, increase the scale of data centers and accommodate low-radix switches, Google has adopted various modified Clos topologies in the past.
[97] One of the largest Google data centers is located in the town of The Dalles, Oregon, on the Columbia River, approximately 80 miles (129 km) from Portland.
Codenamed "Project 02", the complex was built in 2006 and is approximately the size of two American football fields, with cooling towers four stories high.
[98][99] The site was chosen to take advantage of inexpensive hydroelectric power, and to tap into the region's large surplus of fiber optic cable, a remnant of the dot-com boom.
[100] In February 2009, Stora Enso announced that they had sold the Summa paper mill in Hamina, Finland to Google for 40 million Euros.
[105] In 2013, the press revealed the existence of Google's floating data centers along the coasts of the states of California (Treasure Island's Building 3) and Maine.
[115] Google has acknowledged that Python has played an important role from the beginning, and that it continues to do so as the system grows and evolves.
[91] To lessen the effects of unavoidable hardware failure, software is designed to be fault tolerant.
Initially, the index was being served from hard disk drives, as is done in traditional information retrieval (IR) systems.
This switch "radically changed many design parameters" of their search system, and allowed for a significant increase in throughput and a large decrease in latency of queries.
[136] In October 2013, The Washington Post reported that the U.S. National Security Agency intercepted communications between Google's data centers, as part of a program named MUSCULAR.