According to Google: PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is.
A PageRank results from a mathematical algorithm based on the Webgraph, created by all World Wide Web pages as nodes and hyperlinks as edges, taking into consideration authority hubs such as cnn.com or mayoclinic.org.
The goal is to find an effective means of ignoring links from documents with falsely influenced PageRank.
[9][10] The eigenvalue problem was also suggested in 1976 by Gabriel Pinski and Francis Narin, who worked on scientometrics ranking scientific journals,[11] in 1977 by Thomas Saaty in his concept of Analytic Hierarchy Process which weighted alternative choices,[12] and in 1995 by Bradley Love and Steven Sloman as a cognitive model for concepts, the centrality algorithm.
[13][14] A search engine called "RankDex" from IDD Information Services, designed by Robin Li in 1996, developed a strategy for site-scoring and page-ranking.
[19][20] Google founder Larry Page referenced Li's work as a citation in some of his U.S. patents for PageRank.
[21][17][22] Larry Page and Sergey Brin developed PageRank at Stanford University in 1996 as part of a research project about a new kind of search engine.
An interview with Héctor García-Molina, Stanford Computer Science professor and advisor to Sergey,[23] provides background into the development of the page-rank algorithm.
[5] Rajeev Motwani and Terry Winograd co-authored with Page and Brin the first paper about the project, describing PageRank and the initial prototype of the Google search engine, published in 1998.
[5][31] The PageRank algorithm outputs a probability distribution used to represent the likelihood that a person randomly clicking on links will arrive at any particular page.
It is assumed in several research papers that the distribution is evenly divided among all documents in the collection at the beginning of the computational process.
The PageRank values are the entries of the dominant right eigenvector of the modified adjacency matrix rescaled so that each column adds up to one.
Because of the large eigengap of the modified adjacency matrix above,[33] the values of the PageRank eigenvector can be approximated to within a high degree of accuracy within only a few iterations.
Google's founders, in their original paper,[31] reported that the PageRank algorithm for a network consisting of 322 million links (in-edges and out-edges) converges to within a tolerable limit in 52 iterations.
[34] Various strategies to manipulate PageRank have been employed in concerted efforts to improve search results rankings and monetize advertising links.
These strategies have severely impacted the reliability of the PageRank concept,[citation needed] which purports to determine which documents are actually highly valued by the Web community.
For such graphs two related positive or nonnegative irreducible matrices corresponding to vertex partition sets can be defined.
Sarma et al. describe two random walk-based distributed algorithms for computing PageRank of nodes in a network.
In both algorithms, each node processes and sends a number of bits per round that are polylogarithmic in n, the network size.
[42] In March 2016 Google announced it would no longer support this feature, and the underlying API would soon cease to operate.
The SERP rank of a web page is a function not only of its PageRank, but of a relatively large and continuously adjusted set of factors (over 200).
Search engine optimization (SEO) is aimed at influencing the SERP rank for a website or a set of web pages.
[49] When Google elaborated on the reasons for PageRank deprecation at Q&A #March 2016, they announced Links and Content as the Top Ranking Factors.
For search engine optimization purposes, some companies offer to sell high PageRank links to webmasters.
[54] Even though PageRank has become less important for SEO purposes, the existence of back-links from more popular websites continues to push a webpage higher up in search rankings.
This model is based on a query-dependent PageRank score of a page which as the name suggests is also a function of query.
[59][60] In any ecosystem, a modified version of PageRank may be used to determine species that are essential to the continuing health of the environment.
[61] A similar newer use of PageRank is to rank academic doctoral programs based on their records of placing their graduates in faculty positions.
[62] A version of PageRank has recently been proposed as a replacement for the traditional Institute for Scientific Information (ISI) impact factor,[63] and implemented at Eigenfactor as well as at SCImago.
[citation needed] In 2005, in a pilot study in Pakistan, Structural Deep Democracy, SD2[69][70] was used for leadership selection in a sustainable agriculture group called Contact Youth.