Elo rating system

The Elo[a] rating system is a method for calculating the relative skill levels of players in zero-sum games such as chess or esports.

The difference between the ratings of the winner and loser determines the total number of points gained or lost after a game.

[4] The USCF used a numerical ratings system devised by Kenneth Harkness to enable members to track their individual progress in terms other than tournament wins and losses.

[5] At about the same time, György Karoly and Roger Cook independently developed a system based on the same principles for the New South Wales Chess Association.

A statistical endeavor, by contrast, uses a model that relates the game results to underlying variables representing the ability of each player.

Elo's central assumption was that the chess performance of each player in each game is a normally distributed random variable.

To simplify computation even further, Elo proposed a straightforward method of estimating the variables in his model (i.e., the true skill of each player).

One could calculate relatively easily from tables how many games players would be expected to win based on comparisons of their ratings to those of their opponents.

Elo's original suggestion, which is still widely used, was a simple linear adjustment proportional to the amount by which a player over-performed or under-performed their expected score.

Note that while two wins, two losses, and one draw may seem like a par score, it is worse than expected for player A because their opponents were lower rated on average.

And if the K-value is too low, the sensitivity will be minimal, and the system will not respond quickly enough to changes in a player's actual level of performance.

The USCF (which makes use of a logistic distribution as opposed to a normal distribution) formerly staggered the K-factor according to three main rating ranges: Currently, the USCF uses a formula that calculates the K-factor based on factors including the number of games played and the player's rating.

The above expressions can be now formally derived by exploiting the link between the Elo rating and the stochastic gradient update in the logistic regression.

is Since the very beginning, the Elo rating has been also used in chess where we observe wins, losses or draws and, to deal with the latter a fractional score value,

To address these difficulties, and to derive the Elo rating in the ternary games, we will define the explicit probabilistic model of the outcomes.

This approach to pairing certainly maximizes the rating risk of the higher-rated participants, who may face very stiff opposition from players below 3000, for example.

[46] Some methods, used in Norway for example, differentiate between juniors and seniors, and use a larger K-factor for the young players, even boosting the rating progress by 100% for when they score well above their predicted performance.

This also combats deflation, but the chairman of the USCF Ratings Committee has been critical of this method because it does not feed the extra points to the improving players.

Jeff Sagarin of USA Today publishes team rankings for most American sports, which includes Elo system ratings for college football.

[55] In pool, an Elo-based system called Fargo Rate is used to rank players in organized amateur and professional competitions.

The MOBA game League of Legends used an Elo rating system prior to the second season of competitive play.

[67] Mechwarrior Online instituted an Elo system for its new "Comp Queue" mode, effective with the Jun 20, 2017 patch.

In 1998, an online gaming ladder called Clanbase[72] was launched, which used the Elo scoring system to rank teams.

[74] A similar alternative site was launched in 2016 under the name Scrimbase,[75] which also used the Elo scoring system for ranking teams.

Comparative descriptions were utilized alongside the Elo rating system to provide robust and discriminative 'relative measurements', permitting accurate identification.

The Elo rating system has also been used in biology for assessing male dominance hierarchies,[78] and in automation and computer vision for fabric inspection.

[84] The YouTuber Marques Brownlee and his team used Elo rating system when they let people to vote between digital photos taken with different smartphone models launched in 2022.

[85] The Elo rating system has also been used in U.S. revealed preference college rankings, such as those by the digital credential firm Parchment.

[91] The Elo rating system was featured prominently in the 2010 film The Social Network during the algorithm scene where Mark Zuckerberg released Facemash.

In the scene Eduardo Saverin writes mathematical formulas for the Elo rating system on Zuckerberg's dormitory room window.

Arpad Elo , the inventor of the Elo rating system
Graphs of probabilities and Elo rating changes (for K=16 and 32) of expected outcome (solid curve) and unexpected outcome (dotted curve) vs initial rating difference. For example, player A starts with a 1400 rating and B with 1800 in a tournament using K = 32 (brown curves). The blue dash-dot line denotes the initial rating difference of 400 ( 1800 − 1400 ). The probability of B winning, the expected outcome, is 0.91 (intersection of black solid curve and blue line); if this happens, A 's rating decreases by 3 (intersection of brown solid curve and blue line) to 1397 and B 's increases by the same amount to 1803. Conversely, the probability of A winning, the unexpected outcome, is 0.09 (intersection of black dotted curve and blue line); if this happens, A 's rating increases by 29 (intersection of brown dotted curve and blue line) to 1429 and B 's decreases by the same amount to 1771.