How Is Elo Rating Calculated

ELO Rating Calculator

Calculate the new ELO ratings for two players after a match using the standard ELO system

Player 1 New Rating:
Player 2 New Rating:
Rating Change for Player 1:
Rating Change for Player 2:
Expected Score for Player 1:
Expected Score for Player 2:

How Is ELO Rating Calculated: The Complete Guide

The ELO rating system is a method for calculating the relative skill levels of players in competitor-versus-competitor games such as chess, esports, and other head-to-head competitions. Developed by Hungarian-American physics professor Arpad Elo in the 1960s, this system has become the standard for rating players in various competitive fields.

Understanding the ELO Rating System

The ELO system is based on the principle that the performance of a player in a game is a normally distributed random variable. The core idea is that:

  • Each player has a true skill level that determines their probability of winning against other players
  • When a higher-rated player wins, they gain fewer points than when a lower-rated player wins
  • The system is zero-sum – the total points in the system remain constant (excluding new players)

The Basic ELO Formula

The fundamental ELO formula for calculating new ratings after a game is:

New Rating = Old Rating + K × (Result – Expected Score)

Where:

  • K is the development coefficient (K-factor)
  • Result is 1 for a win, 0.5 for a draw, and 0 for a loss
  • Expected Score is the probability of winning based on current ratings

Calculating Expected Score

The expected score for Player A against Player B is calculated using:

EA = 1 / (1 + 10(RB – RA)/400)

Where RA and RB are the current ratings of Player A and Player B respectively.

The K-Factor: Development Coefficient

The K-factor determines how much a player’s rating can change after each game. Different organizations use different K-factors:

Player Type Typical K-Factor Description
Beginners 10-20 New players with fewer than 30 games
Intermediate Players 20-30 Players with 30-100 games
Established Players 10-20 Players with 100+ games
Masters 10 Players rated 2400+

In our calculator, we’ve included options for K-factors of 10, 20, 30, and 40 to cover different scenarios. The standard K-factor of 40 is selected by default as it provides a good balance for most calculations.

Practical Example of ELO Calculation

Let’s walk through a concrete example to understand how ELO ratings are calculated:

  1. Initial Ratings: Player A has 1600, Player B has 1500
  2. K-factor: 32 (common in chess)
  3. Match Result: Player A wins

Step 1: Calculate Expected Scores

EA = 1 / (1 + 10(1500-1600)/400) = 1 / (1 + 10-0.25) ≈ 0.65

EB = 1 – EA ≈ 0.35

Step 2: Determine Actual Results

Player A wins (1 point), Player B loses (0 points)

Step 3: Calculate New Ratings

New RatingA = 1600 + 32 × (1 – 0.65) ≈ 1610.4

New RatingB = 1500 + 32 × (0 – 0.35) ≈ 1488.8

After rounding, Player A would have 1610 and Player B would have 1489.

ELO Rating Systems in Different Games

The ELO system has been adapted for various competitive games and sports:

Game/Sport Typical K-Factor Initial Rating Special Rules
Chess (FIDE) 10-40 1200 (beginners), 1500 (intermediate) Different K-factors based on player level
League of Legends Varies by tier 1200 (Iron) to 2500+ (Challenger) LP system with promotion series
FIFA (EA Sports) Varies 1000-1500 Skill rating with divisions
American Football (NFL) 20-30 1500 Team ratings with home field advantage
eSports (General) 30-50 1000-1500 Often modified with additional factors

Common Misconceptions About ELO

Despite its widespread use, there are several misunderstandings about the ELO system:

  1. ELO measures absolute skill: ELO only measures relative skill within a specific player pool. A 2000-rated chess player isn’t necessarily twice as good as a 1000-rated player.
  2. Higher K-factor is always better: While a higher K-factor means faster rating changes, it can lead to more volatility. Most systems reduce the K-factor as players become more established.
  3. ELO accounts for all factors: The basic ELO system doesn’t consider home advantage, player fatigue, or other contextual factors that might affect performance.
  4. All rating systems are ELO: Many games use modified or completely different systems (like Glicko or TrueSkill) that build on ELO’s principles but add additional features.

Advanced ELO Variations

While the basic ELO system works well for many applications, several variations have been developed to address specific needs:

Glicko Rating System

Developed by Mark Glickman, the Glicko system extends ELO by adding a ratings deviation (RD) that measures the reliability of a player’s rating. Key features:

  • Accounts for rating uncertainty
  • Ratings change more dramatically when a player has few games
  • RD decreases as more games are played

TrueSkill System

Developed by Microsoft Research for Xbox Live, TrueSkill is a Bayesian skill rating system that:

  • Handles team games and multiplayer matches
  • Provides uncertainty measurements
  • Supports draw probabilities

Elo-MMR Systems

Many modern games combine ELO with Matchmaking Rating (MMR) systems that:

  • Use ELO as a base but add additional factors
  • Often hide the exact rating from players
  • May use different algorithms for different skill brackets

Mathematical Foundations of ELO

The ELO system is based on several mathematical principles:

Logistic Function

The expected score formula uses a logistic function to convert rating differences into probabilities. The formula:

E = 1 / (1 + 10-(R1 – R2)/400)

This creates an S-shaped curve where:

  • A 0-point difference gives a 50% win probability
  • A 400-point difference gives about 90%/10% probabilities
  • The relationship is symmetric (if A is 200 points higher than B, B is 200 points lower than A)

Zero-Sum Property

In a two-player game, the total rating points remain constant (excluding any bonus points for new players):

ΔR1 + ΔR2 = 0

This means that for every point one player gains, the other loses a point (in a two-player game).

Rating Inflation and Deflation

While the two-player system is zero-sum, real-world implementations often experience:

  • Inflation: When new players join at ratings below the average, causing the average rating to rise over time
  • Deflation: When weak players leave the system, causing the average rating to drop

Many rating systems include mechanisms to control inflation/deflation, such as:

  • Adjusting initial ratings for new players
  • Periodic rating resets or adjustments
  • Bonus points for high-performance players

Implementing ELO in Your Own Applications

If you’re developing a game or competitive system, here’s how to implement ELO:

  1. Choose initial ratings: Common starting points are 1200-1500, depending on your player base.
  2. Select K-factors: Consider different K-factors for different skill levels.
  3. Handle new players: Decide whether to give new players bonus points to encourage participation.
  4. Implement the formula: Use the standard ELO formula or a variation.
  5. Store historical data: Keep records of rating changes for analysis.
  6. Prevent abuse: Implement measures against rating manipulation.

Here’s a simple Python implementation:

def calculate_elo(rating1, rating2, result, k_factor):
    """
    Calculate new ELO ratings
    :param rating1: Current rating of player 1
    :param rating2: Current rating of player 2
    :param result: 1 if player 1 wins, 0.5 for draw, 0 if player 2 wins
    :param k_factor: Development coefficient
    :return: Tuple of (new_rating1, new_rating2)
    """
    expected1 = 1 / (1 + 10 ** ((rating2 - rating1) / 400))
    expected2 = 1 / (1 + 10 ** ((rating1 - rating2) / 400))

    new_rating1 = rating1 + k_factor * (result - expected1)
    new_rating2 = rating2 + k_factor * ((1 - result) - expected2)

    return round(new_rating1), round(new_rating2)

Criticisms and Limitations of ELO

While widely used, the ELO system has some limitations:

  1. Assumes performance is normally distributed: In reality, player performance might follow different distributions.
  2. Doesn’t account for team dynamics: Basic ELO struggles with team games where individual performance is hard to measure.
  3. Ignores contextual factors: Factors like home advantage, player fatigue, or equipment differences aren’t considered.
  4. Sensitive to initial conditions: The starting ratings can significantly affect long-term rating distributions.
  5. Encourages conservative play: Players might avoid risky strategies to protect their rating.

Despite these limitations, ELO remains popular due to its simplicity and effectiveness for most competitive scenarios.

ELO in the Real World: Case Studies

Chess

The World Chess Federation (FIDE) uses a modified ELO system where:

  • K-factors vary by player level (10 for top players, up to 40 for beginners)
  • Ratings are published monthly
  • A player must complete at least 9 games to receive an official rating
  • The minimum rating is 1000, maximum is typically around 2800 (though no theoretical maximum exists)

As of 2023, the highest FIDE rating ever achieved was 2882 by Magnus Carlsen.

League of Legends

Riot Games uses a proprietary system based on ELO principles for their ranked ladder:

  • Players are divided into tiers (Iron, Bronze, Silver, Gold, Platinum, Diamond, Master, Grandmaster, Challenger)
  • Each tier has divisions (IV to I)
  • LP (League Points) are used within divisions
  • The system considers both individual and team performance

Unlike pure ELO, the League system includes promotion series and demotion shields to reduce volatility.

FIFA Video Game

EA Sports’ FIFA series uses a modified ELO system for its Ultimate Team online matches:

  • Players start at 1000 skill rating
  • Divisions range from 10 (lowest) to 1 (highest), plus Elite division
  • Weekend League qualifiers use a separate ELO-like system
  • The matchmaking tries to pair players with similar skill ratings

Academic Research on Rating Systems

The ELO system has been extensively studied in academic literature. Notable research includes:

  1. Elo’s Original Paper: “The Rating of Chessplayers, Past and Present” (1978) by Arpad Elo remains the foundational work.
  2. Glickman’s Glicko System: “The Glicko System” (1999) introduced the concept of ratings deviation to address uncertainty in player ratings.
  3. TrueSkill Paper: “TrueSkill™: A Bayesian Skill Rating System” (2007) by Herbrich et al. from Microsoft Research presented a more sophisticated probabilistic model.

For those interested in the mathematical foundations, the American Mathematical Society has published several papers on rating systems and their statistical properties.

Future of Rating Systems

As competitive gaming and esports continue to grow, rating systems are evolving:

  • Machine Learning Approaches: Some systems now use ML to predict outcomes more accurately by considering more factors.
  • Behavioral Analysis: New systems incorporate player behavior metrics to identify smurfs and boosters.
  • Dynamic K-factors: Some games now adjust K-factors in real-time based on performance consistency.
  • Cross-game Rating: Efforts are underway to create rating systems that work across multiple games.

The National Institute of Standards and Technology (NIST) has shown interest in standardized rating systems for competitive integrity in esports.

Conclusion

The ELO rating system remains one of the most influential and widely-used methods for measuring relative skill in competitive games. Its simplicity, combined with its effectiveness at predicting match outcomes, has ensured its longevity across diverse applications from chess to video games to sports.

While more sophisticated systems like Glicko and TrueSkill have addressed some of ELO’s limitations, the core principles of the ELO system continue to form the foundation of most competitive rating systems today. Understanding how ELO works not only helps players comprehend their rating changes but also provides valuable insights into the mathematics of competitive balance.

Whether you’re a competitive player looking to understand your rating fluctuations, a game developer implementing a ranking system, or simply curious about the mathematics behind competitive gaming, the ELO system offers a fascinating intersection of statistics, psychology, and game theory.

Leave a Reply

Your email address will not be published. Required fields are marked *