How Elo Is Calculated

ELO Rating Calculator

Calculate expected ELO changes based on match outcomes and player ratings

Results

Player A New Rating:
Player B New Rating:
Rating Change for Player A:
Expected Score for Player A:

Comprehensive Guide: How ELO Rating is Calculated

The ELO rating system, developed by Hungarian-American physics professor Arpad Elo in the 1960s, has become the standard for measuring relative skill levels in competitive games. Originally designed for chess, it’s now used in video games, sports, and even academic ranking systems. This guide explains the mathematical foundation, practical applications, and nuances of ELO calculations.

1. The Core ELO Formula

The ELO system operates on several key principles:

  1. Rating Difference Determines Probability: The greater the difference between two players’ ratings, the higher the probability that the higher-rated player will win.
  2. Points Exchange After Matches: After each match, points are transferred from the loser to the winner, with the amount depending on the upset magnitude.
  3. Dynamic Adjustment: Ratings continuously adjust based on match outcomes, creating a self-correcting system.

The fundamental ELO formula for calculating the expected score (E) of Player A against Player B is:

EA = 1 / (1 + 10(RB – RA)/400)

Where:

  • EA = Expected score for Player A
  • RA = Current rating of Player A
  • RB = Current rating of Player B

2. Rating Adjustment After a Match

After a match, ratings are updated based on:

  1. The actual result (1 for win, 0.5 for draw, 0 for loss)
  2. The expected result (calculated from the formula above)
  3. The K-factor (development coefficient)

The new rating is calculated as:

RA(new) = RA + K × (SA – EA)

Where:

  • RA(new) = New rating for Player A
  • K = K-factor (development coefficient)
  • SA = Actual result (1, 0.5, or 0)
  • EA = Expected result from the first formula
Rating Difference Expected Score for Higher-Rated Player Expected Score for Lower-Rated Player Points Exchanged (K=32)
0 0.50 0.50 16 (if higher-rated wins)
100 0.64 0.36 10.24 (if higher-rated wins)
200 0.76 0.24 7.68 (if higher-rated wins)
300 0.85 0.15 5.60 (if higher-rated wins)
400 0.90 0.10 3.84 (if higher-rated wins)

3. The K-Factor Explained

The K-factor determines how much a player’s rating can change after a single match. Different organizations use different K-factors:

  • FIDE (Chess): 10 for top players, 20 for weaker players, 40 for new players
  • USCF (Chess): 32 for masters, higher for weaker players
  • FIFA (Soccer): 30
  • League of Legends: Varies by tier (higher in lower tiers)
  • Chess.com: Dynamic system that changes based on rating and game type

Higher K-factors mean:

  • More volatile ratings
  • Faster convergence to “true” rating
  • Greater rewards for upsets
  • More punishment for losses

Lower K-factors mean:

  • More stable ratings
  • Slower adjustment to true skill level
  • Less dramatic changes from single matches

4. Practical Examples

Let’s examine some concrete examples to understand how ELO works in practice:

Example 1: Evenly Matched Players

Player A: 1500
Player B: 1500
K-factor: 32
Result: Player A wins

Calculation:

  1. Expected score for A: 1 / (1 + 10(1500-1500)/400) = 0.5
  2. New rating for A: 1500 + 32 × (1 – 0.5) = 1516
  3. New rating for B: 1500 + 32 × (0 – 0.5) = 1484

Example 2: Upset Victory

Player A: 1800
Player B: 2200
K-factor: 24
Result: Player A wins (upset)

Calculation:

  1. Expected score for A: 1 / (1 + 10(2200-1800)/400) ≈ 0.24
  2. New rating for A: 1800 + 24 × (1 – 0.24) ≈ 1819
  3. New rating for B: 2200 + 24 × (0 – 0.76) ≈ 2181

Example 3: Expected Victory

Player A: 2000
Player B: 1600
K-factor: 32
Result: Player A wins (as expected)

Calculation:

  1. Expected score for A: 1 / (1 + 10(1600-2000)/400) ≈ 0.85
  2. New rating for A: 2000 + 32 × (1 – 0.85) ≈ 2005
  3. New rating for B: 1600 + 32 × (0 – 0.15) ≈ 1595

5. Mathematical Properties of ELO

The ELO system has several important mathematical properties:

  1. Zero-Sum Game: The total points in the system remain constant (ignoring new players). When one player gains points, another loses them.
  2. Convergence: With sufficient games, players’ ratings will converge to their “true” skill levels.
  3. Scale Invariance: Adding the same constant to all ratings doesn’t change the relative probabilities.
  4. Transitivity: If A is favored over B and B over C, then A is favored over C (though not necessarily by the sum of the individual advantages).

6. Common Variations and Extensions

While the basic ELO system is powerful, many organizations have developed variations:

Elo-MOD (Modified ELO)

Used in some sports, this modification:

  • Considers margin of victory
  • Uses different K-factors based on the importance of the match
  • May include home-field advantage

Glicko and Glicko-2

Developed by Mark Glickman, these systems:

  • Include a ratings deviation (RD) that measures reliability
  • Account for competitive inactivity
  • Are used by the FIDE for some calculations

Trueskill (Microsoft)

Used in Xbox Live, this Bayesian system:

  • Models uncertainty with Gaussian distributions
  • Handles team games better than basic ELO
  • Considers both skill and consistency
System Key Features Primary Use Cases Rating Range
Basic ELO Simple pairwise comparisons, zero-sum Chess, 1v1 games Typically 100-3000
Glicko Includes ratings deviation, handles inactivity Online gaming, sports 1500± (with RD)
Glicko-2 Improved volatility measurement Competitive gaming, esports 1500± (with RD)
Trueskill Bayesian, handles teams, uncertainty modeling Xbox Live, team games 25± (with μ and σ)
Elo-MOD Considers score difference, match importance Sports (soccer, basketball) Varies by sport

7. Real-World Applications

The ELO system and its variants are used in numerous fields:

Chess

The original and most famous application. FIDE (World Chess Federation) maintains official ELO ratings for competitive players worldwide. The system helps:

  • Match players of similar skill levels
  • Determine tournament seedings
  • Identify rising talents
  • Track player development over time

Video Games

Nearly all competitive games use some form of ELO:

  • League of Legends: Uses a modified ELO system for ranked matchmaking
  • Dota 2: Uses Glicko-2 for its ranked system
  • Counter-Strike: Uses a proprietary system inspired by ELO
  • Rocket League: Uses a modified ELO with uncertainty measurements

Sports

Many sports federations use ELO or similar systems:

  • FIFA: Uses ELO for national team rankings
  • World Rugby: Uses a modified ELO system
  • NBA: Some analysts use ELO to predict game outcomes
  • NFL: ELO is part of the NFL’s official power rankings

Academic and Other Uses

ELO principles are applied in:

  • University ranking systems
  • Product recommendation algorithms
  • Political election forecasting
  • Peer review systems for academic papers

8. Strengths and Limitations

Strengths

  • Simplicity: Easy to understand and implement
  • Self-correcting: Naturally adjusts to player improvement
  • Scalable: Works for any number of players
  • Fair matching: Encourages competitive balance
  • Predictive power: Good at estimating match outcomes

Limitations

  • Assumes equal variance: All players’ ratings are equally uncertain
  • No margin of victory: Basic ELO only considers win/loss
  • Initial ratings matter: Starting points can bias long-term ratings
  • Inflation/deflation: Without proper controls, average rating can drift
  • Team games difficult: Basic ELO struggles with team dynamics

9. How to Improve Your ELO

Understanding how ELO works can help you improve your rating:

  1. Play Against Higher-Rated Opponents: Winning against higher-rated players gives more points than expected wins.
  2. Avoid Unnecessary Losses: Losing to lower-rated players costs more points than expected.
  3. Focus on Consistency: In systems with ratings deviation (like Glicko), consistent performance reduces uncertainty.
  4. Understand the Meta: In games, knowing the current optimal strategies helps win “unwinnable” matches.
  5. Analyze Your Games: Reviewing losses against higher-rated players can reveal improvement areas.
  6. Play Regularly: Inactivity can increase your ratings deviation in some systems.
  7. Manage Tilt: Emotional control prevents losing streaks that devastate ratings.

10. Common Misconceptions

Several myths persist about ELO systems:

  1. “ELO measures absolute skill”: It only measures relative skill within the rated population.
  2. “Higher K-factor is always better”: Higher K-factors lead to more volatile ratings, which may not reflect true skill.
  3. “You can’t improve your ELO after a certain point”: Ratings can always change with sufficient evidence (matches).
  4. “ELO is only for 1v1 games”: While basic ELO is pairwise, extensions handle teams well.
  5. “The rating number itself is meaningful”: Only differences between ratings matter (the scale is arbitrary).

11. Advanced Topics

Rating Inflation and Deflation

Over time, rating systems can experience:

  • Inflation: Average rating increases (common in games where new players start low)
  • Deflation: Average rating decreases (rare, usually requires artificial controls)

Solutions include:

  • Periodic rating resets
  • Dynamic K-factors that adjust based on rating
  • Bonus pools for new players

Handling New Players

New players present challenges:

  • Initial Rating: Where to start? Too high/low causes problems.
  • Uncertainty: New players have unknown true skill levels.
  • Exploitation: Smurf accounts (high-skilled players with new accounts).

Common solutions:

  • Provisional ratings with higher K-factors
  • Placement matches before entering the main pool
  • Behavioral analysis to detect smurfs

Team ELO Calculations

For team games, systems typically:

  1. Calculate each team’s average rating
  2. Treat the match as a comparison between these averages
  3. Distribute the team’s rating change to individual players

Example for a 5v5 game:

Team A average: 1800
Team B average: 1700
Result: Team A wins

Team A gains fewer points than if they’d won against a team rated 1800, because they were favored.

12. Historical Context and Development

The ELO system was developed in the 1960s by Arpad Elo, a physics professor at Marquette University. Key milestones:

  • 1960: Elo publishes “The Rating of Chessplayers, Past and Present”
  • 1970: FIDE adopts the ELO system for international chess
  • 1990s: Online gaming begins using ELO for matchmaking
  • 2000s: Glicko and Trueskill systems developed
  • 2010s: ELO principles applied to machine learning and recommendation systems

Elo’s original paper (available through AMS) remains one of the most cited works in competitive rating systems.

13. Implementing Your Own ELO System

To implement a basic ELO system:

  1. Choose initial ratings (often 1500 for new players)
  2. Select a K-factor (32 is common for new systems)
  3. After each match:
    1. Calculate expected scores for both players
    2. Determine actual results (1, 0.5, or 0)
    3. Update ratings using the formula
  4. Consider adding:
    • Rating floors/ceilings
    • Inactivity decay
    • Provisional status for new players
    • Margin of victory considerations

Here’s a simple Python implementation:

def expected_score(rating_a, rating_b):
    return 1 / (1 + 10 ** ((rating_b - rating_a) / 400))

def update_ratings(rating_a, rating_b, result_a, k_factor=32):
    # result_a: 1 for win, 0.5 for draw, 0 for loss
    ea = expected_score(rating_a, rating_b)
    new_a = rating_a + k_factor * (result_a - ea)
    new_b = rating_b + k_factor * ((1 - result_a) - (1 - ea))
    return new_a, new_b

# Example usage:
rating_a, rating_b = 1500, 1600
new_a, new_b = update_ratings(rating_a, rating_b, 1)  # Player A wins
print(f"New ratings: A={new_a:.1f}, B={new_b:.1f}")

14. ELO in the Age of Big Data

Modern implementations often combine ELO with:

  • Machine Learning: To detect anomalies or predict rating changes
  • Behavioral Analysis: To identify smurfs or boosted accounts
  • Performance Metrics: Incorporating in-game statistics beyond win/loss
  • Real-time Updates: Some systems update ratings during matches

The Stanford University paper on “Elo Rating and Trueskill” explores these advanced applications in depth.

15. Ethical Considerations

Rating systems raise important questions:

  • Privacy: Should individual ratings be public?
  • Fairness: Do systems disadvantage certain groups?
  • Transparency: Should the exact algorithms be open?
  • Manipulation: How to prevent gaming the system?
  • Mental Health: Can rating systems cause undue stress?

The Association for Computing Machinery has published guidelines on ethical ranking systems.

Conclusion

The ELO rating system remains one of the most elegant and effective methods for measuring competitive skill over 60 years after its invention. Its simplicity belies its mathematical sophistication, and its adaptability has allowed it to thrive in domains far beyond its original chess application.

Understanding ELO principles can help competitors in any rated system make better decisions about who to play, how to improve, and how to interpret rating changes. For developers, the system provides a robust foundation that can be extended with modern statistical techniques to handle increasingly complex competitive environments.

As competitive gaming and esports continue to grow, we’ll likely see further innovations in rating systems, but the core principles of ELO will undoubtedly remain influential for decades to come.

Leave a Reply

Your email address will not be published. Required fields are marked *