Hi all,

This goes out to the actual developers of the SR system of Overwatch. But others with experience in statistical analysis of strategy or similar games will also find this discussion interesting.

I will try to make this sound a bit less technical for everyone to understand, but I feel this is a very important change for a game like Overwatch to adopt, and make the competitive SR ranking way more intuitive, as well as, mathematically rigorous.

It is a well known problem with ELO ratings that they are based of on a constant variance term, say, V - a term which signifies how accurate a player’s rating is. Of course, such a term is far from constant in reality. A player returning to the game after a long hiatus will have a larger value of V. Also players who have not played enough games to determine their ranking accurately (hence 10 placement games instead of just 1 or 2) would have a larger A. However, a player’s skill level could be extremely high even if they haven’t played the game for too long. Or, a strong player might have a second or third account. Or a weaker player too a long break from comp and started practicing a lot in quick play / training / custom games and ended up getting really strong in a few months. In such cases, it is necessary for the SR system to correctly gauge their performance and place them in the correct skill bracket as quickly as possible. Unfortunately ELO doesn’t achieve that.

Since a (proper) matchmaking system exists in Overwatch, one other problem with the ELO system is somewhat circumvented. Though, it is still not mathematically accurate to say that two players with around the same ELO are of the same skill. This is because, a player’s skill variance is an important factor missing from this calculation.

Also, a thing like ELO hell - if it exists - exists only in the ELO system.

Glicko/Trueskill are systems where a player’s skill rating is calculated accurately considering a variance term which decreases with more games player (hence, the rating becomes more and more accurate) and increases slightly with stagnation. Strong / returning / quickly improving players can therefore, climb up the ranks faster (1000 SR to 3000 SR may be possible in 5 games). With the SR resetting every two months, this works out great as the rankings towards the end of the season truly reflect a very accurate portrayal of the true skill levels of all players. Even if the SR doesn’t reset, the variance can be set to not go down a minimum value, thus, allowing for steady progress and at the same time keeping the rankings as accurate as possible.

Lichess uses Glicko in their chess rating system. Trueskill was developed by Microsoft for their ranking systems in Xbox games. ELO was/is always adopted in games with some sort of SR system due to legacy reasons, but it’s time we moved away from ELO and adopted a more rigorous system considering more aspects than just wins or losses.

I’d love to hear your thoughts and opinions on this!