Frustration
For several years I was frustrated at MMR/SBMM (matchmaking ranking, Skill-based matchmaking) in Overwatch, my favorite multiplayer game, and in my opinion, undoubtedly the best designed team-based FPS of all time (better than counter strike, sorry).
Why was I frustrated?
First, I only solo queue, I just can’t find people to play with, or people who share my poor level.
I felt that I was either being carried by the team, or I was carrying for them. Despite this, my MMR score was never really impacted by my own performance: this would mean that me and my team mates would obtain about the same reward, disregarding how they contributed to the loss or victory.
Realization
So the very good player got about the same reward than the much less good player.
After many matches, I felt that there was a great gap of individual skill between players of the same team, despite the MMR system being designed, (in my mind), to sort players according to how good they are, so they can play against people who have similar skills.
Arguing
I argued a lot on forums, and came up with a system where individual performance would give a bonus to players.
Opponents argued back that this was a bad idea, since this would encourage players to aim for individual performance instead of team performance: this would lead to solo queueing players not caring if their team win or lose, since the game rewards individual performance more.
I answered that my system could be configured to balance the calculation between individual performance and the ELO gain due to loss or victory.
I also argued that the ELO system was invented for chess, a solo game, and was not designed for team based matches. Why does that matter?
Pondering about trust and toxicity
Algorithms that assemble a team automatically, based on MMR/ELO score are quite recent. This means the algorithm takes players from similar rank, and creates a match against another team with a similar average ELO score.
This is new because usually, in TEAM sports or online, players just find team-mates they trust or team-mates they believe share the same skill.
It is true that there are individual ELO rankings for sports like tennis, but not for games with larger teams, like soccer, handball etc.
For those sports like soccer, handball, basketball, and so on, athletic skills are evaluated by humans who watch the athlete play, not by an algorithm that watches if an athlete was on the winning or losing team. Humans then decide what player to put on the team.
This obviously leads to player having a low amount of trust towards their stranger team mates if they don’t feel they are pulling their weight. Beyond toxicity, players feel they make effort and are not rewarded, or don’t make any effort and get rewarded anyway. Matchmaking now feels random.
Heroes
Overwatch makes it even more complicated to evaluate individual skill because it allows to pick among 40 heroes who have a large range of different abilities.
Previously, overwatch gathered statistics about hero performance, and compared them to average performance of the same hero in the same skill division, and calculated this hero-relative performance score: things like damage, crowd control, damage absorbed by shield, healing made, etc.
Obviously, those are quantitative and not qualitative, and doesn’t really allow to accurately evaluate how a player contributed to a loss or a win, but it’s better than nothing, especially in lower ranks: opponents argue that game sense, positioning, timing, etc, cannot be measured but matter much than quantitative stats.
Balancing
I then started writing some system to reward a mix of individual and team performance.
-
This system is designed to be “zero sum”, in the sense that the score added and substracted doesn’t lead to an increase in the total quantity of points across all player population. So in total, the winner team is rewarded 100 points, and the loser team loses 100 points. What matter is how they are distributed among players of a team.
-
It is possible to configure one variable to balance how much the individual performance weighs on the ELO score reward or penalty. I call this the win_loss_base. When this number is lower, eg 0.1, individual performance weighs a lot on the score. When it’s high, eg 2, it weighs much less on the score. Such a higher variable discourages players to aim for individual performance, since it’s more interesting for them to aim for team victory instead.
-
As you can see in the chart, when a player wins, he always earns MMR, and when a player loses, he always loses MMR. What change is the quantity of points earned or lost depending on performance weigh. This sort of “mitigate” carrying.
Calculation
Here is the python code:
# how much static ELO a player earns/loses after a win or loss
win_loss_base = 0.5
# the individual scores of the winning players and losing players
scores_winnner = "30 20 10 10 0"
scores_loser = "20 10 5 5 0"
# converting to integer
scores_winnner = [int(a) for a in scores_winnner.split()]
scores_loser = [int(a) for a in scores_loser.split()]
# we calculate a fraction on how each player contributed to the victory of the team
weighed_bonus_w = [a / sum(scores_winnner) for a in scores_winnner]
# same, except it's "mirrored" with 1-a for the losing team
weighed_bonus_l = [1-a / sum(scores_loser) for a in scores_loser]
# we add the base
weighed_score_w = [win_loss_base + a for a in weighed_bonus_w]
weighed_score_l = [win_loss_base + a for a in weighed_bonus_l]
# we recalculate the fractions again
scores1 = [a/sum(weighed_score_w) for a in weighed_score_w]
scores2 = [-a/(sum(weighed_score_l)) for a in weighed_score_l]
print(scores1)
print(scores2)
I wrote a longer script that outputs HTML, to visualize score with different team scores (fair, unbalanced, balanced team) and various win_loss_base values, but I cannot post links or images here, so I used the table function here:
This is with a win_loss_base of 0.5:
loser | winner |
---|---|
loser scores: 20 10 5 5 0 | winner scores: 30 20 10 10 0 |
■■■■■■-15.38 | ■■■■■■■■■■26.53 |
■■■■■■■-19.23 | ■■■■■■■■22.45 |
■■■■■■■■-21.15 | ■■■■■■■18.37 |
■■■■■■■■-21.15 | ■■■■■■■18.37 |
■■■■■■■■■-23.08 | ■■■■■14.29 |
–
This is with a win_loss_base of 2:
loser | winner |
---|---|
loser scores: 20 10 5 5 0 | winner scores: 30 20 10 10 0 |
■■■■■■■-17.86 | ■■■■■■■■22.08 |
■■■■■■■-19.64 | ■■■■■■■■20.78 |
■■■■■■■■-20.54 | ■■■■■■■19.48 |
■■■■■■■■-20.54 | ■■■■■■■19.48 |
■■■■■■■■-21.43 | ■■■■■■■18.18 |