A proposal to mitigate player frustration when solo-queuing

Frustration

For several years I was frustrated at MMR/SBMM (matchmaking ranking, Skill-based matchmaking) in Overwatch, my favorite multiplayer game, and in my opinion, undoubtedly the best designed team-based FPS of all time (better than counter strike, sorry).

Why was I frustrated?

First, I only solo queue, I just can’t find people to play with, or people who share my poor level.

I felt that I was either being carried by the team, or I was carrying for them. Despite this, my MMR score was never really impacted by my own performance: this would mean that me and my team mates would obtain about the same reward, disregarding how they contributed to the loss or victory.

Realization

So the very good player got about the same reward than the much less good player.

After many matches, I felt that there was a great gap of individual skill between players of the same team, despite the MMR system being designed, (in my mind), to sort players according to how good they are, so they can play against people who have similar skills.

Arguing

I argued a lot on forums, and came up with a system where individual performance would give a bonus to players.

Opponents argued back that this was a bad idea, since this would encourage players to aim for individual performance instead of team performance: this would lead to solo queueing players not caring if their team win or lose, since the game rewards individual performance more.

I answered that my system could be configured to balance the calculation between individual performance and the ELO gain due to loss or victory.

I also argued that the ELO system was invented for chess, a solo game, and was not designed for team based matches. Why does that matter?

Pondering about trust and toxicity

Algorithms that assemble a team automatically, based on MMR/ELO score are quite recent. This means the algorithm takes players from similar rank, and creates a match against another team with a similar average ELO score.

This is new because usually, in TEAM sports or online, players just find team-mates they trust or team-mates they believe share the same skill.

It is true that there are individual ELO rankings for sports like tennis, but not for games with larger teams, like soccer, handball etc.

For those sports like soccer, handball, basketball, and so on, athletic skills are evaluated by humans who watch the athlete play, not by an algorithm that watches if an athlete was on the winning or losing team. Humans then decide what player to put on the team.

This obviously leads to player having a low amount of trust towards their stranger team mates if they don’t feel they are pulling their weight. Beyond toxicity, players feel they make effort and are not rewarded, or don’t make any effort and get rewarded anyway. Matchmaking now feels random.

Heroes

Overwatch makes it even more complicated to evaluate individual skill because it allows to pick among 40 heroes who have a large range of different abilities.

Previously, overwatch gathered statistics about hero performance, and compared them to average performance of the same hero in the same skill division, and calculated this hero-relative performance score: things like damage, crowd control, damage absorbed by shield, healing made, etc.

Obviously, those are quantitative and not qualitative, and doesn’t really allow to accurately evaluate how a player contributed to a loss or a win, but it’s better than nothing, especially in lower ranks: opponents argue that game sense, positioning, timing, etc, cannot be measured but matter much than quantitative stats.

Balancing

I then started writing some system to reward a mix of individual and team performance.

  1. This system is designed to be “zero sum”, in the sense that the score added and substracted doesn’t lead to an increase in the total quantity of points across all player population. So in total, the winner team is rewarded 100 points, and the loser team loses 100 points. What matter is how they are distributed among players of a team.

  2. It is possible to configure one variable to balance how much the individual performance weighs on the ELO score reward or penalty. I call this the win_loss_base. When this number is lower, eg 0.1, individual performance weighs a lot on the score. When it’s high, eg 2, it weighs much less on the score. Such a higher variable discourages players to aim for individual performance, since it’s more interesting for them to aim for team victory instead.

  3. As you can see in the chart, when a player wins, he always earns MMR, and when a player loses, he always loses MMR. What change is the quantity of points earned or lost depending on performance weigh. This sort of “mitigate” carrying.

Calculation

Here is the python code:

# how much static ELO a player earns/loses after a win or loss
win_loss_base = 0.5

# the individual scores of the winning players and losing players
scores_winnner = "30 20 10 10 0"
scores_loser = "20 10 5 5 0"
# converting to integer
scores_winnner = [int(a) for a in scores_winnner.split()]
scores_loser = [int(a) for a in scores_loser.split()]

# we calculate a fraction on how each player contributed to the victory of the team
weighed_bonus_w = [a / sum(scores_winnner) for a in scores_winnner]

# same, except it's "mirrored" with 1-a for the losing team
weighed_bonus_l = [1-a / sum(scores_loser) for a in scores_loser]

# we add the base
weighed_score_w = [win_loss_base + a for a in weighed_bonus_w]
weighed_score_l = [win_loss_base + a for a in weighed_bonus_l]


# we recalculate the fractions again
scores1 = [a/sum(weighed_score_w) for a in weighed_score_w]
scores2 = [-a/(sum(weighed_score_l)) for a in weighed_score_l]

print(scores1)
print(scores2)

I wrote a longer script that outputs HTML, to visualize score with different team scores (fair, unbalanced, balanced team) and various win_loss_base values, but I cannot post links or images here, so I used the table function here:

This is with a win_loss_base of 0.5:

loser winner
loser scores: 20 10 5 5 0 winner scores: 30 20 10 10 0
■■■■■■-15.38 ■■■■■■■■■■26.53
■■■■■■■-19.23 ■■■■■■■■22.45
■■■■■■■■-21.15 ■■■■■■■18.37
■■■■■■■■-21.15 ■■■■■■■18.37
■■■■■■■■■-23.08 ■■■■■14.29


This is with a win_loss_base of 2:

loser winner
loser scores: 20 10 5 5 0 winner scores: 30 20 10 10 0
■■■■■■■-17.86 ■■■■■■■■22.08
■■■■■■■-19.64 ■■■■■■■■20.78
■■■■■■■■-20.54 ■■■■■■■19.48
■■■■■■■■-20.54 ■■■■■■■19.48
■■■■■■■■-21.43 ■■■■■■■18.18
2 Likes

I remember someone on here saying they did this at some point but removed it because of a team having an impact on the individual player.

This was in OW1 for a long time. They removed it in for OW2, and according to a video I watched where a dev discussed it, the argument is that it’s just too hard to accurately assign value to individual performance. And these are my words not theirs (I forget what example they used) - but for instance, a mercy rezzing a DPS player instead of a tank. Maybe rezzing the DPS player in that instance is the right move to help your team win. Or, maybe the tank is the better move. But there are just too many variables to accurately assign points to individual plays/performance. It didn’t work well in OW1, so they removed it in OW2.

Wow OP. This is some PhD-level work.

If I were you I would simply write:

“Make the Matchmaker skill based because I said so”

2 Likes

Ok, so you didn’t really say how you’re going to weigh individual performance.

Overwatch stats are very contextual.

Man, that’s quite an essay

Overwatch already measures a lot of things for each hero, as I wrote: there are hero-relative, division-relative, team-relative performance measurements, where the game will look at the hero stats

  • it will compare with hero stats with the average of heroes in that tier division
  • it could then use that performance score and compare with the performance of other teammates, to have a hero-relative, division-relative, team-relative performance score.
  • that score then affect a bonus to a minimal degree to final score.

The weigh of individual performance score on the final score can be small to avoid impacting the score too much, I showed examples with a “base” factor.

The goal is mainly to have an effect in situations where a player is obviously very high performance or very low performance, to “mitigate” carrying, at least a bit.

The main goal is only to avoid situations where you have a very good player or very bad player (compared to his team-mates) getting a reward which is too similar compared to the team-mates.

In bronze, silver and maybe gold, the performance gap is much larger between players, which could explain this “stuck in silver” feeling for solo queue.

Of course, I am not arguing that this system should be enabled for platinum and above, but this system is relevant to lower skill tiers, maybe not gold, but obviously bronze and silver.

Some skill expressions are not necessarily tied to raw stats.

This stats-driven approach of determining the performance is flawed and can be abused and exploited.

Overwatch is too chaotic.

For example the concept of space creation can not really be quantified. Space creation is fundamental in Overwatch and can win games.
There are many ways to create space that are not apparent on the scoreboard.

I agree with that.

Although like I said, the impact on score would be minimal since the “strength” on how it affects the final score can be configured, as shown in the code, and only applied to lower tiers where those “situational” or qualitative aspects are less impactful, in my view.

I agree that the gap can be too large in low ranks.

In my opinion the ranking system should be mainly about wins and losses, without taking into consideration the performance.

If they blindly match people in a lobby based on the SR without trying to make the match even, there will be stomps at the beginning, but then, with time, things will settle down.

The good players will win more than they lose.
The bad players will lose more than they win.
Everyone will end up being in his natural space.

Making the matchmaker stats-related will only lead to inflations and deflations, because that’s equivalent to artificially boosting people based on specific metrics that don’t necessarily reflect the actual contribution towards the outcome of the match.

The skill-based matchmaking can only make sense if they are able to develop a very sophisticated judging system that goes way beyond the usual standard stats.

An advanved artificial intelligence is needed in such case.
Basically an algorithm that can analyze the plays of the players turn by turn, play by play, like an actual coach.

1 Like