How is the game calculating SR gains / losses below Platinum.
TLDR: There is a mathematical basis for claiming to be ‘hardstuck’; we will examine how that can happen.
TLDR for the TLDR: Make the MMR calculation only use the most recent 100 games (for example)
[EDIT: added better links to the images of the stats]
When someone comes onto the forum and complain of being ‘stuck’ at some rank, the response of ‘git gud’ is derisive, dismissive, and more importantly not helpful. Offers to review VODs are usually genuine and I’ve had folks here do me the favor of giving me constructive advice / criticism that has actually helped my gameplay.
Background
As an older player (50+) I know I’ll never be a hitscan god and have no illusions of ever climbing to the above-average ranks like Diamond or above. After I started playing the ‘Competitive’ (in quotes on purpose but we’ll come back to that at a later time) mode, I would lose pretty consistently and quickly started to sink to Bronze and found myself at <500 where you can find the joys of deranking stacks, one-trick-ponys, troll Roadhogs, super accurate Ana’s, and smurfs galore. It’s easy to fill your ‘bad sportsmanship’ plate from the smorgasbord of toxic play available down there.
Coming to the forums looking for answers and finding both an extravaganza of worthless opinions and a lack of any explanations based on math (with the exception of how using the ELO system to rank players being possibly problematic in a team game, again, we’ll come back to that, I promise) , I began to research how SR is calculated. I know, I know, everyone one here is an expert; “win more”, “don’t die”, “don’t feed”, “don’t play tilted”; “you are in the rank you ‘deserve’”, etc. That last one is my least favorite because it both anthropomorphizes a computer algorithm and makes a value judgement on the game experience of a human player. I won’t get into an argument with those who defend the status quo, there are too many forum trolls who feel their place on the ladder is indicative of their worth and sometimes their hundreds of hours of practice (read: grind); wearing their rank like a badge and using it as a proxy for authority over other players. Those who have ‘faith’ in the games design can keep their faith, my task is to peer into the numbers behind the game and possibly shed some light on why certain anecdotes continue to cause debate here, primarily the ‘hardstuck’ player.
Scenario: a new player falls to bronze and then even though they play much better now find it exceptionally hard to rank up.
Assumptions
Assumption: The MatchMaker™ attempts to find a ‘balanced’ match of two teams defined as one in which either team has a 50% probability of winning. There is much sturm and drang expended over this topic. No judgement is made here and the discussion is left for another thread.
Assumption: The MatchMakingRating / MatchMakerRating (MMR) is a real number roughly between -3 and +3 where each whole integer marks 1 standard deviation away from the mean. See this graph of where each group falls in the total population,
www.statisticshowto.com/wp-content/uploads/2013/02/standard-normal-distribution.jpg
From what Blizzard has said in the past, it is easy to infer that each whole MMR value can be considered a Standard Deviation and maps to a rank threshold, -3 for Bronze, -2 for Silver, -1 for Gold, 0 for Platinum, +1 for Diamond, +2 for Masters, and +3 for Grand Masters. This isn’t exact, but works for our purposes because we are only interested in the bottom ranks, the exact thresholds can move around without affecting the conclusions.
Assumption: All players performance is a normal population and follows a distribution that maps to a normal bell curve.
Assumption: There should be roughly 68% of all players who fall between -1 and +1 standard deviation of the mean.
Assumption: Bronze makes up the lowest 8% of players (this probably varies between seasons but is useful and the exact value doesn’t affect the analysis.
Assumption: MMR is based on statistical methods like Standard Deviations, Confidence Intervals, Mean, etc., and is a proxy for a players overall ability.
Assumption: The game can only adjust your MMR based on the statistics available to it, like win / loss, damage done, time on the objective, etc.
Assumption: The game is using Machine Learning tools, such as a Bayesian Classifier type approach, to calculate how to adjust the MMR of a player based on their performance.
Assumption: PBSR is calculated the same regardless of map. One of the replies mentioned that two maps had different calculations; that might be true but that is the first mention I’ve encountered for that. If you know more about it, please share links, etc.
Assumption: Career stats are tracked relative to Hero and very likely each map. I’ve had consecutive games where my elims were above the stated ‘career avg’ but the avg went down and other games where the opposite happened. At first I wondered if by ‘career’ they meant like the last 100 games, but that makes less sense than tracking stats for each map.
Assumption: The collection of stats on OverBuff are representative of the population of OverWatch players in general for the purposes of examining relative behavior. Looking at just the Reinhart players they have over 4000 accounts being tracked. Yes, they will not the be same as the real game but it provides us with a good enough spread of values to make some overall judgements about the possible weightings of difference performance levels.
I’ll add more assumptions later but this will be enough to help with the hypotheses below.
Hypothesis 1
They are simply a bad player. I’m including this for completeness but recognize it is not really an explanation for the sheer numbers of reports of the scenario. If you are really that bad then, ok, please have fun as best you can, no judgement here. I don’t ever say “you ‘deserve’ the rank you are at”.
Hypothesis 2
Players are underperforming while they learn the game and its heros. While having the most truth, it doesn’t explain players who have improved but still can’t climb.
Hypothesis 3
Players are manipulating the ladder by throwing games with the intent of being ranked at a level well below their actual skill. It is tempting to assign a motivation to these players, I will refrain as the inclusion of this hypothesis is more to describe a commonly reported subject and also to be able to exclude it specifically. This is to say, for the purpose of this post, I’m not taking a position over how often this happens, whether it is bad / good, indifferent, how it may be addressed, etc. Simply put, it is beyond the scope of the math I’m wanting to put forth.
Hypothesis 4: This one is the focus of this post, the others are included so trolls can’t say they weren’t considered. The game is basing the MMR on AVERAGE performance over a large number of games. While this makes sense it also has some rather interesting implications. If every game were to have an equal probability of being a win vs a loss, then the factor that more heavily affects the rise / fall of MMR is individual performance. This topic will be the subject of an in-depth thread of its own later, for now, let us ask the question, what do you mean by ‘average performance’.
Now, to answer this question, I want to include a personal observation, not to claim any expertise but as a way of expressing how I came to this particular conclusion about ‘average performance’.
I’ve been using the website ‘OverBuff’ https://www.overbuff.com/
so I could get a better understanding of the numbers breakdown of my performance. Below you’ll find the actual numbers from 2 consecutive games played on the same account on the same day that each moved my SR by the same value (21 SR, down for the loss, up for the win).
Observation: The fact that this 3rd-party site only gets updated when you exit the client makes me wonder if the same thing applies to MMR updates. This might explain the advice to keep playing until you lose. I would also like to ask those ‘unranked to GM’ people about their play sessions, how many games per session, etc.
**Pro-Tip**
if you exit the client after every match, you can actually see a history of how you did both statistically and for your SR using Overbuff. This was important to me because otherwise it was hard to really understand how your performance in any single match was converted into a SR gain / loss afterwards.
----
What did I learn? Well, after a win (win of 21 SR ) followed immediately by a loss (loss of 21 SR) I could examine both the values for each match separately and how they compared to my career averages when both games had the same magnitude effect on my MMR.
SR for a loss
Seeing the numbers for the loss
image found here: https://imgur.com/Ywpd06V
Loss Stats (in Percentiles)
Elims: 24%
Obj Kills: 41%
Obj Time: 96%
Solo Kills: 95%
Final Blows: 92%
DMG Blocked: 7%
Damage: 14%
Deaths: 63%
Charge Kills: 29%
Shatter Kills: 15%
Fire Kills: 7%
E:D Ratio: 34%
Voting Cards: 34%
Medals: 44%
Gold Medals:50%
Silver Medals:63%
Bronze Medals:35%
You can see that my percentiles were all over the 8% threshold we set for a ‘bronze player’ except for Deaths and E:D Ratio, but that is not surprising in a loss. What is more interesting is if you look at the career averages (see below) and ask “is this a bronze player”? The simple way to approach this is to look at the percentile averages and see if any of them are below that 8%percentile threshold, which two are (Dmg Blocked @ 7% and Fire Kills @ 7%) When you look at the other stats, they are much closer to average and some way above average (Solo Kills, Obj Time, Final Blows). Now, if we ask how much each stat is weighted when calculating performance-based SR, it would be hard to refute a couple of observations. Those stats on the high end of the scale (Solo Kills, Obj Time, Final Blows) must be weighted much lower than the below average stats, otherwise the SR for the win would be higher than the same amount of SR generated by the loss.
SR for a win
Now let’s turn our attention to a win.
image found here: https://imgur.com/jLh1C5Q
win stats (in percentiles)"
Elims: 81%
Obj Kills: 54%
Obj Time: 80%
Solo Kills: 93%
Final Blows: 99%
DMG Blocked: 99%
Damage: 60%
Deaths: 78%
Charge Kills: 98%
Shatter Kills: 68%
Fire Kills: 32%
E:D Ratio: 90%
Voting Cards: 4%
Medals: 1%
Gold Medals:1%
Silver Medals:87%
Bronze Medals:1%
Ignoring the stats for medals (generally good advice) we notice that all of the stats are way above that 8% score we set as representative of a ‘Bronze’ player. The one of the two stats that were low in the loss are now much closer to the high end of the range (Dmg Blocked @ 99%) and the other is relatively much closer to average, Fire Kills @ 32%. The SR for this win is exactly the value of the previous loss (21 SR for both) so it would be easy to conclude that both performances had a nearly equal effect on MMR. In the loss we wondered if Dmg Blocked had a higher weighting in the calculation but that is hard to explain in the light of the top 1% percentile value achieved in the win. This could still be true but the weights would have to be different between a win and a loss and that would be harder to explain from a pure statistical standpoint.
So what do we have so far? A win with relatively high performance having the same magnitude effect on SR as a reasonably average performance in a loss, especially relative to our 8% threshold of ‘Bronze’ players. This is the basis for players asking, “why do I seem to need to play as well as someone two ranks higher just to start heading towards the next rank?”
How do your Averages affect your SR gain / loss
Here are my average stats for the last 99 games as the same Hero.
image found here: https://imgur.com/yxxC9Hu
Single Player Averages (in Percentiles)
Elims: 24%
Obj Kills: 41%
Obj Time: 96%
Solo Kills: 95%
Final Blows: 92%
DMG Blocked: 7%
Damage: 14%
Deaths: 63%
Charge Kills: 29%
Shatter Kills: 15%
Fire Kills: 7%
E:D Ratio: 34%
Voting Cards: 34%
Medals: 44%
Gold Medals:50%
Silver Medals:63%
Bronze Medals:35%
Note: This average snapshot was saved 9 games (4 wins 5 losses) after the two games shown above but on the same day. Mark the difference up to my realization of the significance of the +/- 21 games later in the play session. I don’t expect those 9 games to have affected the averages (of 99 games) enough to change any conclusion.
Discussion
How do we earn more SR for a win then we lose for a loss? Using the stats for these two games, we can say that we’d have to play better than Gold (the lowest stat in the loss) in order for our PBSR to slowly inch our way out of Bronze. If you look at my averages, the lowest percentile is Fire Kills (7%) and Dmg Blocked (7%) so it is obvious those areas are worth improving.
Interesting question #1, if my lowest stats are already near the threshold for the next rank, why is my SR so low?
Interesting question #2, if my averages are already so close to the thresholds that mark the next rank, why do really good performances not affect my SR more than average losses?
Interesting question #3, how much better would my stats have to have been during that win to eke out a single additional SR? For the stat geeks, this would be the marginal SR cost (in performance).
Interesting question #4, if I could maintain that rate, is it accurate that I would have to play 2000 games to go from 500 to 1500? Remember the 50% win / loss balance, so we have to play twice as many total games so half are wins that net 1 SR.
Interesting question #5, if these are my averages after 99 games, playing 2000 more games at that same performance level will only make it that much harder to alter my overall average?
Interesting question #6, is it reasonable to expect a player in this situation (MMR of -3 but with mostly average stats to need that many games to ‘improve’ from the lowest rank when they already have near average stats?
Conclusions
How do we earn more SR for a win then we lose for a loss? Using the stats for these two games, we can say that we’d have to play better than Gold (the lowest stat in the loss) in order for our Performance Based SR (PBSR) to slowly inch our way out of Bronze.
I would like to address those interesting questions, but that might take a while and this post is already long enough, for now take them as exercises for the curious minds out there.
I think we can conclude a few points from the details I’ve shared here today.
#1 The more game an account has played, the harder it is to move it between ranks, because the system has a high ‘confidence’ in your performance and thus won’t adjust your SR as much for each win / loss. The more confidence the system has, the less PBSR you’ll get and thus your climb goes into the hundreds (if not thousands, see above) of games to advance ranks.
#2 The performance of an individual game has less influence on your MMR than how much you move your averages. Want to climb out of Bronze? You’ll have to fight against those 100 poor games that are effectively an anchor on your SR, keeping it from moving much.
#3 Stop insisting on an ‘SR reset’, this already exists and is called creating a new account; the performance averages will mostly result in the same rankings for the vast majority of players and ignores the financial incentive Blizzard has in players creating new accounts.
#4 Last but not least, if you ask ‘what are we supposed to do about it?’ I have a simple answer: Petition Blizzard to limit the MMR calculation to the most recent X games, where X can be a reasonable amount like 100 or 200 games. That way, those games where you were just learning how to use heroes abilities, map layout, etc. eventually stop counting against you.
Thank you for reading this far, cheers and I hope this generates a healthy and civil discussion.