# How Competitive Skill Rating Works (Season 11)

I’d like to add something about the difference between accuracy and precision, if I may. The common usage mixes the terms but to discuss what I want to discuss I need to separate the ideas.

If I were to tell you that my location was California, that would be accurate, but not very precise. Similarly, to say that your skill is between 2200 and 3400 also is likely accurate, but not very precise.

You are talking more about precision. The question then becomes one of the interface between you and the measuring system (SR). If you were to try to find me, I could likely give you my address and that would be sufficiently precise to accomplish your task. What I couldn’t do is to just give you my zip-code. You’d never find me that way. That wouldn’t be precise enough. On the other hand, I could also give you a 12 digit latitude/longitude coordinate, there are literally billions of those coordinates that would accurately describe my position and any one of those would work as long as I didn’t move.

If I was walking around my house, or even doing yardwork, by the time you got to the 12 digit coordinate it would likely be wrong. You’d be better served by the address.

Kaa’s experiment in reference (27) assumes that the precision is 4 digits and thus discusses accuracy in terms of the fact that the two accounts were 476 SR different. He’s not wrong (I mean, SR is 4 digits, it’s most reasonable to conclude that that precision is meaningful) but he would be able to say that the system gave the same (accurate) results if both accounts ended up in Gold tier. If we consider that SR is likely over-precise then the question of accuracy become a bit different. How different is 476 SR really? The answer probably depends on where that 476 range is on the overall scale.

This is where your hero and playstyle come in (and perhaps the other factors Kaa gives). If you, personally, aren’t that consistent or you play in a way that isn’t that consistent, or you play a hero that relies on factors beyond your control, then what you are doing is very much like moving around your house while someone is looking for you with a 12 digit coordinate. You keep updating the number, but a few games later it’s wrong.

Measurement systems should never be more precise than they are accurate. You wouldn’t want to use the 12 digit coordinate to find me in my house. It would only be right by sheer dumb luck. If you look at coordinate x,y but you only find empty space, you are simply inaccurate. Wrong. You of course want them to be meaningful, using a zip code wouldn’t work, but generally it’s important to match your precision to your desired accuracy.

The SR system fails dramatically in this respect. I’ve never heard someone that claimed they could tell a difference in skill level below a 200 SR difference. There are 5000 SR levels currently, so we could just take 5000, divide it by 200, and get essentially 25 meaningful tiers of game play in the MOST PRECISE version. According to Kaawumba’s experiment, we should take that 5000 and divide it by 500 to 1000. Consider if it is meaningful to have an SR difference of 1. If not, there probably shouldn’t BE the possibility of an SR difference of 1. It means absolutely nothing.

In a sense, we already have these tiers. If you forget the over precise 4 digit number and understand your rank only in terms of tiers, I think you would find your results to be a bit more accurate but THEN you have to consider not just YOUR consistency, but what “2500” or “Platinum Tier” even MEANS. The 3 possibilities Kaa just posted are all variations on the same theme, which is that the skill level that “2500” corresponds to may potentially change throughout the season. It’s a ranking system. A RELATIVE number. You’re not a Platinum player, you’re a player higher than the people in Gold but not quite as good as those in Diamond. It may seem like a pedantic difference, but it will help you to understand.

So even it the system was perfectly accurate and precise to 4 digits and you played perfectly consistently, it’s still possible for your SR to go up and down through the season. Not that it necessarily would so dramatically, but it could certainly be a factor.

Hope that helps you to understand it.

I play almost every hero, my most played hero is phara though.

SR is very accurate. On my original account (this one) I have only made it above <500 once. On my other account I am high silver low gold at the start of the season. So within 2000 SR means it’s accurate, right?

1 Like

Playing too many heroes can lead to higher volatility as well. The more consistent your play (including hero choice) the more stable your rating. It would be easier for me to analyze if you make your profile public. You may find that you have low win rates with certain heroes, and play them too much.

Please post from your other account, and make sure that its profile is public.

I continue to gather data.
Have a look at teams SR difference here:

`https://docs.google.com/spreadsheets/d/1TBdpG3ahtD31QZ0Xn6HMygxruuIM1285n2cHCNHsNwg/edit?usp=sharing`

The tendency clearly has changed from my previous comment about the issue. I am not playing at my usual SR right now, thanks to insane loss period. My hypothesis stands - matchmaker manipulates your win chance by shifting team SR. It actually pushes me up now - the games are MUCH easier and coordinated, than they were even 3 days ago. Explanation - team SR difference is in my favor almost constantly.

This is really good. For a multi-million dollar company Blizz doesn’t like to properly post statistics or documentation regarding Competitive all around or Character Information. The hero gallery should have a breakdown of the damage amount of every ability and standard damage per character. A small gif showing what the ability does. (I saw new D.va’s that didn’t know anything about DM) Your entire post should be pinned to the top of the Competitive Forum or have a Blizzard version made, but the mods are too busy deciding if some slightly argumentative post needs to be locked.

1 Like

Again:

You have enough data so that we can start to see this in action.

Here is your data from season 11:

As your team SR becomes higher [lower] than your enemy SR, you gain less [more] points on a win, and lose more [less] points on a loss. The trend lines are clear, though noisy due to the effect of performance SR. The trend lines are also fairly slight, reflecting the rather slow change in expected win percentage with differences in SR.

If I divide this into quadrants, you have:
Top Right: Wins where you were expected to win: 23 +/- 4.8
Bottom Right: Losses where you were expected to win: 24 +/- 4.9
In theory you should have won more than you lost, but the error bars are big.

Top Left: Wins where you were expected to lose: 21 +/- 4.6
Bottom Left: Losses where you were expected to lose: 32 +/- 5.7
It seems you have lost more than you should have, but the error bars are big.
Maybe you are tilting when you see that you are underwater with respect to SR. But the error bars are pretty large. Maybe you’ve just been unlucky.

Breaking these numbers into bins of 10 SR:

I’d love for this plot to have the clear expected trend line, but it doesn’t and the error bars are way too big. But hey, if you’d like to play a couple hundred more games and give me the data, I’ll happily crunch it. I’ve wanted plots like these for a long time, and this is the first time I’ve seen data that has even started to approximate what Scott is describing.

Your data representation is very informative, thanks for the effort.
By “you were expected to win(lose)” you mean games, where SR difference was in my favor? It seems, Scott has something like that in mind.
Your second graph clearly shows, that:

• Either PBSR bonuses and team difference bonuses are insufficient to compensate for SR differential.

• Or it is a huge mistake - to allow SR differential past 10-15, because average players, like myself, suffer from it.

I strongly argue for both. For example:

`https://imgur.com/a/Driy7vg`

Two games, same heroes, same SR. Team SR difference actually, I just can’t find other words, robbed me of 6 SR. Was it necessary to create a game with so much difference in team SR in the first place? My personal performance was obliterated, but I know I was very good in both matches, on fire most of the time. Second match actually had worse stats in K/D and overall eliminations. But I got rewarded for it. And this is most obvious example. Another one:

`https://imgur.com/a/kfmtwd4`

Matches without SR difference (1 or -1 isn’t much). One with insane performance is lost. What’s the bonus? NONE. One with good performance is one. What’s the bonus? at least 4 SR. What is the incentive to put effort into games??

I’ll continue to update my spreadsheet, the link is permanent. You are free to use the data as you see fit, but I would love to get mentioned somewhere near your analysis of it.

Yes.

The limits on win probability is between 40 and 60%. That seems restrictive enough to me. I don’t know what SR differential that represents.

Enemy_SR_B - team_SR_B - (Enemy SR_A - team SR_A) = 20 SR. Reading from my first plot, that means a expecting win percentage difference contributed ~20 * (.0697) = 1.4 SR to your change in SR upon victory. Since there was actually 6 SR of difference, the majority of that likely came from performance metrics. I haven’t been able to quantify performance metrics, with substantial effort. It’s certainly not as simple as K/D, and the developers have said that “on fire” is not how they derive the number. See my original post, “Performance Modifier” for more information. Performance metrics generally just show up as noise in my plots. It shows up here as a significant amount of scatter in the first plot.

A “neutral” game, with no expected win differential, and no performance modifier, no other funny business, gives +/- 24 SR (at low diamond, probably true here as well). So you gained 2 SR due to performance on your loss, and gained 3 on your win. So, more than none. But making the performance modifier bigger tends to reward stat hunting rather than winning and team play. Again, see the “Performance modifier” section in the original post.

Because you like winning? Because you’ll rank up if you win more? Because putting effort in makes it more fun and valuable to you?

Thanks. I always reference my source data, but sometimes it is behind a link or two.

@kaawumba I got my NiteOwl alt during this last sale. I haven’t played on it yet (I moved recently, don’t even have a desk right now) but before I relegate it to my “I know I’m playing like crap right now” alt I’d be more than willing to run an experiment or two. Just let me know. So far zero games played in any mode. Honestly, I don’t even think I’ve logged into OW with the acct. I bought it and shut down my computer…

Probably the most useful thing for me would be if you try-hard when doing placements and up until you start gaining/losing ~ 24 SR per game. Maybe 30 games, including placements. Do it all in one season. Also write down the ranks (gold/plat/unranked/etc.) of each opponent and ally. Unfortunately, with private profiles, you can’t really track everyone’s SR reliably.

Essentially, you’d be adding more data to this: Initial Competitive Skill Rating, Decrypted and this Initial Competitive Skill Rating, Decrypted - Google Sheets. Though as I’ve said, you can’t track SR of other players anymore, so I wouldn’t bother with going through profiles like I did.

I don’t care how you level to 25, but let me know what you do.

LFG adds unknown factors (I’m currently either exploiting the LFG system, or finally getting the recognition I deserve, or LFG just uniquely suits my play style, or being lucky, not sure which), so please avoid that for the 30 games, and solo queue only.

Thanks.

1 Like

One other thing: it’s easiest to record team and enemy SR and ranks by taking screenshots (printscr on windows). Just don’t be doing something on your other monitor when trying to take the screenshot or it won’t work.

Kaawumba, I have 171 games now in my spreadsheet. Could you please update your graph for SR change мы SR differential?

Here you go:

I made the histogram in Matlab instead of excel this time, to grant a more power and flexibility.

Thank you.
I think, we can come up with a recommendation for players now: if you see, that the enemy team has lower SR, everyone should play their mains and try harder.

That’s not how debates, forums, or the internet in general works

2 Likes

I personally think MMR is HEAVILY still stat based. I learnt to recognize lose streaks just after a “bad stats performance” and win streaks after a good one. In my opinion this is definitely true even if Blizzard deny it because it would incentivate char swapping to farm stats. I personally focus on stats farming when I realize i’ve been placed in unwinmable matches for 2 times in a row and this definitely shortens the lose streak durations. Someone could say im a thrower but the reality is that I’m only tryin to do my best ignoring team composition. It’s selfish? Yes but it works so it’s Blizzard fault not mine and for sure I wont stop doing it until the day that they’ll stop defining “perfectly normal” a 500sr fluctuation.
My suggestion is to fill when the team works and be completely uninterested in team comp when you realize that it don’t, in this case analyze what could be the hero that would give you more stats and go for it. Sometimes I even suprizingly carry the game to an unexpected win.

p.s with this attitude i’ve achieved master rank and drastically reduced the duration of lose streaks that could have eventually sent me back in the worst place of the entire ladder, low diamond.

p.p.s ofc mmr is hidden because Blizzard still don’t trust it and because you would easily realized what I previously told you.

1 Like

Blockquote
If a player’s MMR is wrong and too low…
…which will cause his MMR to rise over many games played…
…will then be placed with stronger and stronger opponents (and stronger and stronger allies) until his MMR is correct…

The above is concerning - as it seems like this is the main reason why people get stuck in lower ranks. Especially if you start getting matched to players in the lower rank that are closer or equal to your actual skill. If this happens it is highly possible that you’ll never leave the rank since you’ll win a few and lose a few (around the 50%). Unless the amount of SR you gain for each win is massive and far more than the loss, progress will be zero.

Factor in some bad luck and people fall all the time from proper locations to whole ranks below - and then remain there for the duration.

I’m thinking it makes far more sense to allow players to stomp lower players and send the up the ranks vs leaving them in the ranks and matching them with similar players in that rank. Let the similar players meet up when they all reach the proper rank.

I might be missing something but it seems that change would send smurfs up ricky-tick - and while lower ranks might get a few serious beatings at the beginning of the season it would in theory over time balance. Especially if the win/lose SR at their proper rank drops keeping properly matched people closer together for longer periods.

Nothing more frustrating than being in a lower rank and having to constantly battle and carry a lot of the work for a few games only to start getting matched to better players, and ending up in the win 1 lose 1 back and forth.

On top of that, placements feel like teams are randomly matched and being tracked by individual performance. However, your performance is seriously underrated a lot of times because … a lot of times … the team comps and skills are so out of wack that there is zero teamwork.

shrug… all the math and many measurements and what not… and it really just doesn’t feel like it works… maybe thinking simpler instead is the actual solution?

BTW - great post, great information

1 Like

Well, you always want your team to try-hard and the other team to goof off. What actually happens varies of course.