SR, MMR, and Role Queue

Kaedi-11739 · August 27, 2019, 5:17am

Might as well ask, do you want to do season 18 for the skill rating guide thread? I know you handed it off with things being dead quiet in the last few seasons, but obviously with changes now…

…it would also help me out as I have something important going on (which I can’t discuss at this time).

Kaawumba-1133 · August 27, 2019, 1:19pm

Sure. Maybe I just needed a few months off. I came back for the replays, and stuck around for the role queue. Also, it needs a full rewrite with the changes, and you’d probably be uncomfortable with all the deletions that need to be made.

Can you say if it is Blizzard/Overwatch related or personal?

ElobaCarcen-2521 · August 27, 2019, 2:09pm

@Kaawumba

Hi, Noxifer here. I know you usually compile info like this in an attempt to combat criticism towards the system. That’s all good and well, but I really gotta ask you:

Can you not see how something like this can be particularly problematic and it would explain the one-sided streaks many players are complaining about?

In WoW they used a 4 digit number, same as SR. That MMR served the purpose of giving you an estimation of where you belong, and they used it for finding enemies only (because there you created teams, so you always played as a full premade).

Here you have a 1 digit number between -3 and +3, it’s not given to you to see, and it’s also used for finding allies (essentially making it into a self fulfilling prophecy). And, especially if it can shift quickly from +3 to -3 after a single match, no wonder why they don’t display it and no wonder why the system is so volatile.

Razihell-11734 · August 27, 2019, 2:37pm

Exactly what is happening in Heroes of the storm

You win SR raise then you got match with potatoes (MMR dosent move as fast as SR)

I did quit HOTS for this reason and im quitting OW for this even if 222 was a really good move BYE

cash-21865 · August 27, 2019, 2:56pm

He adressed this before. So instead of 1 it is probably more like 1.001203 and the jumps are far smaller.

ElobaCarcen-2521 · August 27, 2019, 3:14pm

Let me share a story from HotS.

A friend of mine, who is really good at the game reached rank 4 (that was either preseason or season 1, when Rank1 was the highest).
On rank 4, instead the games becoming more skilled, he would literally play with people who mess around and throw the games, so he would swing between rank 5 and 2, unable to reach rank 1. I think he dropped to like rank 6 and got pissed off.

Meanwhile, I was ~rank 30 and was steadily climbing. My friend asked me if I could help him out, so that he knows he’ll have at least one person who can help him.
On the 2nd time we tried, we did manage to pull off a win streak (I don’t know, however many games it was to get him from like rank 5 to rank 1) and he did reach rank 1 and got his mount. Surprisingly, I did pull my weight as rank 30 or lower, with the lowest rank on the enemy team being 11.

We never bothered with that trash of a Comp system again… or not in that game anyways.

ElobaCarcen-2521 · August 27, 2019, 3:35pm

This does not reflect the initial state of the game, where SR swings of ~800 were a common occurrence. It used to be a meme, that one day someone is in Silver, the next day he’s in Plat, so how can SR represent skill in any way.

And while SR swings are not as crazy nowadays, streaks remain very much a thing.
As to why they are happening, I was about to suggest, that the personal performance factor is at fault, but then again streaks also happen for people in Master and Grand master, where supposedly the personal performance factor doesn’t work anymore, and the system is pure Elo (which I don’t know if it’s true, but it can be tested easily if two Diamonds or higher queue together and see if they win or lose the exact same amount of points at the end of a match).

Also, numbers can be translated. They could be showing you what is a numerical representation of what SR you are estimated to be able to reach, just like they did in WoW.

They are not hiding it, because “people will abuse the system”. They are hiding it, because people will point out irregularities, especially for Plat and below with personal performance affecting your MMR and SR gains/losses.

Streaks can be easily explained. If for whatever reason the system decides you underperformed (even if you did what was necessary), and you end up on a downward spiral, because the system assumes you can maintain consistent performance under any circumstances, which in a team game is not really the case. What others do or don’t do can and will affect your performance.

I bet you, that if we could see numbers, there will be these cases where a player does everything right, but the system decides he underperformed. Maybe he underperformed when it comes to pure numbers, devoid of the context of the match, but with a replay you can show, that he couldn’t possibly do any better under the circumstances.

Other issues could be related to smurfs artificially raising the bar of what the standard performance should be on certain rating.

I’m telling you, there’s something very fishy with this system. It simply does not behave like what one would expect based on his experience with other games where Elo, True Skill or whatever variation we have here.
The worst part is, Blizzard do not have the financial incentive to improve the system, because there are all these morons willing to buy multiple copies of the game to try and play in a higher league.

ElobaCarcen-2521 · August 27, 2019, 3:53pm

This is my take on a particular quote from Mercer in the Role Queue Update thread. It’s indicative of what I think about the current state of the matchmaking system.

Role Queue Update

This is another way of saying “You’re not as smart as you think you are” or within the context of Overwatch “You’re not as good at the game as you may think you are”.

This is the same level of arrogance displayed by Jay Wilson, who was claiming, that people don’t adequately remember the previous Diablo games, when in reality he was the one remembering things wrong and sometimes he was outright fabricating lies.

It’s also similar to the infamous “You think you do, but you don’t” by J. Allen Brack.

I’m pretty sure if you guys were to adequately display MMR for each player on a team, as well as how personal performance affects MMR and SR gains/losses at the end of a match, with us currently having the ability to watch replays, players would be able to spot irregularities with the system that are so stupid, that you bringing the Dunning-Kruger effect would end up just as ironic as the above mentioned statements.

Kaawumba-1133 · August 27, 2019, 5:25pm

This is a common misconception. My goal is primary to give information about how the system works. If I have no problem with criticism if it is criticizing the system as it is actually implemented. I criticize the system as well. I get frustrated when people hallucinate how the system works and criticize that hallucination.

MMR is not an integer, and will move a small amount (approximately 0.03 for established players) with each game.

Regarding streaks: Win probability changes slowly with rank because there are so many random factors in each individual match. Unfortunately, it follows from this that frequent and long streaks will occur, and a player’s rank will oscillate widely. Essentially, a player will tend to bounce between the range of where he is nearly guaranteed to win and where he is nearly guaranteed to lose. Historically, the range varied from player to player, but +/- 250 SR was common and +/- 500 was possible. This problem can be analyzed in depth, mathematically (Overwatch Forums). With role queue, and several months of data, streaks should be reduced in frequency and duration, and total SR swing should be less. How much less will have to be seen.

It behaves exactly like expected from TrueSkill: Why Match Quality is Frequently Poor.

SR is not used in matchmaking. Only MMR is. There is no “you’re doing well, time to carry a sack of potatoes” in Overwatch. I don’t comment on HoTS because I’ve barely played it.

Thorny-11568 · August 27, 2019, 5:40pm

MMR is measured in standard deviations, got it. Still not really useful information, because we don’t know if it’s standard deviations as a modifier of your SR or range, or standard deviations as a modifier to the average player overall.

Measuring it in regards to the average player makes sense in the context of fair matchmaking, by using it to push people toward their SR and setting the goalposts of 0 and 5000 at the highest attained values(or slightly above), you get a system that properly models the bell curve, which SR appears to do. You also get the effect we see where it becomes increasingly hard to gain SR approaching 5000, and to lose it approaching 0, because maintaining performance above or below 3 stdevs from the norm is extremely difficult.

However, in the context of the fixed system suggested, it still makes sense. If you have a measure of a player’s stdev from their rank, it’s easy to pair them with opposite players, match their performance with similar players on the other team, etc, to ensure a more fair and balanced game.

While the devs have said matchmaking is done based on MMR, not SR, that is a single line and not some code we’ve actually been able to examine. It would be perfectly reasonable for them to drop you into a pool(likely containing dozens or hundreds of queued players) based on the SR range the game will be, then match players out of the pool based on MMR. You would still be matching based on MMR, it’s just within the SR range.

‘Performance’ is not a metric easily measured in standard devations. A DPS playing with amazing tanks is going to be doing much more damage, getting many more eliminations, and suffering far fewer deaths than the same exact DPS playing with a potato roadhog and dps sigma. In this way, it’s quite possible that hitting a few one-sided wins would result in your stats appearing to have vastly improved compared to the average player, who is getting one-sided wins, one-sided losses, and a few fair games. Your stats in a one-sided win can easily be 5x or more your stats in a one-sided loss.

If MMR is easy enough to shift in that context, then it would make perfect sense that winning a few stomps results in you being paired with poor players. The resulting stomps will confuse your MMR, and perhaps you end up back in a winning streak.

Again, while I don’t buy into anecdotal evidence, there is a ton of it to be had. Smurfs largely agree that they will always end up paired with enemy smurfs if overperforming. Hundreds if not thousands of players feel they’re being forced into win and loss streaks.

Games in a coordinated stack feel utterly insane compared to normal games at the same SR. While stack matchmaking is weighted to be matched against other stacks, it’s clearly also got a skill factor. When you could still see stacks, my 4 stack would always be put against enemy 4-6 stacks. That’s fair, but they were always coordinated as well, resulting in long and fair games. I would expect that if there were no bias in the internal balancing, we would see the stacks that result from idiots using ‘stay as team’ as well, and it didn’t appear so.

The average ability of an opponent when queueing as a 2050~ sr 4 stack is equal or higher than the average ability of an opponent when queueing at 2700~ sr solo. This is something I do have enough games to be confident is past the observational bias mark. Whether it’s because stacks that don’t work well together break, or because there’s an internal bias, I can’t say.

In the context of what blizzard employees have posted, you can make a convincing case for either way. That said, my disclaimer on this topic still applies. While I personally am inclined to believe this system does exist, there is no proof one way or the other.

Further, if you’re using the potential existance of this system to claim you can’t climb, you are just plain wrong. If you are good enough for the matchmaker to try to balance downwards around you or put enemy smurfs against you, you are raking in performance based SR like crazy. If you can’t win half your games, or your performance based SR gains don’t outweigh your losses, any potential argument about being sandbagged falls apart because you’re obviously not doing great in the eyes of the system.

tldr; Maybe the games are handicapped, but if you can’t climb that certainly isn’t why.

Kaawumba-1133 · August 27, 2019, 5:59pm

I don’t really know what you’re trying to say here, but MMR is not “a modifier of your SR range”, nor is it a “modifier to the average player”. It is a value that is used to place people in matches.

MMR is not used to push people anywhere.

If you can’t trust what the developers say about the system, there is not much to discuss. All you can say from looking at match data alone is “Hey, it sorta looks like Microsoft’s TrueSkill, which is well documented”.

Thorny-11568 · August 27, 2019, 6:04pm

Poor word choice, perhaps. But, if MMR is measured in standard deviations, they’re obviously using standard deviations in some way. Thus, MMR should either be ‘standard deviations from your current rank’ or ‘standard deviations from the average player’. It makes absolutely no sense to model a system in units of standard deviation if it is not actually a standard deviation of some sort.

This is objectively false. If your MMR is significantly below your SR, climbing will be very difficult until such time as you can shift it upwards. If your MMR is significantly above your SR, climbing will be very easy until it has time to drop. Your SR gains are clearly impacted by where the matchmaker thinks you should be, particularly at the start of a new season.

The developers intentionally keep everything to short and often cryptic notes. I have not called them a liar, simply said that their statement is up for interpretation.

If they limit the pool of potential players based on SR, then match based on MMR, they could say that they match based on MMR not SR without being in an outright lie.

Gibberish-11316 · August 27, 2019, 6:07pm

I have been pushing the line that SR and MMR are separate things in response to a number of threads, but I will say that the language Scott used in the the role queue update was confusing, specifically:

I am hoping that he really mean’t to say MMR there and a few other places in the update.

BTW Kaawumba, did you see this talk, its very enlightening:

Kaawumba-1133 · August 27, 2019, 6:15pm

Yeah, Scott pretending MMR doesn’t exist is very frustrating. I think he’s trying to keep it simple for the plebs, but it isn’t really working. At this point, I consider Seagull & Jeff Kaplan: NEW Hero Confirmed! - YouTube to be definitive that SR has no effect on matchmaking.

I have seen the talk. It is good. http://www.moserware.com/2010/03/computing-your-skill.html is also good.

ElobaCarcen-2521 · August 27, 2019, 6:35pm

SR might not have effect on matching (technically it shouldn’t), but it’s also indicative to a point.
A game with 5 people in gold, and 1 DPS in Silver. Can you guess who failed to deliver kills and refused to switch?
No one can convince me, that in the particular example, said DPS had similar MMR as that of the other participants.

IMO this could be the issue. That the system takes one number for performance for said rating and assumes you can fulfill that number under any circumstances.

As a healer, sometimes in a winning game you can pull off insane numbers. Other times you don’t have that opportunity (because the enemy team might tilt, start arguing, run in one by one and don’t do anything). You can’t heal, when your allies don’t lose any health.

Same thing with losses. Sometimes you can do insane numbers, top both healing and damage, despite losing. Other times, you get stomped so hard, you don’t even have the time to perform.

In all of these scenarios, the player is not doing anything differently than what he usually does, yet his performance will be vastly different in terms of numbers, that are devoid of context.

If your MMR is a standard diviation, with the standard itself being a flat number, the game can assume you overperformed or underperformed despite you playing the way you usually do, and as a result, you end up on streaks.

Gibberish-11316 · August 27, 2019, 6:46pm

The best answer I can give is to watch the video - but its long (about 1 hour).

Therefore I will cut to the chase.
In your particular example the most likely scenario is that the DPS in question was grouped with a higher MMR player, hence the DPS’s SR and MMR were both lower than yours.

Kaawumba-1133 · August 27, 2019, 6:46pm

It’s this one.

What I meant is that MMR is not used to rig matches. The MMR/SR differential is used to affect SR gains and losses, yes.

What is this, 2016? They got rid of that ages ago.

Certainly things could be better documented. But Blizzard’s statements are usually clear and fit with the matchmaking theory I’ve seen from independent (non-Blizzard) sources. Most of the “interpretations” I’ve seen are people trying to find some excuse for why they have poor teammates other than the obvious: that they themselves are a poor teammate, and they are matched fairly, on average.

Thorny-11568 · August 27, 2019, 7:25pm

Everyone intelligent is aware you believe it to be that one. You obviously are one of the smartest posters here, can you objectively provide any sort of proof beyond saying it is so?

Seems a bit contradictory, but ok.

That is true of many posts, and it’s easy to see from the tone. However, their ulterior motive and general lack of awareness does not indicate that it is or is not the case. Suggesting so is a pretty severe fault in your logic. As I said in my first post, were this truly the case, anyone who belongs higher than they are would still be climbing due to PBSR. It still matters because it severely impacts match quality if that is how matchmaking is done.

Kaawumba-1133 · August 27, 2019, 9:59pm

Here’s the actual quote: https://www.youtube.com/watch?v=bn8-aWPvLwE&t=1h13m30s. You can interpret however you want. However, with sufficient math background, only one explanation is reasonable.

I was referring to your “particularly at the start of a new season”.

I have demonstrated that the problematic behavior can be understood by simulating a conventional, well-designed system. https://www.reddit.com/r/OverwatchUniversity/comments/aatezy/why_match_quality_is_frequently_poor/. By Occam’s Razor, I reject more obscure explanations.

Thorny-11568 · August 28, 2019, 3:04am

I’m not sure the ‘sufficient math background’ has any relevance here besides an attempt to invalidate my opinions via a self-proclaimed superiority. The question is not a mathematical one, but one of linguistics and trust.

Fair enough.

Occam’s Razor is a way of thinking, not empirical evidence. It might mean that it’s likely to be correct, but we both know it does nothing to actually back up that assertion.

My counterpoint: How do you measure standard deviations from the norm in the context of matches against varying opponents? We know MMR is measured in standard deviations. We know MMR is weighted based on wins, losses, and performance. By trying to measure these things against all players when different ranks will have drastically different performance in a non-linear fashion due to playstyles, you introduce mountains of factors that could destroy your data.

For example, more damage is almost always better. However, doing 2000 damage/min in a bush league game is much more likely than doing the same number in a grand master game where you attempt to limit ultimate feed and targets out of position are quickly killed. Likewise, achieving high healing numbers is much more difficult when your tanks aren’t constantly taking non-lethal chip damage and instead are being focus fired when they are killed. Given a player in gold, and a player in GM, many of their stats are liable to be comparable, but obviously the GM players’ mean more given they are against a GM opponent.

Wins, losses, and performance are all irrelevant until placed in the context of the opponents they were against. Measuring MMR as standard deviations within your range of skill(perhaps done by rank: bronze/silver/gold/etc) allows you to assess whatever variables you determine to be most relevant in the context of the level of gameplay you’re playing at. Thus, as a gamer and software developer myself(likely with less mathematical background than you), I would be inclined to measure in the context of a rank. This allows you to easily compare all performance between peers, whilst removing the bias due to the major game differences at varying levels of play.

If I were tasked with creating a system that achieved as close of games as possible, I would want to identify overperformers and underperformers and ensure that they were evenly distributed as much as possible within the provided constraints. Keep in mind that at it’s core, Blizzard is a company interested in making money, and similar tactics have been employed by many game companies in many ways.

All I’m getting from your recent posts is that you are unable to substantiate anything beyond that single line from Jeff. It’s a bit disappointing, since the response to all of these topics is ‘You’re crazy, go listen to Kaawumba!’.

So, allow me to ask you straight out. I acknowledge that any evidence in favor of weighted matchmaking is largely anecdotal and unsubstantiated. I acknowledge that Jeff has claimed they do not look at SR for matchmaking. Do you have anything else to offer me besides subjective viewpoints and ad-hominem? A shred of real evidence that the matchmaking is how you say it is? From what I can see, there is no objective proof that they DO or DO NOT ‘handicap’ matches, simply evidence to the possibility of either.