The effect of skill, quantified. DR278

THere’s so much more going on than that, though, right?

All the parts can move independently. Deck lists are different, the opponent meta is different, and the pilots are maybe, but not always, different (because people in legend went through your D4-1 bracket and are included in that data to a greater or lesser extent, too).

If all the parts can move, picking one part and giving it credit without evidence is guessing.

No, but there is every reason to believe control priest is better in a smaller meta where it can tailor deck choices to win one or two prevalent match ups as opposed to the wider D4-1 meta or are you going to dispute this, too? I’m sure honest control priest players will tell you this without hesitation.

If the deck is different and the meta is different, then skill isn’t comparable across different decks and metas… thus the arguments are flat.

100% they are not good enough to talk about any single deck skill. No. Not at all. None.

If you wanted to do a viable, rough estimate of skill changes from one point to the next, the only thing that would work from this data is to look at the whole set of points, not individual decks.

You know the population mean is 50% because the game is binary… for every win there is a loss somewhere. Knowing this, the distance from this mean becomes meaningful in aggregate (not on a one deck level, but measuring the whole domain).

I would guess the total variance (S squared) for D4-1 would be larger than the same statistics for top 1k legend, and that would suggest the spread between best and worst is lower, meaning your opponents are more skilled as a group.

What would this do other than prove what we already know? Nothing. But it is at least a reasonable proof that factors the faults with the data sets into the calculation, making it at least a somewhat valid rough estimate of the difference.

If you then wanted to look at differences in standard terms between metas, that would be more appropriate but still not great.

If you don’t understand why comparisons across different populations are made in standard terms, you should look that up.

1 Like

No, that’s always been true of control priest. It has so many situational tools that it’s very easy to build it to hard counter basically any one deck. It has always been harder to make control priest good in a wide meta.

That said, what scrotie did in the original analysis (applying the diamond win rates to the meta you find in high legend) is a pretty good way to isolate that out. Priest isn’t seeing the kinds of gains it is JUST because of the more narrow meta. That only accounts for a small portion of the observed gains for control priest. Sometimes the meta explains a larger portion of the gains (control warrior).

The original analysis scrotie did more specifically was a measurement of how much of the change in win rate isn’t well explained by the more narrow meta.

The assumption is that it is primarily skill that’s driving the difference, because there isn’t much else.

The data we have obviously isn’t perfect for a lot of reasons (it’s opt in from players and requires running a deck tracker, which misses the entirety of mobile play, etc.)

Any conclusions on what we have will be similarly lower in confidence, but I’d still generally agree with the statement that some decks improve in power with player skill (and some decrease), and that this kind of analysis is a decent way of identifying those decks, even if the exact value of gains/loses is a bit off due to weaker data sets.

1 Like

If Neon prefers we can label the two factors meta and “not meta.” Or meta and “the matchup winrate factor.” I don’t care if we refuses to label the cause of matchup winrate differences as something known, he could label it as unknown, just so long as he freaking labels it somehow. I’m not pointing to something that doesn’t exist here.

And he can’t in good conscience say that I’m not isolating the effect of the meta here.

I’m glad to see you’ve sharpened your pencil a little.

You can say that skill is represented in these numbers, but it’s not isolated. Which means there is some value probabilistically in using this analysis to decide which decks to select if you want to play a high skill deck. Foolproof? No.

2 Likes

But it isn’t. It doesn’t do that. It’s an abuse of what this data is to try and extrapolate to another group because that’s predictive, which we have all agreed this data doesn’t do.

The change is axiomatic. You’ve conceded as much several times.

It isn’t though. It’s an artificial transformation that doesn’t really work.

Again, I mentioned to you that you need to measure different populations in standard units and there are really strong theoretical underpinnings for this that are beyond the scope of this analysis.

No, this is YOUR assumption because you have a vested interest in this being correct or your entire time here has been wasted.

There are a stack of very good alternatives that you can’t just handwave away no matter how inconvenient they are to you.

I can agree with this and still disagree with the rest of the sentence.

The analysis is garbage.

So as to have no confidence other than we know it’s in there somewhere, but we have no idea exactly what.

I can say there is more skilled play in the T1K group, but I don’t think these numbers have any reliable indicator of literally anything beyond statistical artififact. None.

They are no more accurate than a child’s cardboard box is a race car -it may have the paint and the shape, but it won’t get you anywhere.

And furthermore, I’m sad that none of them even comprehend how sample variance would be a better indicator of skill change between metas and open up better analysis across metas.

There really just isn’t a long list of other things that it could be.

We can analyze how much the meta impacts the win rates. That’s easy. You can see how a deck performs against others on average across all of the brackets.

You can see trends in that as the top legend brackets are approached of you want to see if it’s a continuation of a trend or an outlier.

This is all data we have pretty easy access to (although HSR puts it behind a paywall). It’s not perfect data, but it’s not particularly bad data either.

The second major pillar is RNG effects, but that can be evaluated by looking at week over week data of specific matchups when talking lower sample sizes like in high legend. This would cause noticeable shifts in matchups from week to week if it was the RNG effects in game driving it, and noticing these don’t change hugely from week to week lowers the risk of it being other kinds of noise.

That leaves everything else that we can’t directly evaluate, but the rest of it basically comes down to things you could reasonably call “skill,” as it’s how the decks are being piloted.

You don’t need perfect numbers to learn useful information. Even if you don’t want to call it skill, it doesn’t really change that at higher ranks, things like arcane hunter are going to do noticeably worse, even against decks it farmed on the way up there, and that’s not just a meta effect. I’m not going to get hung up on the label of it, and I’m not going to just go “oops, these numbers just means nothing.”

2 Likes

The artificial transformation is literally a meta with diamond matchup winrates and top Legend deck popularity. Even if you say “matchup winrate isn’t skill,” how on earth can you say that the difference doesn’t show how the meta doesn’t fully explain what happens in Legend? Is deck popularity no longer a valid measurement of deck popularity?

All you are is someone who’s decided that anything I ever do will be garbage.

1 Like

FYI, I am subscribed to HSreplay, in case anyone wants specific data, like popularity. This is a very interesting debate. Very educational. :blush:

Even if the list isn’t long, you have no way of saying which factor is most responsible in a given deck… as in the amount of cause each piece contributes changes without any consistency, meaning the entire analysis is just crap.

1 Like

I was using Vicious Syndicate. I’m not a fan of how HSR does things, with the exception of some things HSR does that VS doesn’t, and even then only because I don’t have an alternative.

Bruh.

In excel speak, overall winrate =SUMPRODUCT(matchup winrate, opponent deck popularity)

This isn’t my big idea, it’s literally the founding principle of Vicious Syndicate. Go ask them if you don’t believe me. If it’s wrong, every winrate they’ve ever published is wrong. Every archetype tier list for every rank, every week.

1 Like

I tend to agree. Just wanted to throw that out there.

You can though…

Meta forces can be compared by looking at deck popularizes and matchup data. That’s what scrotie did in the original post. Sometimes that explains the changes well (control warrior), other times it does (control priest).

Noise/RNG is removed with large sample sizes. For smaller ones, you collect the data over longer periods of time instead of a singular week snapshot, but we don’t often see decks jumping from one tier to another without clear explanations in even small data sets like top legend.

The third umbrella, skill based effects, is the one we can’t directly measure, but we can get a rough estimate of the first two, so we don’t exactly need to. It is whatever is left over.

There are definitely ways that scrotie’s analysis can be improved (mostly by doing this over several weeks of data to reduce noise), but the basis of his idea is sound.

2 Likes

Neon almost certainly has me on ignore, but here’s what he needs to understand…

Vicious Syndicate does NOT simply look at every game that Rainbow Mage plays at top Legend, counts the total games, counts the wins, and report winrate as wins divided by total. Repeat: they do NOT do this (HSR does though). They are proud of not doing this. They consider it prone to sampling errors and so do I.

What they do instead is a three phase process.

  1. They look at every game that Rainbow Mage plays against Mech Rogue (with Rainbow Mage as either the recorder or their opponent). The count the total of those games, divide wins by total, and get a matchup winrate. They repeat this process for all of Rainbow Mage’s other matchups.

  2. They count the number of times their recorders queue up AGAINST a particular archetype, regardless of what the recorder is playing. (This is opponent only data.) They use these numbers to determine deck popularity.

  3. Finally, they calculate overall winrate for Rainbow Mage to be equal to matchup winrate (Rainbow Mage vs Mech Rogue) × popularity (Mech Rogue) + matchup winrate (Rainbow Mage vs Enrage Warrior) × popularity (Enrage Warrior) + … and so on, for every opponent deck archetype. Like the SUMPRODUCT function in Excel.

THIS IS HOW THE SAUSAGE IS MADE. Under the rather explicit premise the winrate CAN be broken down into components, that those two components are matchup winrates and deck popularity, and that the relationship between them and winrate is a defined arithmetical formula. As I said all the way back in part 2 of the opening post over a hundred posts ago, this is NOT my idea. This is how the INDUSTRY STANDARD does statistics. It is NOT supposed to be controversial.

1 Like

As an OG in jail would say: It’s time to make the donuts! :laughing:

Ew

18 characters req

1 Like

There really isn’t.

That analsyis yields useless gibberish. If you can’t reliably identify exactly what factor and in what percentage, you have nothing.

To someone who doesn’t understand the why of statistics, it seems that way, for sure. It is sufficiently technical as to be unapproachable for most people, which gives some false air of credibility, and that’s what it’s trading on here.

This discussion is less about the math and more about the theory of why that math doesn’t do what you and others believe it does. Understanding the math behind, for example, the sum of squares, is a separate issue from understanding what it measures, why it is important, how it is used, and what factors make it change. Someone could do the calculations properly, but if the underlying data set failed certain conditions the outcome would have no meaning. That’s what we have here.

The op is what the dunning-kruger effect looks like in numbers.

It’s been a long day - I built a new toy shed today so I’m tired and sore. I need to run the electrical tomorrow, so I’m going to hit the rack.

I’m also done with topic. You folks glad hand each other and be wrong all you like.

2 Likes

Then if I have nothing, VS also has nothing. I am literally using their formula for winrate.

Suspicious timing. Good riddance though.

To help us ignorant laymen better see through this illusion, consider taking the data (albeit its scarcity) and correcting the OP’s math / interpretation. :blush:

I will say one thing in advance: there will be some variation. It’s not as if the Diamond 4-1 population or the top 1000 Legend remains static throughout a month. At the beginning of the month, Diamond is higher skill, as it includes more people who will end the month at Legend; the top 1000 stays pretty even, at least once it fills up (it’s more elite when it’s only top 400 because there are only 400 Legends). Kind of like how the distance from the Earth to the sun is “1 AU” but actually the distance fluctuates over the course of a year, the “distance” between D4-1 and T1KL fluctuates over the course of a month.

If we ever get a meta that lasts for an entire month, then we can compare similar times of months, but 5-6 VS reports between balance patches isn’t exactly common. It’ll probably happen eventually with patience, but… could be a while.

1 Like

Prepare to spend a lot of your life being sad. There are countless things to learn and expertise is a narrow narrow thing.

You have the same amount of proof that you demand. Either you can draw conclusions or you can’t. Personally I think it’s very low probability that the value of “skill” is zero but you’re welcome to your guess :slight_smile:

2 Likes