Scaling and/or dataset for game balance blog post

I have been widely criticized for questioning the accuracy of Blizzard’s estimates in their game balance posts for their model/scale 5K paragon player. I fully realize that Blizzard has more data than I do; however, I feel logically that there are at least two things that do not make sense to me with the data that Blizzard reported.

5K Paragon “Average” Player Barb Crusader DH Monk Necro WD Wizard
Non-season/Era 12 130 138 125 130 123 130 130
Season 19 135 136 124 134 118 120 130
Benefit of Pandemonium Buff +5 GRS -2 GRs -1 GRs +4 GRs -5 GRs -10 GRs +0 GRs

Logically, it is hard to imagine how the pandemonium buff would reduce the “average” 5K witch doctors (after their data scaling/transformation) by 10GRs (equivalent to a functional decrease of 5X DPS) in their clear potential from season to non-season when they analyzed the data in early December. This leads me to believe that there is something off in their scaling methods and or the analyzed dataset not being as informative as needed to draw accurate conclusions.

Note: The difference is necromancers season to non-season is now rather dramatic; however, it is important to note that the seasonal the GR 150 S19 clears occurred more recently in general.

In addition to this issue, one can use the data from the worldwide leaderboard to look at actual GR clears for 5K paragon players to determine if the model makes sense (even if you restrict the analysis to the top of the worldwide leaderboard top 200 across all 4 regions combined). I have done this for WD using both the final era 11 and current era 12 leaderboards. Comparison of actual data (4-6K paragon players in the top 200 (i.e. the superstars) highlight a discrepancy.

Or there just isn’t enough WD playing this season vs the amount of them in off season?

Have you considered that people are better geared in non-season, and people in season and non season maybe have different behavior?
More people might have jumped in to Non-season before season started, on one of their old character, made a few fast High Grift runs to try out a new OP build. Then start season, slowly playing through that, without having gotten to that same point of pushing GRifts.

I would certainly question Blizzards conclusions here. I can imagine the variation in data is much larger than what they are trying to measure (differences between classes with an accuracy of 1 GR). Like maybe instead of
Barb 130 NS, 135 S, it might be Barb 125-135 NS, Barb 125-145 S within a 95% confidence interval or whatever.

They have access to all data, but all players who have pushed GRifts in december 2019 do not represent the whole population of people who might do so.
As Blizzard also note, some classes might suffer from too few people playing some classes currently (because people know that they underperform), making them underperform more than they really should.
But one could also imagine the opposite effect. If crusader is hyped as the OP class in a season, more people jump into the game, and pick a crusader. Which might mean more ‘lower skill/dedication players’ join the pool of crusaders, “dragging down” the average GRift clearing ability of the class in the data set.

I have no idea if any of the above has any merit. My point mostly is, that the uncertainty in the data could easily be big enough that the numbers can be correct, and also not mean that anyone decreases in GRift clearing potential due to the season buff (which as you say, is of course not the case).

So I agree that the data probably is not very useful to draw any conclusions from. But then I also have to assume Blizzard isn’t drawing any conclusions solely based on those two data sets.

Looking only at the top 200 players in leaderboards adresses some of the issues of looking at all data, but also introduces other issues. So using both is likely not a bad idea.

I have considered this and other possibilities. That is why I also mentioned looking at how well their non-season model 5K data compares to real data in non-seasons. Using the current era 12 worldwide leaderboard, one would have to use the upper echelon of WD GR clears. Specifically, they are only 18 WDs in era 12 in the entire world who have cleared GR 130 (or above) that have 6K or less paragon. One could argue that there just are not many 4-6K paragon WD on the leaderboard, but the numbers on per region basis are much higher. I need to double check precise numbers.

Also if popularity affects their metrics dramatically, it raises doubts about how they corrected for this, as we know that certain classes are being played preferentially in the current era.

You could imagine that it could also have the opposite effect. Many of the top players want to play the class with the top GR potential. These players would flock to perceived OP classes. As a result casual players would be less likely to earn a spot on the leaderboard and at least relative to the leaderboard, things would be “higher” skill…