In the discussions around Ret and Fury relative performance, one idea has been presented that on the surface has merit.
The idea advocates looking at parses in the 99th and 100th percentile to capture those optimal players who have the best gear which should represent scaling performance.
This has been applied to comparisons between Ret and Fury where at the 95 percentile and lower Ret is outperforming Fury but Fury retakes the lead at the 99th and 100th percentile.
Example shown here:
100th
https://classic.warcraftlogs.com/zone/statistics/1017#dataset=100&sample=7
99th
https://classic.warcraftlogs.com/zone/statistics/1017#dataset=99&sample=7
95th
https://classic.warcraftlogs.com/zone/statistics/1017#dataset=95&sample=7
However, I think this analysis is flawed as it doesn’t account for classes with a higher gear independent RNG peaks in damage. Classes with a set percentage chance to proc abilities like Fire Mage and Fury. The 99th and 100th percentiles will always select for the lucky proc runs giving them much better outcomes than in the lower percentile ranges. They have a higher peak and the 99% and 100% parses capture that peak. These should be treated as outliers and not indicative of class scaling.
How do I prove this? Isn’t it just one conjecture verses another?
Currently yes. However, lets set up a hypothesis to test the theory that Fury just naturally has higher peaks.
Hypothesis:
If this is true and Fury peaks higher and the much better performance in the 99% and 100% is not due to superior gear in that bracket then we should be able to see the same thing occur in end of phase Nax where pretty much everyone has phase max gear.
Lets see if that is the case:
As expected - in the 95% Fury come in at 17th:
https://classic.warcraftlogs.com/zone/statistics/1015/#region=1&dataset=95
At 99% Fury are still at 17th (though there is a sharp jump at this point):
https://classic.warcraftlogs.com/zone/statistics/1015/#region=1&dataset=99
And at 100th Fury jump up to 15th:
https://classic.warcraftlogs.com/zone/statistics/1015/#region=1&dataset=100
Sample sizes in the Nax samples at all ranges are much larger than the ones we have for one week of Ulduar (covering the Ret buff period). But even so there is a clear bump in performance relative to other classes in the 99th and 100th percentile with actual rank changes in the 100th percentile.
This provides clear evidence that Fury has an intrinsically higher RNG damage cieling compared to many other classes that is independent of gear scaling. Rather than simply scaling better with better gear as some have conjectured with this weeks Ulduar results, the Ulduar parses fit the same pattern as end phase Nax.
Conclusion:
Results confirm Fury has higher RNG peeks and that higher percentiles are not sufficient data points to provide evidence for Fury superior scaling with gear.
Note - This data doesn’t do anything to disprove the conjecture that Fury scale better with gear, it only provides evidence that the 99th and 100th percentile parses can’t be used to accurately show scaling performance. I also think the use of 99th and 100th percentile data to approximate scaling has been done in good faith, but it is a fundamentally flawed approach.
Further analysis can be done to bolster or weaken this case by analyzing the number of parses for each gear bracket in each of the percentiles to determine if there actually is a significant gear difference between the 95ers and the 99ers.
Tldr:
Selecting only from the 99 and 100 percentiles is selecting from outliers and is a misuse of the data set. There is strong selection bias for good rng in the 99 and 100 percentiles.