The parses themselves are inherently inaccurate , and hilariously, haven’t been subjected to normal data science scrutiny. The data itself is not accurate and is drawing poor predictive results. I want to see the data pipeline used in those processes or at least some semblance of transparency. As the chief data evangelist at Looker said, "When you know you don’t know, you’re cautious. But if bad data fools you into thinking you do know something, you’re liable to charge ahead based on that (false) knowledge. And that’s where the real danger is.”
The value itself is based on a percentile parse across all(maybe?) runs. Meaning, “When someone says they got a rank one or even a 96 percentile parse or something like that what they mean is that compared to all players of a similar class and specialization in a specific encounter and on a specific difficulty they performed that well compared to other players”
Now the raw value on the left is more or less meaningless because its comparing irrespective of gear, so you tend to look at the ilevel normalization. The problem in regards to ilevel normalization is that it doesn’t take into account the QUALITY of your gear. Not all 375 level items are equal. Having a socket on your item may as well add another 50 item levels depending on the slot. As an example, my 370 weapon(+41 crit, +74 haste = htt.ps://wowhead.com/item=162017) is worlds of magnitude worse than this dagger which is also a 370 (46 crit, 69 hase, = h.ttps://wowhead.com/item=159131) because it has a +40 haste socket.
To be analytical here lets take the combined value so we can draw comparisons. If you add the crit and haste values from both daggers(370), they equal a +115 stat boost ( that way we can normalize haste/crit ). Now lets look at the combined stat boost from a 385 dagger, its +122. Linearly extrapolate the data and we see that a +40 stat boost(+155) equates to an ilevel of 427.15 [y(x) = y1 + (x-x1)/(x2-x1)], 427.15 Yes you read that correctly, that dagger, stats wise, has an effective item level of 427. I don’t have the time right now but you can easily sit down and calc an effective item level based on sockets/stat priority. The obvious here is that I’m not accounting for primary stats, so you could denormalize those values a bit once you know the attribute values.
Stat priority is another big factor here because not every stat is equal and for rogues(as an example), haste has a higher attribute value than say mastery, etc. That is masked entirely by item level, 375 boots with versatility/mastery is not the same as haste/crit.
The next important aspect that is not captured is that the parse does not represent an equal distribution curve. I googled and I couldn’t find any data to prove that it is, which a lot of people tend to assume it follows. What is the number of rogues are parsing 14k-18k dps? What number are parsing 12-13999k, etc? Where is this data? If you don’t have this data and you just blindly say “well not being in the 90th percentile is BAD” leaves a lot on the table to be discussed. At the most elementary level, what is the standard deviation that we are talking about here?
Without diving into the data, its hard to draw any clear conclusions. Just like any data science work, you HAVE to do feature engineering. Blindly trusting data is a recipe for disaster. There are at least 30+ features that need to be discussed before we even get into the beginnings of covariance or correlation.
With all that said, the parses themselves should help as a ballpark to understand that maybe something can be done in terms of optimized rotations or positioning.