Yeah. Here’s a concrete example from D3 that makes it clear what would actually happen if you used that “averages” method:
Let’s compare two builds.
“Build #1” has peak adjusted clear 151.1, avg of top ten 150.5, and 100th place 147.7.
“Build #2” has peak adjusted clear 145.0, avg of top ten 140.6, 100th place 125.7.
The actual full data pool for each build is 400-600 clears. So, let’s do something like what’s been suggested, and drop the top 10% and bottom 10% off both pools, then average the rest. Doing that, we get 146.3 for Build #1, and 120.8 for Build #2.
So, Build #2 should get a buff of 146.3 - 120.8 = 25.5 tiers, in order to achieve parity with Build #1, right?
Wrong. If we did that, Build #2 would be taking down GR150 at 0 paragon, probably in less than 10 minutes, and quite possibly in as little as 5 minutes.
How do I know that? Well, because Build #1 and Build #2 are the same build, Masquerade Bone Spear Necro, before and after it got nerfed. And, we know exactly how much that nerf was worth, because it’s purely mathematical, taking the build down from a 3111x multiplier to a 671x multiplier, which equates to -9.8 tiers.
So, the first 10 tiers or so of our 25.5 tier buff take the build exactly back to where it was before it got nerfed, which was one of the most powerful builds ever seen in the game, taking down 150 around 4500 paragon. And THEN we’re stacking another 15.5 tier buff on top of that!
And that is enough of a buff to take both the paragon requirement for 150 down from 4500 to 0, and the time requirement down to about 5:00.
So, that’s a case where we KNOW, basically with complete certainty, what would happen if we were to adopt this balancing method (an epic disaster). And the exact same thing would happen if we tried to use that method to balance LoD DH, Trag’oul Necro, Raiment Monk, etc.