Thoughts on GOAT

Alright so computing this by hand is extremely annoying and cumbersome, so it’s easier to code a program to do it:

  1. Take the number of wins/losses between player A and player B, and compute (X - U)/O where X is the number of wins, U is (wins+losses)/2, and O is (wins+losses)*0.5^2.
  2. Repeat this for every player combination.
  3. Combine all these scores using Stouffer’s method.
  4. Sort players based on combined score.
  5. This computes the probability that this player could have this win-rate against these opponents assuming they are equal skill.
  6. The top player will have the highest score, meaning their win-rate against these opponents has the lowest probability of occurring assuming this player is equal in skill to all other players in the test.

Serral (485): 5.824957
Dark (76): 5.155443
INnoVation (48): 4.275851
Maru (49): 4.081305
Zest (1658): 2.606580
herO (233): 2.471097
Stats (309): 1.013453
Rogue (1662): 0.936411
soO (125): 0.496307
TY (63): 0.284821
PartinG (5): -0.685497
Solar (1793): -0.711255
ByuN (47): -0.908748
Classic (186): -1.372828
Trap (177): -1.606986
Cure (1665): -1.864295
DRG (4): -5.292103
RagnaroK (117): -6.620646
Bunny (1517): -8.083868

This doesn’t add any weights to the games to compensate for balance, which would be highly relevant with nydus/brood/proxy abuse. Taking that for granted, Serral is the GOAT, and Innovation is the best Terran. I think there’s a fair chance that if you were to compensate for balance issues (a herculean effort would be required to do it properly) that innovation would be the strongest player of all time. That’s simply because his performance is spread out across many iterations of the game, while Darks/Serrals are clustered around nydus allins (dark) and skyzerg (serral).

Z scores are hard to understand. Serral’s Z score corresponds with a 0.000000007% chance of his win-rate occurring assuming he is equal skill. Innovation is 0.00002%. So there are really big differences in skill just between these top players. The difference between Serral and Bunny is absolutely monstrous.

Also, another thing to note is that the top 2 Protoss absolutely suck compared to the top 2 Zerg/Terran. This is exactly why Protoss has loads of second-place finishes in premier tournaments. Also, lol @ hero (Z score of 2.5) stampeding maru (Z score of 4.1) in the recent Code S finals.

2 Likes