What I learned making a Replay Analyzer


Over the past few months, I’ve been working on a website that analyzes replay uploads, and gives an individual score based on how well you did, with a breakdown of why you gained or lost points. This differs from tools like HotsLogs and HeroesProfile in that I’m actually looking at what the players are doing in-game. HL and HP are simply grabbing the endgame stats and a few other match details, which is largely meaningless outside of looking at global popularity stats.

I’ve written a summary about it in the link below, which I won’t repeat here, but I’ll give you a TL;DR and some random facts:

• The best way to win is to have more healing than the other team. No other stat even comes close to influencing the match outcome as much as healing does. Healing is actually more important than being good.
• My ranking system is 83% accurate at predicting match outcomes solely based on each player’s StormStats rating, making it infinitely more accurate than the actual game, and over 6 times more accurate than GhostDunk’s AI (link below).
• The in-game matchmaker only creates fair matches about 3% of the time (fair meaning either team has between 40-60% chance to win).
• By the time I stopped caring, my system could predict matches 63% of the time based on hero composition.
• The balance in this game is absolute garbage, by pretty much every metric you look at. The community agrees, even if they don’t admit it, because Johanna is banned 18128% more often than Probius.
• Orb Li-Ming is 12% better than Calamity Li-Ming, across all skill levels. Stop playing teleport.
• Banning nothing is the 9th most popular ban.

StormStatsBlog (read this for all the juicy details):

GhostDunk’s AI (and other analysis):


wow, this is amazing

The “I trust in science and authority until someone presents inconvenient facts” community will have potentially life threatening cognitive dissonance from this report.


Nice. Actual data is often sorely missing in conversations.

Without having 100% of all matches logged there is the risk of the data itself being unbalanced because the sort of person who uploads their replays might also have a tendency to play in a certain way.

That said I do expect matches to be unbalanced, given that HotS is receiving no balance updates, development or even growing its player pool.


Wow, so you repeat the same biases you made before, and then tried to call it a ‘scientific’ approach. Kinda hard to declare you’ve done something, when you don’t actually know what that would entail, or rather, why biases are usually something ‘science’ tries to account for, and then remove from analysis. Also, you don’t actually define the means of your analysis (e.g. power ratings,) so you pretty much talk about how great you are, how bad blizzard is, and assert that that has more meaning than self-satisfaction for yourself

  1. Your link on PPUS does not actually pull up whatever grievance you’re trying to voice.

You’ve been told about that before and some even demonstrated a better way to link to those. So far, it seems like your ability to utilize information that doesn’t agree with your initial premiss is in the negative.

  1. The basis of how you’re trying to convey ‘balance’ has false equivalence. Complaining that probius isn’t banned as much as johanna would be like declaring starcraft can’t be balanced because players will make more drones than ultralisks.

A “balanced” game isn’t not going to have equal distribution of bans because player-meta influences the perception of gameplay beyond just statistical numbers; not all heroes post the same numbers, contribute the same thing, or have the same opportunity in their function. Drones don’t do the same things as Ultrlisks, pawns don’t do the same thing as queens. Chess – a thing you use for examples – does not have equal-reporting on openers, midgame tactics, or closers (for people that know the names for those things,) but rather has shifts in what is played despite few changes to the ‘balance’ of the game. Favoring an opener advised by GothicChess is going to skew the moves players make than the ‘balance’ of the game would indicate at divergent ELO.

  1. Conflating cause and effect.

Part of the issue of actual statistical analysis is determining the difference between cause and effect. What you are doing is trying to find "statistically significant’ information which is what psuedoscience does for clickbait. Those sorts also have issues posting reliable links, applying the scientific method, and doing more than looping confirmation bias for declaring they have results without actually doing the work. But they have a bottom line, and there isn’t much place for much else, and that seems to be about the same for you.

  1. You process is influenced by disdain and assumptions, rather than actual information.

For instance, you go on about the harassment (way to keep updated on that btw) and assume ‘interns’ did maps, but you don’t actually convey all the ‘victory’ conditions for the maps in question. ToD core isn’t just damaged by claiming an active shrine: it is damaged by mercenaries, the boss, and controlling all towers on the map. You aren’t informed, you’re just angry and trying to act otherwise.

  1. You power rating puts Mal’ganis at the top.

Mal’ganis is not the top tank, he’s not even the tank your comparisons can use to show how ‘unbalanced’ the game can be. Instead, you drop LoL references on an audience that probably knows less about LoL than you do about HotS and you go around to a number of other characters instead of Mal’ganis. Your metrics are flawed, unexplained, and those things were pointed out to you before you even posted this.

What you report on a ‘balance’ change does not convey a process for useful information. The regen rate on a hero is based on the max HP, so when one value is adjusted, so is the other, and you neglect that information. That was pointed out to you when you complained about it here, and it looks like that’s just more stuff you’re ignoring. You are reveling in misinformation, and then trying to fault something else for that. That isn’t ‘science’.

Your methods are flawed, they’re just the same excuse to flaunt the same grievances you keep voicing from one game to another. You have a hate-induced-hard-on for cc, and can point at healing numbers, so mal’ganis has a high power rating, but little other explanation than an analogy compared to nova.

Cool, maybe you have 10,000 replays of HotS WTF replays of mal’ganis using his HP swap, but your methods don’t actually align with the grievances you raise, so you go off on tangents instead and don’t see an issue in the contradiction of claim and conduct presented.

Were you hoping someone wouldn’t actually read this, or are you going to run off and edit another wiki to lambast me and then act like I’m the one who doesn’t know how a ‘wiki’ works?


Yeah, the commentary was little over-emotional, but when somebody puts together a pile of data and makes it public, one can draw one’s own conclusions. I always appreciate the effort involved in gathering data.


that’s true. I just post overly long walls, while this does look to have functional replays on the site and even posts game-chat, so it has something to show for all it’s talk. Posters that claim to want to police replays to ban players could use that as a much better tool than the game replay system.

Thats why the site is up, for people to upload

good information.
Seeing it with concrete data surprises me.
although I think that all the players deep down know it.

Correlation =/= causation.
Better players will heal more than bad ones.
Almost all the time, the winning team will have higher healing numbers due to being at higher lvls and because their uptime is higher.
Not talking about how not every Healer is as easy/hard as the others, meaning ppl who try out new, hard to play Healers will be bad which gives them less heals by default and even without heals they’ll do worse.

Tldr: ‘winning → higher heals’ and not ‘higher heals → winning more’

Sidenote: otherwise my Uther wouldn’t have 66% wr. (Btw, did you take dmg mitigation into account?)

That value is basically still in the range of a coin guess.

I really have no desire to get into the rest of it – my science sense is tingling – so I’ll just extend kudos for undertaking a very interesting and ambitious project and seeing it through.

OP does get mad props for what he’s created. I just wish it wasn’t framed in the way that it is.


LoL, reading the chasm of difference between the two links in terms of objectivity and methodology is funny. Can pretty safely assume the OP didn’t actually read the second link.


I added ARAM in the last week because it was mostly easy, and I know a lot of people want an ARAM ranking system. Right now the ARAM rating is very bad because a lot of the code is assuming you’re on a real map. I’m going to improve it. I also planned on adding icons for merc camps and objectives on the maps so you can see when each team captures them.

I’m thinking about trying to resume the tool that will show player picks/history/winrates on the lobby. When Blizzard made SC2 they did a relatively good job making it hard to debug, so I’ve been fighting with how difficult it is to scan HotS’s memory for useful information.

I am open to feature suggestions. Keep in mind I spend maybe 5 hours a week on projects like this so it will be awhile until big updates get deployed.

Just found this post in my search for an AI tool that could watch back a game replay. I was pretty sure no one has made anything like this yet, but I think AI would be a great way to do something like this, rather than programming it with rules. Had you considered anything like this? I’d be interested in collaborating on a project like this.

Nice to see this thread come up again, confirms healer bans are correct as I figured
