How does hotslogs or hots api work actually?
If let’s say 1% of the hots playerbase registers an account does that mean hotslogs only tracks the stats of that 1% and shows the stats for that 1%?
Because that is highly unrepresentative…
How does hotslogs or hots api work actually?
If let’s say 1% of the hots playerbase registers an account does that mean hotslogs only tracks the stats of that 1% and shows the stats for that 1%?
Because that is highly unrepresentative…
In a hots (or other game) replay file, are a number of ways information is sorted for the game to read. People can make scripts or programs to browse through some of the file information and pull some of that out to use for collecting game and player stats. (one such program is available per gethub if you wanted to parse yourself without using hotslogs or heroesprofile)
The information pulled is from the replay files that are voluntarily uploaded, so people have to intentionally commit to upload their batch or to enable the autouploader. Since only 1 person per game needs to upload information, it doesn’t need as wide a group to upload to get some representation of the populace going.
For stat representation, its generally held that small samples can indicate larger groups; if someone is polling 1 question, a sample of 500 queries could potentially represent 15 million people if the sample is willing to accept a wide margin of confidence. Its one thing to conclude that 40-60% of a population heard of a product and a hero will win 40-60% of their games. The further from 50%, the higher the confidence one could have in the polled sample – meaning that there’s a bit more potential uncertainty here since the goal of the game is to be near 50% XD.
Part of the issue of something like game balance, however, is that it isn’t a single polled question, it’s a matter of analyzing combinations against other variables. Using mathisfun.com
If there are 90 heroes, 10 are picked, no dupes, no order variations, there’s something like 5.72e+12 variations for team compositions.
(that’s 5.72x10^12 or 5,723,000,000,000, or trillions of combinations) and that’s just picking the heroes, not accounting for talent variation.
If I’m willing to be 5-10% off of the polled sample on the success of these heroes, it might only need a couple hundred games… per talent variation. If I want to be 99% certain I’m 1-2%, then it could take tens of thousands of games. Some heroes do actually get enough recorded samples to give a good impression from these sites when looking at the broad picture. However, when it comes to sampling by ranks, the system can’t verify that those polled are actually in the correctly represented group.
Having 1% if the population can be large enough to get data from some trends, but part of the issue is the lack of accurate representation to genuinely be a ‘random sample’. Since the basis is by participation, its more likely to get concentrated samples rather than ‘random’ ones, so there can be some concerns for skewed sampling.
So that said, the sampling size for representation could be smaller than some people think it needs to be, but on the other hand, the confidence variables will be much wider than other acknowledge because they don’t really consider the extent of the variables to map.
When 1 person uploads, data gets uploaded from all 10 players in the game.
Therefore it’s not hard to figure out even 20-30% of players uploading will create a very large and accurate coverage.
I have not uploaded a single game to Heroesprofile yet it shows my SL rank accurately because enough of the people I played against (or with) have uploaded.