Visualizing balance in the simplest way possible

tEhbAtZ-1845 · July 14, 2020, 4:29am

I can just send it to you if you’d like.

VichelChaos-2623 · July 14, 2020, 4:33am

You mean the sql software? If its that, I just downloaded it.
If its some other code or whatever, sure. Vict cl. 51@ gmail . com

tEhbAtZ-1845 · July 14, 2020, 4:34am

I can just send you a CSV dump of the data. Aligulac is pretty challenging to work with. You have to filter out the garbage based on number of games played, date of last play, etc.

Cheezecake-1895 · July 14, 2020, 4:34am

This is generally true of all elo distributions so its not really saying much, they will all have a forced average (generally 1500), and be distributed approximately normal, the fact that elo is a zero sum game with a cap on points gained/lost would also make the standard deviation pretty similar.

Eternity-11719 · July 14, 2020, 4:35am

Of course, if your customers want to found a E-sports club or Gamble. Sample should include many pro players data.

Cheezecake-1895 · July 14, 2020, 4:37am

These are quite important, its the basis for a lot of the improvments that glicko/trueskill made over Elo.

VichelChaos-2623 · July 14, 2020, 4:44am

I was just thinking more on the lines of grabbing mirror matches rank distributions, getting mean and standard deviations for mirror matchups and playing with some comparisons between players. At this point there isn’t much that I’m willing to do.

That would actually be super helpful, I have much more versatility like that. Thanks!

tEhbAtZ-1845 · July 14, 2020, 4:46am

I actually had to implement a custom SQL parser because postgresql had issues loading his database.

Cheezecake-1895 · July 14, 2020, 4:49am

couldn’t you just get all the information through the API? Might take a long time though.

VichelChaos-2623 · July 14, 2020, 4:49am

Then it is even more appreciated. I’m more interested in playing with it than any real analysis because the graphs you sent pretty much confirm my suspicions that the rank distributions of players of each mirror matchup are pretty much identical.

tEhbAtZ-1845 · July 14, 2020, 4:50am

I think you’ll find it funny how low the kurtosis is. You’ll know what I mean when you see it. The selection bias of the tournament processes is to blame!

Cheezecake-1895 · July 14, 2020, 4:54am

theres an inherent cap on how far outliers can go due to the fact that points gained/lost is based on your opponents. You can do a simulation yourself to see it for yourself.

VichelChaos-2623 · July 14, 2020, 4:56am

Honestly, I’m just extremely interested in grabbing say TvP. And then doing the following analysis:
1-Taking the ELO distribution of TvT and using standarization, find the equivalent ELO at PvP at each point of the cure (say from 0 to 2000 or whatever). Then, calculating the percentage winrate based on the ELO difference. And then see the difference in the predicted winrate from ELO to the actual winrate in aligulac itself for that matchup at that ELO.

Say the equivalent ELO for PvP at 1000 is 1200 for TvT. I then want to calculate the winrate that a 200 point difference would make, and then compare it to the winrate of TvP at 1000 ELO and 1200 ELO. And from there, see where the path takes me. Edit: Now that I think about it, I can standarize TVP ELO also (or PvT, but its important to compare everything with the same bases, so TvP if I started with TvT and PvT if started with PvP), and then check the winrate at that elo directly.

If my suspicions are certain that balance has so little to do with winrates at lower ranks, then I’m very sure that this difference in predicted winrates at the ELO to actual winrates would be extremely small. And then we can safely tell all those whiners that balance really is a non-factor.

tEhbAtZ-1845 · July 14, 2020, 5:03am

It should be on its way.

VichelChaos-2623 · July 14, 2020, 5:04am

Thanks! I’ll get to it later and see if I can share some results of playing around.

tEhbAtZ-1845 · July 14, 2020, 5:05am

It’s the whole database of players so you can filter it and process it as you see fit. There are 17709 players. Lmao. I can also dump ranking histories, tournaments, and match histories as well, for example. So if you want them just say so.

tEhbAtZ-1845 · July 14, 2020, 5:18am

https://i.imgur.com/OoydWie.png

This is why you have to filter out players with too few games played.

Cheezecake-1895 · July 14, 2020, 5:21am

This is not a valid analysis.

Elo comparisons across different matchups are not comparable. Lets say that the average zerg player has a true skill of 1500, and lets say that the average terran player has a **true skill of 2500 and both t/z skill distributions have a sd of 500 points. now the way elo works is that it doesn’t know what your “True” skill rating is, the only way it can try to estimate your skill is be seeing how many times an individual wins relative to the other players, it creates a forced average rating (usually 1500) for the distribution of ratings. Now what does this mean? This means that if we have a terran player who’s true skill is 2 standard deviations above the average of terran players (3500) his elo rating will only be 2500 (in tvt). If we have a zerg player who’s true skill is 2 standard deviations above the average zerg, his elo rating will also be 2500. Since the zerg/terran players both have the same elo rating of 2500 does this mean they are both just as skilled? no, they both have the same rating but the terran player is more skilled

Sources: https://en.wikipedia.org/wiki/Pairwise_comparison#:~:text=Pairwise%20comparison%20generally%20is%20any,the%20two%20entities%20are%20identical.
https://en.wikipedia.org/wiki/Elo_rating_system
http://www.glicko.net/research/acjpaper.pdf
http://www.glicko.net/research/glicko.pdf

The basis for every single paired comparison model is comparing two objects to see which is preferred, not measuring true “skill”

tEhbAtZ-1845 · July 14, 2020, 5:27am

Cheezecake:

Elo comparisons across different matchups are not comparable. Lets say that the average zerg player has a true skill of 1500, and lets say that the average terran player has a **true skill of 2500 and both t/z skill distributions have a sd of 500 points. now the way elo works is that it doesn’t know what your “True” skill rating is, the only way it can try to estimate your skill is be seeing how many times an individual wins relative to the other players, it creates a forced average rating (usually 1500) for the distribution of ratings . Now what does this mean? This means that if we have a terran player who’s true skill is 2 standard deviations above the average of terran players (3500) his elo rating will only be 2500 (in tvt). If we have a zerg player who’s true skill is 2 standard deviations above the average zerg, his elo rating will also be 2500. Since the zerg/terran players both have the same elo rating of 2500 does this mean they are both just as skilled? no, they both have the same rating but the terran player is more skilled

This is a circular argument. If they are comparable depends on if the population distributions are similar or not. You created a situation where they are dissimilar and said this is proof that they are dissimilar.

A T-test it what is used to compare the means of two samples to see if they came from the same parent distribution.

In probability and statistics, Student’s t-distribution (or simply the t-distribution) is any member of a family of continuous probability distributions that arises when estimating the mean of a normally distributed population in situations where the sample size is small and the population standard deviation is unknown. It was developed by William Sealy Gosset under the pseudonym Student.

The t-distribution plays a role in a number of widely used statistical analyses, including Student’s t-test for assessing the statistical significance of the difference between two sample means, the construction of confidence intervals for the difference between two population means

VichelChaos-2623 · July 14, 2020, 5:31am

This is where I am going to start inserting some very strong assumptions for the sake of actually being able to work with the data in such a way. My first assumption is that players from the three races have the exact same average true skill and standard distribution because they are subsets that come from the same set (human population and sorry if I’m using the wrong vocabulary, bealieve me this is harder than it seems…) and that there is no factor that makes people from different “skill potential” (lets say reflexes, multitasking, hand to eye coordination, situation analysis, etc) pick different races. I’m making this strong assumption because, well, I want to test if it actually can give me results that, even if not accurate, at least can give me some conclusions to draw.

Then, I’m taking mirror matchups as the way to put each player into this “greater skill line” per se. Like, using mirror matchups where balance and matchup difference meta is not an issue to remove those factors, and then assuming that the standard deviation will take care of the randomness of the matchup.