Visualizing balance in the simplest way possible

Simple question, why do they have different ELO ratings for different versions of chess such as blitz?

Answer: Because the games are different, just like a PvP is not comparable to a PvT

3 Likes

Second Question: If there are three players, { A,B,C } and in any given 1v1 match A>B>C>A, can ELO tell you which player will win despite the fact that we know A>B>C>A will always be the outcome of any match?

Answer: No

3 Likes

That’s a black and white fallacy. You leap from “being different” to “incomparable” and that’s an unsubstantiated claim. The Elo algorithm is a comparative algorithm, meaning the predictions it makes only apply to the rating pool in which it was established UNLESS the other rating pool is the same. You can test if two rating pools are similar using a T-test, and, if they are, then comparisons are valid.

So what you are saying is that PvT and PvP are identical matches? This is exactly why your biased bro, this is a pretty dim statement to make.

4 Likes

Nope. I am saying that someone with X skill will get the same rank regardless of which matchup they play. The skill distributions are what we are sampling so if the ZvZ/TvT/PvP distributions are similar enough to compare is a question a T-test could answer.

So you don’t think there are some players who are better at ZvT than ZvZ?

LOL. Thank you for making such an absurd statement that just proves how biased you are.

1 Like

Not over a large sample.

In probability theory, the central limit theorem (CLT) establishes that, in some situations, when independent random variables are added, their properly normalized sum tends toward a normal distribution (informally a bell curve) even if the original variables themselves are not normally distributed.

This is actually a completely wrong statement for a different reason.

The sample of games in the Aligulac database are not randomly selected, they are selected at an arbitrary cut off point by anyone who submits a game. If a masters player is on a lucky winstreak and gets a few games in GM recorded on aligulac then he will inflate the points pool for the matchups that he plays, which will not be distributed equally along the matchups seeing as the distribution for races is not equal at GM level.

1 Like

Protoss is broken op, just admit it man…

So then what happens if one matchup is imbalanced and the other one isn’t because of a new patch?

Answer: There is bias that would affect the ELO ratings for everyone in that specific matchup regardless of skill level. In other words, the ratings are not comparable across different matchups

2 Likes

The data passes a normality test: https://i.imgur.com/tsoPq31.png

In statistics, normality tests are used to determine if a data set is well-modeled by a normal distribution and to compute how likely it is for a random variable underlying the data set to be normally distributed.

An informal approach to testing normality is to compare a histogram of the sample data to a normal probability curve. The empirical distribution of the data (the histogram) should be bell-shaped and resemble the normal distribution.

Measuring that bias is literally what this test does. That’s literally the point of the thread.

You do understand that a normal distribution does not mean that there isn’t a bias in the data right? right? Please tell me you understand the difference.

1 Like

If the data is normally distributed it means you can make comparisons with other datas that are also normally distributed and the chance of an error is exceedingly small. The point of showing that it has converged on a normal distribution is to show that the bias is too small to cause meaningful error so the fact that I have shown that is definitive proof I am aware of bias and have eliminated the possibility.

Elo distributions will always tend to approach a normal distribution. Bias is NOT about normality,

h ttps://en.wikipedia.org/wiki/Bias_(statistics)#:~:text=Statistical%20bias%20is%20a%20feature,underlying%20quantitative%20parameter%20being%20estimated.

2 Likes

Nope. It’s logistic. It’s clearly defined right here.

https://i.imgur.com/0D5bBjf.png

It is about normality. When sampling a population, there will be some variability within your sample because your sample is only a sub-set of the whole population. You can get sub-clusters. Showing that the sample is normal shows that it has approximated the original distribution and thus accurately represents it (aka has no bias).

What that means in terms of Elo is if the distribution of varying skill in the real human population has been accurately matched by the Elo values themselves, which it has. That is true, at least, for comparisons about the mean.

The probability of a player winning is logistic yes. The probability of a player winning is NOT the same as the distribution of elo ratings, even the performance of any individual player is normally distributed

Elo's central assumption was that the chess performance of each player in each game is a [normally distributed](https://en.wikipedia.org/wiki/Normal_distribution) [random variable](https://en.wikipedia.org/wiki/Random_variable). Although a player might perform significantly better or worse from one game to the next, Elo assumed that the mean value of the performances of any given player changes only slowly over time. Elo thought of a player's true skill as the mean of that player's performance random variable.

4 Likes

In the original version, yes, but later versions were based on the logistic curve. Furthermore, it’s a moot point because the distributions are similar enough that comparisons between the means are still valid.

The probability of the outcome of any one match is logistic it says so RIGHT ON YOUR LINK.

This is not the same as the distribution of the ELO ratings.

Distributions being normal=/=unbiased

And for the record, even if a method is unbiased we never expect any one sample to be unbiased, only the sampling distribution is unbiased

1 Like

And the adjustment of the rank is based on the probability of winning which means the rank is logistic as well.

I can only explain this to you so many times. When a sample is normal, the mean has a very high probability of matching the mean of the parent population. If the mean of the terran and zerg elo values have matched the parent population (the distribution of skill in the human race), you can make comparisons between the two such as doing a T-test. Those comparisons have a very high probability of being valid. If there is bias, then it will fail the T test. That’s why you do the T test.

Look, you can agree with Batz or not, but you are putting words in his mouth that he never said.

He simply said that PvT and PvP are comparable. That does not mean saying that they are identical. This is not a situation in which different = incomparable and thus identical = comparable. Actually, in the vast majority of cases the point behind comparing things is actually to look for differences between them.

3 Likes