I wasn’t going to respond to you, since you are clearly a troll, but this comment is particularly egregious and it really demonstrates your dishonesty. You are smart enough to understand certain statistical principles but you deliberately misapply them.
You are accusing me of trying to cater the sample towards my intended conclusion, yet that is impossible because my goal is an increase in the sample size. Increasing the sample size causes the average to converge on the true and actual trend within the data (you don’t have any control over it - the average goes wherever it goes). If you want to misrepresent the data, you want to do the opposite - you want a smaller sample where you can pick a cluster of data that fits your narrative.
You are defending a small sample size because it confirms your conclusion, while you accuse me of wanting to cater the sample towards my conclusion BECAUSE I want a larger sample? Wow. The hypocrisy and dishonesty is just astounding.
The only thing I pointed out, in this regard, is that their sample may be too small. ~100 games is not large enough to calculate MMR when MMR is known to fluctuate radically depending on the meta and what opponents you face. This is a reasonable position that doesn’t favor ANY conclusion. The bot may well be 6k mmr in skill or it may be 4k mmr in skill. Asking for a larger sample does not favor any conclusion. All I am saying is that the data isn’t convincing towards their conclusion. Yet, despite this, you are accusing me of favoring a conclusion when that isn’t the case at all. I am simply asking for more data.
Thats literally not how sampling works, this is a problem you didn’t understand last time and something you still don’t understand.
You are looking at this sample and saying “It doesn’t have enough lurker games so its a bad sample”. Thats not how it works, as long as the sampling method is unbiased then you stick with the sample data and accept its conclusions.
Now the other argument you are making is that the sample size is not large enough, which is something that you can easily prove by calculating a confidence interval for the true MMR/Winrate.
I am defending the sampling METHOD.
Your argument might have some merit if they only got GM on A SINGLE account, but they didn’t they got it on 3 accounts.
You could simply go ahead and prove it if you want, the data is in their published paper which I linked earlier so nothing is stopping you from disproving their claim. But you wont, because you know you are wrong.
Your reasoning for increasing the sample size has been made perfectly clear, you already repeated yourself several times so there can be no mistake.
You can’t get out of this one bro, you are trying to cherry pick data because you didn’t like that the sample didn’t have enough games vs players who go lurkers
This is incredibly unethical immoral and just plain out dumb and shows you don’t understand statistics at all.
This is what I was just talking about. The things you are saying are just plain wrong. You want your sample to be robust enough to mirror the base-rates within the populace. If you have a bag of red and green and blue marbles, take a sample, and have no red marbles, then your sample is not robust enough to represent the population. The same thing applies here. If a certain strategy happens X% of the time on average, about X% of the games in your sample should have that strategy. If not, your sample is too small, it didn’t represent the population of strategies.
This is called the “Lindeberg’s condition”. Google it. You are saying this isn’t how a sampling works, when in fact it is exactly how it works. Could you please stop insulting my knowledge about statistics, as you butcher basic principles of statistics? It’s really not that much to ask.
I legitimately never get triggered this hard. Ever.
There you go again trying to sound smarter than you actually are.
You could’ve easily suggested a different sampling method such as stratified sampling but you went ahead and tried to overcomplicate things because you don’t understand the basics of how sampling works.
Kid, you literally just said I didn’t know how sampling works because I used the proper method of sampling. You in essence called Lindeberg’s condition wrong.
doesn’t understand Lindeberg’s condition (aka the basics of the CLT).
accuses someone else of “resorting to stuff he literally doesn’t understand”.
This goes to show that just because you took a class on machine learning, learned to plug some data into a python script, doesn’t mean you understand statistical theory even in the slightest. You are like the car mechanic who thinks he is an engineer.
Uh, no. This is “making sure the requirements of the CLT are met” which is really important to make a claim based on the CLT. You don’t understand the basics of sampling because you called Lindeberg’s condition wrong, when Lindeberg’s condition is one of the ways to measure if the requirements of the CLT have been satisfied.
I’ve been saying repeatedly that their sample may not be robust enough to include a good representation of all possible strategies. That is what I have been saying repeatedly from square 1. This is Lindeberg’s condition. You called it wrong. You said I was trying to cater my sample towards my conclusion. What you said is complete nonsense.
Mate, you understand that while you may look smart to people who have no experience with statistics, anyone with more than a couple of introduction to statistics courses can tell how full of BS you are right?
You aren’t going to bait me into this argument (and its not even an argument when you clearly have no idea what your talking about) when you don’t even understand the basics of unbiased sampling.
Once again you call the central limit theorem wrong while you claim that I am the one who is trying to fool everyone. Asking for a larger sample causes the distribution to normalize, not to become biased. Yet you are repeatedly saying that a larger sample equates to a biased sample. That is literally the exact opposite of how the central limit theorem works. From the wikipedia page:
By the law of large numbers, the sample averages converge in probability and almost surely to the expected value µ as n → ∞. The classical central limit theorem describes the size and the distributional form of the stochastic fluctuations around the deterministic number µ during this convergence. More precisely, it states that as n gets larger, the distribution of the difference between the sample average Sn and its limit µ, when multiplied by the factor √n (that is √n(Sn − µ)), approximates the normal distribution with mean 0 and variance σ2. For large enough n, the distribution of Sn is close to the normal distribution with mean µ and variance σ2/n. The usefulness of the theorem is that the distribution of √n(Sn − µ) approaches normality regardless of the shape of the distribution of the individual Xi.
For me, alphastar does these “stupid actions”, because actually they don’t matter that much : they don’t really affect much the winrate so it can’t figure out that he shouldn’t do that, because it’s all about statistics, and to figure out this is a bad actions, he should have an enormous sample where he did that as there are a tons of different variable he aslo needs to consider, and some have probably have much more impact.
On the other side, humans don’t just use only rational thinking to make decision, but also emotions. And conservative instinct is probably one of the most selected behavior in evolution.
So when we control units, we don’t need to have to do thousands of experiment where we lost the unit to figure out we should optimize survival of our units. It relies on our own fear of the death, and our instinct to avoid situation that could kill us that we transpose on our units.
And it’s striking that when commentators see reapers that kill drone or reapers been killed, they emphazise that like it’s a big deal, but actually it’s not really that much, it won’t change much the outcome of the game.
And emotions often bypass our rationnal way of thinking. It is consider as a weakness most of the time, but it’s actually quite useful in certain ways.
People that have a medical condition that make them unable to feel emotions are really affected in their decision making, and are blocked when they need to face really simple decision like which T-shirt they’ll wear today.
So we could easily “improve” alphastar and implement a rule to keep his units/buildings alive (an instinct of survival rule). But the work on alphastar is beyond SC2, it’s unsupervised algorythm so they want to see if he find that itself, or he doesn’t care because individual lives of his units don’t matter to win at SC2 (which could have some ethical implications in some more serious applications than SC2, like if we decide to let the power to some AIs, they could be proned to sacrify humans to achieve their goal).
Well how do we know it wouldn’t learn how to counter the lurkers if all we have is one game it lost to them? I like your ideas about placing limits on its mechanics to improve its game sense though. That makes perfect sense to me. I’ll have to run it by my brother, he’s more of an expert on this type of thing. I’m just a lowly artist
I agree with rabi on this one. While it is impressive how far alphastar got, their algorithms have far way to go. Although it is complex topic and hard to interpret certain things, or judge. It still fails to predict future and doesn’t understand many things and does bad decisions. In current state it clearly doesn’t fully solve their long term goal, developing ai applicable to real world. IRL is much more complex than SC2, if it has this much problems in SC2 it is not that good ! But thing is even it is not perfect it is kinda working already, they didn’t stop alphastar developement right. And even if they did, they using sc2 only as stepping stone and test their algorithms. Their goal isn’t make perfect SC2 ai, but to improve their algorithms. And even it is not perfect, i am not aplhastar creator, it was successful in some things, so if they published on nature, it had probably some scientific value, otherwise it would not get through, or peer-review, btw didn’t checked this fact. As someone said, it sometimes does stupid decision, or doesn’t use that much strategy, because sc2 is mechanical game strongly. Interesting thing is, after they restricted apm, it started playing more strategically and using more variety of a units. There still is no excuse to not remake obs, after seeing 15 lurkers and having them in your base, while your army could clear that and instead expanding away. Or for some other controversial decisions. So it is baffling, that they simply claim it got better than 99.8%, but didn’t even mention, because it has ridiculous macro and 264 epm in the first place. Same previously, when it played mana. Claimed only it was success, but completely ignored all things at which it failed. Because it is far from perfect and it clearly doesn’t work that well, even it is huge stepping stone !
That’s the problem, though. Vs a human opponent mistakes like these would drive your win rate to 0%. If it’s too dumb to build observers vs lurkers, every person would be going lurkers against it. None of the pool of opponents that this bot was trained against has the capacity to take advantage of these weaknesses. Ergo, a winrate is a product of how good your strategy is vs the pool of potential strategies. The human pool of strategies is constantly shifting as players try to counter one another, so a given strategy will go up and down in winrate depending on the day.
No actually the Bots wouldn’t have a large sample necessarily because it assumes that there is another bot that is capable of doing strategies that beat this bots current strategy. Yet this bot has zero capacity to deal with lurkers, ergo the pool of opponents that it was trained against didn’t have a player who went lurkers.
This is the fundamental problem of trying to train this thing to understand strategy when you are inferring from winrates. A strategy can have a high win rate not because the strategy is good, but because the pool of opponents are bad or because the pool of opponents aren’t prepared with the right strategies.
That is clearly the problem here. It has really big problems in how it plays that a human would be able to take advantage of to drive the bot’s win rate down to 0%. It is struggling to generate bots that can strategically exploit one another.
It really emphasizes that the pool of opponents that you played against has a huge impact on your win rate. It lost vs lurkers while refusing to make an observer, and it did this vs someone with <5.3k mmr. That means if its entire pool of opponents went lurkers, it would have a 0% winrate and below 5.3k mmr.
Humans will adjust their strategies to counter your strategies which means over time this things ELO rank would drop.
I hope the publish their entire dataset as I would gladly run their bots on my home server to prove this point - Win rates are unstable over time because the meta shifts.
Yeah, i don’t know much about machine learning, but because there are to many moves, it depends on surpervised learning a lot.
Another thing in nature article, they said it too:
“AlphaStar is very impressive, and is definitely the strongest AI system for any StarCraft game to date,” he says. “That being said, StarCraft is nowhere near being ‘solved’, and AlphaStar is not yet even close to playing at a world champion level.”
Yeah, I take issue with that claim as well. There was a guy from, I believe it was, MIT who created a bot that did one base blink stalker all ins and I guarantee it would be rank #1 worldwide just due to the insane cost efficiency of perfect blink micro. It is hard to imagine that someone would claim that alphastar is the most powerful AI in Starcraft 2. Perhaps it is the most powerful self learning AI, but almost certainly not the most powerful AI.
I think you misunderstand the point of the AI, think about it in the simplest terms possible. The goal that was stated on their paper was to ‘match the skill’ of pro players, and in that sense we have a clear criteria of simply asking, did the AI win games against pros? And we already have our answer to that.
Sc2 is an incredibly difficult and complex game which so many different decisions going on at the same time, and we can’t expect it to play the game perfectly like rabbi is asking, but we can see clearly that the AI was able to win a series of random games with an apm cap that wasn’t too insane which is completely astonishing.
It also beat Serral, and once again your imposing your own criteria on what a ‘successful’ Sc2 player looks like, we have to look at it from the criteria that it is judging itself on, which is “can it consistently win games in the very complicated game of sc2?”
I suggest you do not listen to Rabbi and instead read their paper, because rabbi is lying about their stated goals which you can clearly read in the paper.