AlphaStar: reaches GM with all 3 races
Deepmind’s marketing team: Success! It has MASTERED SC2!
Also AlphaStar:
https://i.imgur.com/NXFhx4H.png
https://i.imgur.com/NniBzfy.png
https://i.imgur.com/HfU2QHq.png
These are only a few of dozens of examples. This thing butchers the basics of the game, but it has nice micro and amazing macro. I’ve done a fair bit of AI development and I am not impressed with this thing and here is why. Macro and micro are very easy things to model. There were AIs programmed by teenagers with nice micro/macro. The point of creating a self-learning AI is to tackle problems that are difficult to model. A simple question like “When do you drop mules?” is very difficult to answer because it depends on the situation and understanding situations and how they evolve into future situations is very difficult.
Yet, here it is dropping mules on a planetary that is going to die. That means its ability to predict even near-future outcomes is non-existent. It couldn’t tell the planetary was going to die and that it shouldn’t drop mules yet that would be clear to a human before the battle even started. There are infinitely more difficult questions in SC2 than “Will this planetary die?”, yet here it is dropping mules on a dead planetary.
I think it is pretty clear that this AI wins games on merits that AIs are already good at, while it struggles with basics of the game. It builds factories in the middle of nowhere, depots blocking its mineral line, throws its reapers away with NO regard for how important the first reaper is when scouting, and this list goes on and on.
These are huge problems from a game-understanding perspective. If a Terran throws his reaper away, creep gets a huge boost and early game liberator/hellion harass will fail, and mid-game tank pushes won’t work. This fast-tracks the zerg to hivetech. This AI doesn’t understand this. It has very limited game understanding. It doesn’t understand how one situation evolves into the other. It just spams units and has nice micro.
I think the problem here is that the Deepmind team isn’t imposing the right restrictions on the AI. Since macro/micro are so very easy to model, their bots will have a natural inclination to try to win the game on those grounds, while ignoring the more complex aspects of the game. The way you work around this is to impose restrictions on the bot. You could limit its APM for example and that would restrict its reliance on micro. They have added some restrictions like this, but I think it is clear that they haven’t added enough.
First, the micro restrictions I think are sufficient with the exception of focus-fire. It has perfect focus-fire. Having imperfect focus-fire is a restriction that humans have to deal with. Sometimes even Maru’s hands get sweaty.
Secondly, I don’t think the restrictions on APM are sufficient to limit macro. Macro requires very few APMs. Macro is more about timing than APM. Bots can and will nail perfect timing every time. Add a timing restriction to the bot’s macro.
Third, the bot can select units on the edge of the screen. Humans can’t do that because it would cause the camera to scroll. Add that restriction. If it tries to select a unit on the edge of the screen, it scrolls and mis clicks.
These are the problems that I see. The goal of a self-learning AI is to model difficult scenarios to answer difficult questions. Their bots are naturally inclined to win games on the strengths that bots already have, yet that isn’t the goal of a self-learning AI, and it impedes the bots ability to focus on what it is supposed to. The restrictions they add try to incentivize the bot to win on other merits, but their restrictions are insufficient and the massive advantages that computers have in APM and precise timing bleed through. More aggressive and strategic restrictions are required to force their bot to win games on the basis of game-knowledge / strategy. Until this happens, their bot is only marginally more impressive than traditional AIs.