Use DeepMind to balance races?

Vykuu-1391 · February 4, 2019, 2:20am

@Aieres No. I’m certainly not as much of an expert as a DeepMind employee, but I use, and have hand coded, some ML algorithms in my research.

You can’t teach an AI to play based on replays on a patch that hasn’t gone out yet, there won’t be any… Furthermore, the replay based learning is only the first step in training the current AlphaStar AI.

Secondly, you are far too optimistic with your guess at how the AI will react to patches. Your comment about massive game-breaking changes being the same as minor ones is exactly the problem. The learning often gets stuck in local minima. If you’re optimizing on a non-convex surface, you have no guarantee that the strategies you learn are optimal, and the agent can stagnate at non-optimal strategies. This is the point I think you’re missing. ML algorithms usually stop meaningfully learning after some amount of time, and in fact letting them run too long can actually reduce their effectiveness.

There are various tricks that ML people attempt to use to get around being stuck in local minima (like adding second derivative information, trying a variety of cost functions, etc), but none of these guarantee reaching a global minimum as opposed to a local one.

Aieris-2317 · February 4, 2019, 12:28pm

Well, yes, my post is based on assumption that DeepMind team figured out a way to get around local minimums in optimization algoritms. But from their video, it seemed to me that DeepMind is not static optimization algoritm. It changes itself based on information it gets. So it is able to adjust its strategy on fly. Second, If I got them right, they taught that AlphaStar agent how to play and then they let hundreds or thousands of his copies to play against each other and it developed its own strategies. And those strategies resembled those human pro players use. It showed, for example, that stalkers are the most OP unit in game if you are able to see whole map and micro them all at once (which player can not).