Use DeepMind to balance races?

Frosty3k-1548 · January 28, 2019, 3:31pm

After DeepMind is fleshed out and fully functional with SC2, I would assume it wouldn’t be a massive change to just change it’s interface slightly and give it a different RTS game to chew on. If they can do that, we can await our balance changes from an over-worked AI, and not have to worry about whether the war3 team’s balance decisions are good or not. It would also save the team a lot of time to work on more content (or just reduce labour costs).

Monochrome-21111 · January 28, 2019, 4:19pm

Thanks Einstein, but deepmind for now, only works on a specific map, in a specific matchup, with this specific game, and they have actually worked on it for quiet some time now. (over two years, iirc)

I can only assume, that it is exremly complicated to programm a selflearning AI like this and in the end, human beings will play differently, with flaws, with micro and macro mistakes, with so called instincts, and so on. Not to mention, that several matchups and strategies are very much influenced by the battleground/map. It can help to collect some data, for sure, but I would not count 100% on it and say: “Yes, this RTS is completly balanced because the AI showed it”. Also, it is very likely, that this is not gonna happen for WC3.

Frosty3k-1548 · January 28, 2019, 4:29pm

Thanks Hippocrates. It can work with other maps, they are using one map because it was a popular map that was considered well-balanced, and the AI learns its strats on a per-map basis. Of course it’s more difficult than just switching a few variables, but far less difficult than moving on to another game. It is Blizzards interface that the AI uses, so they could replicate that source code for war3. And actually yes, given enough time the AI would find the best way to play the game and would account for all known strategies. The current DeepMind AI just needs more work to get there.

You need to understand the purpose of an interface… the AI doesn’t care that it’s playing SC2. It just is given the objective of the game, and it uses Blizzards interface to move units and kill stuff until it wins. The same can be done with War3 seeing as how it has the exact same controls (given the current version of the DeepMind-SC2 interface). In fact, if they coded it correctly the AI wouldn’t even notice that they were different games.

Nrumer-2991 · January 28, 2019, 5:01pm

you cannot compare it still. if it is balanced for neural networks with no limitations in apm, it might still not be balanced for human noobs.

Frosty3k-1548 · January 28, 2019, 5:04pm

I agree with that statement. I’ve also been hoping for a vs DeepMind AI ladder (both solo and coop) on SC2 because that’d be fun. However it has the same issue. The APM thing is one of the issues they still need to resolve though… I see that one issue being a blocker for any sort of functional use-case.

Scr4tchy-21650 · January 28, 2019, 6:53pm

They don’t even provide the classic team with enough resources to do proper QA and you want them to use DeepMind to balance Wc3. Funny.

Frosty3k-1548 · January 28, 2019, 7:00pm

Because their team is also doing balancing.

…of course another option would be to outsource the balancing to pros. That option wouldn’t save money in the long run though.

Aieris-2317 · January 29, 2019, 8:44am

OP is right. That neural network is designed to play SC like RTS. If they provide W3 with interface for it, it can play it. They can even use that agent that won against pro players, because it already knows how to control RTS.
As for APM, if you watched the video, they mentioned that for fairness, they caped APM at around 300 which is less than pro players have. So I can imagine that they introduce it to public and make different difficulties with various APM caps, like 100, 150 or 200.
For balancing purposes, it can be awesome. They only need to let it learn to play all possible matchups. Then they apply balance changes and see the results on hundreds of thousands of samples and test those changes properly. It will not be 100% balanced for humans, but they will be able to discover flaws before releasing those changes to public. And it would take much less time to do balance changes because this AI is capable of playing much more games than humans can. So they may be able to release balance changes that are backed by hundreds of thousands test games every month.

Frosty3k-1548 · January 29, 2019, 2:45pm

And that’s what I’d love to see. War3 is much more difficult to balance than SC2 just due to there being heroes (and neutral heroes) and a 4th race. And they put the classic team on tackling this daunting task while they keep a dedicated balance team for sc2. It just doesn’t make sense to not pull the DeepMind AI over to war3…

However, yes the current DeepMind is APM-capped. That doesn’t mean the AI will use that APM ineffectively (as was demonstrated with the crazy stalker micro). So they still need to make the AIs control and focus imperfect.

Saith-2730 · January 29, 2019, 2:59pm

It is a massive change. You have to design an API that allows a program to run a wc3 game. You have to develop sensors/events that signal stuff like damage, unit collision, mimic vision (like chronoboosting a building) etc (there is a reason only one matchup is possible so far and that it is a mirror). And on top you have to design it in a way that fits machine learning.

Then you have to find a research team that actually wants to develop a deepmind network for wc3. You also have to cross fingers that wc3 collected data of games like sc2 does (i highly doubt that), to train the neural network. If you just let the networks train against each other without any human data you will get two AI’s that are good at one thing: Beating each other, but not necessarily humans.

Even if you accomplish that, how do you think this is gonna be usable for balance changes? A Neural Network is basically a function that uses gradient descent to find a local minimum for a given problem to minimize the error the network produces. The important part here is local minimum. If you are still familiar with a bit of calculus you know that a local minimum can be, but doesnt have to be the global minimum. So there is no guarantee that the strategy a Deepmind AI finds is actually the best possible way to play.
Even if it did, this network is not trained to make balance changes. It is trained to play the game. Where do you get the mass amount of data for the supervised learning part, to teach it what a good balance change is. Which balance changes is the AI allowed to make? How would you design and train such a network? Maybe its possible but i can guarantee you that this would take a long time researching it.

And in the end games are mostly about fun, if you actually find a way to train the wonder balancing AI to make good and automated balance changes, who says that those changes are actually fun? Maybe the AI comes to the conclusion that the best possible solution is, that only mirrors should be permitted to play. Or just one race.

I will stop it here, there are so much more complex problems such a task would have to taken under consideration i didnt mentioned here and also probably dont have the expertise to. But trust me, thats not a thing you just do with a bit of extra work.

TL;DR:

The AI is not fully fleshed out, it only plays one matchup on a specific map
The matchup is a mirror, so no matter how unfun that mirror is, it is balanced anyway
Neural Networks are not like humans (yet). They are trained for a very specific context and its not necessarily easy to change that context.
its probably more expensive than just developing warcraft 4 or any other game.

Triceron-1519 · January 29, 2019, 3:40pm

Despite winning, the AI was still doing some weird stuff like odd unit movement and funky unit control. We simply arent there yet to depend on AI to balance for human players. We still have yet to understand how AI thinks and how their strats can be applied to players. Even if AI deems it balanced, players may not.

PanRaist-1329 · January 29, 2019, 4:21pm

lol people dont know anything about IA
unless you make it FOR balance it wont help in anyway and probably just make things worst

Frosty3k-1548 · January 29, 2019, 8:21pm

With respect, I don’t think you know anything about how they’ve built DeepMind. At a high level, it is built to learn from historical records – not from scripted events. When it plays against itself, it needs to try alternative methods to come out on top. Then it records the results. It keeps doing that until you tell it to stop. Fast forward 500-1000 years (of simulated time), and you will have an AI that knows all human strategies plus made some of it’s own that we haven’t thought of.

@Saith
Good points mostly. I don’t mean the AI will do the balancing, because it knows nothing about what would make the game playable and fun for humans. That would be an insane task. I mean, let humans do some changes then run it through a massive simulation using DeepMind. Then record the metrics for win/loss and build order to determine what likely needs tweeking. It essentially ends up being the same thing as how we iterate balance changes now, except using AI to verify there’s no huge unforseen gaps.

Also it sounds like you are more familiar than I am about the low level implementation of the interface for SC2. Did they actually code in things like collision and pathing to the interface? Or do they allow the AI to determine that on its own?

Triceron-1519 · January 29, 2019, 8:43pm

But that data is mostly inapplicable to humans as a result of it being an AI doing whatever an AI does based on it observing humans play. Even if it spits out data that its use of Undead is winning 635/1000 games against humans, we aren’t really going to be able to use that as proof that Undead have an advantage. The AI doesn’t abide to any player meta and would be creating a new meta for players to follow rather than players adapting to balance updates. I don’t think that will be fun for anyone.

Aieris-2317 · January 30, 2019, 8:57am

Don’t focus on API implementation. That is something that can be done. I’m not going to speculate how much work it would be, but it is certainly possible.

Now for the results of AI games - If you learn that AI to play on current patch and give it enough time to learn to play like human pro players, then if you make an change into the game balance and let it play again, you may observe how those changes changed this AI meta. Of course that it will not provide you with 100% correct prediction how this change will change real meta for pro players, but it may poit you to unforseen backfires of this change.
For example, I am sure that if this was applied to recent changes to KotG and NE, the AI would adapt and devs would see that NE win rate rapidly increased and that KotG is the most used hero in game. And so they would be able to adjust those changes before releasing it to public. Perhaps NE would still be much stronger than before, but it would not be so significant.

Simply put, In my opinion, current results of DeepMind may lead to creation of tools for developers to test their games much more before release and that it will lead to higher quality and better balance in the future.

Frosty3k-1548 · January 30, 2019, 2:25pm

Exactly. It’s about improving quality in a more efficient manner that I’m interested in. AI certainly wouldn’t be perfect, but it would trend towards what’s the best strategy to counter the opponent. What would be interesting for me to see is if that strategy changes vs the same build depending on what the AI already has.

@Triceron
If we aren’t playing the game to its fullest potential, then yeah we totally should switch the meta to reflect what AI finds more optimal. Meta is only meta because it’s what players think is the best way to play. It changes whenever somebody figures out a new effective build. The same applies to AI.

Triceron-1519 · January 30, 2019, 3:56pm

The problem is we might not reach that AI meta. Nor is it likely that pros will adopt the AI meta strats, which we have no access to seeing or knowing. We just expect that the game has been balanced while still experimenting, and all the while we will see it as still imbalanced. We are human, we are impatient beings. Instead of adapting strong anti kotg strats, we simply say kotg is OP and needs nerf. Not to say the KoTG is balanced, but saying we are not doing what AI does and thus the AI meta is sinply inapplicable.

Think about it, if the game was fully balanced around games requiring to be played 1000s of yeaes worth to reach, how long would we take to reach that level of trial and error and counterplay to achieve?

Also, one drawback is potentially having AI balance itself around a glaring flaw. For example, if there was a bug in KotG micro AI vs getting stunned, then it might suggest CL first hero with impale, MK first, and TC first picks as immediate counters. The AI stops using KoTg expecting it to be countered when it was just a bug in the AI. Players would never pick up on this nuance. Devs might not pick up on it for lack of understanding why stun became an immediate counter and just assumed it worked. Based on an AI meta, we should all be trying stun to counter KoTg, and we would be left trying it out and half people saying AI is right it works or AI is wrong, fix the game. Lets say AI determines Dark Ranger Blood Mage is the defacto KOTG counter even, how long will it take for us not only to figure that combo, but figure the strat use for thr two?

RuN-6130 · January 30, 2019, 6:53pm

You better start coding now

Vykuu-1391 · January 31, 2019, 10:16pm

With significant work, a binary could be made to train a modified AI on a game like wc3. However,

DeepMind, not Blizzard, is doing the AI work. There’s no incentive for them to put in this extra work changing the input/output of their model, and there’s no incentive for Blizzard to put in the extra work to make the research binary so that the AI can train in a reasonable amount of time.
These things are trained on the Google cloud, not servers that Blizzard owns.
If you introduce a change to the game, and you want to see effects from that change, you have to retrain from some base-level (for the sc2 bot the base level is the result of learning from replays). This takes a significant amount of time (order of a week) even with the massive amount of computing power available to DeepMind.
There’s not necessarily any understandable relationship between the winning strategies learned after retraining and the previous winning strategies. The learning surface is obviously incredibly complicated, we have no clue what effect a patch will have on it. Convergence analysis is usually incredibly difficult.
Human players on a PTR actively target the units which the patch is effecting, while the AI doesn’t. Using humans is probably more efficient than anything suggested in this thread.

Aieris-2317 · February 1, 2019, 9:39am

Vykuu - To your points 3 - 5. That is not correct. You teach the AI how to play based on replays and when it knows how to play, it only adapts its strategies based on their sucess. That means that when you, lets say, give ghouls +10 dmg and +500hp, the AI will discover, sooner or later, that this unit is OP and so ghoul rush is the best strategy. The same would go for those little changes +1 dmg here and -2 strength there. The AI will discover that previous tactics are not so efficient as they used to be and so it will develop new ones. And as I have sayed before. It may not be accurate but when you see that KotG first ramps up from 10% to 95% of games, it is clear idication that something is wrong.

To Triceron - your argument about AI bug is irrelevant. This kind of AI can’t have any bugs, because it does not have any specific game related code. It knows its inputs and it has some win condition measurement algoritms, but there is nothing in the AI about how to use stuns. It simply learns that this ability deals damage to oponent and removes his movement/casting/damage ability and so it is good to use it to disable/kill dangerous enemies. That is the reason it needs literaly hundreds of years of experience to mimic real player behavior - it does not know anything at the begining and it iterates on its attempts. If you remove that stun part from that ability, the AI will need another thousands of games to discover that and to adapt, because you don’t say to it that you removed that, you let it discover that change for itself.
They mention that in that video as well. They started that league with x different agents and they let them to branch strategies. I suppose that there were thousands of agents in the end and they picked up 5 branches that were the most successful.