Use genetic algorithms to improve enemy AI over time in video games

I’ve played my fair share of video games in the past twenty years, and ever since I played for the first time, I was fascinated with how someone could make these games work. More often than not, a special mechanic in a game left me wondering how the programmers could implement a certain feature. However, i often notice patterns in enemy behavior. Therefore, it is way too easy for the player to predict how enemies might react to certain situations. While this might have to do with limited AI design, I recently had the idea to experiment with genetic algorithms to make enemies adopt to the player’s style over time to make it more difficult for the player to achieve his goals. My theory is that this way, enemies will adapt to the player over time, and this, in turn, forces the player to try different strategies to succeed. This article summarizes my experiments and discusses the results and how they could be applied in game development.

An overview of genetic algorithms

In short, genetic algorithms are a class of algorithm that are especially suitable for solving hard search problems with a large (potentially infinite) search space. Such problems include the k-colorability and the knapsack problem (pdf download). The main idea behind is the same as in natural evolution: Survival of the fittest. The solution, that best works in a certain environment, is the one that should survive the longest and produce the most offsprings. Over time, the overall population will get closer to the optimal solution with each new generation.

Each member of the population is characterized by a genome which describes the attributes of that particular individual. Typically, the first generation starts with random solutions (i.e. random genome values). These can also follow a heuristic, for excmple, if you already know that a certain part of the genome is more important than another in a certain environment, that gene could already start with a high value.

Another important part (if not the most important part) of a GA is the fitness function which quantifies how good a particular solution is. In other words, the fitness function describes how well an individual of a population fits in the current environment. In my experiments I found that this was probably the hardest part to design properly. As I’ll demonstrate later, the fitness function heavily depends on the goal you want to achieve. For different goals, the fitness function can vary drastically. Note that the fitness function is universal for all members of all generations.

Then, the best solutions of a generation are selected according to the fitness and produce offsprings (in nature that would be mating). The idea here is to select the best ones only (in my case I chose the ones that had a fitness above the average), and use their genomes to create the next generation (which is hopefully better than the previous one). This is called the crossover step in genetic algorithms. Next, there’s a chance of random mutations after each crossover step. This will typically only change a single gene in an individual’s genome. However, variations exist. Furthermore, the old generation can either be destroyed or kept alive after the crossover. Certain solutions work better for certain scenarios.

Thinking about life goals

So, from above we know that our game needs the following things to utilize GAs to improve enemies over time:

  • A genome that describes the enemies
  • Some form of goal
  • A fitness function that quantifies how close a single enemy comes to the optimum solution (i.e. the goal).

At first, this seems pretty straight-forward. But what exactly is the goal of the enemies? Well, answering this question is the hardest part of designing a genetic algorithm in a scenario without a clear goal. Let’s look at it from this angle: The 0-1 knapsack problem, introduced above, has a clear goal. The goal there is to maximize the worth of the things in the backpack by combining items without putting too much in the backpack. But what exactly is the goal for enemies in a computer game? Is it to cause as much damage to the player as possible? Prevent the player from reaching a certain goal? Not getting killed?

As you can see, this is not easy to answer, and the solution heavily depends on the type of game you’re designing. And the problems don’t stop after you’ve defined the goal for enemies, because how do you put that goal in numbers that a computer can understand? And how exactly can the fitness function quantify how good each individual of a population is with resprect to the defined goals?

Therefore, it’s extremely important to think about the goals of the AI before starting to design any other component of the GA.

Designing a simple example game

To demonstrate how (or if) genetic algorithms actually influence AI in a game, I built the following small competetive game:

Figure 1: The most important components of my example game

The game might not look like much, but it should suffice to demonstrate whether the genetic algorithm works and to experiment with. The smaller dark-green cubes represent food, a shared resource that the entities try to get to the fastest. The entities are the larger light-green and yellow cubes. They have two parameters that we are going to focus on for now: perception and speed. The perception defines how far an entity can see. It can’t see food outside of this range. The speed defines how quickly an entity can move. The goal of each entity is to maximize the calories eaten.

The game starts with a random population of 40 entities and some food cubes. The food cubes spawn at random positions in regular intervals and they spoil after a certain amount of time. Besides that, the entities also die after a couple of generations due to natural age.

With only this description, let’s look at a few results from experiments. Naturally, it should be clear that entities with better eyesight and with a higher movement speed should have an advantage over others, and these are the results for five experiments:

As expected, the speed of the food seekers went up because this is a trait that helps an individual to eat more in a competetive environment. It can reach food faster compared to others and, therefore, eat more in the same time. The value, however, doesn’t exceed eight over time. I think this has to do with inertia. The entites can’t turn fast enough to move towards the food when necessary, and they pass it more often than they manage to reach it.

The perception, on the other hand, didn’t seem to make as much of a difference as expected. There is no clear trend that can be observed in the data. I think this has to do with the fact that there’s enough food available on the relatively small play field. However, we can also see that the one population with the lowest perception (Experiment 2) died out after a while. Therefore, the perception should still be above a minimum for a population to succeed.

I also added a figure with the average life expectancy of each generation, and the life expectancy decreased in only one out of five experiments. It makes sense, because an entity that lives a longer life can, on average, accumulate more calories in its life compared to one that lives shorter.

The last graph displays the average hunger of each generation. The name might be a bit misleading. In my game, the hunger describes how quickly an entity searches for food. An entity only searches for food if its health value goes below the hunger value. The maximum health is 100, and we can observe how quick the average hunger value goes above that value to make entities look for food permanently.

Therefore, I can conclude that this mechanism does indeed work in my game, and the best strategy to win in this game is to be fast and hungry.

The inclusion of enemies

Next, I added enemies to the game. Their goal is to kill as many of the existing entities as possible. To make it simple, enemies kill an entity as soon as they touch it. It makes sense to assume that faster entities will survive longer on average, because they can outrun most enemies. And, in most cases, the result looks similar to this:

Figure 2: The typical development of the average speed of enemies and entities

Note that this is one of five experiments, but three other graphs looked indistinguishable. What’s important is that the trend is clearly visible. Both average speeds increased over time, but it was a more important trait for entities rather than enemies. There was, however, one instance where the effects of a random mutation can be seen:

Figure 3: A different outcome thanks to a random mutation in the enemy genome

In fig. 3, a random mutation happened towards the end of the experiment. One of the enemy’s genes mutated and that enemy started becoming more and more successful, and, therefore, produced more offsprings. Figure 3 shows the only experiment where the enemies managed to destroy all the entities in the game.

As a comparison, here is one of the experiment results for the average perception of enemies and entites:

Figure 4: Average perception of enemies and entities

Interestingly, the perception of the entities always started high and remained relatively high. I could observe a slight decrease over time, but no data sample was significant enough to actually make a reliable prediction. The enemies, on the other hand, always started with an extremely low perception which increased over time.

To clarify, I should mention two things: Between the first series of experiments (without enemies) and the second one, I drastically increased the size of the playfield. Furthermore, I decreased the hunger damage for all entities, so that they would still have enough time to flee from enemies. Note that the hunger consistently increased over time in all experiments. I think this has to do with the fact that the entities still had the same goal as before (maximize the calories eaten in a lifetime). Besides the changed play field, The number of enemies was limited to 100 (compared to 800 entities) so that the enemies wouldn’t kill all entities right away. I think this explains why enemy values normally did not change that quickly and drastically (compared to the entitiy values). However, the trend should still be visible. Besides that, enemies had no one that kills them. They could only starve. This means that only the worst possible enemies (the ones that couldn’t even hunt down a single entity) died of hunger. I think that all these factors explain why the enemy values didn’t increase that much (in a typical run). However, it also shows how drastically a positive random mutation can change the outcome of a genetic algorithm.

What does all this have to do with video games?

Until now, not too much. Until now I only showed that genetic algorithms work, and that they typically produce better offsprings with every new generation. Now let’s look at a control group, if you want to call it that.

In this control run, I removed the crossover phase and the chance for random mutations. Instead, the mating works as before (same number and procedure) but the new generation will also receive random values instead of the ones in the parent genomes.

Then, I compiled the following chart that contains the average fitness of all experiments in all three runs:

Figure 5: Average fitness of all experiments in all runs

Remember, run one was entities only, run two contained entities and enemies, run three was like run two but with random solutions only. Run one and two contained five experiments, run three contained three experiments. Furthermore, remember that the fitness functions are cumulative (eaten calories and killed entities, respectively, over time). Furthermore, the calories per food item was changed between run one. Therefore, run one is only included for comparison, but it’s not considered in the conclusion.

The run without genetic algorithms (blue) starts off very strong (remember: random values only!), but the average fitness steadily decreases over time, which should not happen with a cumulative fitness function. For me, this shows that the average fitness worsens over time because the entities don’t manage to adjust to the environment. Initially, there’s enough food to eat, and not many enemies. Therefore, there’s enough to eat for everyone, regardless of how fast they are or how far they can see. Then, after some time, that food is gone, and the enemies have produced offspring. The entities, however, failed to adjust to that fact, and they can consume less food, on average, with each new generation.

With the genetic algorithm approach (green), however, we can observe the same spike in the beginning, followed by a fast decrease. This is the same problem: initially, there’s plenty of food. That, however, is gone after a while. Then, the decrease stops, unlike in the random population. This is the point where only the better solutions start to survive. Then, the weaker populations die, and gradually make place for better offsprings with each generation. This is where the green line starts increasing.

The trends are abvious: The random solution (blue line) decreases after the initial spike, the green line (genetic algorithm) is increasing over time, and it overtakes the blue line at the end of the experiment.

For the enemies, the trend is not apparent. Both are practically linear. The genetic algorithm, however, produces a steeper slope compared to the random data. I think the enemy curves show the difference between the random solution and genetic algorithms even more clearly. For the enemies, the food (i.e. the entities) is always plentiful. Therefore, the quality of the solution is what makes the only difference.

Possible use-cases

I think emplyoing general algorithms in games would make sense whenever the game is relatively open. Bad examples are, for example, chess and tetris. Those games have very strict rules, and leave little to no room for experiments. Furthermore, a typical game of chess takes quite some time.

Emplyoing GAs in games makes sense, wherever enemy/opponent attributes can be tweaked between rounds, and where such rounds occur often. Some examples would be round-based survival games (e.g. Call of Duty zombies). Here, each round of zombies could contain the top 50% of the previous ones and 50% random solutions. Over time, the enemies would gradually get smarter in attacking the player, theoretically.

Conclusion and Summary

Genetic algorithms are a class of algorithms for solving and approximating solutions for hard problems with large search-spaces. In this article, I discussed a few experiments that were conducted in a self-made game that employs genetic algorithms to improve the game AI over time.

As a conclusion, I can say that the solution indeed did improve over time. We saw that the initial population contained good and bad solutions, as they were created randomly. The GA, however, clearly improved the average fitness of the solutions over time, while the average fitness of the randomly created new populations steadily decreased with time.

In game development, this technique can be useful in round-based player-vs-computer games. The overhead of implementing a GA to alter enemy values over time, however, might not be worth the effort. Usually, it’s pretty obvious what attributes should increase with time to offer a more difficult gaming experience (e.g. enemy speed, enemy hit points, etc.), and it would be easier to simply increase those values over time, for example, with every new wave of enemies. Furthermore, GAs traditionally include mutations, which can lead to a very unbalanced experience, as we saw. This, however, could also go the other way, for example, if the random starter population is bad. Employing GAs in games would, however, definitely make every run of the game a unique experience, and it could offer a better experience (in combination with a good AI) compared to manual tweaking of enemy attributes.

As a conclusion, I’d say that developers could receive the best results if the AI has different strategies (e.g. hide and attack, run towards the player and attack, attack from the side, etc.), and then genetics should decide what approach to choose. More successful approaches would be chosen more often as long as they work, and different ones will be chosen once the old strategy doesn’t work as well any more.

Resources

Game source code (use the tags for different experiments) – GitHub

Sources

Solving the 0-1 knapsack problem with genetic algorithms – micsymposium.org
Graph coloring – wikipedia.org

Leave your two cents, comment here!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.