Learning AI – Genetic Algorithms

C++ Portfolio ShowcasedPublished August 23, 2011 at 1:11 am No Comments

In this demonstration I have used various artificial intelligence techniques to create a “find safest path” learning AI simulation.  The below image outlines the map used to test the simulation, however the map can be modified so the enemies are in different locations and the map itself is different.

Learning AI Map

Map used in the simulation.

The scenario I setup consists primarily of two entities: the cars and the cops.  The cars spawn at the top-left most node (node 0) and try to reach the bottom-right most node (node 17).  Initially they go the shortest path using regular A* logic.  In addition to the learning aspect of A* they also use Manhattan Distance for the heuristics.  The shortest path, however, is one of the most dangerous paths.  There are cops placed throughout the city armed with laser beams ready to zap any cars passing by (because screw logic).  Since the cars are dumb to begin with, they will continuously go that route anyway.  After a while they try different paths and learn that just about any other path they take is safer than the shortest path.  Once many generations have passed (with each new car essentially being their own generation) the cars will try to stick with what works and will travel safer paths, which ultimately makes them appear a lot less stupid.
The movement in the simulation is fairly simple.  I’m using seek algorithms to go from node to node.  I think a simple seek does a good job at emulating a real car speeding through the city while attempting to avoid laser beam shooting cops.

From there we have the map.  I always thought it to look a bit strange to have a map in the middle of no where for things to follow, so the grass and buildings are purely to set the mood.  The neat thing about the map is that it’s loaded from an external text file.  This saved me having to hardcode in a bunch of values and also makes it so the map could be changed by someone who doesn’t have the source code.  It’s also incredibly nifty to press a key and it reload the map, allowing easy placement of cops and buildings.  But what is the point of having buildings if your A.I. doesn’t avoid them! That’s where A* comes in.  The A* supports Euclidean, Squared and Manhattan distances, however I just stuck with the Manhattan Distance.  Keeping it simple.

All of the learning is stored within the bots but it really affects the A* algorithm.  Each edge (or ‘arc’) is assigned an index.  The edge from 0 to 3 is index 0, from 0 to 6 is index 1, and so on.  The bots store added weights for each of these edges within them, then when they call the A* to get their path they give the algorithm these addition weights.  These weights are determined by genetic algorithms, but we’ll get to that soon enough.  The mutation process of the genetic algorithm lets the bots randomly decide to “hate” certain paths.  The hope is that eventually one of them hates the most efficient path, since it’s the most dangerous.  Once this happens they will take a different route, and if it works out better (judged by some fitness score), then they remember it.  Future generations will use this information, while continuously trying new things.  Let’s take a further look at how the genetic algorithm works.

I’m going to assume you know what a genetic algorithm is.  If not, refer to my article here.  I wish to only focus on how I am using it in this post.  I start by having each bot be their own “individual.”  The first few bots are created with 0 additional weights on the A*.  This is why they go the shortest path because they do not “hate” any paths yet.  This goes on for a while just to store some information about their current, unmodified progress.  Each individual has a ranking determined by a fitness score.  Those who do well rank higher up, while those who rank poorly rank further down.  On top of that, there is only five individuals stored in the ranking at a time, so those who perform poorly simply die out.  Think survival of the fittest.  This is important because this governs how the children are created.  I am using the roulette algorithm to randomly pick two of the five ranking individuals.  Then I’m using the crossover technique to create a new child.  This process happens every time a new car is spawned.  Doing this alone makes absolutely no progress, so let’s talk mutation.

If we constantly make perfect children of two parents, then there’s no variation in society.  Everyone knows something but no one can learn anything new.  They only know what their parents knew.  This is why mutation is important, and I found this to be the trickiest part of the entire project.  If the individuals mutate too much, then they try new things so often that they never do what actually works.  If the mutation is too little then they, eventually, after a couple hundred generations, will maybe find the right solution.  Balancing out the two wasn’t easy and required that I just play around with the values a bit until something looked promising.  It could probably be further improved but it works pretty well at the moment so I’m good with that.  I ended up settling with a 30% chance of mutation for every edge, with a random 0-50 increase in weight.

The fitness for the genetic algorithm was simply the health remaining for each individual.  For some reason this was a tricky part as well.  Once the bot is created they are not considered part of the population yet, it’s after they’ve ran their course that they be added to the population (and maybe removed immediately after for doing a horrible job).

That’s pretty much it.  Below is a video demonstration of the project.  As you can see the bots start off going the worst possible path and by the end go the safest path.

Leave a Reply

(required)

(required)