## 6229Starting with Neat, the XOR problem, some questions

Expand Messages
• Jan 2, 2014
• 0 Attachment
Hi,

I have decided to hold back on the CPPN project because I feel that I am still missing something in my basic understanding of Neat. As a result I am now trying to reproduce the classic XOR problem as documented in "Evolving Neural Networks through Augmenting Topologies" which is available at http://nn.cs.utexas.edu/downloads/papers/stanley.ec02.pdf

I have read and reread the document several times and am part of the way through implementing some code that attempts to solve the XOR problem. Whilst the document is excellent in its overview of Neat, there are a few implementation details missing for me to be able to reproduce the XOR results exactly.

When I create a new population (say 150 genome members), all of the initial population will be the same species because all of the population will have each input node connected to the single output node, with no hidden nodes. In the initial population I have all nodes connected so my nodes will have ids of 0,1,2,3 representing the Bias,Input1,Input2,Output. For the genes connecting these nodes, I will have innovation ids of 0,1,2 (3 connections between inputs and the output).

I understand that each new gene needs to have a new unique innovation id. Does the same rule apply to the nodes? I have assumed not and so continue as per below.

When I add a new node to genome, I understand the concept of disabling the existing connection and add 2 new connections. For each connection I have to check if the innovation id matches any existing genes or if a new one needs to be created. For the first new node I create on a random genome, the genes will have 2 new innovation ids 3 and 4. Lets say that I create the new node between the bias and the output, my new node will be id 4 and my gene innovation ids will be 3 (the bias to node 4) and 4 (node 4 to the output). Let's say that in another population I create a new node between input 2 and the output. This new node will also be node id 4 because it is the first new nodes for this population. This means that I have 2 populations both with a node 4, but the nodes have different input connections. This new node will have a new innovation id for the new connection between the input 2 and the new node, but the output connection gene will have the same innovation id as the other population because it is from node 4 to the output. Is this correct? I assume this is an important point, so guidance here would be useful.

When it comes to the actual reproduction of the genomes I am a bit lost and am just making a few assumptions regarding the order that the reproduction is performed.

The document states the following rules :

The champion of each species with more than five networks was copied into the next generation unchanged. There was an 80% chance of a genome having its connection weights mutated, in which case each weight had a 90% chance of being uniformly perturbed and a 10% chance of being assigned a new random value. There was a 75% chance that an inherited gene was disabled if it was disabled in either parent. In each generation, 25% of offspring resulted from mutation without crossover. The interspecies mating rate was 0.001. In smaller populations, the probability
of adding a new node was 0.03 and the probability of a new link mutation was
0.05.

So, trying to turn that into a set of rules has proved to be a bit of a challenge.

I guess that to start with each genome needs to have its fitness evaluated. As the first population is all of the same species that is not a problem. Once each genome has its fitness calculated do you simply to a roulette crossover?

For the interspecies mating values along with the probabilities of a new node or connection being made, are these values (0.001, 0.03 and 0.05) representing percentages?

My my basic example so far, I am doing a roulette crossover 75% of the time with 25% just performing mutation. The mating pool contains all species with the number of genomes from each species a multiple of the normalized fitness of the total population.  Each population then has a 5% chance of having a node added, a new connection made of an existing connection disabled.

Regarding the population size, I find that after about 100 generations, I end up with 150 different species (the same number as the population size), this is because each time I mutate or add a new connection/node I am forming a new species. Once I have this many species do I just continue to crossover and mutate etc as normal until the network fitness is optimal?

Sorry again for all the questions,

Regards,

Simon

• Show all 7 messages in this topic