August 15, 2007

Digital sex is complicated

It turns out digital sex is a lot more complicated than I had originally thought. A key aspect to genetic algorithms is crossover between pairs of individuals. This crossover is made to reflect the genetic pairing seen in most eukaryotes. It is an artificial way to distribute genes from the parents to the children. Crossover is important because without it, a genetic algorithm is little more than a random search. Crossover (theoretically) lets the organisms traverse the fitness landscape quicker because common building blocks are passed from parents to children.

Techniques, thoughts and a cool adaptive study after the jump.

The most common form of crossover used in genetic algorithms is one-point crossover. A “line” is drawn down the chromosome for both parents. One parent donates the left half, the other donates the right half.


1-point crossover, image taken from Wikipedia


One-point crossover can be extended to two-point, where two lines are drawn and genetic material swapped accordingly. Crossover techniques can be three-point, four-point, etc. Another popular technique is called Uniform Crossover. Each gene in the chromosome has a uniformly equal chance of being passed to the child. The algorithm walks down the chromosome, flipping a coin at each gene. If the coin is heads, the father donates the gene. If the coin is tales, the mother donates the gene. In this way, a chromosome of L genes will average L/2 crossovers.

The big controversy arises when debating which technique is better. Proponents of x-point crossovers claim that crossover allows fundamental building blocks to be passed from parents to children. It is believed that certain configurations of genes work synergistically together and (for the most part) are independent of other genes. Because of this, crossover allows these groups of genes to migrate en masse to the child. X-point crossover enthusiasts claim uniform crossover is nothing more than glorified random mutation.

Proponents of uniform crossover contend that building blocks are over hyped and unsubstantiated claims. It has also been shown that uniform crossover helps break out of local maxima on the fitness landscape (ie. temporarily solutions that the population tends to converge on, preventing it from moving to a better fitness level). Because uniform crossover tends to break up “chunks” of genetic material, it is more capable of introducing novel genetics into the population to help stimulate new growth.

An interesting paper I just found (postscript) confronts this problem and presents a solution. They devised an experiment that allowed the simulation itself to decide which method was the best. In theory, the better method will produce more fit organisms. The method of crossover, therefore, was embedded as another gene in the chromosome of organisms.

They found that small populations benefited very little from the adaptive crossover technique while large populations responded well. It was theorized this was because the small population size lacked the diversity to make a difference in crossover choice. More importantly, they found that 2-point crossover performed better (and therefore survived genetically) than uniform distribution. The study is not entirely conclusive but a good starting point. For instance, it is possible that a particularly fit organism evolved early and happened to use 2-point crossover, from which further generations were created. Proliferation of 2-point does not necessarily conclude that it is a better method, although it does highly suggest it.

Which brings us back to the Distributed Neuron project. I was confused on which technique to pick. X-point crossover relies on building blocks but DN doesn’t have a large chromosome. The placement of genes in a chromosome affects the algorithm because different “building blocks” are separated. Should I try to place similar elements together? Where should the chromosomes be cut? Should it be random?

Furthermore, many of the genes are broad spectrum (ie. number of neurons, number of synapses per neuron, etc). Other sets of genes (like growth hormones and transmitter seeds) are very localized and could benefit immensely from the building block theory of crossover.

I’ve yet to decide on an approach but I may take the course I have with the entire project - let the simulation figure it out. The idea of embedding the crossover method in the chromosome itself is interesting and in the spirit of the DN project. Luckily, this is one of the design decisions that can be reversed and changed quite easily. Many genetic algorithm projects routinely switch the crossover and fitness schemes when local maxima are found. I have hope, however, that letting natural selection pick the best method will work out for DN.

2 Responses to “Digital sex is complicated”

  1. Anm Says:

    While it has been awhile since I’ve gotten my hands dirty with GAs, I’ve wanted to try a totally adaptive system. First, it would require a sparse genetic encoding. In other words, not every codon maps to part of some gene that is realized in the organism. That gives you room to attach control commands to some of those in-between codons. Commands like: increase/decrease mutation rate or increase/decrease crossover likelihood. The idea is to encode within the genetic representation which areas of the genome are critical and which areas could benefit from the most experimentation. There is evidence (that I’m not going look up at the moment) that such meta-mutation control exists within natural genomes.

    That said, you may need a counter balance to prevent the genome from locking itself from mutation. Perhaps each generation will increase mutation rates very very slightly, requiring active mutation to keep certain segments locked down. Or perhaps control commands that decrease mutation rates should be more likely to mutate themselves. Or maybe fitness functions will weed out any species that completely lock themselves from evolution.

  2. Zach Says:

    Hmm, that is a really good idea. Pathways (actions that are triggered by various neuron activities, such as downregulating transmitter production) and their distribution throughout the network are also completely mutatable. I might as well go all the way and make the system completely adaptable by including aspects like mutation rate and crossover probability/technique.

    I was also playing with the idea of adding extra “superficial” genes. These genes would be included but serve no functional purpose (roughly analogous to blue vs green eyes). These genes could be detected by the vision system of organisms, giving them the chance to build social networks of other organism they recognize. I’d like to eventually implement real sexual reproduction, where organisms choose their own mate instead of a computer algorithm. Perhaps one value for a superficial gene becomes dominant in fit organisms, leading to better reproductivity.

    Interesting things to think about. Thanks for the comment :)

Leave a Reply