Catallaxy Services | @feaselkl |
|
Curated SQL | ||
We Speak Linux |
There are four major evolutionary algorithms:
In today's talk, we will cover the first two.
Evolutionary algorithms are great at solving a particular type of problem:
Evolutionary algorithms also perform extremely well when the environment (and thus the best solution) changes regularly.
Good uses for Evolutionary Algorithms: NP-Hard problems.
Evolutionary Algorithms help solve min-max problems and can avoid hill-climbing issues.
Evolutionary Algorithms tend to solve hill-climbing problems:
Many algorithms tend to get stuck at local maxima/minima; EAs often don't.
There are several good solution heuristics:
Evolutionary algorithms guarantee zero of these properties.
Going further, evolutionary algorithms have a number of limitations:
But...they do tend to get the job done.
Evolutionary Algorithms are modeled after simplified forms of biological processes.
Creative Commons. Author: John Gould
Organisms have chromosomes, whose purpose is to carry genes. In Evolutionary Algorithms, "organisms" (candidate solutions) typically have one chromosome.
Creative Commons. Author: Merlin G. Butler, et al.
Genes make up DNA sequences called genotypes. Each gene has a number of alternative forms, called alleles.
Creative Commons. Courtesy: National Human Genome Research Institute.
Genotypes are possible genetic structures. Given 2 alleles for each of 32 genes, we would have 2^32 or 4,294,967,296 genotypes.
Phenotypes are observable physical characteristics based on specific genotypes. This might be coloration, height, size, or beak structure.
Creative Commons. Author: Madeline Price Ball
BEWARE: there is not a 1:1 correspondence between genotypes and phenotypes, as phenotypes often depend upon a specific combination of alleles. When even one allele is missing, the desired effect may not appear. Close doesn't count with genetics.
The environmental niche is a set of features which certain phenotypes might be able to exploit. Example: thicker fur for northern climes.
Creative Commons. Author: Julio Reis
Genetic algorithms were popularized with John Holland's work on the topic, particularly Adaptation in Natural and Artificial Systems.
Applying terms:
Most simple genetic algorithms are arrays of genes with binary alleles:
We build up a population of organisms, large enough to ensure genetic diversity.
After building the population, we score each chromosome using the same fitness function:
Now each chromosome has its own score and can compare to other chromosomes.
For each organism (or pair of organisms) in the next generation, determine two "parents" based on some algorithm. A simple algorithm is roulette wheel, where we pull based on percentage fitness.
We then apply crossover with some probability. If that RNG pull succeeds, we cross over at some starting and ending point (also RNG determined):
Typically, there is one crossover range per organism pair. Sometimes, the crossover check fails and we keep the two parents. Usually we perform crossover with p = [0.6, 0.8].
For each gene in each new organism, we apply some mutation operation:
Mutation is a tool to help us get out of local maxima, but too much mutation and you have random search. Usually we perform mutation with p = [0.0001, 0.01]
Now we have a new organism, which is hopefully better than the old organisms in terms of solving our fitness function:
Repeat for each organism in the next generation, and repeat the process.
Great problems for genetic algorithms include the knapsack problem, portfolio selection, the traveling salesman problem, and the Holyfield problem.
The Holyfield Problem comes from Evander Holyfield's Real Deal Boxing for the Sega Genesis.
You have several options to help make your boxer the best.
Result: genetic algorithms let us choose the best set of options, making our boxer the best there is.
We want to maximize returns for a portfolio while minimizing risk.
We want to minimize total travel distance while hitting every city exactly once.
John Koza popularized Genetic Programming with his eponymous series of books.
Genetic programming is an extension of genetic algorithm concepts. Genetic algorithms feature fixed-sized chromosomes with data-representational alleles.
Genetic programming takes the mechanisms of genetic algorithms and applies it to a higher-level problem.
Similarities with genetic algorithms:
But there's one major difference: the organisms/chromosomes are entire programs.
Suppose we want to solve a problem where the fitness function is (56 - x)^2. One candidate solution:
In practice, genetic programs are significantly less parsimonious.
Genetic programs are great for deriving mathematical relationships.
Genetic programs can help us derive non-linear function results, similar to a regression model. If you are regressing, beware that genetic programs will overtrain on the current data; they're much more useful for identifying formulas.
Another use for genetic programming is building regression trees, which use decision tree mechanics but solve the same problem as a regression model: predicting the dependent variable's value.
Our regression tree was not better than a linear regression based on the attributes we selected, but was more understandable to a layman. In other cases, we can end up with better results.
Related to mathematical formulas, we can generate predictions for time series data.
How many points we have available and how many points we want to predict will determine how closely our genetic program's predictions can hew to actual data.
Evolutionary algorithms, including genetic algorithms and genetic programming, offer up an interesting approach to navigating large hill-climbing type search spaces with adequate efficiency and relatively high likelihood of avoiding local minima/maxima.
To learn more, go here: http://CSmore.info/on/gia
And for help, contact me: feasel@catallaxyservices.com | @feaselkl