A genetic algorithm is a metaheuristic used for generating high quality solutions to optimization and search problems in machine learning. GA are mainly using biological operators as they are relying on the theory of natural selection.
[Picture 1 -The evolutionary Generational Cycle]
Introduction
In order to understand how GA works, we need to be familiar with some terms that we will be using:
- Population: This is a subset of all the probable solutions that can solve the given problem.
- Chromosomes: A chromosome is one of the solutions in the population.
- Gene: Consider it as an element in a chromosome.
- Fitness function: This is a function that in order to produce an improved output, a specific input(the solution) is used.
- Genetic operators: A genetic operator is an operator used to guide the algorithm towards a solution to a problem. The three main types of operators are mutation, crossover and selection.
So, now that we have seen some of the terminology that we will be using, lets get started. Genetic algorithms use the evolutionary generational cycle(See the picture 1) to produce high-quality solutions. As you can see, they use various operations that increase or replace the population with the aim to provide an improved fit solution.
In Genetic Algorithms, the following phases are followed:
Initialization
The genetic algorithm starts by generating an initial population. This initial population consists of all the probable solutions to the given problem. Generally, for initialization we use random binary strings.
Fitness assignment
Due to the fact that we should determine whether we should choose an individual for reproduction or not, a fitness score is assigned to all individuals in the population. The higher the fitness score of an individual, the higher the chances of being chosen for reproduction.
Selection
In this phase, individuals are selected for the reproduction of offspring by their scores. The selected individuals are then arranged in pairs of two to enhance reproduction. These individuals pass on their genes to the next generation, as there is a high chance that they will generate the best solution to the problem( *better than the previous one).
Reproduction
This phase involves the creation of a child population. Despite that it seems easy to do so, the algorithm employs variation operators that are applied to the parent population. The two main operators in this phase are the following:
a. The Crossover operator: This operator swaps the genetic information of two parents to reproduce an offspring. It is performed on randomly selected parent pairs in order to generate a child population of equal size as the parent population.
b. The Mutation operator: This operator adds new genetic information to the new child population. This is achieved by flipping some bits in the chromosome.
ex.
--First parent's chromosome: 000000
--Second parent's chromosome: 111111
--(One point Crossover)
--offspring1: 111000
--offspring2: 000111
ex2
--First parent's chromosome: 101011
--Second parent's chromosome: 110110
--(Multi point Crossover)
--offspring1: 100110
--offspring2: 111011
[Picture 2 - The crossover and muatation operations]
Replacement
In this phase, generational replacement takes place (the old population is replaced with the new child population). As we discussed earlier, the new population probably consists of higher fitness scores than the old population, so an improved solution is generated.
Termination
This generational process is repeated until a termination condition has been reached (upon the requirements we have set have been satisfied).
Pseudocode
While Pseudocode is not a programing language, it can help you understand the structure and the basic logic of a program. In our case, a GA example would be the following:
START
Generate a population
while not terminated condition
Compute fitness
REPEAT
Selection
Crossover
Mutation
Compute fitness
UNTIL termination condition
STOP
Now, that we have theoretically a basic understanding of what is going on, you can see the example code and try to understand it.
GA Example
This is an example showing a GA. I know that it might be difficult to understand due to the length and the way this program is written, but you can see how a GA actually is .
-- Let's start with our settings
problemSize = 64 --The size of the problem
mutationRate = 0.005 -- The frequency of new mutations
crossoverRate = 0.98 -- the frequency of new crossovers
populationSize = 6 -- the size of our population
maxGenerations = 100 -- The maximum generations
selectionTournamentSize = 3
seed = os.time()
-- Our Crossover function
function crossover(mum, dad)
if math.random() > crossoverRate then
return ""..mum
end
local cut = math.random(mum:len()-1)
local s = ""
for i=1, cut do
s = s..mum:sub(i,i)
end
for i=cut+1, dad:len() do
s = s..dad:sub(i,i)
end
return s
end
-- Our mutation function
function mutation(bitstring)
local s = ""
for i=1, bitstring:len() do
local c = bitstring:sub(i,i)
if math.random() < mutationRate then
if c == "0"
then s = s.."1"
else s = s.."0" end
else s = s..c end
end
return s
end
-- Our selection function
function selection(population, fitnesses)
local pop = {}
repeat
local bestString = nil
local bestFitness = 0
for i=1, selectionTournamentSize do
local selection = math.random(#fitnesses)
if fitnesses[selection] > bestFitness then
bestFitness = fitnesses[selection]
bestString = population[selection]
end
end
table.insert(pop, bestString)
until #pop == #population
return pop
end
-- the Ρeproduction phase!
function reproduce(selected)
local pop = {}
for i, p1 in ipairs(selected) do
local p2 = nil
if (i%2)==0 then p2=selected[i-1] else p2=selected[i+1] end
local child = crossover(p1, p2)
local mutantChild = mutation(child)
table.insert(pop, mutantChild);
end
return pop
end
-- Fitness function
function fitness(bitstring)
local cost = 0
for i=1, bitstring:len() do
local c = bitstring:sub(i,i)
if(c == "1") then cost = cost + 1 end
end
return cost
end
function random_bitstring(length)
local s = ""
while s:len() < length do
if math.random() < 0.5
then s = s.."0"
else s = s.."1" end
end
return s
end
--
function getBest(currentBest, population, fitnesses)
local bestScore = currentBest==nil and 0 or fitness(currentBest)
local best = currentBest
for i,f in ipairs(fitnesses) do
if(f > bestScore) then
bestScore = f
best = population[i]
end
end
return best
end
-- The Evolution function
function evolve()
local population = {}
local bestString = nil
-- initialize the popuation random pool
for i=1, populationSize do
table.insert(population, random_bitstring(problemSize))
end
for i=1, maxGenerations do
-- evaluate
local fitnesses = {}
for i,v in ipairs(population) do
table.insert(fitnesses, fitness(v))
end
-- update best
bestString = getBest(bestString, population, fitnesses)
-- select
local tmpPop = selection(population, fitnesses)
-- reproduce
population = reproduce(tmpPop)
print(">gen", i, "best cost=", fitness(bestString), bestString, "\n")
end
return bestString
end
-- run
best = evolve()
print("Finished!\nBest solution found had the fitness of", fitness(best), best, "\n")
The result should be something like this:
Additional Resources
- If want to learn more and are good at algebra then you can also see this: carrjk.pdf (whitman.edu)
Conclusion
After following this tutorial, you should have a basic understanding regarding the Genetic Algorithms. This tutorial, by no means covers everything about the Genetic Algorithms; yet its purpose is to make you interested in learning more by yourself. All in all, do not forget that it is an “Introduction” not an in-depth scientific handbook and should not be treated otherwise.
Notes
--Note1: This is my first even tutorial so any constructive feedback would be highly appreciated.
--Note2: Feel free to ask any questions about anything.
--Note3: If you think that I have forgotten something, explained anything wrong or have any sort of mistake feel free to mention it.