Skip to content

Propagator rework

No due date 0% complete
  • Make usage and extension of propagators more intuitive.
  • Introduce common (selection) operators (e.g., Roulette wheel, rank, steady-state, tournament, elitist, Boltzmann).
  • Provide new propagator class Evolutionary with a parameters selection, crossover, and mutation, where the user can specify the selection, crossover, and mutation strategy, in addition t…
  • Make usage and extension of propagators more intuitive.
  • Introduce common (selection) operators (e.g., Roulette wheel, rank, steady-state, tournament, elitist, Boltzmann).
  • Provide new propagator class Evolutionary with a parameters selection, crossover, and mutation, where the user can specify the selection, crossover, and mutation strategy, in addition to their respective hyperparameters (as a dict)

Propagator Rework

Check out DEAP documentation: https://deap.readthedocs.io/en/master/api/tools.html#module-deap.tools

Propulate can handle continuous (real-valued, float --> real-coded genetic algorithm), ordinal (int), and categorical (str) variables / traits.
Thus, we need to provide variants of selection, crossover, and mutation for each of these variable types.

Selection

Tournament Selection
Run several "tournaments" among a few individuals chosen at random from the (breeding) population. The winner of each tournament, i.e., the one with the best fitness, is selected for breeding. Selection pressure can be adjusted by adjusting the tournament size. The larger the tournament size, the smaller the chance of weak individuals to be selected, because, if a weak individual is selected to be in a tournament, there is a higher probability that a stronger individual is also in that tournament. Pseudo code:

choose k (tournament size) individuals from the (breeding) population at random (with or without replacement)
choose the best individual from the tournament with probability p
choose the second best individual with probability p*(1-p)
choose the third best individual with probability p*(1-p)^2
and so on

The tournament process is repeated multiple times until the required number of parents is selected for reproduction.

In deterministic tournament selection, the individual with the highest fitness is always selected as the winner and thus as a parent for the next generation.

Roulette-Wheel Selection
The probability of choosing an individual for breeding of the next generation is proportional to its fitness, the better the fitness, the higher the chance for that individual to be chosen. Choosing individuals can be depicted as spinning a roulette that has as many pockets as there are individuals in the (breeding) population, with sizes depending on their probability.
With the fitness of individual i, $f_i$, and the size of the current generation, $N$, the probability of choosing individual $i$ is:

$p_i = \frac{f_i}{\sum_{j=1}^N f_j}$

In this method, one individual can be drawn multiple times.
Note that roulette-wheel selection by definition cannot be used for minimization or when the fitness can be smaller or equal to 0.

Random Selection
Randomly choose an individual from the (breeding) population.

Best Selection
Select the $k$ best individuals in terms of their fitness from the (breeding) population.

Worst Selection
Select the $k$ worst individuals in terms of their fitness from the (breeding) population.

Rank Selection
The selection probability does not depend directly on the fitness, but on the fitness rank of an individual within the (breeding) population. This puts large fitness differences into perspective. Moreover, the exact fitness values do not have to be available, only a sorting of the individuals according to quality.
In linear ranking, the selection pressure can be set by a parameter $sp \in \left[1.0, 2.0\right]$, corresponding to no and high selection pressure, respectively. The probability $p$ for rank position $r_i$ is:

$p\left(r_i\right) = \frac{1}{n} \cdot \left(sp - \left(2sp - 2\right) \cdot \frac{i - 1}{n - 1}\right),\ 1 \leq i \leq n,\ 1 \leq sp \leq 2$ with $p(r_i) \geq 0,\ \sum_{i=1}^n p\left(r_i\right) = 1$

Boltzmann Selection
The Boltzmann selection operator introduces the concept of temperature, similar to the approach used in simulated annealing. The idea is to control the selection pressure dynamically over time, allowing the algorithm to explore the search space more broadly in the early stages and gradually focus on exploitation as it converges.
The probability of selecting an individual is influenced not only by its fitness but also by a "temperature" parameter, $T$.
Initially, the temperature is set high, which reduces the difference in selection probability between individuals with different fitness levels. This allows the algorithm to explore a wide range of solutions. As the temperature decreases, the selection pressure increases, favoring fitter individuals more strongly. The selection probability is modeled after the Boltzmann distribution (a probability distribution in statistical mechanics), which is used to describe the distribution of energy states in a system at thermal equilibrium. The selection probability for an individual $i$ with fitness $f_i$ is:

$p_i = \frac{\exp{f_i / T}}{\sum_j \exp{f_j / T}}$

NSGA / SPEA-II Selection
Only relevant in the context of multi-objective optimization.

Crossover

n-Point Crossover
Straightforward for all types of traits with two parents. Also known as discrete recombination.
How to handle more than two parents? Choose each segment from a parent with probability 1/num_parents?

Uniform Crossover
Works for all types of variables, with an arbitrary number of parents. The selection probability for each trait is 1/num_parents. Also known as discrete recombination.

Intermediate Crossover
Mix traits of two parents, $a^i_{P1}$ and $a^i_{P2}$, to obtain trait of resulting child, $a^i$. Only works for real-valued traits!

$a^i = a^i_{P1} \cdot \beta_i + a^i_{P2} \cdot \left(1 - \beta_i\right)$ with $\beta_i \in \left[-d, 1 + d\right]$ randomly equally distributed per trait $i$

Recommended: $d = 0.25$

Blend Crossover
Given two parents with traits $a^i_{P1}$ and $a^i_{P2}$ where $a^i_{P1} < a^i_{P2}$, randomly select child trait $a^i$ in the range
$a^i \in \left[a^i_{P1} - \beta_i \cdot \left(a^i_{P2} - a^i_{P1}\right), a^i_{P2} + \beta_i \cdot \left(a^i_{P2} - a^i_{P1}\right)\right]$

Recommended: $\beta_i = 0.5$

Simulated Binary Crossover (SBX)
Generate offspring by combining traits of two parents such that the distribution of the offspring values is centered around the parent values, with a probability distribution similar to that seen in binary crossover.

For each trait $a^i_{P1}$ and $a^i_{P2}$ in the parent vectors, a predefined, constant crossover distribution index $\eta_c$ is used to control the spread of the offspring values around the parents. The lower $\eta_c$, the larger the spread (more exploration); the higher $\eta_c$, the closer the offspring to the parents (more exploitation). With a uniformly distributed random number $u \in \left[0, 1\right]$, the crossover parameter $\beta_i$ is calculated as:

$\beta_i = \left(2u\right)^\frac{1}{\eta_c + 1}$ if $u \leq 0.5$ and $\beta_i = \frac{1}{2(1-u)}^\frac{1}{\eta_c + 1}$ if $u > 0.5$

The traits of the two children are then generated symmetrically around the midpoint of the parent traits:

$a^i_{C1} = 0.5 \cdot \left[\left(1 + \beta\right) \cdot a^i_{P1} + \left(1 - \beta_i\right) \cdot a^i_{P2}\right]$
$a^i_{C2} = 0.5 \cdot \left[\left(1 - \beta\right) \cdot a^i_{P1} + \left(1 + \beta_i\right) \cdot a^i_{P2}\right]$

If the generated offspring traits exceed the predefined bounds of variables, they are clipped to remain within the allowed range.

Partially Matched Crossover (PMC), Order Crossover (OX)
Recombination operators for combinatorial tasks / permutations probably not relevant for Propulate.

Mutation

Gaussian Mutation
Apply Gaussian mutation with standard deviation $\sigma$ around a current trait of the input individual. Only works for float-type traits.

Point Mutation
Point mutate given number of randomly chosen traits with given probability. Choose random value within provided search-space limits.

Loading