| Algorithm-Evolve documentation | Contained in the Algorithm-Evolve distribution. |
str_crossover( $string1, $string2 [, $N ] )arr_crossover( \@array1, \@array2 [, $N ] )str_agreement( $string1, $string2 )arr_agreement( \@array1, \@array2 )str_mutate( $string1 [, $num [, \@alphabet ]] )arr_mutate( \@array1 [, $num [, \@alphabet ]] )str_random( $size [, \@alphabet ] )arr_random( $size [, \@alphabet ] )
Algorithm::Evolve::Util - Some useful utility functions for use in evolutionary algorithms.
use Algorithm::Evolve::Util ':str';
use Algorithm::Evolve::Util ':arr';
At the moment, this module only provides string- and array-mangling utilities. They can be imported with the use arguments ':str' and ':arr' respectively.
In the following descriptions, a gene refers to either a string or an array reference. A position in the gene refers to a single character for string genes and an array element for array genes.
str_crossover( $string1, $string2 [, $N ] )arr_crossover( \@array1, \@array2 [, $N ] )Returns a random N-point crossover between the two given genes. $N defaults
to 2. The two inputs should be the same length, although this is not enforced.
$N must be also less than the size of the genes.
If you are unfamiliar with the crossover operation, it works like this: Lay
down the two genes on top of each other. Pick N positions at random, and cut
both genes at each position. Now swap every other pair of segments, and tape
the genes back up. So one possible 2-point crossover on the string genes
aaaaaa and bbbbbb would produce the two genes abbaaa and baabbb
(the two "cuts" here were after the 1st and 3rd positions).
str_agreement( $string1, $string2 )arr_agreement( \@array1, \@array2 )Returns the number of positions in which the two genes agree. Does not enforce that they have the same size, even though the result is somewhat meaningless in that case.
String gene comparison is done in a non-unicode-friendly way. To override this
and use a (slower) unicode-friendly string comparison, set
$Algorithm::Evolve::Util::UNICODE_STRINGS to a true value.
In array genes, the comparison of individual elements is done with eq.
Note that this is the Hamming metric, and not the edit distance metric. Edit distance may be an interesting fitness to use as well. There are at least two modules (Text::Levenshtein and Text::LevenshteinXS) that I know of which calculate the edit distance of two strings.
str_mutate( $string1 [, $num [, \@alphabet ]] )arr_mutate( \@array1 [, $num [, \@alphabet ]] )Returns a random mutation of the gene according to the given alphabet
(defaulting to {0,1}). If $num is less than 1, it performs probabilistic
mutation, with each position having a $num probability of being mutated. If
$num is greater than or equal to 1, it performs N-point mutation: exactly
$num positions are chosen at random and mutated. $num defaults to 1. A
convenient rule of thumb is start with a mutation rate of 1/gene_length.
A mutation will always change the character in question: an 'a' will never be chosen to replace an existing 'a' in a mutation. The following identity holds for N-point mutations:
str_agreement( str_mutate($some_string, $n, \@alph), $some_string )
== length($some_string) - $n;
The alphabet for a string gene should consist of only single characters unless you know what you're doing. Conceivably, you can implement an 'add' and 'remove' mutation by using an alphabet that contains strings with length != 1. But this seems a little hackish to me. For array genes, the alphabet can be just about anything meaningful to you.
str_random( $size [, \@alphabet ] )arr_random( $size [, \@alphabet ] )Returns a random gene of the given size over the given alphabet, defaulting to {0,1}.
StringEvolver.pm in the examples/ directory uses the utilities in Algorithm::Evolve::Util to implement a completely generic simple string evolver critter class in very few lines of code.
Algorithm::Evolve is written by Mike Rosulek <mike@mikero.com>. Feel free to contact me with comments, questions, patches, or whatever.
Copyright (c) 2003 Mike Rosulek. All rights reserved. This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
| Algorithm-Evolve documentation | Contained in the Algorithm-Evolve distribution. |
package Algorithm::Evolve::Util; use strict; use base 'Exporter'; use List::Util qw/shuffle/; use Carp qw/croak carp/; our $VERSION = '0.03'; our $UNICODE_STRINGS = 0; our %EXPORT_TAGS = ( str => [qw/str_crossover str_mutate str_agreement str_random/], arr => [qw/arr_crossover arr_mutate arr_agreement arr_random/], ); our @EXPORT_OK = map { @$_ } values %EXPORT_TAGS; sub str_crossover { my ($s1, $s2, $n_point) = @_; $n_point ||= 2; my $len = length($s1); croak "Can't do ${n_point}-point crossover on length $len string" if $n_point >= $len; ## this allows for duplication of indices. maybe a fixme my @points = sort { $a <=> $b } map { int(rand $len) } 1 .. $n_point; push @points, $len if $n_point % 2; for (0 .. @points/2 - 1) { my ($x, $y) = @points[2*$_, 2*$_+1]; (substr($s1, $x, $y-$x+1), substr($s2, $x, $y-$x+1)) = (substr($s2, $x, $y-$x+1), substr($s1, $x, $y-$x+1)); } return ($s1, $s2); } sub str_agreement { my ($s1, $s2) = @_; ## substr is safe for unicode; xor'ing characters is not. But ## xor is about 30 times faster on longish strings... if ($UNICODE_STRINGS) { my $tally = 0; for (0 .. length($s1)-1) { $tally++ if substr($s1, $_, 1) eq substr($s2, $_, 1); } return $tally; } my $xor = $s1 ^ $s2; return $xor =~ tr/\x0/\x0/; } sub str_mutate { my ($string, $n, $alphabet) = @_; $n ||= 1; $alphabet ||= [0,1]; croak "Invalid alphabet" unless ref $alphabet eq 'ARRAY' and @$alphabet > 1; my $len = length($string); my @mutate_indices = $n < 1 ? map { rand() < $n ? $_ : () } 0 .. $len-1 : (shuffle 0 .. $len-1)[ 0 .. int($n)-1 ]; for my $idx (@mutate_indices) { my $char = substr($string, $idx, 1); my @different = grep { $char ne $_ } @$alphabet; substr($string, $idx, 1) = $different[ int(rand @different) ]; } return $string; } sub str_random { my ($length, $alphabet) = @_; $alphabet ||= [0,1]; return join '', map { $alphabet->[ rand @$alphabet ] } 1 .. $length; } ########################################## sub arr_crossover { my ($a1_ref, $a2_ref, $n_point) = @_; $n_point ||= 2; my @a1 = @$a1_ref; my @a2 = @$a2_ref; my $len = @a1; croak "Can't do ${n_point}-point crossover on length $len array" if $n_point >= $len; ## this allows for duplication of indices. maybe a fixme my @points = sort { $a <=> $b } map { int(rand $len) } 1 .. $n_point; push @points, $len-1 if $n_point % 2; for (0 .. @points/2 - 1) { my ($x, $y) = @points[2*$_, 2*$_+1]; my @tmp = @a1[$x .. $y]; @a1[$x .. $y] = @a2[$x .. $y]; @a2[$x .. $y] = @tmp; } return (\@a1, \@a2); } sub arr_agreement { my ($a1, $a2) = @_; my $tally = 0; for (0 .. $#{$a1}) { $tally++ if $a1->[$_] eq $a2->[$_]; } return $tally; } sub arr_mutate { my ($arr_ref, $n, $alphabet) = @_; $n ||= 1; $alphabet ||= [0,1]; my @arr = @$arr_ref; croak "Invalid alphabet" unless ref $alphabet eq 'ARRAY' and @$alphabet > 1; my $len = scalar @arr; my @mutate_indices = $n < 1 ? map { rand() < $n ? $_ : () } 0 .. $len-1 : (shuffle 0 .. $len-1)[ 0 .. int($n)-1 ]; for my $idx (@mutate_indices) { my $char = $arr[$idx]; my @different = grep { $char ne $_ } @$alphabet; $arr[$idx] = $different[ int(rand @different) ]; } return \@arr; } sub arr_random { my ($length, $alphabet) = @_; $alphabet ||= [0,1]; return [ map { $alphabet->[ rand @$alphabet ] } 1 .. $length ]; } ########################################## ########################################## ########################################## 1; __END__