This module is intended to duplicate the ‘main()’ function found in other
languages such as C++ and Java. In order to run an experiment, this module
should be passed to your interpreter. In the interest of speed and consistency
we suggest that PyPy 1.9.0 be used to run this code, although
Python 2.7 should be able to handle it correctly.
To see a full description of this modules command line arguments, run
pypy main.py -h.
Provided with this code should be the cfg folder which contains some
configuration files useful for running experiments. These files can be passed
to main along with other configuration information in order to recreate
the experiments performed in the paper Analysis of Cartesian Genetic Programming's
Evolutionary Mechanisms. For example, the following command performs
one run of the Parity problem using seed number 13 with a genome size of
2000. Also, verbose output is printed to the display,
the reorder variant is used, and the results are output to output/test_run.dat.
pypy main.py cfg/once.cfg cfg/parity.cfg -seed 13 -g 200 -v -ordering reorder
-out test_run.dat
For any support questions email brianwgoldman@acm.org.
-
main.all_runs(config)[source]
Perform all of the requested runs on a given problem. Returns a two part
tuple:
- list of the dictionaries returned by one_run.
- frequency information collected by one_run.
Will give results for all
completed runs if a keyboard interrupt is received.
Parameters:
- config: Dictionary containing all configuration information required
to perform all of the necessary runs of the experiment. Should contain
values for:
- All configuration information needed by one_run
- problem: The name of which problem from the problem module to
run experiments on.
- runs: How many runs to perform
-
main.combine_results(results)[source]
Given a list of result dictionaries, return analysis information such as
the median values of each statistic as well as the median absolute
deviation.
Parameters:
- results: A list of dictionaries containing similar key values.
-
main.frequencies_to_vector(config, frequencies)[source]
Returns a flattened version of the frequencies dictionary.
-
main.one_run(evaluator, config, frequencies)[source]
Performs a single run of the given configuration. Returns a dictionary
containing results.
Parameters:
- evaluator: An object with the function get_fitness that takes an
individual and returns its fitness value
- config: A dictionary containing all of the configuration information
required to perform a experimental run, including:
- All information required to initialize an individual.
- All information required to run evolution.multi_independent.
- verbose: Boolean value for if extra runtime information should
be displayed.
- max_evals: The maximum number of evaluations allowed before
termination.
- max_fitness: The fitness required to cause a “successful”
termination.
- frequencies: Dictionary used to return information about how often
individuals of different lengths are evolved. Set by evolution.generate.
Handles how to perform all of the actual evolution.
-
class evolution.Individual(graph_length, input_length, output_length, max_arity, function_list, **_)[source]
Bases: object
An individual object used to combine gene fitness with genomes, as
well methods for manipulating those genomes.
-
all_active()[source]
Function that always makes all nodes in the genome active. Useful
when the fitness function analyzes nodes directly when combined with
Single mutation.
-
asym_phenotypic_difference(other)[source]
Determine how many genes would have to be mutated in order to make
the other individual phenotypically identical to self.
Parameters:
- other: The individual to compare with.
-
connections(node_index)[source]
Return the list of connections that a specified node has.
Parameters
- node_index: The index of the node information is being requested
for. Note this is different from gene index.
-
dag_determine_active_nodes()[source]
Determines which nodes are currently active and sets self.active
to the sorted list of active genes in DAG individuals.
Automatically called by gene manipulating member functions.
-
dag_random_gene(index, invalid=None)[source]
Determines a random gene value given a gene index of a full DAG.
If optional invalid option is used, the returned value
will only be invalid if the gene has no other valid values.
Parameters:
- index: The gene index who’s value is being set.
- invalid: Value to avoid returning if possible
-
determine_active_nodes()[source]
Determines which nodes are currently active and sets self.active
to the sorted list of active genes. Automatically called by gene
manipulating member functions.
-
dump()[source]
Returns a json ready dictionary representation of the entire individual
-
dump_genes()[source]
Returns a file read/writable representation of the individuals genes.
-
evaluate(inputs)[source]
Given a list of inputs, return a list of outputs from executing
this individual.
Parameters:
- inputs: The list of input values for the individual to process.
-
load(data)[source]
Recovers a “dump”ed individual from the dictionary,
overwriting the calling individual.
-
mutate(mutation_rate)[source]
Mutates the calling individual’s genes using the give mutation rate.
Parameters:
- mutation_rate: The probability that a specific gene will mutate.
-
new(modification_method, *args, **kwargs)[source]
Return a copy of the individual. Note that individuals are shallow
copied except for their list of genes.
-
one_active_mutation(_)[source]
Mutates the calling individual using the Single mutation method.
-
random_gene(index, invalid=None)[source]
Determines a random gene value given a gene index. If optional
invalid option is used, the returned value will only be invalid
if the gene has no other valid values.
Parameters:
- index: The gene index who’s value is being set.
- invalid: Value to avoid returning if possible
-
static reconstruct_individual(data, test_inputs)[source]
Recovers an individual’s connections and never active node listing.
Used when post processing individuals from previous runs.
-
reorder()[source]
Reorder individual’s genes randomly without
changing any of the actual connection information.
-
show_active()[source]
Prints the active portions of the individual in a somewhat readable
way.
-
simplify()[source]
Using the semantic information recorded for each node, this
will modify the individual to ensure only 1 node produces each
unique semantic.
-
valid_reconnect(node_index, invalid=None)[source]
When using a DAG individual, find a random connection location that
does not depend on the current node.
Parameters:
- node_index: The index of the node who’s connection is being reset
- invalid: Value to avoid returning if possible
-
evolution.generate(config, output, frequencies)[source]
An Individual generator that will yield a never ending supply of
Individual objects that need to have their fitness set before the
next Individual can be yielded.
Parameters:
- config: A dictionary containing all configuration information
required to generate initial individuals. Should include values
for:
- All configuration information required to initialize an Individual.
- dag: If DAG based individuals should be used.
- reorder: If the parent should be reordered before making offspring.
- mutation_rate: The probably to use for mutation.
- off_size: The number of offspring to produce per generation.
- output_length: The number of output variables.
- max_arity: The maximum arity used by any function.
- duplicate: String specifying the way to handle duplicate
individual creation, either normal'', ``skip'', ``accumulate, or
single.
- problem: The problem these individuals are solving. Used on in
the case where problems require unusual individual modification.
- output: Dictionary used to return information about evolution, will
send out:
- skipped: The number of evaluations skipped by Skip.
- estimated: The estimated number of evaluations that are skippable.
- inactive_bits_changed: Keeps track of nodes that were active, became
inactive, and have become active again, looking at how many bits in their
semantics have changed. Is a dictionary of frequencies mapping change -> count.
- reactiveated_nodes: Counts the number of nodes that were active,
became inactive, and have become active again.
- active_nodes_changed: Counts how many nodes that were active before and
after mutation had at least one bit in their semantics change. Is a dictionary
mapping number changed in a single mutation -> count.
- active_bits_changed: Counts how many bits were changed when an active
node was changed by mutation but remained active.
- child_replaced_parent: Counts how often the parent is replaced by the offspring.
- parent_not_replaced: Counts how often no offspring in a generation replaced the parent.
- frequencies: Dictionary used to return information about how often
individuals of different lengths are evolved.
-
evolution.multi_indepenedent(config, output, frequencies)[source]
Allows for multiple parallel independent populations to be evolved
at the same time. Will generate one individual from each population
before repeating a population. Not used in the paper Analysis of
Cartesian Genetic Programming's Evolutionary Mechanisms.
Parameters:
- config: A dictionary containing all of the configuration information
required to generate multiple populations worth of individuals. Should
include values for:
- All configuration information required by generate.
- pop_size: The number of parallel populations to use.
- output: Used to return information about evolution. Shared by all
parallel populations. Will contain all information output by
generate.
- frequencies: Dictionary used to return information about how often
individuals of different lengths are evolved. Shared by all parallel
populations. Will contain all information output by generate.
Defines each of the benchmark problems used as well as the function sets
for those problems.
-
class problems.Active(config)[source]
Bases: problems.Problem
Defines the Active problem, in which all individuals receive fitness
based on how many active nodes they have.
The only operator in this function is ‘None’, meaning only
connection genes actually evolve.
-
get_fitness(individual)[source]
Returns the percentage of nodes that are active.
-
max_arity = 2
-
operators = [None]
-
class problems.Binary_Decode(config)[source]
Bases: problems.Bounded_Problem, problems.Binary_Mixin
Defines the Binary Decode problem.
-
problem_function(inputs)[source]
Returns a 1 on the output line specified by the binary input index
-
class problems.Binary_Encode(config)[source]
Bases: problems.Bounded_Problem, problems.Binary_Mixin
Defines the Binary Encode problem.
-
static data_range(config)
Creates the list of all possible binary strings of specified length
with exactly one set bit. config should specify the input_length.
-
problem_function(inputs)[source]
Returns the binary encoding of which input line contains a one.
-
class problems.Binary_Mixin[source]
Bases: object
Inheritance mixin useful for setting the class attributes of
binary problems.
-
static data_range(config)
Given a dictionary specifying the input_length, returns all binary
values of that length.
-
max_arity = 2
-
operators = [<built-in function or_>, <built-in function and_>, <function nand at 0xa01210c>, <function nor at 0xa012144>]
-
class problems.Binary_Multiply(config)[source]
Bases: problems.Bounded_Problem, problems.Binary_Mixin
Defines the Binary Multiplier problem.
-
problem_function(inputs)[source]
Return the result of performing a binary multiplication of the first
half of the inputs with the second half. Will always have the same
number of output bits as input bits.
-
class problems.Binary_Multiply_Miller(config)[source]
Bases: problems.Binary_Multiply
-
operators = [<built-in function and_>, <function and_neg_in at 0xa01217c>, <built-in function xor>, <built-in function or_>]
-
class problems.Binary_Multiply_Torresen(config)[source]
Bases: problems.Binary_Multiply
-
operators = [<built-in function and_>, <built-in function xor>]
-
class problems.Bounded_Problem(config)[source]
Bases: object
Base object for any problem with a known set of test cases. Stores a
map for all possible inputs to their correct outputs so they only
have to be evaluated once.
-
get_fitness(individual)[source]
Return the fitness of an individual as applied to this problem.
Parameters:
- individual: The individual to be evaluated.
-
problem_function(_)[source]
Designed to force children of this class to implement this function.
Children use this function to define how to translate an input value
into an output value for their problem.
-
class problems.Breadth(config)[source]
Bases: problems.Bounded_Problem, problems.Binary_Mixin
Defines the Breadth problem.
-
static data_range(config)
Creates the list of all possible binary strings of specified length
with exactly one set bit. config should specify the input_length.
-
operators = [<built-in function or_>]
-
problem_function(inputs)[source]
Returns true as long as at least one input is true.
-
class problems.Demultiplexer(config)[source]
Bases: problems.Bounded_Problem, problems.Binary_Mixin
Defines the Demultiplexer (DEMUX) Problem.
-
problem_function(inputs)[source]
Returns the last input bit on the output line specified by the binary
index encoded on all inputs except the last bit.
-
class problems.Depth(config)[source]
Bases: problems.Problem
Defines the Depth problem.
-
get_fitness(individual)[source]
Returns the fitness of the individual as a percentage of maximum
fitness.
-
max_arity = 2
-
operators = [<function <lambda> at 0xa012844>]
-
class problems.Even_Parity(config)[source]
Bases: problems.Bounded_Problem, problems.Binary_Mixin
Defines the Even Parity problem.
-
problem_function(inputs)[source]
Return the even parity of a list of boolean values.
-
class problems.Flat(_)[source]
Bases: problems.Problem
Defines the Flat problem, in which all individuals receive fitness
based on how many connection genes are connected to the input.
The only operator in this function is ‘None’, meaning only
connection genes actually evolve.
-
get_fitness(individual)[source]
Returns the percentage of connection genes connected to the input.
-
max_arity = 2
-
operators = [None]
-
class problems.Koza_1(config)[source]
Bases: problems.Bounded_Problem, problems.Regression_Mixin
Defines the Koza-1 problem.
-
koza_quartic(inputs)[source]
Return the result of Koza-1 on the specified input. Expects the input
as a single element list and returns a single element list.
-
class problems.Multiplexer(config)[source]
Bases: problems.Bounded_Problem, problems.Binary_Mixin
Defines the Multiplexer (MUX) Problem.
-
problem_function(inputs)[source]
Uses the first k bits as a selector for which of the remaining bits to
return.
-
class problems.Neutral(_)[source]
Bases: problems.Problem
Defines the Neutral problem, in which all individuals receive the same
fitness. The only operator in this function is ‘None’, meaning only
connection genes actually evolve.
-
get_fitness(_)[source]
Returns the fitness of passed in individual, which is always 0.
-
max_arity = 2
-
operators = [None]
-
class problems.Novel(config)[source]
Bases: problems.Problem, problems.Binary_Mixin
Defines the Novel problem, which evaluates individuals based on how many
unique semantics the individual can create.
-
get_fitness(individual)[source]
-
class problems.Pagie_1(config)[source]
Bases: problems.Bounded_Problem, problems.Regression_Mixin
Defines the Pagie-1 problem.
-
static data_range(config)
Returns a multidimensional grid of points in the input space.
Parameters:
- config: A dictionary containing information about the input space.
- All configuration information required by float_range.
- input_length: How many dimensions are in the input space.
-
pagie(inputs)[source]
Returns the result of Pagie-1 on the specified inputs.
-
class problems.Problem(config)[source]
Bases: object
The abstract base of a problem
-
get_fitness(individual)[source]
Designed to force children of this class to implement this function.
Children use this function evaluate an individual and
return its fitness.
-
class problems.Regression_Mixin[source]
Bases: object
Inheritance mixin useful for setting the class attributes of
regression problems.
-
static data_range(config)
Returns a incremental range of a floating point value. Like range() for
floats.
Parameters:
- config: A dictionary containing information about the input space.
- min: The minimum valid value in the space.
- max: The maximum valid value in the space.
- step: The distance between sample points.
-
max_arity = 2
-
operators = [<function add at 0xa012224>, <function sub at 0xa01225c>, <function mul at 0xa012294>, <function div at 0xa0122cc>]
-
class problems.TwoFloor(config)[source]
Bases: problems.Bounded_Problem, problems.Binary_Mixin
Defines the Two Floor Problem.
-
static data_range(config)
Creates the list of all possible binary strings of specified length
with exactly one set bit. config should specify the input_length.
-
operators = [<built-in function or_>]
-
problem_function(inputs)[source]
Returns a string of bits half as long as the input string, where
the only set output bit is at the index // 2 of the set input bit.
-
problems.and_neg_in(x, y)[source]
-
problems.arity_controlled(desired)[source]
Decorator used to make functions take any number of inputs while only
using the first desired number of arguments. For example, you can
pass 10 arguments to a function that takes only 1 if desired=1 and
the first of the arguments will actually be used. Currently unused.
Parameters:
- desired: The actual arity of the wrapped function.
-
problems.binary_range(config)[source]
Given a dictionary specifying the input_length, returns all binary
values of that length.
-
problems.float_range(config)[source]
Returns a incremental range of a floating point value. Like range() for
floats.
Parameters:
- config: A dictionary containing information about the input space.
- min: The minimum valid value in the space.
- max: The maximum valid value in the space.
- step: The distance between sample points.
-
problems.float_samples(config)[source]
Returns random samples of the input space.
Parameters:
- config: A dictionary containing information about the input space.
- min: The minimum valid value in the space.
- max: The maximum valid value in the space.
- input_length: The number of input variables.
- samples: The number of samples to draw.
-
problems.n_dimensional_grid(config)[source]
Returns a multidimensional grid of points in the input space.
Parameters:
- config: A dictionary containing information about the input space.
- All configuration information required by float_range.
- input_length: How many dimensions are in the input space.
-
problems.nand(x, y)[source]
Simple Nand function for inclusion in function sets.
-
problems.nor(x, y)[source]
Simple Nor function for inclusion in function sets.
-
problems.protected(function)[source]
Decorator that ensures decorated functions always have a valid output.
If an exception occurs or infinity is returned, the first argument of the
function will be returned.
Parameters:
- function: The function to be decorated.
-
problems.single_bit_set(config)[source]
Creates the list of all possible binary strings of specified length
with exactly one set bit. config should specify the input_length.
Collection of utility functions with no other obvious home.
-
util.bitcount(integer)[source]
Counts the number of set bits in an integer
-
util.diff_count(data1, data2)[source]
Count the number of differences is two sets of data
-
util.find_median(data)[source]
Returns the median of the data.
-
util.load_configurations(filenames)[source]
Given a list of files containing json encoded dictionaries, combined
the data into a single dictionary. Will attempt to use file extension
to detect correct file type.
Parameters:
- filenames: The list of files paths.
-
util.meanstd(data)[source]
Returns the mean and standard deviation of the given data.
Parameters:
- data: The data to find the mean and standard deviation of.
-
util.median_deviation(data, median=None)[source]
Returns the median and the median absolute deviation of the data.
Parameters:
- data: The data to find the medians of.
- median: If the median is already known you can pass it in to save
time.
-
util.open_file_method(filename)[source]
This chooses the proper way to open a file by examining its extension.
Returns the function to use, either “open” or “gzip”
-
util.save_configuration(filename, data)[source]
Write a dictionary to the specified file in json format. Will attempt to
use file extension to detect correct file type.
Parameters
- filename: The path to write to.
- data: The data to be written.
-
util.save_list(filename, data)[source]
Write a list of dictionaries to the file in a more human readable way.
Will attempt to use file extension to detect correct file type.
Parameters
- filename: The path to write to.
- data: The list of dictionaries to be written.
-
util.set_fonts()[source]
Configures matplotlib to use only Type 1 fonts, and sets the figure size
such that those fonts will be legible when the figure is inserted in
a publication.
Takes file names from the final/ folder and parses the information to
create the bar charts in Table IV. Use this module as an
executable to process all result information for a single problem, such as:
python bar_plot.py final/multiply_accumulate_normal_*.dat.gz
Do not mix problems in a single run.
NOTE: You CANNOT use pypy for this as pylab is current unsupported. Use
python 2.7 instead.
Takes file names from the final/ folder as command line arguments
and parses the semantic information to produce the contents of Table VI.
Use this module as an executable to process each problem’s results:
python bit_behavior final/decode_*.dat.gz
Note: Do not mix results from different problems.
This will parse through all files in the final folder and produce
R parsable output for use in analysis. Run as a stand alone
executable in the form of:
python make_rdata.py final/*.dat.gz > rdata.csv
Takes file names from the final/ folder and parses the information to
create the black and white graphs in Table III. Use this module as an
executable to process all result information for a single problem, such as:
python never_actives.py final/multiply_accumulate_normal_*.dat.gz
Do not mix problems in a single run.
NOTE: You CANNOT use pypy for this as pylab is current unsupported. Use
python 2.7 instead.
-
never_actives.cmap_discretize(cmap, N)[source]
Return a discrete colormap from the continuous colormap cmap.
cmap: colormap instance, eg. cm.jet.
N: number of colors.
Takes file names from the final/ folder and parses the information into
readable values and produces statistical measures used in Table II. Use this module as an
executable to process all result information for a single problem, such as:
python stats.py final/multiply*.dat.gz
Do not mix problems in a single run.
NOTE: You CANNOT use pypy for this as scipy is current unsupported. Use
python 2.7 instead.
-
stats.make_rectangular(data, fill)[source]
Given data, a list of lists, square off the rows and return a matrix,
with empty spaces being filled.