`main`¶

This module is intended to duplicate the ‘main()’ function found in other languages such as C++ and Java. In order to run an experiment, this module should be passed to your interpreter. In the interest of speed and consistency we suggest that PyPy 1.9.0 be used to run this code, although Python 2.7 should be able to handle it correctly.

To see a full description of this modules command line arguments, run pypy main.py -h.

Provided with this code should be the cfg folder which contains some configuration files useful for running experiments. These files can be passed to main along with other configuration information in order to recreate the experiments performed in the paper Analysis of Cartesian Genetic Programming's Evolutionary Mechanisms. For example, the following command performs one run of the Parity problem using seed number 13 with a genome size of 2000. Also, verbose output is printed to the display, the reorder variant is used, and the results are output to output/test_run.dat.

pypy main.py cfg/once.cfg cfg/parity.cfg -seed 13 -g 200 -v -ordering reorder -out test_run.dat

For any support questions email brianwgoldman@acm.org.

main.all_runs(config)[source]¶

Perform all of the requested runs on a given problem. Returns a two part tuple:

list of the dictionaries returned by one_run.
frequency information collected by one_run.

Will give results for all completed runs if a keyboard interrupt is received.

Parameters:

config: Dictionary containing all configuration information required to perform all of the necessary runs of the experiment. Should contain values for:
- All configuration information needed by one_run
- problem: The name of which problem from the problem module to run experiments on.
- runs: How many runs to perform

main.combine_results(results)[source]¶

Given a list of result dictionaries, return analysis information such as the median values of each statistic as well as the median absolute deviation.

Parameters:

results: A list of dictionaries containing similar key values.

main.frequencies_to_vector(config, frequencies)[source]¶: Returns a flattened version of the frequencies dictionary.

main.one_run(evaluator, config, frequencies)[source]¶

Performs a single run of the given configuration. Returns a dictionary containing results.

Parameters:

evaluator: An object with the function get_fitness that takes an individual and returns its fitness value
config: A dictionary containing all of the configuration information required to perform a experimental run, including:
- All information required to initialize an individual.
- All information required to run evolution.multi_independent.
- verbose: Boolean value for if extra runtime information should be displayed.
- max_evals: The maximum number of evaluations allowed before termination.
- max_fitness: The fitness required to cause a “successful” termination.
frequencies: Dictionary used to return information about how often individuals of different lengths are evolved. Set by evolution.generate.

`evolution`¶

Handles how to perform all of the actual evolution.

class evolution.Individual(graph_length, input_length, output_length, max_arity, function_list, **_)[source]¶

Bases: object

An individual object used to combine gene fitness with genomes, as well methods for manipulating those genomes.

all_active()[source]¶: Function that always makes all nodes in the genome active. Useful when the fitness function analyzes nodes directly when combined with Single mutation.

asym_phenotypic_difference(other)[source]¶

Determine how many genes would have to be mutated in order to make the other individual phenotypically identical to self.

Parameters:

other: The individual to compare with.

connections(node_index)[source]¶

Return the list of connections that a specified node has.

Parameters

node_index: The index of the node information is being requested for. Note this is different from gene index.

dag_determine_active_nodes()[source]¶: Determines which nodes are currently active and sets self.active to the sorted list of active genes in DAG individuals. Automatically called by gene manipulating member functions.

dag_random_gene(index, invalid=None)[source]¶

Determines a random gene value given a gene index of a full DAG. If optional invalid option is used, the returned value will only be invalid if the gene has no other valid values.

Parameters:

index: The gene index who’s value is being set.
invalid: Value to avoid returning if possible

determine_active_nodes()[source]¶: Determines which nodes are currently active and sets self.active to the sorted list of active genes. Automatically called by gene manipulating member functions.

dump()[source]¶: Returns a json ready dictionary representation of the entire individual

dump_genes()[source]¶: Returns a file read/writable representation of the individuals genes.

evaluate(inputs)[source]¶

Given a list of inputs, return a list of outputs from executing this individual.

Parameters:

inputs: The list of input values for the individual to process.

load(data)[source]¶: Recovers a “dump”ed individual from the dictionary, overwriting the calling individual.

mutate(mutation_rate)[source]¶

Mutates the calling individual’s genes using the give mutation rate.

Parameters:

mutation_rate: The probability that a specific gene will mutate.

new(modification_method, *args, **kwargs)[source]¶: Return a copy of the individual. Note that individuals are shallow copied except for their list of genes.

one_active_mutation(_)[source]¶: Mutates the calling individual using the Single mutation method.

random_gene(index, invalid=None)[source]¶

Determines a random gene value given a gene index. If optional invalid option is used, the returned value will only be invalid if the gene has no other valid values.

Parameters:

index: The gene index who’s value is being set.
invalid: Value to avoid returning if possible

static reconstruct_individual(data, test_inputs)[source]¶: Recovers an individual’s connections and never active node listing. Used when post processing individuals from previous runs.

reorder()[source]¶: Reorder individual’s genes randomly without changing any of the actual connection information.

show_active()[source]¶: Prints the active portions of the individual in a somewhat readable way.

simplify()[source]¶: Using the semantic information recorded for each node, this will modify the individual to ensure only 1 node produces each unique semantic.

valid_reconnect(node_index, invalid=None)[source]¶

When using a DAG individual, find a random connection location that does not depend on the current node.

Parameters:

node_index: The index of the node who’s connection is being reset
invalid: Value to avoid returning if possible

evolution.generate(config, output, frequencies)[source]¶

An Individual generator that will yield a never ending supply of Individual objects that need to have their fitness set before the next Individual can be yielded.

Parameters:

config: A dictionary containing all configuration information required to generate initial individuals. Should include values for:
- All configuration information required to initialize an Individual.
- dag: If DAG based individuals should be used.
- reorder: If the parent should be reordered before making offspring.
- mutation_rate: The probably to use for mutation.
- off_size: The number of offspring to produce per generation.
- output_length: The number of output variables.
- max_arity: The maximum arity used by any function.
- duplicate: String specifying the way to handle duplicate individual creation, either normal'', ``skip'', ``accumulate, or single.
- problem: The problem these individuals are solving. Used on in the case where problems require unusual individual modification.
output: Dictionary used to return information about evolution, will send out:
- skipped: The number of evaluations skipped by Skip.
- estimated: The estimated number of evaluations that are skippable.
- inactive_bits_changed: Keeps track of nodes that were active, became inactive, and have become active again, looking at how many bits in their semantics have changed. Is a dictionary of frequencies mapping change -> count.
- reactiveated_nodes: Counts the number of nodes that were active, became inactive, and have become active again.
- active_nodes_changed: Counts how many nodes that were active before and after mutation had at least one bit in their semantics change. Is a dictionary mapping number changed in a single mutation -> count.
- active_bits_changed: Counts how many bits were changed when an active node was changed by mutation but remained active.
- child_replaced_parent: Counts how often the parent is replaced by the offspring.
- parent_not_replaced: Counts how often no offspring in a generation replaced the parent.
frequencies: Dictionary used to return information about how often individuals of different lengths are evolved.

evolution.multi_indepenedent(config, output, frequencies)[source]¶

Allows for multiple parallel independent populations to be evolved at the same time. Will generate one individual from each population before repeating a population. Not used in the paper Analysis of Cartesian Genetic Programming's Evolutionary Mechanisms.

Parameters:

config: A dictionary containing all of the configuration information required to generate multiple populations worth of individuals. Should include values for:
- All configuration information required by generate.
- pop_size: The number of parallel populations to use.
output: Used to return information about evolution. Shared by all parallel populations. Will contain all information output by generate.
frequencies: Dictionary used to return information about how often individuals of different lengths are evolved. Shared by all parallel populations. Will contain all information output by generate.

`problems`¶

Defines each of the benchmark problems used as well as the function sets for those problems.

class problems.Active(config)[source]¶

Bases: problems.Problem

Defines the Active problem, in which all individuals receive fitness based on how many active nodes they have. The only operator in this function is ‘None’, meaning only connection genes actually evolve.

get_fitness(individual)[source]¶: Returns the percentage of nodes that are active.

max_arity = 2¶

operators = [None]¶

class problems.Binary_Decode(config)[source]¶

Bases: problems.Bounded_Problem, problems.Binary_Mixin

Defines the Binary Decode problem.

problem_function(inputs)[source]¶: Returns a 1 on the output line specified by the binary input index

class problems.Binary_Encode(config)[source]¶

Bases: problems.Bounded_Problem, problems.Binary_Mixin

Defines the Binary Encode problem.

static data_range(config)¶: Creates the list of all possible binary strings of specified length with exactly one set bit. config should specify the input_length.

problem_function(inputs)[source]¶: Returns the binary encoding of which input line contains a one.

class problems.Binary_Mixin[source]¶

Bases: object

Inheritance mixin useful for setting the class attributes of binary problems.

static data_range(config)¶: Given a dictionary specifying the input_length, returns all binary values of that length.

max_arity = 2¶

operators = [<built-in function or_>, <built-in function and_>, <function nand at 0xa01210c>, <function nor at 0xa012144>]¶

class problems.Binary_Multiply(config)[source]¶

Bases: problems.Bounded_Problem, problems.Binary_Mixin

Defines the Binary Multiplier problem.

problem_function(inputs)[source]¶: Return the result of performing a binary multiplication of the first half of the inputs with the second half. Will always have the same number of output bits as input bits.

class problems.Binary_Multiply_Miller(config)[source]¶

Bases: problems.Binary_Multiply

operators = [<built-in function and_>, <function and_neg_in at 0xa01217c>, <built-in function xor>, <built-in function or_>]¶

class problems.Binary_Multiply_Torresen(config)[source]¶

Bases: problems.Binary_Multiply

operators = [<built-in function and_>, <built-in function xor>]¶

class problems.Bounded_Problem(config)[source]¶

Bases: object

Base object for any problem with a known set of test cases. Stores a map for all possible inputs to their correct outputs so they only have to be evaluated once.

get_fitness(individual)[source]¶

Return the fitness of an individual as applied to this problem.

Parameters:

individual: The individual to be evaluated.

problem_function(_)[source]¶: Designed to force children of this class to implement this function. Children use this function to define how to translate an input value into an output value for their problem.

class problems.Breadth(config)[source]¶

Bases: problems.Bounded_Problem, problems.Binary_Mixin

Defines the Breadth problem.

static data_range(config)¶: Creates the list of all possible binary strings of specified length with exactly one set bit. config should specify the input_length.

operators = [<built-in function or_>]¶

problem_function(inputs)[source]¶: Returns true as long as at least one input is true.

class problems.Demultiplexer(config)[source]¶

Bases: problems.Bounded_Problem, problems.Binary_Mixin

Defines the Demultiplexer (DEMUX) Problem.

problem_function(inputs)[source]¶: Returns the last input bit on the output line specified by the binary index encoded on all inputs except the last bit.

class problems.Depth(config)[source]¶

Bases: problems.Problem

Defines the Depth problem.

get_fitness(individual)[source]¶: Returns the fitness of the individual as a percentage of maximum fitness.

max_arity = 2¶

operators = [<function <lambda> at 0xa012844>]¶

class problems.Even_Parity(config)[source]¶

Bases: problems.Bounded_Problem, problems.Binary_Mixin

Defines the Even Parity problem.

problem_function(inputs)[source]¶: Return the even parity of a list of boolean values.

class problems.Flat(_)[source]¶

Bases: problems.Problem

Defines the Flat problem, in which all individuals receive fitness based on how many connection genes are connected to the input. The only operator in this function is ‘None’, meaning only connection genes actually evolve.

get_fitness(individual)[source]¶: Returns the percentage of connection genes connected to the input.

max_arity = 2¶

operators = [None]¶

class problems.Koza_1(config)[source]¶

Bases: problems.Bounded_Problem, problems.Regression_Mixin

Defines the Koza-1 problem.

koza_quartic(inputs)[source]¶: Return the result of Koza-1 on the specified input. Expects the input as a single element list and returns a single element list.

class problems.Multiplexer(config)[source]¶

Bases: problems.Bounded_Problem, problems.Binary_Mixin

Defines the Multiplexer (MUX) Problem.

problem_function(inputs)[source]¶: Uses the first k bits as a selector for which of the remaining bits to return.

class problems.Neutral(_)[source]¶

Bases: problems.Problem

Defines the Neutral problem, in which all individuals receive the same fitness. The only operator in this function is ‘None’, meaning only connection genes actually evolve.

get_fitness(_)[source]¶: Returns the fitness of passed in individual, which is always 0.

max_arity = 2¶

operators = [None]¶

class problems.Novel(config)[source]¶

Bases: problems.Problem, problems.Binary_Mixin

Defines the Novel problem, which evaluates individuals based on how many unique semantics the individual can create.

get_fitness(individual)[source]¶

class problems.Pagie_1(config)[source]¶

Bases: problems.Bounded_Problem, problems.Regression_Mixin

Defines the Pagie-1 problem.

static data_range(config)¶

Returns a multidimensional grid of points in the input space.

Parameters:

config: A dictionary containing information about the input space. - All configuration information required by float_range. - input_length: How many dimensions are in the input space.

pagie(inputs)[source]¶: Returns the result of Pagie-1 on the specified inputs.

class problems.Problem(config)[source]¶

Bases: object

The abstract base of a problem

get_fitness(individual)[source]¶: Designed to force children of this class to implement this function. Children use this function evaluate an individual and return its fitness.

class problems.Regression_Mixin[source]¶

Bases: object

Inheritance mixin useful for setting the class attributes of regression problems.

static data_range(config)¶

Returns a incremental range of a floating point value. Like range() for floats.

Parameters:

config: A dictionary containing information about the input space. - min: The minimum valid value in the space. - max: The maximum valid value in the space. - step: The distance between sample points.

max_arity = 2¶

operators = [<function add at 0xa012224>, <function sub at 0xa01225c>, <function mul at 0xa012294>, <function div at 0xa0122cc>]¶

class problems.TwoFloor(config)[source]¶

Bases: problems.Bounded_Problem, problems.Binary_Mixin

Defines the Two Floor Problem.

static data_range(config)¶: Creates the list of all possible binary strings of specified length with exactly one set bit. config should specify the input_length.

operators = [<built-in function or_>]¶

problem_function(inputs)[source]¶: Returns a string of bits half as long as the input string, where the only set output bit is at the index // 2 of the set input bit.

problems.and_neg_in(x, y)[source]¶

problems.arity_controlled(desired)[source]¶

Decorator used to make functions take any number of inputs while only using the first desired number of arguments. For example, you can pass 10 arguments to a function that takes only 1 if desired=1 and the first of the arguments will actually be used. Currently unused.

Parameters:

desired: The actual arity of the wrapped function.

problems.binary_range(config)[source]¶: Given a dictionary specifying the input_length, returns all binary values of that length.

problems.float_range(config)[source]¶

Returns a incremental range of a floating point value. Like range() for floats.

Parameters:

config: A dictionary containing information about the input space. - min: The minimum valid value in the space. - max: The maximum valid value in the space. - step: The distance between sample points.

problems.float_samples(config)[source]¶

Returns random samples of the input space.

Parameters:

config: A dictionary containing information about the input space. - min: The minimum valid value in the space. - max: The maximum valid value in the space. - input_length: The number of input variables. - samples: The number of samples to draw.

problems.n_dimensional_grid(config)[source]¶

Returns a multidimensional grid of points in the input space.

Parameters:

config: A dictionary containing information about the input space. - All configuration information required by float_range. - input_length: How many dimensions are in the input space.

problems.nand(x, y)[source]¶: Simple Nand function for inclusion in function sets.

problems.nor(x, y)[source]¶: Simple Nor function for inclusion in function sets.

problems.protected(function)[source]¶

Decorator that ensures decorated functions always have a valid output. If an exception occurs or infinity is returned, the first argument of the function will be returned.

Parameters:

function: The function to be decorated.

problems.single_bit_set(config)[source]¶: Creates the list of all possible binary strings of specified length with exactly one set bit. config should specify the input_length.

`util`¶

Collection of utility functions with no other obvious home.

util.bitcount(integer)[source]¶: Counts the number of set bits in an integer

util.diff_count(data1, data2)[source]¶: Count the number of differences is two sets of data

util.find_median(data)[source]¶: Returns the median of the data.

util.load_configurations(filenames)[source]¶

Given a list of files containing json encoded dictionaries, combined the data into a single dictionary. Will attempt to use file extension to detect correct file type.

Parameters:

filenames: The list of files paths.

util.meanstd(data)[source]¶

Returns the mean and standard deviation of the given data.

Parameters:

data: The data to find the mean and standard deviation of.

util.median_deviation(data, median=None)[source]¶

Returns the median and the median absolute deviation of the data.

Parameters:

data: The data to find the medians of.
median: If the median is already known you can pass it in to save time.

util.open_file_method(filename)[source]¶: This chooses the proper way to open a file by examining its extension. Returns the function to use, either “open” or “gzip”

util.save_configuration(filename, data)[source]¶

Write a dictionary to the specified file in json format. Will attempt to use file extension to detect correct file type.

Parameters

filename: The path to write to.
data: The data to be written.

util.save_list(filename, data)[source]¶

Write a list of dictionaries to the file in a more human readable way. Will attempt to use file extension to detect correct file type.

Parameters

filename: The path to write to.
data: The list of dictionaries to be written.

util.set_fonts()[source]¶: Configures matplotlib to use only Type 1 fonts, and sets the figure size such that those fonts will be legible when the figure is inserted in a publication.

`bar_plot`¶

Takes file names from the final/ folder and parses the information to create the bar charts in Table IV. Use this module as an executable to process all result information for a single problem, such as:

python bar_plot.py final/multiply_accumulate_normal_*.dat.gz

Do not mix problems in a single run.

NOTE: You CANNOT use pypy for this as pylab is current unsupported. Use python 2.7 instead.

`bit_behavior`¶

Takes file names from the final/ folder as command line arguments and parses the semantic information to produce the contents of Table VI. Use this module as an executable to process each problem’s results:

python bit_behavior final/decode_*.dat.gz

Note: Do not mix results from different problems.

`make_rdata`¶

This will parse through all files in the final folder and produce R parsable output for use in analysis. Run as a stand alone executable in the form of:

python make_rdata.py final/*.dat.gz > rdata.csv

`never_actives`¶

Takes file names from the final/ folder and parses the information to create the black and white graphs in Table III. Use this module as an executable to process all result information for a single problem, such as:

python never_actives.py final/multiply_accumulate_normal_*.dat.gz

Do not mix problems in a single run.

NOTE: You CANNOT use pypy for this as pylab is current unsupported. Use python 2.7 instead.

never_actives.cmap_discretize(cmap, N)[source]¶: Return a discrete colormap from the continuous colormap cmap. cmap: colormap instance, eg. cm.jet. N: number of colors.

`stats`¶

Takes file names from the final/ folder and parses the information into readable values and produces statistical measures used in Table II. Use this module as an executable to process all result information for a single problem, such as:

python stats.py final/multiply*.dat.gz

Do not mix problems in a single run.

NOTE: You CANNOT use pypy for this as scipy is current unsupported. Use python 2.7 instead.

stats.make_rectangular(data, fill)[source]¶: Given data, a list of lists, square off the rows and return a matrix, with empty spaces being filled.

`main`¶

`evolution`¶

`problems`¶

`util`¶

`bar_plot`¶

`bit_behavior`¶

`make_rdata`¶

`never_actives`¶

`stats`¶

Table Of Contents

Previous topic

This Page

Navigation

Previous topic

This Page

Quick search

Navigation