Reinforcement Learning in Coevolution Models

QINXIN PAN

1. Introduction

1.1 Coevolution

In biology, coevolution is "the change of a biological object triggered by the change of a related object." 1 Each party in a coevolutionary relationship will consume nature source and meanwhile produce metabolic products, and consequently influence the other parties' evolution. Two most well known relationship are competition coevolution and collaboration coevolution. Competition coevolution mostly exist between species living on similar environmental resources while collaboration coevolution exists between species who benefit each other and finish a larger piece of object together.

1.2 Coevolution algorithm

Coevolutionary algorithms are a class of algorithms used for generating artificial life as well as for optimization, game learning etc. Daniel Hillis coevolved sorting networks and Karl Sims coevolved virtual creatures. 2 3 Similar research have been done mainly using genetic evolution algorithm and reinforcement learning. The genetic evolution algorithm mimic the process of evolution under nature selection while the reinforcement learning process mimic the bidirectional influencement between species within a coevolution relationship.

1.3 RBN

A Boolean network consists of a set of Boolean variables whose state is determined by other variables in the network. It is a model commonly used in genetic regulatory network study. 4

2. Goals of the project

A recent study in nature physics shown that the different networks with different topology, such as scale free, Poisson etc, show different learning abilities. 5 This shed some light on why most biological networks such as gene regulatory network, protein interaction network etc exhibit scale free topology. I'm trying to ask how the different topology will influence coevolution process in this project. I will construct two populations of threshold RBNs with heterogeneous topology containing certain topology properties and then apply tournament selection evolution algorithm with replacement. I will also apply reinforcement learning via the fitness function in the evolution algorithm. The final goal of this project is to compare how scale-free & scale-free, scale-free & Poisson, Poisson & Poisson will coevolve within the competition and collaboration relationship.

3. Methods

3.1 Representation of the coevolving population.

Threshold RBN model will be used to represent the two populations. One node indicates one gene, it will be either on or off indicated by 1 or 0. Edges will be drawn between those nodes to indicate regulating relationships. So each node will have a threshold function for its dynamic status. Since each node has only 2 status, with limited number of nodes, after a few steps, each network will fall into a cycle called attractor. I will use this attractor to represent phenotype.

3.2 Evolution with reinforcement learning

I will apply tournament selection algorithm with replacement. Better fitness - how close a phenotype is to certain target function- will be the selecting object. To mimic the coevolution part, reinforcement learning will be used. In the collaboration model, the two populations are supposed to achieve a larger piece of target function. So reward will be given to network individual if they collaborate better. In the competition model, the two populations will be competing for limited resources. So if they are too similar to each other, punishment will be given to the individual's fitness and oppositely if they are different enough, reward will be assigned. In this way, we are tuning the two populations to collaborate or compete.

3.3 Analyze the effects of topology on learning process.

I will compare the difference, try to explain the difference via network topology properties such as degree distribution, modularity etc.

4. Datasets

I will use synthetic data. To certain extent, the output from one population will train the other population and vice versa.

5. Timeline

(1) Implement the co-evolution process. 04/12-04/22
(2) Run experiment to compare different topologies. 04/22-4/30
(3) write up milestone report.
(4) Analyze the experimental results
(5) Write up final report.