Reinforcement Learning in Coevolution Models
QINXIN PAN
1. Introduction
1.1 Coevolution
In biology, coevolution is "the change of a biological object triggered
by the change of a related object."
1
Each party in a coevolutionary
relationship will consume nature source and meanwhile produce metabolic
products, and consequently influence the other parties' evolution. Two
most well known relationship are competition coevolution and
collaboration coevolution. Competition coevolution mostly exist between
species living on similar environmental resources while collaboration
coevolution exists between species who benefit each other and finish a
larger piece of object together.
1.2 Coevolution algorithm
Coevolutionary algorithms are a class of algorithms used for generating
artificial life as well as for optimization, game learning etc. Daniel
Hillis coevolved sorting networks and Karl Sims coevolved virtual
creatures.
2
3
Similar research have been done mainly using genetic
evolution algorithm and reinforcement learning. The genetic evolution
algorithm mimic the process of evolution under nature selection while
the reinforcement learning process mimic the bidirectional influencement
between species within a coevolution relationship.
1.3 RBN
A Boolean network consists of a set of Boolean variables whose state is
determined by other variables in the network. It is a model commonly
used in genetic regulatory network study.
4
2. Goals of the project
A recent study in nature physics shown that the different networks with
different topology, such as scale free, Poisson etc, show different
learning abilities.
5
This shed some light on why most biological networks
such as gene regulatory network, protein interaction network etc exhibit
scale free topology. I'm trying to ask how the different topology will
influence coevolution process in this project. I will construct two
populations of threshold RBNs with heterogeneous topology containing
certain topology properties and then apply tournament selection
evolution algorithm with replacement. I will also apply reinforcement
learning via the fitness function in the evolution algorithm. The final
goal of this project is to compare how scale-free & scale-free,
scale-free & Poisson, Poisson & Poisson will coevolve within the
competition and collaboration relationship.
3. Methods
3.1 Representation of the coevolving population.
Threshold RBN model will be used to represent the two populations. One
node indicates one gene, it will be either on or off indicated by 1 or
0. Edges will be drawn between those nodes to indicate regulating
relationships. So each node will have a threshold function for its
dynamic status. Since each node has only 2 status, with limited number
of nodes, after a few steps, each network will fall into a cycle called
attractor. I will use this attractor to represent phenotype.
3.2 Evolution with reinforcement learning
I will apply tournament selection algorithm with replacement. Better
fitness - how close a phenotype is to certain target function- will be
the selecting object. To mimic the coevolution part, reinforcement
learning will be used. In the collaboration model, the two populations
are supposed to achieve a larger piece of target function. So reward
will be given to network individual if they collaborate better. In the
competition model, the two populations will be competing for limited
resources. So if they are too similar to each other, punishment will be
given to the individual's fitness and oppositely if they are different
enough, reward will be assigned. In this way, we are tuning the two
populations to collaborate or compete.
3.3 Analyze the effects of topology on learning process.
I will compare the difference, try to explain the difference via network
topology properties such as degree distribution, modularity etc.
4. Datasets
I will use synthetic data. To certain extent, the output from one
population will train the other population and vice versa.
5. Timeline
(1) Implement the co-evolution process. 04/12-04/22
(2) Run experiment to compare different topologies. 04/22-4/30
(3) write up milestone report.
(4) Analyze the experimental results
(5) Write up final report.