Biologically Inspired Learning in Artificial Neural Networks
Updated: May 27, 2021
1. Levels of Organization in Nature Biological organisms and systems have come to exhibit intelligent behaviors to solve the problems they encounter in their environment. These intelligent behaviors emerged as a result of billions of years of evolution to help them to cope with their environments and ultimately survive.
It is reasonable to discuss three levels of organization observed in nature referred as the evolutionary (phylogenetic), developmental (ontogenetic) and lifetime (epigenetic) . The evolutionary level is concerned with the evolutionary process of species which operates over generations and allows organisms to adapt their environment through variation and selection. The developmental level refers to the developmental process of a multi-cellular organism starting from a single cell. The environmental factors can have an influence in the developmental process of the individuals, however, the process is mainly instructed by the genetic code. Finally, the lifetime level refers to the changes that occur during an individuals lifetime, usually in response to the environmental factors, which allows individuals to adapt/learn. 2. Evolving Artificial Neural Networks Inspired by the evolutionary, development and lifetime levels of organization in biological neural networks (BNNs), the field known as neuroevolution [2,3], employs evolutionary algorithms to optimize artificial neural networks (ANNs). In neuroevolution, the parameters of ANNs (topology, connectivity and their weights) are represented into the genotype of the individuals. For instance, all the connection weights of the networks can be encoded in the genotype of the individuals as real-valued vectors. The algorithm starts with a randomly generated population of genotypes, evaluates all of these individuals by converting into an ANN and testing on the task, selecting the ones that perform relatively better than others, and reproduce the selected individuals to generate a new population by applying mutation and crossover operators inspired by the evolutionary process. Performing the evaluation, selection and reproduction steps iteratively for certain number of iterations, we expect to find the ANNs that can produce satisfactory results in solve the task.
One of the key aspects in NE is the approach used for encoding the ANNs. In the case of direct encoding, the topology and/or weights of the ANNs are directly represented within the genotype of the individuals. However, this approach may not be biologically plausible considering the amount of the information required to encode possible configurations of BNNs with a relatively small number of genes present in the genotype. Moreover, searching this large number of possible network configurations using direct encoding is likely to lead scalability issues . In addition to that, the ANNs optimized with this approach do not exhibit lifetime learning capabilities. A complementary approach, known as indirect encoding on the other hand, aims to find optimum rules to develop or learning mechanisms to train ANNs during their lifetime . 3. Evolution of Biologically Inspired Learning A fundamental property of BNNs is their plasticity, which allows them to modify their internal configurations during their lifetime. According to the current physiological understanding, these changes are performed on synapses (connections between two neurons) based on the local interactions of the neurons . This form of learning may lead to a high level of adaptation (as observed in BNNs) due to its distributed and self-organized nature. However, further research is needed to understand the emergent of a coherent learning behavior from local learning rules. Hebbian learning has been proposed to model the plasticity property in ANNs . According to the basic formalism of Hebbian learning, the synaptic efficiency between two neurons is increased/decreased depending on the correlations of their activations. However, this formalization is likely to suffer from instability as it introduces an indefinite increase/decrease of the synaptic efficiencies. Therefore, the plasticity rules may require further optimization to properly capture the dynamics needed for adjusting the network parameters.
We employ genetic algorithms to encode Hebbian based discrete synaptic plasticity rules to perform synaptic changes during the lifetime of the ANNs . These rules perform synaptic changes locally in each synapse based on the pairwise binary activation states of the pre- and post-synaptic neurons and a binary (reward/punishment) reinforcement signal that is received from the environment after each action. Thus, as shown in the figure below, the goal of the genetic algorithm is to find one of three possible synaptic change (increase, decrease, stable) for all possible binary states of pre- and post-synaptic neurons and reinforcement signal. Consequently, there are 3^8 possible rules since there are 2^3 possible states. Additionally, we included a continuous value into the genotype of the individuals to optimize also the learning rate.
4. Foraging Task
We test the learning and adaptation capabilities of the networks on a foraging task with changing environmental conditions. An agent with a randomly initialized feed-forward ANN is required to learn to navigate with an enclosed environment and collect/avoid correct types of items. After each action of the agent, a reinforcement signal is provided to guide the learning process. We introduced two seasons where the reinforcement signals for the types of items to be collected/avoided are reversed. In both seasons, the agent is required to learn to explore the environment and avoid the walls and explore the environment. In this scenario, the agent is required to automatically adapt to each season by updating the synaptic connections of the networks continuously based only on the reinforcement signals. The performance of the agent can be seen in the video below.
This article is based on . For more detail results and discussion please see: Evolving plasticity for autonomous learning under changing environmental conditions. Evolutionary Computation. 2020. DOI: 10.1162/evco_a_00286. arXiv preprint arXiv:1904.01709. References  M. Sipper, E. Sanchez, D. Mange, M. Tomassini, A. Pérez-Uribe, and A. Stauffer. A Phylogenetic, Ontogenetic, and Epigenetic View of Bio-Inspired Hardware Systems. IEEE Transactions on Evolutionary Computation , Vol. 1, No. 1, pages 83-97, April 1997.
 Floreano, D., Dürr, P., & Mattiussi, C. (2008). Neuroevolution: from architectures to learning. Evolutionary intelligence, 1(1), 47-62.  Stanley, K. O., Clune, J., Lehman, J., & Miikkulainen, R. (2019). Designing neural networks through neuroevolution. Nature Machine Intelligence, 1(1), 24-35.  Yaman, A., Mocanu, D. C., Iacca, G., Fletcher, G., & Pechenizkiy, M. (2018, July). Limited evaluation cooperative co-evolutionary differential evolution for large-scale neuroevolution. In Proceedings of the Genetic and Evolutionary Computation Conference (pp. 569-576). ACM.  Kowaliw, T., Bredeche, N., & Doursat, R. (2014). Growing Adaptive Machines. Springer. Trachtenberg, J. T., Chen, B. E., Knott, G. W., Feng, G., Sanes, J. R., Welker, E., & Svoboda, K. (2002). Long-term in vivo imaging of experience-dependent synaptic plasticity in adult cortex. Nature, 420(6917), 788.  Hebb, D.O. (1949). The organization of behavior: A neuropsychologi-cal theory.  Yaman, A., Mocanu, D. C., Iacca, G., Coler, M., Fletcher, G., & Pechenizkiy, M. (2020). Evolving plasticity for autonomous learning under changing environmental conditions. Evolutionary Computation. DOI: 10.1162/evco_a_00286.