Semantron 22 Summer 2022

Neural networks

There are two main genetic operations: mutation and crossover. When a chromosome mutates, one allele is negated. For example, 110100 1 0 to 110100 0 0 where one bit switched from 1 to 0 (Dewdney, 1993). A crossover requires two ‘ parent ’ chromosomes which together produce two solutions. These parents are effectively swapped between two bits. For example, a crossover of 10110101 and 00101101 between bits 3 and 4 would result in the offspring 101 01101 and 001 10101 (Mathew, 2012). Before any genetic operations are carried out, each chromosome in the population must be evaluated. This can be achieved using the loss function of the network where the chromosome which represents the network which generates the lowest loss from a set of test samples is considered the fittest in the population. An algorithm combining the mutation and crossover operations creates a new generation of chromosomes, containing more chromosomes than the original population. A subset of this new generation is then selected to be the new population by taking the n fittest chromosomes in the generation where n is the original population size. This repeats with each new generation and the fitness of the population as a whole will increase until it reaches a steady state at which it plateaus and there is little to no improvement (Dewdney, 1993). Although an actual implementation of a genetic algorithm for a neural network would be much more complicated than this, the principles shown here are the same. Both gradient descent and genetic algorithms proceed similarly in that they must continue to adjust and improve the parameters of the network to reach a solution. However, the ways in which they achieve this are very different, with gradient descent taking a more analytical approach using mathematics while genetic algorithms look instead to the natural world for inspiration. In summary, optimization methods such as gradient descent will produce more accurate results than genetic algorithms, although it depends on the type of problem being analysed and in certain cases the opposite is true. One advantage of genetic algorithms is that there is no need to find the gradients at any point so they can be very useful in situations where computing the gradients is difficult or even impossible. The main disadvantage, however, is that, for sufficiently large networks to solve complicated problems, genetic algorithms require massive amounts of processing power to train the network which restricts the ability to use them. Gradient descent is also much easier to implement in terms of the code required for simple projects. Ultimately, there is likely to be a combination of both methods of training neural networks that will lead to advancements in performance in the future (Hulstaert, 2017).

Bibliography

Abdi, H., 1994. 'A Neural Network Primer', Journal of Biological Systems, 2(3), pp. 247-283. Bottou, L., 2012. 'Stochastic Gradient Descent Tricks', in: G. Montavon, G. Orr & K. Müller, eds. Neural Networks: Tricks of the Trade. Berlin: Springer, pp. 421-436. Darmochwał, A., 1991. 'The Euclidean Space', Journal of Formalized Mathematics, Volume 3. Dewdney, A., 1993. The New Turing Omnibus. 2nd ed. New York: Holt Paperbacks. Hulstaert, L., 2017. Gradient descent vs. neuroevolution. [Online] Available at: https://towardsdatascience.com/gradient-descent-vs-neuroevolution-f907dace010f [Accessed 23 August 2021]. Leung, F. e. a., 2003. 'Tuning of the structure and parameters of a neural network using an improved genetic algorithm', IEEE Transactions on Neural Networks, 14(1), pp. 79-88. Mathew, T., 2012. Genetic Algorithm, Mumbai: Indian Institute of Technology Bombay .

63

Made with FlippingBook interactive PDF creator