Semantron 22 Summer 2022

Methods of training neural networks

Rory Probert

Neural networks are an increasingly common technique for processing data efficiently. However, in order to operate properly, the network must first be trained to produce accurate results. This essay aims to discuss two different methods: gradient descent and genetic algorithms. Before discussing this, one must have a basic understanding of what a neural network actually is, and so a brief explanation follows. A neural network is a graph defined as a set of objects (nodes) which are related to each other through connections (edges) which may be weighted. It attempts to mimic the structure of neurons in the brain, hence ‘ neural ’ network.

The diagram to the right shows the general structure of most neural networks. The graph is divided into several layers of nodes: the first layer represents the input neurons, followed by an arbitrary number of medial layers of neurons and, finally, the output neurons (Dewdney, 1993). The objective of the network is to take several input values (corresponding to each input neuron) and convert this into several output values

(corresponding to each output neuron). Every single node in layer i is connected to every node in layer i + 1 (and therefore each node in layer i + 1 is connected to each node in layer i) and each connection is weighted. For each node in a layer, and for each of that node’s connection to a node in the next layer, the value of the first node is multiplied by the connection’s weight and this value is then added to the value of the new node (originally 0). Then, for each node in the new layer, a bias is added to its value. Finally, this value is fed into an activation function, f , which introduces nonlinearity to the system. Two popular choices of activation function are the linear rectifier and the hyperbolic tangent function (Dewdney, 1993).

Made with FlippingBook interactive PDF creator