Chapter 2: Our first Neural Network
Last updated
Last updated
First, let us start by answering one simple question: What is a Neuron? Let us look into a formal definition:
A neuron, also known as a neurone (British spelling) and nerve cell, is an that receives, processes, and transmits information through electrical and chemical signals. [1]
As you can see, there are 3 main functions of a neuron:
Receive signals
Process signals and
Transmit the result (after processing)
These 3 functions perfectly match 3 key components of a real Neuron, just look at the following picture of a Neuron:
These three components have 1:1 mapping to main functions that Neuron is doing:
Dendrites - receives signal and sends it to the body
Soma (body) - process signal
Axon - sends signal to other neurons
Now, we can create a mathematical model of the Neuron:
Weights
- Each weight represent a dendrite and responsible for receiving the signal from the neighborhood. Since each dendrite have it's own size
(
shape
)
, there is certain deviation between signal that is delivered to the dendrite and the signal that dendrite delivers to the soma. This is why we need weight for each connection. Weight is basically tells us how dendrite process the signal. Now as soon as all the signal collected and multiplied on the weights, summed and transmitted to the soma.
Activation Function - Is a mathematical function that emulates neuron body (soma) and the only function that it is supposed to do is to convert the input signal to the signal that will be transmitted to the next neurons in the network. You can pick any function you like, but, there are, of course functions that are more or less suitable for this role. We will speak more about the most popular activation functions later in this book.
With all the knowledge discussed we are actually ready to do real coding.
Now let us actually have an example: case about the party. Let us model it with the pen and paper. 3 input neurons, 1 output with the very simple sigmoid function for activation.
In order to make it work we need to define 2 types of neurons:
Input neurons;
Connected neurons;
Input neuron:
Before we can proceed with the implementation for our first real Neuron we need to define several things first:
Interface for the Activation Funciton;
Our first simple Activation Function;
The interface is going to be simple:
Nothing hard here, right, double in and double out. For our first implementation we will use step function:
If you plot the function it looks like this:
Now we are ready to discuss the most complex part of this chapter - implementation of our first real Neuron. We will take it piece by piece. First let me describe all the fields that we will need:
At this point it should be obviouse how the connection adding works:
And the main function that is doing all the magic:
One thing that we are omiting here is Builder. Since pattern builder has nothing to do with the DeepLearning we will just assume that Builder exist for our class without showing the code snippet here.
Now we can create our first NN. First let's describe the neural network that we are going to build. Our first NN going to be predicting if one should go to the party or not. In order to build the network we should have data that we will use for educating the network. Since we does not know how to train the network yet we will be doing it manually by observing patterns in our data and trying to handy craft such weights that will allow the network to do the correct predictions. So here is what we know about factors that impacting experience on the parts:
The fact that our best friend will be there or not;
Availability of the favorite drink (let's call it "Vodka" :) );
Weather around (sunny or not);
Now we can create combination of all this factors and desired behavioure for each of them. In order to make representation more simpler we use 1 or 0 to represent each factor. 1 will mean that factor exists, 0 that it does not. Since we are speaking about 3 factors, input to our NN will be 3 digits (1 or 0 each), for example the following array represent input with the information that the best friend will be on the party, but there will be no favorite drink and the weather will not be sunny and the outcome of this factors - one should not go to the party:
[1, 0, 0] => 0
One more example with the information that there will be no best friend, but there will be drink and there is sunny outside - one need to go to the party:
[0, 1, 1] => 1
Now we let build all possible combinations of the data and the outcomes:
[1, 0, 0] => 0 [0, 1, 0] => 0 [0, 0, 1] => 0 [1, 1, 0] => 1 [1, 0, 1] => 1 [0, 1, 1] => 1 [1, 1, 1] => 1
How our NN will look like? It is actually straigt forward. We are going to have 3 input neurons and 1 output neuron. Let jum imideatly to the picture of the NN so it is simpler to understand:
This network is called "fully connected 1 layer network". I know is sounds scary, I think many people in science trying deliberatly to make it sound so :) But anyway, we are going to decouple the name onto sub parts and discuss part by part. Fully connected simply means that each of neuron in a layer have connection to each of the neurons in a previouse layers. Layer is an abstraction, that does not exist in a realy brain, just an simplification that allows engineers to represent the neurola networks more officently. Each layer is a collection of the neurons that does not have connection between each other, neurons from the layer are receiving signals only from the previouse layer (if such layer exists) and sending signals to the next layer (if such layer exists). Looking on our picture one might think that it is 2 layers network. The thing is, historically input layer not getting in the count (or maybe we are counting from 0, then it actually make sense). If we are not counting input layer this netowrk is indeed 1-layer.
At this point we have absolutely everything that we need to proceed with implementation of our firs NN: data, architecture of the network and even weights that we will be using. Let's actually implement:
*All the code that we are going to discuss available in the following branch:
[1]