Chapter 2: Our first Neural Network

Chapter 2: Our first Neural Network

What is a Neuron

First, let us start by answering one simple question: What is a Neuron? Let us look into a formal definition:

A neuron, also known as a neurone (British spelling) and nerve cell, is an electrically excitable cell that receives, processes, and transmits information through electrical and chemical signals. [1]

As you can see, there are 3 main functions of a neuron:

  • Receive signals

  • Process signals and

  • Transmit the result (after processing)

These 3 functions perfectly match 3 key components of a real Neuron, just look at the following picture of a Neuron:

These three components have 1:1 mapping to main functions that Neuron is doing:

  • Dendrites - receives signal and sends it to the body

  • Soma (body) - process signal

  • Axon - sends signal to other neurons

Now, we can create a mathematical model of the Neuron:

Weights

- Each weight represent a dendrite and responsible for receiving the signal from the neighborhood. Since each dendrite have it's own size

(

shape

)

, there is certain deviation between signal that is delivered to the dendrite and the signal that dendrite delivers to the soma. This is why we need weight for each connection. Weight is basically tells us how dendrite process the signal. Now as soon as all the signal collected and multiplied on the weights, summed and transmitted to the soma.

Activation Function - Is a mathematical function that emulates neuron body (soma) and the only function that it is supposed to do is to convert the input signal to the signal that will be transmitted to the next neurons in the network. You can pick any function you like, but, there are, of course functions that are more or less suitable for this role. We will speak more about the most popular activation functions later in this book.

Our First Neuron

With all the knowledge discussed we are actually ready to do real coding.

*All the code that we are going to discuss available in the following branch: https://github.com/DeepJavaUniverse/DJ-core/tree/chapter_2

Now let us actually have an example: case about the party. Let us model it with the pen and paper. 3 input neurons, 1 output with the very simple sigmoid function for activation.

public interface Neuron {

    /**
     * Should be called when a Neuron receives input signal from the connected neuron.
     * For example let's look on the following network:
     * NeuronA
     *         \
     *          \
     *           \
     * NeuronB --- NeuronD
     *           /
     *          /
     *         /
     * NeuronC
     *
     * If NeuronA or NeuronB or NeuronC sends signal to NeuronD, the method should be
     * called.
     *
     * @param from , Neuron that sends the signal.
     */
    void forwardSignalReceived(Neuron from, Double value);

    default void connect(Neuron neuron, Double weight) {
        this.addForwardConnection(neuron);
        neuron.addBackwardConnection(this, weight);
    }

    void addForwardConnection(Neuron neuron);

    void addBackwardConnection(Neuron neuron, Double weight);
}

In order to make it work we need to define 2 types of neurons:

  • Input neurons;

  • Connected neurons;

Input neuron:

public class InputNeuron implements Neuron {

    private Set<Neuron> connections = new HashSet<>();

    @Override
    public void forwardSignalReceived(final Neuron from, final Double value) {
        connections.forEach(n -> n.forwardSignalReceived(this, value));
    }

    @Override
    public void addForwardConnection(final Neuron neuron) {
        connections.add(neuron);
    }

    @Override
    public void addBackwardConnection(final Neuron neuron, final Double weight) { } // No op
}

Before we can proceed with the implementation for our first real Neuron we need to define several things first:

  • Interface for the Activation Funciton;

  • Our first simple Activation Function;

The interface is going to be simple:

public interface ActivationFunction {

    Double forward(final Double x);
}

Nothing hard here, right, double in and double out. For our first implementation we will use step function:

public class StepFunction implements ActivationFunction {

    public Double forward(Double x) {
        return x >= 1. ? 1. : 0.;
    }
}

If you plot the function it looks like this:

Now we are ready to discuss the most complex part of this chapter - implementation of our first real Neuron. We will take it piece by piece. First let me describe all the fields that we will need:

    private final ActivationFunction activationFunction;

    /**
     * Represents the connections from the neuron to the neurons that it receives signals from. For example in the
     * following network:
     * NeuronA ___
     *            \ weight1 = -0.1
     *             \
     * weight2 = 0.1\
     * NeuronB ------ NeuronD
     *              /
     *             /
     *            /  weight3 = 0.8
     * NeuronC ---
     * backwardConnections map will looks like this:
     * NeuronA => -0.1
     * NeuronB => 0.1
     * NeuronC => 0.8
     */
    private final Map<Neuron, Double> backwardConnections = new HashMap<>();

    /**
     * Represents set of the Neurons to which current neuron sends signals to. There is no need in weights here.
     */
    private final Set<Neuron> forwardConnections = new HashSet<>();

    /**
     * inputSignals is used to store the signals from other Neurons. Keys in this Map should be absolutely identical to
     * the keys in the {@link #backwardConnections}. As soon as all the signals received Neuron can start processing
     * them.
     */
    private final Map<Neuron, Double> inputSignals = new HashMap<>();

    /**
     * Amount of the signals that has been received already. As soon as this number reaches the size of the
     * {@link #inputSignals} map the neuron is ready to start processing input signals and send signal forward.
     */
    private volatile int signalReceived;
    private final double bias;

    /**
     * Stores result of the latest signal that was send from the Neuron to other Neurons. This is mostly needed for the
     * output Neurons, since they do not have any other Neurons to send signal to and in the same time there should be
     * the way of getting this value.
     */
    private volatile double forwardResult;

At this point it should be obviouse how the connection adding works:

    @Override
    public void addForwardConnection(final Neuron neuron) {
        forwardConnections.add(neuron);
    }

    @Override
    public void addBackwardConnection(final Neuron neuron, final Double weight) {
        backwardConnections.put(neuron, weight);
        inputSignals.put(neuron, Double.NaN);
    }

And the main function that is doing all the magic:

    @Override
    public void forwardSignalReceived(final Neuron from, final Double value) {
        signalReceived++;
        inputSignals.put(from, value);
        // The following if is the check weather current signal was the last remaining signal to receive. And if so and
        // all incoming signals have been received the Neuron can start processing them and issue new signal himself.
        if (backwardConnections.size() == signalReceived) {
            // 4 steps need to happen when Neuron processes the input signals:
            // 1. Calculate input = W * X + b
            // 2. Calculate output = f(input), where f is activation function
            // 3. Send output to other neurons
            // 4. invalidate state

            // Step #1
            // Calculating W * X + b - sum of all input signals, each signal multiplied on the corresponding weight.
            // Bias is added at the end.
            double forwardInputToActivationFunction
                    = backwardConnections
                        .entrySet()
                        .stream()
                        .mapToDouble(connection ->
                                // inputSignals store the actual signal, while connection.getValue() gives you the
                                // weight that the signal should be multiplied to. Therefore this part is X * W.
                                inputSignals.get(connection.getKey())
                                        * connection.getValue())
                        .sum() + bias;

            // Step #2
            double signalToSend
                    = activationFunction.forward(
                            forwardInputToActivationFunction);
            forwardResult = signalToSend;

            // Step #3 Since signal is calculated now we can send it to other neurons.
            forwardConnections
                    .stream()
                    .forEach(connection ->
                        connection
                                .forwardSignalReceived(
                                        ConnectedNeuron.this,
                                        signalToSend)
                    );
            // Step #4
            signalReceived = 0;
        }
    }

One thing that we are omiting here is Builder. Since pattern builder has nothing to do with the DeepLearning we will just assume that Builder exist for our class without showing the code snippet here.

Now we can create our first NN. First let's describe the neural network that we are going to build. Our first NN going to be predicting if one should go to the party or not. In order to build the network we should have data that we will use for educating the network. Since we does not know how to train the network yet we will be doing it manually by observing patterns in our data and trying to handy craft such weights that will allow the network to do the correct predictions. So here is what we know about factors that impacting experience on the parts:

  1. The fact that our best friend will be there or not;

  2. Availability of the favorite drink (let's call it "Vodka" :) );

  3. Weather around (sunny or not);

Now we can create combination of all this factors and desired behavioure for each of them. In order to make representation more simpler we use 1 or 0 to represent each factor. 1 will mean that factor exists, 0 that it does not. Since we are speaking about 3 factors, input to our NN will be 3 digits (1 or 0 each), for example the following array represent input with the information that the best friend will be on the party, but there will be no favorite drink and the weather will not be sunny and the outcome of this factors - one should not go to the party:

[1, 0, 0] => 0

One more example with the information that there will be no best friend, but there will be drink and there is sunny outside - one need to go to the party:

[0, 1, 1] => 1

Now we let build all possible combinations of the data and the outcomes:

[1, 0, 0] => 0 [0, 1, 0] => 0 [0, 0, 1] => 0 [1, 1, 0] => 1 [1, 0, 1] => 1 [0, 1, 1] => 1 [1, 1, 1] => 1

How our NN will look like? It is actually straigt forward. We are going to have 3 input neurons and 1 output neuron. Let jum imideatly to the picture of the NN so it is simpler to understand:

This network is called "fully connected 1 layer network". I know is sounds scary, I think many people in science trying deliberatly to make it sound so :) But anyway, we are going to decouple the name onto sub parts and discuss part by part. Fully connected simply means that each of neuron in a layer have connection to each of the neurons in a previouse layers. Layer is an abstraction, that does not exist in a realy brain, just an simplification that allows engineers to represent the neurola networks more officently. Each layer is a collection of the neurons that does not have connection between each other, neurons from the layer are receiving signals only from the previouse layer (if such layer exists) and sending signals to the next layer (if such layer exists). Looking on our picture one might think that it is 2 layers network. The thing is, historically input layer not getting in the count (or maybe we are counting from 0, then it actually make sense). If we are not counting input layer this netowrk is indeed 1-layer.

At this point we have absolutely everything that we need to proceed with implementation of our firs NN: data, architecture of the network and even weights that we will be using. Let's actually implement:

    // Creating 3 input neurons
    InputNeuron inputFriend = new InputNeuron();
    InputNeuron inputVodka = new InputNeuron();
    InputNeuron inputSunny = new InputNeuron();
    // Our output neuron
    ConnectedNeuron outputNeuron
                // We have not discussed builder, however it is simple Java builder if needed you can look
                // on the implementation in our repository. 
                = new ConnectedNeuron.Builder()
                    .bias(bias)
                    .activationFunction(new StepFunction())
                    .build();

    // Creating connections between neurons
    inputFriend.connect(outputNeuron, 0.5);
    inputVodka.connect(outputNeuron, 0.5);
    inputSunny.connect(outputNeuron, 0.5);

    // Sending signal to the network
    inputFriend.forwardSignalReceived(null, 1.);
    inputVodka.forwardSignalReceived(null, 1.);
    inputSunny.forwardSignalReceived(null, 0.);

    // Getting result and printing it:
    double result = outputNeuron.getForwardResult();
    System.out.printf("Prediction: %3f\n", result)

[1] https://en.wikipedia.org/wiki/Neuron

Last updated