Neural Networks are fascinating method of solving various problems. Fascinating, partly, due to their curious name :)
Warning: this text is very general and verbose. Going to be improved in future. Feel free to skip it and go solving corresponding tasks, if you have some understanding on the topic! task 1, task 2
But what are they? And are they so good to solve any problem? In this article we are going to explain few general questions about them - while a couple of problems will help to understand how exactly do they work.
Properly we call them Artificial Neural Networks
to distinguish from networks of real neurons in live
organisms. However the latter meaning is not so widely used and thus "artificial" is often omitted. Abbreviations
are NN
and ANN
.
The name is due to fact that NN is represented by network (or graph) of small objects (really, math functions), passing values between them. So they distantly resemble connections of neurons (nerve cells) in live organism, passing signals from brain to muscles, for example.
This resemblance and fact that NNs are supposed to solve some tasks - lead to the choice of such a name.
Beware of marketing!!!
Part (probably large) of the fame the NNs enjoy is due to the name. General people tend to consider them "something ultra-clever" because of such name. However technology is about 50 years old and nowadays often different methods are preferred.
However it is harder to impress public with the usage of Support Vector Machines
in comparison with
Neural Networks
(or Genetic Algorithm
likewise).
Shortly speaking: it is just a mathematical function converting some input values to some output values.
What makes NNs magical, is that we don't know how to create necessary function, but we can "train" the function using sample data. As we often don't know exact dependencies between things in real problems, it may be convenient in many applications.
Let's look at some simple example. Consider a tram:
Suppose, we want to create intelligent safety stopping algorithm for such tram.
There will be ultrasonic sensor in front, which detects when something (some careless pedestrian)
suddenly appears on the way. So at this moment we shall know two parameters - speed of the tram V
and allowed distance for deceleration S
. Our goal is to calculate voltage U
,
applied to motors of the tram (tram stops by applying reverse voltage to engines).
We don't want too much voltage (as it damages motors and instant stop may hurt passengers). Let it be just enough
to stop in exactly S
meters.
Problem is complicated because applying the voltage won't immediately make motors rotating backwards anyway, they have their own inertia, there is friction also, and there is inductance of motor coils which won't allow current to change instantly.
So, again, we want some function to calculate proper voltage depending on inputs:
U = f(S, V)
However it is very difficult to figure out precise mathematical law.
As many first electronical and mechanical calculators were created for military purpose, let's discuss simplified task of hitting an enemy aircraft with an explosive shell:
The cannon is in the origin, and we detect aircraft at the distance S
and height H
,
approaching with the speed V
. These three values will be inputs.
We want to know the angle A
to which raise the barrel of the cannon and timeout T
to which the timer
on the shell should be set, so that explosion happens in proximity of the airplane.
There is braking force of the atmosphere, depending non-linearly on the projectile speed. Also gravity is decreasing with height. Here could be other factors. So the function (with two outputs):
(A, T) = f(S, H, V)
this function becomes complex. Add third dimension to this chart, and ability of the aircraft to fly non-horizontally - and the matter becomes even worse.
We won't discuss here how internally NN is built (this could be learnt further from our problems) - but consider it has many parameters or coefficients inside, which we can manipulate to tune it, e.g. change its behavior to our need.
Then it becomes a question of how to "tune" these coefficients to make NN "work well" for our problem. This is done by the process of training. With tram example it may look like this:
(S, V)
.U
), perform
experiment on a live system with such parameters, and remember error E
- for example, square of
real travel distance Sreal
and expected S
.(S, V)
included in our training set.3
.The main question is, of course, how to "change coefficients to reduce error". One of popular algorithms used with NNs is "backpropagation" - i.e. backward propagation of error according to derivatives of internally used mathematical functions.
However we can do even without complicated math. This will be demonstrated in the second problem on NNs.
As the neural network is a math function, its input and output values are numbers. This makes several important points:
S
is expected to be 5 ... 50
meters,
speed 10 ... 100
miles per hour and voltage in the range 100 ... 800
volts, let divide them by
50
, 100
and 800
correspondingly, so all values are in range 0.0 ... 1.0
-1.0 ... 1.0
for exampleReally, how to choose inputs and how to convert them to values - this is large part of solving the problem with neural network. It is never enough just to say "use NN" - one should explain how it could be conveniently used.
For example, if we want to create character recognizing algorithm, we should first create unrelated
algorithm for splitting image into text characters, then convert them to monochrome and scale to some
small grid (say, 8*8
pixels). Let's call white pixel as 0
value and black as 1
(while gray shades go
between).
So we shall have 256 inputs in range 0 ... 1
but this still is not a complete solution probably. For example,
we may want to choose optimal inner structure of NN so that we don't waste time on unrelated calculations.
But enough considerations - let's go and study Neural Networks by practice: