Lets quickly remind how does Self-Organizing Map works and which are building blocks of it.
In my implementation I have processor named SomLearningProcessor which does all main algorithm steps. After you created instance of that class you are able to call void Learn() method, which goes through all iterations finds best matching neuron with method int FindBestMatchingNeuron(double dataVector) and then it accommodates network with using dataVector and found BMN – this is done with method void AccommodateNetworkWeights(int bestNeuronNum, double dataVector, int
iteration). I will return back to this processor with more details, I just wanted to give quick vision what is it.
So lets move to building blocks of our learning algorithm:
First of all it has Network, which I inject into our SomLearningProcessor through the INetwork interface:
As you see it has list of Neurons which should fill the leaning grid, which is provided through the Topology interface. The default implementation of the network interface supplies possibility to randomize neurons with specifying minWeights and maxWeights boundaries for randomizing.
Also we are injecting IActivationFunction for our neurons, so they will calculate reaction basing on this interface. Interface provides method double GetResult(double inputValue);
There are following concrete implementations of the interface IActivationFunction:
For Self-Organizing Maps I’m using TransparentActivationFunction, which just returns summarized value.
Another interesting thing that we are using during learning process is Topology. Interface looks like this:
As you see Topology consists with Rows and Columns as usual matrix, but the main point of interest is the method: Dictionary<int, double>
GetNeuronsInRadius(int neuronNumber, double radius); What does it do?
For the found best matching neuron neuronNumber – it should find all neurons which are located in boundaries of provided radius. Please also note that I’m returning pairs <NeuronNumber, DistanceFromItToBMN> in result dictionary. This allow increase performance a bit, because no need to double calculate distances.
Currently I have two finished implementations of ITopology they are SimpleMatrixTopology and BoundMatrixTopology.
Difference between them is that BoundMatrixTopology is closed cover like on the picture below, for matrix 6×5 and winning neuron number 22 with radius 2.1 it returns following neighbor neurons.
As you see it also returned 4 as connected neuron.
Same situation for SimpleMatrixToplogy returns next:
So far we have visual explanation how it searches for the neighbor neurons. On the picture below we could see which distances are returned by the
For my research I’m using SimpleMatrixTopology because it is much faster than BoundMatrixTopology.
A bit about Learning Process
Learning process is build upon changing weights of the neurons in order to be more close to input data vector. But also with increasing iteration number we should shrink the radius where we search for neighbor neurons and we should decrease learning rate. Also those neurons which are close to winner should be modified more than those which are far from it. This purposes are supplied by three interfaces. They are IRadiusProvider, INeighbourhoodFunction and ILearningFactorFunction.
Answers for the question which radius we should take on the n-th iteration. Currently I have one implementation of the interface, which looks like below:
As you see when we create instance of it we define TimeConstant and that stands for:
when we calculate radius we are using next formula:
In those two formulas δ0 is half of the initial Grid radius and I setup it in constructor of Topology like: Math.Max(ColCount, RowCount) / 2.0, but of course we could choose wider initial value.
The most commonly used is Gauss Neighbourhood function, because it is quite smooth.
We provide two parameters one of them is distance between neuron i and neuron j, another parameter is current radius, calculated by RadiusProvider. So the formula of it is:
And as we said before, learning rate should decrease with time. For this purpose I have few implementations of the interface ILearningFactorFunction. I’m using ExponentionalFactorFunction
which has method:
The formula looks like:
where τ2 is equal to maximum iterations count and η0 is starting learning rate. I set it to the 0.07.
During learning process I’m using few other interfaces. They are:
ShuffleProvider – provides method using which I shuffle input space vectors on each epoch.
LearningDataProvider – gives us possibility to get i-th input vector and answers for questions how many there are vectors and what is the dimension of input space. Using this interface I can mask providing of completely random input vectors as I did in this application. Or I can provide DataPersister into it, so it will be able read data from file or whatever.
MetricFunction – is just distance measurement for our grid. I have CityBlocks and Euclidean implementations.
With mix of all described above we can build our Learning Processor. Below is full code snippet of SomLearningProcessor:
Lets talk a bit about each of the methods.
It is the main public method which could be used in outside world, so ti will learn Network using all stuff that we injected in constructor. After Lean has executed usage code could use learned Network through the public property. What this method does is: it goes through all iterations up to MaxIterationsCount, then shuffles input space (or not shuffles, it depends on shuffling provider), finds Best Matching Neuron and Accommodates Network weights.
Is the most expensive operation. As you see it goes through all the neurons in Network and calculated distance from current neuron to dataVector. Then it returns neuron with minimal distance.
bestNeuronNum, double dataVector, int iteration)
Takes BMN number, asks RadiusProvider for the radius on current iteration and then asks Topology to find all neurons connected to BMN in current radius. For each found neuron it accommodates weights in next method.
neuronNumber, double dataVector, int iteration, double
distance, double radius)
The most interesting line of code in this method is actual accommodation:
Which stands for formula:
Why did I make all methods virtual and so much protected properties? That is intent of my next post on implementation of SOM, where I’m going to talk about Parallelization of calculations in our algorithm.