KohonenAlgorithm Archives

Lets quickly remind how does Self-Organizing Map works and which are building blocks of it.

In my implementation I have processor named SomLearningProcessor which does all main algorithm steps. After you created instance of that class you are able to call void Learn() method, which goes through all iterations finds best matching neuron with method int FindBestMatchingNeuron(double[] dataVector) and then it accommodates network with using dataVector and found BMN – this is done with method void AccommodateNetworkWeights(int bestNeuronNum, double[] dataVector, int
iteration). I will return back to this processor with more details, I just wanted to give quick vision what is it.

So lets move to building blocks of our learning algorithm:

Network

First of all it has Network, which I inject into our SomLearningProcessor through the INetwork interface:

public interface INetwork

{

IList<INeuron> Neurons { get; set; }

ITopology Topology { get; set; }

}

As you see it has list of Neurons which should fill the leaning grid, which is provided through the Topology interface. The default implementation of the network interface supplies possibility to randomize neurons with specifying minWeights and maxWeights boundaries for randomizing.

public class NetworkBase : INetwork

{

public NetworkBase(bool randomize, IList<double> maxWeights, IActivationFunction activationFunction, ITopology topology)

{…

Also we are injecting IActivationFunction for our neurons, so they will calculate reaction basing on this interface. Interface provides method double GetResult(double inputValue);
There are following concrete implementations of the interface IActivationFunction:

HardLimitActivationFunction
LinearActivationFunction
SymmetricHardLimitActivationFunction
TransparentActivationFunction

For Self-Organizing Maps I’m using TransparentActivationFunction, which just returns summarized value.

Topology

Another interesting thing that we are using during learning process is Topology. Interface looks like this:

public interface ITopology

{

int
NeuronsCount { get; }

int RowCount { get;
}

int ColCount { get;
}

double WholeTopologyRadius { get; }

int
GetNeuronNumber(Location
location);

Location
GetNeuronLocation(int neuronNumber);

IList<int> GetDirectlyConnectedNeurons(int neuronNumber);

Dictionary<int, double>
GetNeuronsInRadius(int neuronNumber, double radius);

}

As you see Topology consists with Rows and Columns as usual matrix, but the main point of interest is the method: Dictionary<int, double>

GetNeuronsInRadius(int neuronNumber, double radius); What does it do?
For the found best matching neuron neuronNumber – it should find all neurons which are located in boundaries of provided radius. Please also note that I’m returning pairs <NeuronNumber, DistanceFromItToBMN> in result dictionary. This allow increase performance a bit, because no need to double calculate distances.
Currently I have two finished implementations of ITopology they are SimpleMatrixTopology and BoundMatrixTopology.

Difference between them is that BoundMatrixTopology is closed cover like on the picture below, for matrix 6×5 and winning neuron number 22 with radius 2.1 it returns following neighbor neurons.

As you see it also returned 4 as connected neuron.

Same situation for SimpleMatrixToplogy returns next:

So far we have visual explanation how it searches for the neighbor neurons. On the picture below we could see which distances are returned by the
GetNeuronsInRadius method.

For my research I’m using SimpleMatrixTopology because it is much faster than BoundMatrixTopology.

A bit about Learning Process

Learning process is build upon changing weights of the neurons in order to be more close to input data vector. But also with increasing iteration number we should shrink the radius where we search for neighbor neurons and we should decrease learning rate. Also those neurons which are close to winner should be modified more than those which are far from it. This purposes are supplied by three interfaces. They are IRadiusProvider, INeighbourhoodFunction and ILearningFactorFunction.

RadiusProvider

Answers for the question which radius we should take on the n-th iteration. Currently I have one implementation of the interface, which looks like below:

public class DefaultRadiusProvider : IRadiusProvider

{

public int MaxIterations { get; private set; }

public double
TopologyRadius { get; private set;
}

public
double TimeConstant { get; private
set; }

public
DefaultRadiusProvider(int
maxIterations, double startRadius)

{

MaxIterations = maxIterations;

TopologyRadius = startRadius;

TimeConstant = maxIterations / Math.Log(startRadius);

}

public double GetRadius(int
iteration)

{

return
TopologyRadius*Math.Exp(-iteration/TimeConstant);

}

As you see when we create instance of it we define TimeConstant and that stands for:

when we calculate radius we are using next formula:

In those two formulas δ0 is half of the initial Grid radius and I setup it in constructor of Topology like: Math.Max(ColCount, RowCount) / 2.0, but of course we could choose wider initial value.

NeighbourhoodFunction

The most commonly used is Gauss Neighbourhood function, because it is quite smooth.

public class GaussNeighbourhoodFunction : INeighbourhoodFunction

{

public
double GetDistanceFalloff(double distance, double
radius)

{

double denominator = 2 *
radius * radius;

return Math.Exp(- distance / denominator);

}

We provide two parameters one of them is distance between neuron i and neuron j, another parameter is current radius, calculated by RadiusProvider. So the formula of it is:

LearningFactorFunction

And as we said before, learning rate should decrease with time. For this purpose I have few implementations of the interface ILearningFactorFunction. I’m using ExponentionalFactorFunction
which has method:

public
override double
GetLearningRate(double iteration)

{

return StartLearningRate * Math.Exp(- iteration /
MaxIterationsCount);

}

The formula looks like:

where τ2 is equal to maximum iterations count and η0 is starting learning rate. I set it to the 0.07.

Other Interfaces

During learning process I’m using few other interfaces. They are:

ShuffleProvider – provides method using which I shuffle input space vectors on each epoch.

LearningDataProvider – gives us possibility to get i-th input vector and answers for questions how many there are vectors and what is the dimension of input space. Using this interface I can mask providing of completely random input vectors as I did in this application. Or I can provide DataPersister into it, so it will be able read data from file or whatever.

MetricFunction – is just distance measurement for our grid. I have CityBlocks and Euclidean implementations.

SomLearningProcessor

With mix of all described above we can build our Learning Processor. Below is full code snippet of SomLearningProcessor:

public class SomLearningProcessor

{

public INetwork Network { get; private
set; }

public ITopology Topology { get; private set; }

protected
IShuffleProvider
ShuffleProvider { get; set; }

protected ILearningDataProvider LearningDataProvider { get; set; }

protected
IRadiusProvider
RadiusProvider { get; private set;
}

protected
INeighbourhoodFunction
NeighbourhoodFunction { get; set; }

protected IMetricFunction MetricFunction { get; set; }

protected
ILearningFactorFunction
LearningFactorFunction { get; set; }

protected
int MaxIterationsCount { get; set; }

public SomLearningProcessor(

ILearningDataProvider
learningDataProvider,

INetwork network,

IMetricFunction
metricFunction,

ILearningFactorFunction
learningFactorFunction,

INeighbourhoodFunction
neighbourhoodFunction,

int maxIterationsCount,

IShuffleProvider
shuffleProvider)

{

LearningDataProvider =
learningDataProvider;

Network =
network;

Topology =
network.Topology;

MetricFunction =
metricFunction;

LearningFactorFunction = learningFactorFunction;

NeighbourhoodFunction = neighbourhoodFunction;

MaxIterationsCount =
maxIterationsCount;

ShuffleProvider = shuffleProvider;

RadiusProvider = new DefaultRadiusProvider(maxIterationsCount,
Topology.WholeTopologyRadius);

}

public virtual
void Learn()

{

int vectorsCount =
LearningDataProvider.LearningVectorsCount;

IList<int> suffleList = new ShuffleList(vectorsCount);

for (int
iteration = 0; iteration < MaxIterationsCount; iteration++)

{

suffleList = ShuffleProvider.Suffle(suffleList);

for (int
dataInd = 0; dataInd < vectorsCount; dataInd++)

{

double[] dataVector =
LearningDataProvider.GetLearingDataVector(suffleList[dataInd]);

int bestNeuronNum =
FindBestMatchingNeuron(dataVector);

AccommodateNetworkWeights(bestNeuronNum, dataVector, iteration);

}

protected
virtual int
FindBestMatchingNeuron(double[]
dataVector)

{

int result = -1;

Double minDistance = Double.MaxValue;

for (int i
= 0; i < Network.Neurons.Count; i++)

{

double distance =
MetricFunction.GetDistance(Network.Neurons[i].Weights, dataVector);

if
(distance < minDistance)

{

minDistance = distance;

result = i;

}

return
result;

}

protected
virtual void
AccommodateNetworkWeights(int
bestNeuronNum, double[] dataVector, int iteration)

{

var radius = RadiusProvider.GetRadius(iteration);

var
effectedNeurons = Topology.GetNeuronsInRadius(bestNeuronNum, radius);

foreach (var
effectedNeuron in
effectedNeurons.Keys)

{

var
distance = effectedNeurons[effectedNeuron];

AccommodateNeuronWeights(effectedNeuron, dataVector, iteration,
distance, radius);

}

protected
virtual void
AccommodateNeuronWeights(int
neuronNumber, double[] dataVector, int iteration, double
distance, double radius)

{

var neuronWghts =
Network.Neurons[neuronNumber].Weights;

var
learningRate = LearningFactorFunction.GetLearningRate(iteration);

var
falloffRate = NeighbourhoodFunction.GetDistanceFalloff(distance,
radius);

for (int
i = 0; i < neuronWghts.Length; i++)

{

double weight = neuronWghts[i];

neuronWghts[i] += learningRate * falloffRate *
(dataVector[i] – weight);

}

Lets talk a bit about each of the methods.

public virtual
void Learn()

It is the main public method which could be used in outside world, so ti will learn Network using all stuff that we injected in constructor. After Lean has executed usage code could use learned Network through the public property. What this method does is: it goes through all iterations up to MaxIterationsCount, then shuffles input space (or not shuffles, it depends on shuffling provider), finds Best Matching Neuron and Accommodates Network weights.

protected
virtual int
FindBestMatchingNeuron(double[]
dataVector)

Is the most expensive operation. As you see it goes through all the neurons in Network and calculated distance from current neuron to dataVector. Then it returns neuron with minimal distance.

protected
virtual void
AccommodateNetworkWeights(int
bestNeuronNum, double[] dataVector, int iteration)

Takes BMN number, asks RadiusProvider for the radius on current iteration and then asks Topology to find all neurons connected to BMN in current radius. For each found neuron it accommodates weights in next method.

protected
virtual void
AccommodateNeuronWeights(int
neuronNumber, double[] dataVector, int iteration, double
distance, double radius)

The most interesting line of code in this method is actual accommodation:

neuronWghts[i] = neuronWghts[i] + learningRate * falloffRate *
(dataVector[i] – neuronWghts[i]);

Which stands for formula:

Why did I make all methods virtual and so much protected properties? That is intent of my next post on implementation of SOM, where I’m going to talk about Parallelization of calculations in our algorithm.

Одним із хороших тестів Самоорганізаційної карти є двовимірна інтерпретація двовимірного вхідного простору. Нейрони нейромережі почитають заповняти той простір із якого приходять вхідні вектори, таким чином можна наочно спостерігати за тим чи нейромережа дає хороші результати.

Для цього я використав решітку розміру 10 на 10. Нейромережа була початково ініціалізована випадковими значеннями нейронів від 0.25 до 0.75. Таким чином якщо постійно вчити нейромережу випадковими значеннями (від 0 до 1), то гратка повинна “розвернутися” на всю площину.

Я використовую 500 кроків на кожному із яких проганяю 100 випадкових точок для того щоб згладити навчання.

Ініціалізація:

Перша ітерація:

Друга:

300:

Остання (500та):

Якщо вхідні значення подавати не із всієї множини X [0, 1], Y[0, 1], а із множини що займає певний об”єкт, то карта просто дуже легко впишеться в нього.

Щоб переконатися в цьому я використав силует дівчини і для навчання подавав тільки точки, які входили в площину, де знаходиться дівчина.

Зробити це для мене було, корисно, оскільки я таки знайшов недолугі частини моєї імплементації і виправив декілька помилок в алгоритмі. Надалі я можу спокійно працювати над паралелізацією обрахунків.

Я сподіваюся що вам сподобалося. Коментарі?

KohonenAlgorithm

My implementation of Self-Organizing Map

Self-Organizing Maps – Заженемо речі в гратку

Possible Parallel Algorithm for Self-Organizing Maps

Subscribe To This Blog

Highlights

Recent Posts

Recent Comments

Archives