Show Reference: "Essentials of the Self-Organizing Map"

Essentials of the Self-Organizing Map Neural Networks, Vol. 37 (January 2013), pp. 52-65, doi:10.1016/j.neunet.2012.09.018 by Teuvo K. Kohonen
    abstract = {The self-organizing map ({SOM}) is an automatic data-analysis method. It is widely applied to clustering problems and data exploration in industry, finance, natural sciences, and linguistics. The most extensive applications, exemplified in this paper, can be found in the management of massive textual databases and in bioinformatics. The {SOM} is related to the classical vector quantization ({VQ}), which is used extensively in digital signal processing and transmission. Like in {VQ}, the {SOM} represents a distribution of input data items using a finite set of models. In the {SOM}, however, these models are automatically associated with the nodes of a regular (usually two-dimensional) grid in an orderly fashion such that more similar models become automatically associated with nodes that are adjacent in the grid, whereas less similar models are situated farther away from each other in the grid. This organization, a kind of similarity diagram of the models, makes it possible to obtain an insight into the topographic relationships of data, especially of high-dimensional data items. If the data items belong to certain predetermined classes, the models (and the nodes) can be calibrated according to these classes. An unknown input item is then classified according to that node, the model of which is most similar with it in some metric used in the construction of the {SOM}. A new finding introduced in this paper is that an input item can even more accurately be represented by a linear mixture of a few best-matching models. This becomes possible by a least-squares fitting procedure where the coefficients in the linear mixture of models are constrained to nonnegative values.},
    author = {Kohonen, Teuvo K.},
    doi = {10.1016/j.neunet.2012.09.018},
    issn = {08936080},
    journal = {Neural Networks},
    keywords = {som},
    month = jan,
    pages = {52--65},
    posted-at = {2012-10-09 14:40:52},
    priority = {2},
    title = {Essentials of the {Self-Organizing} Map},
    url = {},
    volume = {37},
    year = {2013}

See the CiteULike entry for more info, PDF links, BibTex etc.

There are border effects in SOM learning: the distribution of neurons in data space is not the same in the center and the periphery of the network/data space.

One solution to border effects are SOMs with cyclic/spherical/hyper spherical/toroid topologies.

Kohonen cites von der Malsburg and Amari as among the first to demonstrate input-driven self-organization in machine learning.

Kohonen implies that neighborhood interaction in SOMs is an abstraction of chemical interactions between neurons in natural brain maps, which affect those neurons' plasticity, but not their current response.

Kohonen implies that neighborhood interaction in SOMs is what separates them from earlier, more bio-inspired attempts at input-driven self-organization, and what leads to computational tractability on the one hand and proper self-organization as found in natural brain maps on the other.

Kohonen depicts SOMs as an extension of vector quantization (VQ).

Kohonen states that online learning in SOMs is less safe and slower than batch learning.

Kohonen names normalization of input dimensions as a remedy for differences in scaling between these dimensions. He does not cite another paper of his (with colleagues) in which he presents a SOM that learns this scaling.

Kohonen discusses some of the challenges involved in using SOMs for text clustering.

  • words have different importance depending on their absolute frequency,
  • some words occurring very rarely or very commonly must be discarded.

Kohonen states that early SOMs were meant to model brain maps and how they come to be.

So, I've returned to the roots and found something interesting for applications!

Kohonen says the main virtue of SOMs lies in data visualization

Kohonen groups applications of SOMs into

  • statistical methods
    • exploratory data analysis
    • statistical analysis in organization of texts
  • industrial analyzes, control, telecommunications
  • financial applications

Kohonen advises to initialize large SOMs such that their initial organization is already similar to the expected final one to speed up convergence.


Kohonen proposes a version of the SOM in which models (SOM units' weight vectors) are combined linearly from the n best matching units for a given input to optimally describe that input.

The weights for the linear combination are derived of the models' distances from the input.