This project is intended to help students better understand Self-Organizing Maps. The main contribution is a program that allows training Self-Organizing Maps on structured datasets and vizualizing the steps taken by the algorithm and the resulting Self-Organizing Map. Chapters 3 and 4 of this project explain the theoretical aspects of SOMs and their implementation in a software project. Chapter 5 focuses on explaining different aspects of the related program.
Self-Organizing Map (referenced as SOM from now on) is a computational method for the visualization and analysis of high dimensional data . SOMs allow for higher dimensional data to be clustered and mapped onto a lower dimensional object, usually a 2-dimensional grid. They are a form of neural networks that use unsupervised learning. The SOM structure and algortihm is explained in more detail in chapter 3.
SOMs are used in many data organization tasks, such as categorizing images. While SOMs offer unsupervized learning, they often do not reach optimal distribution and require further specification and classification by an overseer.
SOMs can be a difficult subject to comprehend and implement and some assistance in learning about SOMs might be required. In the opinion of the authors, vizualisation is a key factor in understanding how SOMs work and operate. There are many demos and videos of SOMs in action available on the Internet. Some of them are mentioned in Chapter 2. However few show any helpful information about how different variables within the algorithm act while training.
The goal of this project is to create a program that implements SOMs and allows for visualizing different aspects of the training process and the trained SOM.
Many different SOM visualization tools have been created for educational purposes. In this section we cover only those that are released under a public licence.
Christian Borgelt has created a SOM visualization tool called wsom (or xsom for the Linux version) . This tool allows viewing a grid of interconnected nodes training itself to organize itself into a rectangular 2D grid. The SOM being trained is redrawn after every cycle of the SOM algorithm, which creates an active view of the training process. This program does not characterize any other data about the SOM or its training process. Neither does this program allow loading different datasets for training. A screen capture of wsom in action can be seen on Figure 1.
Figure 1 - A screen capture of the wsom program in action 
Another illustrative program was created by Tom Germano, which uses colors with their three base components of red, green and blue as training data. It is a Java applett that can be configured to run different starting positions and at varying numbers of iterations. While different characteristics of the training process can be extrapolated from visual that the program provides while it’s running, it does not explicitly provide any information about how different aspects of the training algorithm change in time. Neither does this program allow loading different datasets for training. A screen capture of the program in action can be seen on Figure 2.
JavaSOM is a SOM vizualisation tool created by Tomi Suuronen for his Bachelor’s Thesis. It allows loading training data in XML format and the trained SOM can be saved into XML, SVG and PDF formats. The last two are presented as images. While this program allows different inputs for datasets, it does not show any information about the training process itself. A screen capture of the program’s results in SVG format can be seen on Figure 3.
Figure 3 - The results of JavaSOM in SVG format
As there are already different programs that show how a SOM organizes itself, we decided to focus on a slightly different area. Namely, how different elements of the SOM training algorithm act while the algorithm is running.