SlideShare a Scribd company logo
Efficient Implementation of Self-Organizing Map
for Sparse Input Data
Josué Melka and Jean-Jacques Mariage
Introduction
Contributions
Experiments
SOM Description
Standard vs. Batch Versions
Motivation for Sparse SOM
1 Introduction
SOM Description
Standard vs. Batch Versions
Motivation for Sparse SOM
2 Contributions
3 Experiments
Introduction
Contributions
Experiments
SOM Description
Standard vs. Batch Versions
Motivation for Sparse SOM
Presentation: The SOM Network
Self-Organizing Map (Kohonen 1982):
an artificial neural network
trained by unsupervised competitive learning
produces a low-dimensional map of the input space
Many applications
Commonly used for data projection, clustering, etc.
Introduction
Contributions
Experiments
SOM Description
Standard vs. Batch Versions
Motivation for Sparse SOM
The SOM Map
The map: units (= nodes/neurons) on a lattice.
Associated with each node:
a weight vector
a position in the map
http://www.lohninger.com
Introduction
Contributions
Experiments
SOM Description
Standard vs. Batch Versions
Motivation for Sparse SOM
Example: MNIST Handwritten Dataset
Random sampling
SOM map (12x16 units)
Introduction
Contributions
Experiments
SOM Description
Standard vs. Batch Versions
Motivation for Sparse SOM
The SOM Training Algorithm
Within the main loop:
1 compute the distance between input and weight vectors
dk(t) = x(t) − wk(t) 2
(1)
2 find the node weight closest to the input (BMU)
dc(t) = min
k
d(t) (2)
3 update the BMU and its neighbors to be closer to the input
Introduction
Contributions
Experiments
SOM Description
Standard vs. Batch Versions
Motivation for Sparse SOM
Standard Algorithm: The Learning Rule
Update weight vectors at each time step, for a random sample x:
wk(t + 1) = wk(t) + α(t)hck(t) [x(t) − wk(t)] (3)
α(t) is the decreasing learning rate
hck(t) is the neighborhood function
For example, the Gaussian:
hck(t) = exp −
rk − rc
2
2σ(t)2
Gaussian neighborhood
Introduction
Contributions
Experiments
SOM Description
Standard vs. Batch Versions
Motivation for Sparse SOM
The Batch Algorithm
The BMUs and their neighbors are updated once at the end of
each epoch, with the average of all the samples that trigger them:
wk(tf ) =
tf
t0
hck(t )x(t )
tf
t0
hck(t )
(4)
Introduction
Contributions
Experiments
SOM Description
Standard vs. Batch Versions
Motivation for Sparse SOM
Motivation: Reduce the Computing Time
Computing time depends on:
1 the T training iterations
2 the M nodes in the network
3 the D dimensions of the vectors
The issue: large sparse datasets
Many real-world datasets are sparse and high-dimensional.
But existing SOMs can’t exploit sparseness to save time.
Introduction
Contributions
Experiments
SOM Description
Standard vs. Batch Versions
Motivation for Sparse SOM
Datasets Examples: Dense vs. Sparse
MNIST dataset
(dim: 780, density: 19.22%)
News-20 dataset
(dim: 62061, density: 0.13%)
Introduction
Contributions
Experiments
SOM Description
Standard vs. Batch Versions
Motivation for Sparse SOM
Overcoming the Dimensionality Problem
A popular option: reduce the model space
Use less dimension-sensitive space reduction techniques (such as
SVD, Random-Mapping, etc.).
But is this the only way ?
Let’s define f is the fraction of non-zero values, and d = D × f .
Can we reduce the SOM complexity from O(TMD) to O(TMd) ?
Introduction
Contributions
Experiments
Sparse Algorithms
Parallelism
1 Introduction
2 Contributions
Sparse Algorithms
Parallelism
3 Experiments
Introduction
Contributions
Experiments
Sparse Algorithms
Parallelism
Compressed Sparse Rows Format
https://op2.github.io/PyOP2/linear_algebra.html
Sufficient to reduce the computing time ?
CSR supports efficient linear algebra operations. But:
nodes weights must be dense for training
not all operations produce sparse output
Introduction
Contributions
Experiments
Sparse Algorithms
Parallelism
Speedup the BMU Search
Euclidean distance (1) is equivalent to:
dk(t) = x(t) 2
− 2(wk(t) · x(t)) + wk(t) 2
(5)
Batch version
This change suffices to make sparse Batch SOM efficient, since
wk
2 can be computed once for each epoch.
Standard version
By storing w(t) 2, we can compute w(t + 1) 2 efficiently.
wk(t + 1) 2
= (1 − β(t))2
wk(t) 2
+ β(t)2
x(t) 2
+ 2β(t)(1 − β(t))(wk(t) · x(t))
(6)
Introduction
Contributions
Experiments
Sparse Algorithms
Parallelism
Sparse-Som: Speedup the Update Phase
We express the learning rule (3) as (Natarajan 1997):
wk(t + 1) = (1 − β(t)) wk(t) +
β(t)
1 − β(t)
x(t) (7)
Don’t update entire weight vectors
We keep the scalar coefficient separately, so we update only the
values affected by x(t).
Numerical stability
To avoid numerical stability issues, we use double-precision
floating-point, and rescale the weights when needed.
Introduction
Contributions
Experiments
Sparse Algorithms
Parallelism
Parallel Approaches for SOM
Specialized hardware (historical)
Massively parallel computing
Shared-memory multiprocessing
http://www.nersc.gov/users/computational-systems/cori
Introduction
Contributions
Experiments
Sparse Algorithms
Parallelism
How to Split the Computation
Network partitioning divides
the neurons
Data partitioning dispatches
the input data
Introduction
Contributions
Experiments
Sparse Algorithms
Parallelism
What We Did
Sparse-Som: hard to parallelize
not adapted to data
partitioning
too much latency with
network partitioning
Sparse-BSom: much simpler
adapted both to data and
network partitioning
less synchronization needed
Another specific issue due to sparseness
Memory-access latency, because the non-linear access pattern to
weight vectors.
Mitigation: improve the processor cache locality
Access to the weight-vectors in the inner-loop.
Introduction
Contributions
Experiments
Evaluation
Speed benchmark
Quality test
1 Introduction
2 Contributions
3 Experiments
Evaluation
Speed benchmark
Quality test
Introduction
Contributions
Experiments
Evaluation
Speed benchmark
Quality test
The Evaluation
We’ve trained SOM networks
on various datasets
with same parameters
5 times each test
then measured their performance (speed and quality).
Our speed baseline
Somoclu (Wittek et al. 2017) is a massively parallel batch SOM
implementation, which uses the classical algorithm.
Introduction
Contributions
Experiments
Evaluation
Speed benchmark
Quality test
Datasets Characteristics
classes features samples density
15564 0.14
rcv1 53 47236
518571 0.14
15933 0.13
news20 20 62061
3993 0.13
6412 0.29
sector 105 55197
3207 0.30
60000 19.22
mnist 10 780
10000 19.37
7291 100.00
usps 10 256
2007 100.00
15000 100.00
letter 26 16
5000 100.00
17766 29.00
protein 3 357
6621 26.06
2000 25.34
dna 3 180
1186 25.14
4435 98.99
satimage 6 36
2000 98.96
Introduction
Contributions
Experiments
Evaluation
Speed benchmark
Quality test
Training Parameters
Each network :
30 × 40 units grid
Rectangular lattice
tmax = 10 × Nsamples / Kepochs = 10
α(t) = 1 − (t/tmax)
Gaussian neighborhood, with a radius decreasing linearly from
15 to 0.5
Introduction
Contributions
Experiments
Evaluation
Speed benchmark
Quality test
Speed Benchmark
4 datasets used (2 sparse and 2 dense), to test both:
Serial mode (Sparse-Som vs. Sparse-BSom)
Parallel mode (Sparse-BSom vs. Somoclu)
Hardware and system specifications:
Intel Xeon E5-4610
4 sockets of 6 cores each
cadenced at 2.4 GHz
2 threads / core
Linux Ubuntu 16.04 (64 bits)
GCC 5.4
Introduction
Contributions
Experiments
Evaluation
Speed benchmark
Quality test
Results: Serial Performance
usps mnist sector news20 rcv1
0
50
100
150
200
250
300
elapsedtime(seconds)
30
203
94
144
123
50
302
34
44 36
Sparse-Som
Sparse-BSom
Introduction
Contributions
Experiments
Evaluation
Speed benchmark
Quality test
Results: Parallel Performance
1 2 4 8 16 32
number of CPU cores
101
102
103
104
elapsedtime(seconds)
Somoclu
sector
news20
mnist
usps
1 2 4 8 16 32
number of CPU cores
Sparse-BSom
sector
news20
mnist
usps
Introduction
Contributions
Experiments
Evaluation
Speed benchmark
Quality test
Quality Evaluation: Methodology
Metrics used:
1 Average Quantization Error
Q =
N
i=1 xi − wc
N
(8)
2 Topological Error
3 Precision and Recall in classification
Introduction
Contributions
Experiments
Evaluation
Speed benchmark
Quality test
Results: Average Quantization Error
Sparse-Som Sparse-BSom
rcv1 0.825 ± 0.001 0.816 ± 0.001
news20 0.905 ± 0.000 0.901 ± 0.001
sector 0.814 ± 0.001 0.772 ± 0.003
mnist 4.400 ± 0.001 4.500 ± 0.008
usps 3.333 ± 0.002 3.086 ± 0.006
protein 2.451 ± 0.000 2.450 ± 0.001
dna 4.452 ± 0.006 3.267 ± 0.042
satimage 0.439 ± 0.001 0.377 ± 0.001
letter 0.357 ± 0.001 0.345 ± 0.002
Introduction
Contributions
Experiments
Evaluation
Speed benchmark
Quality test
Results: Topological Error
Sparse-Som Sparse-BSom
rcv1 0.248 ± 0.007 0.353 ± 0.010
news20 0.456 ± 0.020 0.604 ± 0.014
sector 0.212 ± 0.012 0.514 ± 0.017
mnist 0.369 ± 0.005 0.268 ± 0.003
usps 0.150 ± 0.006 0.281 ± 0.011
protein 0.505 ± 0.006 0.448 ± 0.007
dna 0.099 ± 0.004 0.278 ± 0.023
satimage 0.103 ± 0.005 0.239 ± 0.015
letter 0.160 ± 0.003 0.269 ± 0.008
Introduction
Contributions
Experiments
Evaluation
Speed benchmark
Quality test
Results: Prediction Evaluation
Sparse-Som Sparse-BSom
precision recall precision recall
rcv1
79.2 ± 0.5 79.3 ± 0.6 81.3 ± 0.4 82.1 ± 0.3
73.7 ± 0.4 70.6 ± 0.5 76.6 ± 0.4 72.6 ± 0.5
news20
64.2 ± 0.5 62.8 ± 0.5 50.3 ± 0.9 49.6 ± 0.8
60.0 ± 1.7 55.4 ± 1.3 47.8 ± 1.2 43.6 ± 1.2
sector
77.2 ± 0.9 73.2 ± 0.9 58.4 ± 0.5 56.0 ± 1.0
73.3 ± 0.8 61.3 ± 1.8 60.9 ± 1.3 44.8 ± 1.0
mnist
93.5 ± 0.2 93.5 ± 0.2 91.5 ± 0.2 91.5 ± 0.2
93.4 ± 0.2 93.4 ± 0.2 91.7 ± 0.2 91.7 ± 0.2
usps
95.9 ± 0.2 95.9 ± 0.2 95.6 ± 0.2 95.6 ± 0.2
91.4 ± 0.3 90.7 ± 0.3 92.4 ± 0.5 91.5 ± 0.4
protein
56.7 ± 0.2 57.5 ± 0.2 56.7 ± 0.4 57.6 ± 0.3
49.8 ± 0.7 51.2 ± 0.6 50.7 ± 0.7 52.1 ± 0.6
dna
90.9 ± 0.6 90.8 ± 0.5 88.5 ± 0.6 88.5 ± 0.5
77.7 ± 1.5 69.6 ± 2.1 81.9 ± 2.9 30.3 ± 1.7
satimage
92.3 ± 0.4 92.4 ± 0.3 92.5 ± 0.4 92.6 ± 0.4
87.6 ± 0.3 85.4 ± 0.4 88.7 ± 0.5 86.3 ± 0.5
letter
83.8 ± 0.3 83.7 ± 0.3 81.9 ± 0.3 81.7 ± 0.4
81.5 ± 0.5 81.1 ± 0.5 80.2 ± 0.3 79.8 ± 0.5
Introduction
Contributions
Experiments
Summary: Main Benefits
Sparse-Som and Sparse-BSom run much faster than their
classical “dense” counterparts with sparse data.
Advantages of each version:
Sparse-Som
maps seem to have a
better organization
Sparse-BSom
highly parallelizable
more memory efficient
(single-precision)
Introduction
Contributions
Experiments
Thank you!

More Related Content

What's hot (19)

DOCX
Dynamic clustering algorithm using fuzzy c means
Wrishin Bhattacharya
 
PDF
Self-Organising Maps for Customer Segmentation using R - Shane Lynn - Dublin R
shanelynn
 
PPTX
Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...
wl820609
 
PPT
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
butest
 
PDF
Research Inventy : International Journal of Engineering and Science
inventy
 
PPTX
Understand Manifolds using MATLAB
Pranav Challa
 
PPTX
Unsupervised learning clustering
Arshad Farhad
 
PPTX
Types of clustering and different types of clustering algorithms
Prashanth Guntal
 
PDF
Multiclass Recognition with Multiple Feature Trees
csandit
 
PPT
Section5 Rbf
kylin
 
PPTX
Hierarchical clustering
Abdullah Masoud
 
PDF
Lecture 5 Relationship between pixel-2
VARUN KUMAR
 
PDF
Survey on clustering based color image segmentation
eSAT Publishing House
 
PDF
Siamese networks
Nicholas McClure
 
PPTX
Neural networks
HarshitGupta367
 
PDF
Hand Written Digit Classification
ijtsrd
 
PPT
DATA MINING:Clustering Types
Ashwin Shenoy M
 
PDF
AN IMPROVED MULTI-SOM ALGORITHM
IJNSA Journal
 
PDF
Tutorial of topological data analysis part 3(Mapper algorithm)
Ha Phuong
 
Dynamic clustering algorithm using fuzzy c means
Wrishin Bhattacharya
 
Self-Organising Maps for Customer Segmentation using R - Shane Lynn - Dublin R
shanelynn
 
Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...
wl820609
 
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
butest
 
Research Inventy : International Journal of Engineering and Science
inventy
 
Understand Manifolds using MATLAB
Pranav Challa
 
Unsupervised learning clustering
Arshad Farhad
 
Types of clustering and different types of clustering algorithms
Prashanth Guntal
 
Multiclass Recognition with Multiple Feature Trees
csandit
 
Section5 Rbf
kylin
 
Hierarchical clustering
Abdullah Masoud
 
Lecture 5 Relationship between pixel-2
VARUN KUMAR
 
Survey on clustering based color image segmentation
eSAT Publishing House
 
Siamese networks
Nicholas McClure
 
Neural networks
HarshitGupta367
 
Hand Written Digit Classification
ijtsrd
 
DATA MINING:Clustering Types
Ashwin Shenoy M
 
AN IMPROVED MULTI-SOM ALGORITHM
IJNSA Journal
 
Tutorial of topological data analysis part 3(Mapper algorithm)
Ha Phuong
 

Similar to Efficient Implementation of Self-Organizing Map for Sparse Input Data (20)

PPTX
Compressed learning for time series classification
學翰 施
 
PDF
MLMM_16_08_2022.pdf
LorenzoCampoli1
 
PPTX
Cycle’s topological optimizations and the iterative decoding problem on gener...
Usatyuk Vasiliy
 
PPT
Input analysis
Bhavik A Shah
 
PDF
Complex models in ecology: challenges and solutions
Peter Solymos
 
PDF
Cycle’s topological optimizations and the iterative decoding problem on gener...
Usatyuk Vasiliy
 
PPTX
Accelerating Dynamic Time Warping Subsequence Search with GPU
Davide Nardone
 
PDF
Dynamics of structures with uncertainties
University of Glasgow
 
PPSX
Simulation and hardware implementation of Adaptive algorithms on tms320 c6713...
Raj Kumar Thenua
 
PDF
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MLAI2
 
PPTX
Mining of time series data base using fuzzy neural information systems
Dr.MAYA NAYAK
 
PDF
Course-Notes__Advanced-DSP.pdf
ShreeDevi42
 
PDF
Advanced_DSP_J_G_Proakis.pdf
HariPrasad314745
 
PDF
Practical Spherical Harmonics Based PRT Methods
Naughty Dog
 
PDF
Gene's law
Hoopeer Hoopeer
 
PDF
GTC Taiwan 2017 GPU 平台上導入深度學習於半導體產業之 EDA 應用
NVIDIA Taiwan
 
PPTX
Efficient anomaly detection via matrix sketching
Hsing-chuan Hsieh
 
PDF
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
MLconf
 
PPTX
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
Salford Systems
 
PPTX
Trackster Pruning at the CMS High-Granularity Calorimeter
Yousef Fadila
 
Compressed learning for time series classification
學翰 施
 
MLMM_16_08_2022.pdf
LorenzoCampoli1
 
Cycle’s topological optimizations and the iterative decoding problem on gener...
Usatyuk Vasiliy
 
Input analysis
Bhavik A Shah
 
Complex models in ecology: challenges and solutions
Peter Solymos
 
Cycle’s topological optimizations and the iterative decoding problem on gener...
Usatyuk Vasiliy
 
Accelerating Dynamic Time Warping Subsequence Search with GPU
Davide Nardone
 
Dynamics of structures with uncertainties
University of Glasgow
 
Simulation and hardware implementation of Adaptive algorithms on tms320 c6713...
Raj Kumar Thenua
 
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MLAI2
 
Mining of time series data base using fuzzy neural information systems
Dr.MAYA NAYAK
 
Course-Notes__Advanced-DSP.pdf
ShreeDevi42
 
Advanced_DSP_J_G_Proakis.pdf
HariPrasad314745
 
Practical Spherical Harmonics Based PRT Methods
Naughty Dog
 
Gene's law
Hoopeer Hoopeer
 
GTC Taiwan 2017 GPU 平台上導入深度學習於半導體產業之 EDA 應用
NVIDIA Taiwan
 
Efficient anomaly detection via matrix sketching
Hsing-chuan Hsieh
 
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
MLconf
 
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
Salford Systems
 
Trackster Pruning at the CMS High-Granularity Calorimeter
Yousef Fadila
 
Ad

Recently uploaded (20)

PDF
Insect Behaviour : Patterns And Determinants
SheikhArshaqAreeb
 
PPTX
Vectors and applications of genetic engineering Pptx
Ashwini I Chuncha
 
PDF
Adding Geochemistry To Understand Recharge Areas - Kinney County, Texas - Jim...
Texas Alliance of Groundwater Districts
 
PPTX
GB1 Q1 04 Life in a Cell (1).pptx GRADE 11
JADE ACOSTA
 
PDF
2025-06-10 TWDB Agency Updates & Legislative Outcomes
tagdpa
 
PPTX
Immunopharmaceuticals and microbial Application
xxkaira1
 
PDF
Chemokines and Receptors Overview – Key to Immune Cell Signaling
Benjamin Lewis Lewis
 
PPT
Cell cycle,cell cycle checkpoint and control
DrMukeshRameshPimpli
 
PPTX
Entner-Doudoroff pathway by Santosh .pptx
santoshpaudel35
 
PDF
Primordial Black Holes and the First Stars
Sérgio Sacani
 
PDF
soil and environmental microbiology.pdf
Divyaprabha67
 
PPTX
LESSON 2 PSYCHOSOCIAL DEVELOPMENT.pptx L
JeanCarolColico1
 
PPTX
Q1 - W1 - D2 - Models of matter for science.pptx
RyanCudal3
 
PPTX
Class12_Physics_Chapter2 electric potential and capacitance.pptx
mgmahati1234
 
PPTX
Q1_Science 8_Week3-Day 1.pptx science lesson
AizaRazonado
 
PDF
A Man of the Forest: The Contributions of Gifford Pinchot
RowanSales
 
PPTX
Diagnostic Features of Common Oral Ulcerative Lesions.pptx
Dr Palak borade
 
PDF
Pharma Part 1.pdf #pharmacology #pharmacology
hikmatyt01
 
PPTX
Microbiome_Engineering_Poster_Fixed.pptx
SupriyaPolisetty1
 
PDF
Treatment and safety of drinking water .
psuvethapalani
 
Insect Behaviour : Patterns And Determinants
SheikhArshaqAreeb
 
Vectors and applications of genetic engineering Pptx
Ashwini I Chuncha
 
Adding Geochemistry To Understand Recharge Areas - Kinney County, Texas - Jim...
Texas Alliance of Groundwater Districts
 
GB1 Q1 04 Life in a Cell (1).pptx GRADE 11
JADE ACOSTA
 
2025-06-10 TWDB Agency Updates & Legislative Outcomes
tagdpa
 
Immunopharmaceuticals and microbial Application
xxkaira1
 
Chemokines and Receptors Overview – Key to Immune Cell Signaling
Benjamin Lewis Lewis
 
Cell cycle,cell cycle checkpoint and control
DrMukeshRameshPimpli
 
Entner-Doudoroff pathway by Santosh .pptx
santoshpaudel35
 
Primordial Black Holes and the First Stars
Sérgio Sacani
 
soil and environmental microbiology.pdf
Divyaprabha67
 
LESSON 2 PSYCHOSOCIAL DEVELOPMENT.pptx L
JeanCarolColico1
 
Q1 - W1 - D2 - Models of matter for science.pptx
RyanCudal3
 
Class12_Physics_Chapter2 electric potential and capacitance.pptx
mgmahati1234
 
Q1_Science 8_Week3-Day 1.pptx science lesson
AizaRazonado
 
A Man of the Forest: The Contributions of Gifford Pinchot
RowanSales
 
Diagnostic Features of Common Oral Ulcerative Lesions.pptx
Dr Palak borade
 
Pharma Part 1.pdf #pharmacology #pharmacology
hikmatyt01
 
Microbiome_Engineering_Poster_Fixed.pptx
SupriyaPolisetty1
 
Treatment and safety of drinking water .
psuvethapalani
 
Ad

Efficient Implementation of Self-Organizing Map for Sparse Input Data