SlideShare a Scribd company logo
S. Benchaou et al Int. Journal of Engineering Research and Applications www.ijera.com 
ISSN : 2248-9622, Vol. 4, Issue 7( Version 3), July 2014, pp.26-30 
www.ijera.com 26 | P a g e 
Image in color Transformation 
into grey levels 
Binarization Cropping Normalization 
New Approach of Preprocessing For Numeral Recognition 
S. Benchaou*, M. Nasri*, O. El Melhaoui*, B. Bouali** 
*(Labo MATSI, EST, University Mohammed 1, Morocco) 
** (Labo AGA, FS, University Mohammed 1, Morocco) 
ABSTRACT 
The present paper proposes a new approach of preprocessing for handwritten, printed and isolated numeral 
characters. The new approach reduces the size of the input image of each numeral by discarding the redundant 
information. This method reduces also the number of features of the attribute vector provided by the extraction 
features method. Numeral recognition is carried out in this work through k nearest neighbors and multilayer 
perceptron techniques. The simulations have obtained a good rate of recognition in fewer running time. 
Keywords – features extraction, handwritten and printed numeral recognition, k nearest neighbors, multilayer 
perceptron , preprocessing. 
I. INTRODUCTION 
For the last decade, the recognition of 
handwritten characters has known a big progress in 
numerous industrial applications, in particular the 
numeral recognition which is used in several sectors 
such as postal sorting, bank check reading, order form 
processing, etc. 
The recognition system requires three main steps 
to predict the class of membership of an unknown 
pattern, which are: preprocessing, features extraction 
and classification. 
The preprocessing phase consists of discarding 
the imperfections and reducing the analyzed area. 
The Numeral features extraction is a delicate 
process and is crucial [1] for a good numeral 
recognition. It consists of transforming the image into 
an attribute vector which contains a set of 
discriminated characteristics for recognition and also 
reducing the amount of information supplied to the 
system. 
In the literature, several works have been 
proposed for features extractions such as invariant 
moments [2], Zernike moments [3], freeman coding 
[4], Loci characteristics [5], etc. 
The last step is the classification which consists 
of partitioning a set of data entity into separate classes 
according to a similarity criterion, different methods 
are proposed in this context includes dynamic 
programming, neural networks, support vector 
machines, k nearest neighbors, k-means, etc. 
In order to validate our contributions, we have 
used in this work a database of 590 numerals, printed 
and handwritten, provided by various categories of 
writers. 
In section 2 numeral features extraction method is 
presented. The k nearest neighbors and the multilayer 
perceptron techniques of classification are discussed 
in section 3. The proposed system for numeral 
recognition is presented in section 4. The result of 
simulations and comparisons are introduced in section 
5. Finally, we give a conclusion. 
II. NUMERAL FEATURES EXTRACTION 
Features extraction is a complex task. It knows a 
great interest in multiple domains such as image 
processing, segmentation, classification, etc. 
After the preprocessing step, the preprocessed 
image is represented by a matrix of pixels which can 
be of very large size. So, it will be useful to represent 
objects by characteristics containing the necessary 
information. This operation is called features 
extraction. In the literature, the feature extraction 
methods are classified according to two categories: 
statistical approach and structural approach. In our 
work we are going to use the statistical approach 
based on profile projection. 
2.1 Profile projection 
Profile projection is a statistical method 
successfully used in recognition of handwritten 
characters [6], etc. 
The preprocessing stage consists of binarizing the 
numeral input image which is presented in grey level. 
The next stage is to only preserve the numeral 
position in image by cropping it. Final stage is to fix 
the size of cropped numeral image. 
Fig. 1: Different steps for preprocessing of 
numeral « 5 ». 
This method calculates the number of pixels 
(distance) between the left, bottom, right, top edge of 
the image and the first black pixel met on this row or 
column. The dimension of the obtained attribute 
RESEARCH ARTICLE OPEN ACCESS
S. Benchaou et al Int. Journal of Engineering Research and Applications www.ijera.com 
ISSN : 2248-9622, Vol. 4, Issue 7( Version 3), July 2014, pp.26-30 
www.ijera.com 27 | P a g e 
vector is twice the sum of the number of rows and 
columns associated to the image of the numeral. 
Fig. 2: The four profile projections of numeral «4» 
III. CLASSIFICATION METHODS 
3.1 K nearest neighbors 
K nearest neighbors (KNN) is a widely used 
method for data classification. Proposed in 1967 by 
Cover [7], it has been widely used in handwritten 
numerals recognition for its simplicity and its 
robustness [8]. 
KNN is a method which was inspired from the 
closest neighbor rule. It is based on computing the 
distance between the test sample and the different 
learning data samples and then attributes the sample 
to the k nearest neighbors. 
3.2 Multilayer perceptron 
Neural networks are one branch of artificial 
intelligence. Artificial neural network is defined as a 
computer system who has been inspired from a long 
study of human brain and how the human neurons 
function [9]. 
Artificial neural networks in general and 
multilayer perceptron (MLP) in particular have been 
widely used for data classification, pattern recognition 
(characters, voice, signal, etc.) [10][11], prediction 
[12], etc. The MLP is characterized by its ability to 
learn and gradually improve its performance through a 
learning process. 
MLPs are forward propagation networks where 
the two closest layers are fully connected. The MLP 
structure contains an input, an output and a certain 
number of hidden layers. 
We have been limited, in this work to one hidden 
layer. 
Learning is a phase where the behavior of the 
network is modified by modifying the synaptic 
weights until a desired output pattern is obtained. 
In this work, we have used the gradient 
backpropagation algorithm, the objective is to 
minimize the squared error between the desired and 
computed output of the MLP. Algorithm 1 shows the 
different steps of the gradient backpropagation 
algorithm. 
Algorithm 1 : Gradient backpropagation 
1. Randomly initialize the synaptic weights 
between -1 and 1. 
2. Randomly apply a realization vector of an 
object to the input layer and its corresponding 
known output to the output layer. 
3. Compute the network output and error E 
between computed and desired outputs. 
4. Adjust the weights by the gradient method: 
( ) 
( ) ( ) 
( 1) ( ) 
r 
r r 
W 
E 
W t W t 
 
 
   
η is the learning rate, which is in general a value 
between 0.1 and 0.9, r = 1,2. 
5. Go to 2 as long as the network does not show 
satisfactory performances. 
We have a structure of three layers: 
- The input layer has n1 neurons that we call nek, 
1≤k≤n1, 
- The hidden and output layers contain n2 and n3 
neurons that we respectively call ncj and nsi, where 
1≤j≤n2, 1≤i≤n3. 
(1) 
k z is the realization of the attribute vector 
component for a numeral image I, with k=1,2,….n1. 
( ) (1) (1) 
jk W  w , j=1….n2, k=1…..n1, (1) 
jk w are the 
synaptic weights connecting neurons in the input to 
the neurons in the hidden layers. 
( ) (2) (2) 
ij W  w , i=1….n3, j=1…..n2, 
(2) 
ij w are the 
synaptic weights connecting neurons in the hidden to 
the neurons in the output layers. The output of the 
neuron j, ncj of the hidden layer is: 
( ) (2) (2) 
z j  f y j 
(1) 
Where  
 
 
1 
1 
(2) (1) (1) 
n 
k 
y j wjk zk (2) 
And (2) 
1 
1 
( ) (2) 
j j y 
e 
f y 
 
 
 (3) 
For j=1,2,…,n2, f is the activation function which 
we choose to be of sigmoid type. 
The output of the neuron i , nsi of the output layer 
is: 
( ) (3) (3) 
zi  f yi 
(4) 
Where  
 
 
2 
1 
(3) (2) (2) 
n 
j 
i ij j y w z (5) 
For i= 1,2,…,n3. 
Notice that the superscripts (1), (2) and (3) 
represent respectively input, hidden and output layers. 
 Optimization of the (2) 
ij w :
S. Benchaou et al Int. Journal of Engineering Research and Applications www.ijera.com 
ISSN : 2248-9622, Vol. 4, Issue 7( Version 3), July 2014, pp.26-30 
www.ijera.com 28 | P a g e 
Binarization Cropping 
Windowing Normalization 
In order to be able to update (2) 
ij w synaptic weights 
for different indices i and j, consider the network 
error to be: 
2 
3 
1 
(3) ( ( )) 
2 
1 
E(t) z di t 
n 
i 
i    
 
(6) 
Where di (t) is the known desired output at neuron 
nsi. t represents the current iteration. Differentiating 
E(t) with respect to (2) 
ij w gives: 
(2) 
(3) 
(3) 
(2) 
( ( ) ( )) 
( ) 
ij 
i 
i i 
ij w 
z 
z t d t 
w 
E t 
 
 
  
 
 
(7) 
Where ( ) (1 ( )) (2) (3) (3) 
(2) 
(3) 
j i i 
ij 
i z f y f y 
w 
z 
    
 
 
(8) 
The synaptic weight 
(2) 
ij w may now be updated. 
( 1) ( ) ( ) ( ) (2) (2) (3) (2) w t w t t z t ij ij i j    (9) 
For i=1….n3 and j =1….n2 
Where 
( ) ( ( ) ( )) ( ) (1 ( )) (3) (3) (3) (3) 
i i i i i  t  z t  d t f y   f y 
(10) 
 Optimization of the 
(1) 
jk w 
The derivation in this case is similar to that 
presented previously, so we go to the updating of the 
weights according to: 
( 1) ( ) ( ) ( ) (1) (1) (2) (1) w t w t t z t jk jk j k    (11) 
For j =1….n2 and k= 1…n1 
Where 
( ) 
( ( )) 
( ) ( ) ( ) 
(2) 
(2) 
1 
(2) (3) (2) 
3 
y t 
f y t 
t t w t 
j 
j 
n 
i 
j i ij 
 
 
  
 
  (12) 
IV. PROPOSED SYSTEM 
We have used a database of 590 Arabic numerals, 
provided by different writers. A sample of the 
database is shown in figure 3. The database is divided 
into two sets, one set of 400 numerals is used for 
learning and the remaining 190 numerals are used for 
the test stage. 
Fig. 3: A sample of handwritten and printed 
numerals 
The proposed system consists of adding a step of 
windowing in the preprocessing stage, in order to 
reduce the size of the attribute vector, which 
decreases the running time. 
The proposed system contains three main steps, 
preprocessing, features extraction and classification. 
The full system is shown in figure 4. 
Fig. 4: Scheme for numeral recognition 
The preprocessing stage, in our case, is so fast in 
computing time. It consists of binarizing the numeral 
input image which is presented in grey level. Then, 
we preserve only the numeral position in image by 
cropping it. The next stage is to normalize the image 
in a predefined size. Finally, we have applied the 
windowing stage. This operation consists of dividing 
the numeral image into small windows. For every 
window we calculate the average of pixel grey level. 
This operation reduces widely the size of the image 
and preserves the useful information for recognition 
of every input image. So, the dimension of attribute 
vectors applied to the system will be reduced too. 
The three last steps (cropping, normalization and 
windowing) reduce also widely the computing time 
of processing. 
After preprocessing, features extraction is carried 
out by profile projection method. KNN and MLP 
methods are used for classification task. 
V. EXPERIMENTAL RESULTS AND 
COMPARATIVE STUDY 
In the first test, the simulations of numeral 
recognition are carried out without using windowing 
stage in the preprocessing phase. Several experiments 
were carried out to determine the recognition rate 
according to the size of normalization and the 
number of neurons in the hidden layer of the 
multilayer perceptron. 
Table 1 shows that the couple son and nc have 
an effect on recognition rate. In fact, with the couple 
(son,nc)= (40x40,40) we have achieved the 
maximum recognition rate equal to 86.8%. The 
attribute vector obtained contains 160 elements for 
each numeral. 
Image 
in 
grey 
level 
Features extraction : 
Profile Projection 
Classification : 
- Multilayer Perceptron (MLP) 
- K nearest neighbors (KNN) 
Input Image
S. Benchaou et al Int. Journal of Engineering Research and Applications www.ijera.com 
ISSN : 2248-9622, Vol. 4, Issue 7( Version 3), July 2014, pp.26-30 
www.ijera.com 29 | P a g e 
Table 1: Numeral recognition rate according to the size of normalization and the number of neurons in the hidden layer 
Size of normalization (son) 
Profile projection 
Number of primitives 
Number of neurons in the hidden layer (nc) 
Recognition rate % 
20x20 
80 
15 
77.89 
20 
82 
30 
81.05 
30x30 
120 
25 
82.63 
35 
84 
45 
83.68 
40x40 
160 
30 
78.9 
40 
86.8 
55 
84 
50x50 
200 
50 
84.73 
80 
85.26 
140 
83.16 
Figure 5 illustrates the evolution of recognition rate according to the number of neurons in the hidden layer, using the size 40x40 as size of normalization. 
Fig. 5: Evolution of recognition rate according to the number of neurons in the hidden layer A normalization size of 40x40 obtains good recognition rate performance of isolated numerals. This size will be preserved for application of our proposed system. In the classifier, the number of neurons in the input layer is too high. It is equal to 160 which weakened the performance of the multilayer perceptron and increased the time of processing. For this reason, and in order to reduce the number of input neurons of the numeral recognition system, we have proposed the windowing step which consists to split the numeral image into equal areas of m pixels. For each area, the average of pixel grey level is calculated. This operation reduces the size of input image, attribute vector dimension, error rate and time processing. In the second test, we have fixed the size of normalization on 40x40. Then, we have applied our approach of windowing in preprocessing phase. Table 2 illustrates the numeral recognition rate using the proposed system after several simulations according to the number of hidden neurons. Table 2: Numeral recognition rate of proposed system 
Table 2 highlights the fact that with the operation of windowing, the size of the attribute vector is reduced to 120 and 80 while it was initially 160. The time processing using the K nearest neighbors and multilayer perceptron is reduced too. We also notice that the choice of the size of windows is essential. The window of two pixels preserves the same recognition rate. It is equal to 86.8%. But the window of four pixels improves the recognition rate to 87.9%, this justifies that the redundant attributes are partially eliminated. 
VI. CONCLUSION 
In this work we have presented a new approach for numerals recognition. In the preprocessing stage we have added the step of windowing. It consists to split the numeral image into equal areas of m pixels. For each area, the average of pixels is calculated. Features extraction is carried out by profile projection. Multilayer perceptron method and K nearest neighbors are used for classification. The simulations have obtained good results; a clear reduction of attributes vector size and time processing with a good increase of recognition rate are obtained. That confirms the performance of our proposed system for numeral recognition. 
REFERENCES 
[1] R.G.1. Benne, B.V.1. Dhandra and M. Hangarge, “Tri-scripts handwritten numeral recognition: a novel approach”. Advances in 
Size of Window-ing 
Number of primiti- ves 
Profile projection 
K nearest neighbors 
Recognit-ion rate (%) 
Time process-ing (min) 
Recognit-ion rate (%) 
Time proce- ssing (s) 
Without windowing 
160 
86.8 
13.58 
86.8 
31 
Window of 2 pixels 
120 
86.8 
13.5 
86.8 
24 
Window of 4 pixels 
80 
87.9 
4 
87.9 
19
S. Benchaou et al Int. Journal of Engineering Research and Applications www.ijera.com 
ISSN : 2248-9622, Vol. 4, Issue 7( Version 3), July 2014, pp.26-30 
www.ijera.com 30 | P a g e 
Computational Research, ISSN: 0975–3273, Vol 1, Issue 2, pp: 47-51,2009. 
[2] Anuja P. Nagare. “License Plate Character Recognition System using Neural Network”. International Journal of Computer Applications (0975-8887), Vol.25, N°.10, July 2011. 
[3] N. Abu Bakar, S. Shamsuddin, and A. Ali. “An integrated formulation of zernike representation in character images”. In Nicolás García-Pedrajas, Francisco Herrera, Colin Fyfe, José Benítez, and Moonis Ali, editors, Trends in Applied Intelligent Systems, volume 6098 of Lecture Notes in Computer Science, pages 359–368. Springer Berlin / Heidelberg, 2010. 
[4] X. Dupré, “Contributions à la reconnaissance de l'écriture cursive à l'aide de modèles de Markov”. Thèse, Université René Descartes-Paris 5, 2003. 
[5] R.Ebrahim, M.R.Moradian, A.Esmkhani, F.M.Jafarlou. “Recognition of Persian handwritten digits using characterization Loci and mixture of experts”. International Journal of Digital Content.Technology and its Applications, Vol.3,N°.3, 2009 . 
[6] D. Gokana, “Contribution à la Reconnaissance Automatique de Caractères Manuscrits. Application à la Lecture Optique de Caractères sur Supports Mobiles”. Thèse de Doctorat : Université de Paris Sud, Centre d’Orsay, 1986. 
[7] R. GIL-PITA, X. YAO, “Evolving edited k- nearest neighbor classifiers”. International Journal of Neural Systems, Vol.18, N°.6, pp 459–467,2008. 
[8] I. Kuncheva, “Editing for the k-nearest neighbors rule by a genetic algorithm”. Elsevier, Pattern Recognition Letters 16, pp: 809-814,1995. 
[9] A. Adnan and S.Sameer, “Optical character recognition : Neural network analysis of hand-printed characters”. In SSPR ’98/SPR ’98 : Proceedings of the Joint IAPR International Workshops on Advances in Pattern Recognition, London, UK, Springer- Verlag, 1998. 
[10] J.P.Pinto, “Multilayer perceptron based hierarchical acoustic modeling for automatic speech recognition”. Thèse, Ecole Polytechnique fédérale de Lausanne, Suisse, 2010. 
[11] L.Oukhellou, “Paramètrisation et classification de signaux en contrôle non destructif. Application à la reconnaissance des défauts de rails par courant de foucault”. Thèse, Université Paris XI Orsay, 1997. 
[12] H.Bouziane, B.Messabih, A.Chouarfia, “Prédiction de la structure 2D des protéines par les réseaux de neurones”. Communication of IBIMA, Vol.6, 2008.

More Related Content

What's hot (17)

PDF
Capstone paper
Muhammad Saeed
 
PDF
Hangul Recognition Using Support Vector Machine
Editor IJCATR
 
PDF
A systematic image compression in the combination of linear vector quantisati...
eSAT Publishing House
 
PDF
Fractal Image Compression By Range Block Classification
IRJET Journal
 
PDF
A Literature Survey: Neural Networks for object detection
vivatechijri
 
PDF
Ijetcas14 527
Iasir Journals
 
PDF
Colour Image Steganography Based on Pixel Value Differencing in Spatial Domain
ijistjournal
 
PDF
A BLIND ROBUST WATERMARKING SCHEME BASED ON SVD AND CIRCULANT MATRICES
csandit
 
PDF
Image Encryption Based on Pixel Permutation and Text Based Pixel Substitution
ijsrd.com
 
PDF
Segmentation and recognition of handwritten digit numeral string using a mult...
ijfcstjournal
 
PDF
Fuzzy entropy based optimal
ijsc
 
PDF
A new block cipher for image encryption based on multi chaotic systems
TELKOMNIKA JOURNAL
 
PDF
A robust combination of dwt and chaotic function for image watermarking
ijctet
 
PDF
Performance Improvement of Vector Quantization with Bit-parallelism Hardware
CSCJournals
 
PDF
IMAGE DE-NOISING USING DEEP NEURAL NETWORK
aciijournal
 
PDF
INTRA BLOCK AND INTER BLOCK NEIGHBORING JOINT DENSITY BASED APPROACH FOR JPEG...
ijsc
 
PDF
Modified weighted embedding method for image steganography
IAEME Publication
 
Capstone paper
Muhammad Saeed
 
Hangul Recognition Using Support Vector Machine
Editor IJCATR
 
A systematic image compression in the combination of linear vector quantisati...
eSAT Publishing House
 
Fractal Image Compression By Range Block Classification
IRJET Journal
 
A Literature Survey: Neural Networks for object detection
vivatechijri
 
Ijetcas14 527
Iasir Journals
 
Colour Image Steganography Based on Pixel Value Differencing in Spatial Domain
ijistjournal
 
A BLIND ROBUST WATERMARKING SCHEME BASED ON SVD AND CIRCULANT MATRICES
csandit
 
Image Encryption Based on Pixel Permutation and Text Based Pixel Substitution
ijsrd.com
 
Segmentation and recognition of handwritten digit numeral string using a mult...
ijfcstjournal
 
Fuzzy entropy based optimal
ijsc
 
A new block cipher for image encryption based on multi chaotic systems
TELKOMNIKA JOURNAL
 
A robust combination of dwt and chaotic function for image watermarking
ijctet
 
Performance Improvement of Vector Quantization with Bit-parallelism Hardware
CSCJournals
 
IMAGE DE-NOISING USING DEEP NEURAL NETWORK
aciijournal
 
INTRA BLOCK AND INTER BLOCK NEIGHBORING JOINT DENSITY BASED APPROACH FOR JPEG...
ijsc
 
Modified weighted embedding method for image steganography
IAEME Publication
 

Viewers also liked (20)

PDF
absorption, Cu2+ : glass, emission, excitation, XRD
IJERA Editor
 
PDF
Dd4301605614
IJERA Editor
 
PDF
D49032225
IJERA Editor
 
PDF
Thermo catalytic decomposition of methane over Pd/AC and Pd/CB catalysts for ...
IJERA Editor
 
PDF
Fa4301925930
IJERA Editor
 
PDF
A43060105
IJERA Editor
 
PDF
Bs4301396400
IJERA Editor
 
PDF
Dye Sysentized Solar Cell (Dyssc)
IJERA Editor
 
PDF
Analysis On Classification Techniques In Mammographic Mass Data Set
IJERA Editor
 
PDF
X4502151157
IJERA Editor
 
PDF
Secured Data Transmission Using Video Steganographic Scheme
IJERA Editor
 
PDF
J42036572
IJERA Editor
 
PDF
Improved Low Voltage High Speed FVF Based Current Comparator with Logical Eff...
IJERA Editor
 
PDF
Ay044316318
IJERA Editor
 
PDF
A044080105
IJERA Editor
 
PDF
General Solution of Equations of Motion of Axisymmetric Problem of Micro-Isot...
IJERA Editor
 
PDF
Q0440596102
IJERA Editor
 
PDF
R04504114117
IJERA Editor
 
PDF
J046015861
IJERA Editor
 
PDF
A survey on RBF Neural Network for Intrusion Detection System
IJERA Editor
 
absorption, Cu2+ : glass, emission, excitation, XRD
IJERA Editor
 
Dd4301605614
IJERA Editor
 
D49032225
IJERA Editor
 
Thermo catalytic decomposition of methane over Pd/AC and Pd/CB catalysts for ...
IJERA Editor
 
Fa4301925930
IJERA Editor
 
A43060105
IJERA Editor
 
Bs4301396400
IJERA Editor
 
Dye Sysentized Solar Cell (Dyssc)
IJERA Editor
 
Analysis On Classification Techniques In Mammographic Mass Data Set
IJERA Editor
 
X4502151157
IJERA Editor
 
Secured Data Transmission Using Video Steganographic Scheme
IJERA Editor
 
J42036572
IJERA Editor
 
Improved Low Voltage High Speed FVF Based Current Comparator with Logical Eff...
IJERA Editor
 
Ay044316318
IJERA Editor
 
A044080105
IJERA Editor
 
General Solution of Equations of Motion of Axisymmetric Problem of Micro-Isot...
IJERA Editor
 
Q0440596102
IJERA Editor
 
R04504114117
IJERA Editor
 
J046015861
IJERA Editor
 
A survey on RBF Neural Network for Intrusion Detection System
IJERA Editor
 
Ad

Similar to New Approach of Preprocessing For Numeral Recognition (20)

PDF
Paper id 252014130
IJRAT
 
PDF
Tracking and counting human in visual surveillance system
iaemedu
 
PDF
Tracking and counting human in visual surveillance system
iaemedu
 
PDF
Tracking and counting human in visual surveillance system
IAEME Publication
 
PDF
Implementation of a modified counterpropagation neural network model in onlin...
Alexander Decker
 
PDF
INVERSIONOF MAGNETIC ANOMALIES DUE TO 2-D CYLINDRICAL STRUCTURES –BY AN ARTIF...
ijsc
 
PDF
INVERSIONOF MAGNETIC ANOMALIES DUE TO 2-D CYLINDRICAL STRUCTURES –BY AN ARTIF...
ijsc
 
PDF
Inversion of Magnetic Anomalies Due to 2-D Cylindrical Structures – By an Art...
ijsc
 
PDF
Solving linear equations from an image using ann
eSAT Journals
 
PDF
An fpga based efficient fruit recognition system using minimum
Alexander Decker
 
PDF
11.digital image processing for camera application in mobile devices using ar...
Alexander Decker
 
PDF
Digital image processing for camera application in mobile devices using artif...
Alexander Decker
 
PDF
Analytical and Systematic Study of Artificial Neural Network
IRJET Journal
 
PDF
Random Chaotic Number Generation based Clustered Image Encryption
AM Publications
 
PDF
SEGMENTATION AND RECOGNITION OF HANDWRITTEN DIGIT NUMERAL STRING USING A MULT...
ijfcstjournal
 
PDF
SEGMENTATION AND RECOGNITION OF HANDWRITTEN DIGIT NUMERAL STRING USING A MULT...
ijfcstjournal
 
PDF
Black-box modeling of nonlinear system using evolutionary neural NARX model
IJECEIAES
 
PDF
Ag044216224
IJERA Editor
 
PDF
11.secure compressed image transmission using self organizing feature maps
Alexander Decker
 
PDF
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
CSCJournals
 
Paper id 252014130
IJRAT
 
Tracking and counting human in visual surveillance system
iaemedu
 
Tracking and counting human in visual surveillance system
iaemedu
 
Tracking and counting human in visual surveillance system
IAEME Publication
 
Implementation of a modified counterpropagation neural network model in onlin...
Alexander Decker
 
INVERSIONOF MAGNETIC ANOMALIES DUE TO 2-D CYLINDRICAL STRUCTURES –BY AN ARTIF...
ijsc
 
INVERSIONOF MAGNETIC ANOMALIES DUE TO 2-D CYLINDRICAL STRUCTURES –BY AN ARTIF...
ijsc
 
Inversion of Magnetic Anomalies Due to 2-D Cylindrical Structures – By an Art...
ijsc
 
Solving linear equations from an image using ann
eSAT Journals
 
An fpga based efficient fruit recognition system using minimum
Alexander Decker
 
11.digital image processing for camera application in mobile devices using ar...
Alexander Decker
 
Digital image processing for camera application in mobile devices using artif...
Alexander Decker
 
Analytical and Systematic Study of Artificial Neural Network
IRJET Journal
 
Random Chaotic Number Generation based Clustered Image Encryption
AM Publications
 
SEGMENTATION AND RECOGNITION OF HANDWRITTEN DIGIT NUMERAL STRING USING A MULT...
ijfcstjournal
 
SEGMENTATION AND RECOGNITION OF HANDWRITTEN DIGIT NUMERAL STRING USING A MULT...
ijfcstjournal
 
Black-box modeling of nonlinear system using evolutionary neural NARX model
IJECEIAES
 
Ag044216224
IJERA Editor
 
11.secure compressed image transmission using self organizing feature maps
Alexander Decker
 
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
CSCJournals
 
Ad

Recently uploaded (20)

DOCX
8th International Conference on Electrical Engineering (ELEN 2025)
elelijjournal653
 
PPTX
Element 11. ELECTRICITY safety and hazards
merrandomohandas
 
PDF
Basic_Concepts_in_Clinical_Biochemistry_2018كيمياء_عملي.pdf
AdelLoin
 
PDF
Introduction to Productivity and Quality
মোঃ ফুরকান উদ্দিন জুয়েল
 
PPTX
The Role of Information Technology in Environmental Protectio....pptx
nallamillisriram
 
PPTX
Evaluation and thermal analysis of shell and tube heat exchanger as per requi...
shahveer210504
 
PPTX
GitOps_Repo_Structure for begeinner(Scaffolindg)
DanialHabibi2
 
PDF
Reasons for the succes of MENARD PRESSUREMETER.pdf
majdiamz
 
PDF
Biomechanics of Gait: Engineering Solutions for Rehabilitation (www.kiu.ac.ug)
publication11
 
PDF
Pressure Measurement training for engineers and Technicians
AIESOLUTIONS
 
PPTX
Green Building & Energy Conservation ppt
Sagar Sarangi
 
PPTX
Solar Thermal Energy System Seminar.pptx
Gpc Purapuza
 
PPTX
Depth First Search Algorithm in 🧠 DFS in Artificial Intelligence (AI)
rafeeqshaik212002
 
PDF
PORTFOLIO Golam Kibria Khan — architect with a passion for thoughtful design...
MasumKhan59
 
PDF
International Journal of Information Technology Convergence and services (IJI...
ijitcsjournal4
 
PPTX
VITEEE 2026 Exam Details , Important Dates
SonaliSingh127098
 
PPTX
Thermal runway and thermal stability.pptx
godow93766
 
PPTX
GitOps_Without_K8s_Training_detailed git repository
DanialHabibi2
 
PPTX
Product Development & DevelopmentLecture02.pptx
zeeshanwazir2
 
PPTX
Heart Bleed Bug - A case study (Course: Cryptography and Network Security)
Adri Jovin
 
8th International Conference on Electrical Engineering (ELEN 2025)
elelijjournal653
 
Element 11. ELECTRICITY safety and hazards
merrandomohandas
 
Basic_Concepts_in_Clinical_Biochemistry_2018كيمياء_عملي.pdf
AdelLoin
 
Introduction to Productivity and Quality
মোঃ ফুরকান উদ্দিন জুয়েল
 
The Role of Information Technology in Environmental Protectio....pptx
nallamillisriram
 
Evaluation and thermal analysis of shell and tube heat exchanger as per requi...
shahveer210504
 
GitOps_Repo_Structure for begeinner(Scaffolindg)
DanialHabibi2
 
Reasons for the succes of MENARD PRESSUREMETER.pdf
majdiamz
 
Biomechanics of Gait: Engineering Solutions for Rehabilitation (www.kiu.ac.ug)
publication11
 
Pressure Measurement training for engineers and Technicians
AIESOLUTIONS
 
Green Building & Energy Conservation ppt
Sagar Sarangi
 
Solar Thermal Energy System Seminar.pptx
Gpc Purapuza
 
Depth First Search Algorithm in 🧠 DFS in Artificial Intelligence (AI)
rafeeqshaik212002
 
PORTFOLIO Golam Kibria Khan — architect with a passion for thoughtful design...
MasumKhan59
 
International Journal of Information Technology Convergence and services (IJI...
ijitcsjournal4
 
VITEEE 2026 Exam Details , Important Dates
SonaliSingh127098
 
Thermal runway and thermal stability.pptx
godow93766
 
GitOps_Without_K8s_Training_detailed git repository
DanialHabibi2
 
Product Development & DevelopmentLecture02.pptx
zeeshanwazir2
 
Heart Bleed Bug - A case study (Course: Cryptography and Network Security)
Adri Jovin
 

New Approach of Preprocessing For Numeral Recognition

  • 1. S. Benchaou et al Int. Journal of Engineering Research and Applications www.ijera.com ISSN : 2248-9622, Vol. 4, Issue 7( Version 3), July 2014, pp.26-30 www.ijera.com 26 | P a g e Image in color Transformation into grey levels Binarization Cropping Normalization New Approach of Preprocessing For Numeral Recognition S. Benchaou*, M. Nasri*, O. El Melhaoui*, B. Bouali** *(Labo MATSI, EST, University Mohammed 1, Morocco) ** (Labo AGA, FS, University Mohammed 1, Morocco) ABSTRACT The present paper proposes a new approach of preprocessing for handwritten, printed and isolated numeral characters. The new approach reduces the size of the input image of each numeral by discarding the redundant information. This method reduces also the number of features of the attribute vector provided by the extraction features method. Numeral recognition is carried out in this work through k nearest neighbors and multilayer perceptron techniques. The simulations have obtained a good rate of recognition in fewer running time. Keywords – features extraction, handwritten and printed numeral recognition, k nearest neighbors, multilayer perceptron , preprocessing. I. INTRODUCTION For the last decade, the recognition of handwritten characters has known a big progress in numerous industrial applications, in particular the numeral recognition which is used in several sectors such as postal sorting, bank check reading, order form processing, etc. The recognition system requires three main steps to predict the class of membership of an unknown pattern, which are: preprocessing, features extraction and classification. The preprocessing phase consists of discarding the imperfections and reducing the analyzed area. The Numeral features extraction is a delicate process and is crucial [1] for a good numeral recognition. It consists of transforming the image into an attribute vector which contains a set of discriminated characteristics for recognition and also reducing the amount of information supplied to the system. In the literature, several works have been proposed for features extractions such as invariant moments [2], Zernike moments [3], freeman coding [4], Loci characteristics [5], etc. The last step is the classification which consists of partitioning a set of data entity into separate classes according to a similarity criterion, different methods are proposed in this context includes dynamic programming, neural networks, support vector machines, k nearest neighbors, k-means, etc. In order to validate our contributions, we have used in this work a database of 590 numerals, printed and handwritten, provided by various categories of writers. In section 2 numeral features extraction method is presented. The k nearest neighbors and the multilayer perceptron techniques of classification are discussed in section 3. The proposed system for numeral recognition is presented in section 4. The result of simulations and comparisons are introduced in section 5. Finally, we give a conclusion. II. NUMERAL FEATURES EXTRACTION Features extraction is a complex task. It knows a great interest in multiple domains such as image processing, segmentation, classification, etc. After the preprocessing step, the preprocessed image is represented by a matrix of pixels which can be of very large size. So, it will be useful to represent objects by characteristics containing the necessary information. This operation is called features extraction. In the literature, the feature extraction methods are classified according to two categories: statistical approach and structural approach. In our work we are going to use the statistical approach based on profile projection. 2.1 Profile projection Profile projection is a statistical method successfully used in recognition of handwritten characters [6], etc. The preprocessing stage consists of binarizing the numeral input image which is presented in grey level. The next stage is to only preserve the numeral position in image by cropping it. Final stage is to fix the size of cropped numeral image. Fig. 1: Different steps for preprocessing of numeral « 5 ». This method calculates the number of pixels (distance) between the left, bottom, right, top edge of the image and the first black pixel met on this row or column. The dimension of the obtained attribute RESEARCH ARTICLE OPEN ACCESS
  • 2. S. Benchaou et al Int. Journal of Engineering Research and Applications www.ijera.com ISSN : 2248-9622, Vol. 4, Issue 7( Version 3), July 2014, pp.26-30 www.ijera.com 27 | P a g e vector is twice the sum of the number of rows and columns associated to the image of the numeral. Fig. 2: The four profile projections of numeral «4» III. CLASSIFICATION METHODS 3.1 K nearest neighbors K nearest neighbors (KNN) is a widely used method for data classification. Proposed in 1967 by Cover [7], it has been widely used in handwritten numerals recognition for its simplicity and its robustness [8]. KNN is a method which was inspired from the closest neighbor rule. It is based on computing the distance between the test sample and the different learning data samples and then attributes the sample to the k nearest neighbors. 3.2 Multilayer perceptron Neural networks are one branch of artificial intelligence. Artificial neural network is defined as a computer system who has been inspired from a long study of human brain and how the human neurons function [9]. Artificial neural networks in general and multilayer perceptron (MLP) in particular have been widely used for data classification, pattern recognition (characters, voice, signal, etc.) [10][11], prediction [12], etc. The MLP is characterized by its ability to learn and gradually improve its performance through a learning process. MLPs are forward propagation networks where the two closest layers are fully connected. The MLP structure contains an input, an output and a certain number of hidden layers. We have been limited, in this work to one hidden layer. Learning is a phase where the behavior of the network is modified by modifying the synaptic weights until a desired output pattern is obtained. In this work, we have used the gradient backpropagation algorithm, the objective is to minimize the squared error between the desired and computed output of the MLP. Algorithm 1 shows the different steps of the gradient backpropagation algorithm. Algorithm 1 : Gradient backpropagation 1. Randomly initialize the synaptic weights between -1 and 1. 2. Randomly apply a realization vector of an object to the input layer and its corresponding known output to the output layer. 3. Compute the network output and error E between computed and desired outputs. 4. Adjust the weights by the gradient method: ( ) ( ) ( ) ( 1) ( ) r r r W E W t W t      η is the learning rate, which is in general a value between 0.1 and 0.9, r = 1,2. 5. Go to 2 as long as the network does not show satisfactory performances. We have a structure of three layers: - The input layer has n1 neurons that we call nek, 1≤k≤n1, - The hidden and output layers contain n2 and n3 neurons that we respectively call ncj and nsi, where 1≤j≤n2, 1≤i≤n3. (1) k z is the realization of the attribute vector component for a numeral image I, with k=1,2,….n1. ( ) (1) (1) jk W  w , j=1….n2, k=1…..n1, (1) jk w are the synaptic weights connecting neurons in the input to the neurons in the hidden layers. ( ) (2) (2) ij W  w , i=1….n3, j=1…..n2, (2) ij w are the synaptic weights connecting neurons in the hidden to the neurons in the output layers. The output of the neuron j, ncj of the hidden layer is: ( ) (2) (2) z j  f y j (1) Where    1 1 (2) (1) (1) n k y j wjk zk (2) And (2) 1 1 ( ) (2) j j y e f y    (3) For j=1,2,…,n2, f is the activation function which we choose to be of sigmoid type. The output of the neuron i , nsi of the output layer is: ( ) (3) (3) zi  f yi (4) Where    2 1 (3) (2) (2) n j i ij j y w z (5) For i= 1,2,…,n3. Notice that the superscripts (1), (2) and (3) represent respectively input, hidden and output layers.  Optimization of the (2) ij w :
  • 3. S. Benchaou et al Int. Journal of Engineering Research and Applications www.ijera.com ISSN : 2248-9622, Vol. 4, Issue 7( Version 3), July 2014, pp.26-30 www.ijera.com 28 | P a g e Binarization Cropping Windowing Normalization In order to be able to update (2) ij w synaptic weights for different indices i and j, consider the network error to be: 2 3 1 (3) ( ( )) 2 1 E(t) z di t n i i     (6) Where di (t) is the known desired output at neuron nsi. t represents the current iteration. Differentiating E(t) with respect to (2) ij w gives: (2) (3) (3) (2) ( ( ) ( )) ( ) ij i i i ij w z z t d t w E t       (7) Where ( ) (1 ( )) (2) (3) (3) (2) (3) j i i ij i z f y f y w z       (8) The synaptic weight (2) ij w may now be updated. ( 1) ( ) ( ) ( ) (2) (2) (3) (2) w t w t t z t ij ij i j    (9) For i=1….n3 and j =1….n2 Where ( ) ( ( ) ( )) ( ) (1 ( )) (3) (3) (3) (3) i i i i i  t  z t  d t f y   f y (10)  Optimization of the (1) jk w The derivation in this case is similar to that presented previously, so we go to the updating of the weights according to: ( 1) ( ) ( ) ( ) (1) (1) (2) (1) w t w t t z t jk jk j k    (11) For j =1….n2 and k= 1…n1 Where ( ) ( ( )) ( ) ( ) ( ) (2) (2) 1 (2) (3) (2) 3 y t f y t t t w t j j n i j i ij        (12) IV. PROPOSED SYSTEM We have used a database of 590 Arabic numerals, provided by different writers. A sample of the database is shown in figure 3. The database is divided into two sets, one set of 400 numerals is used for learning and the remaining 190 numerals are used for the test stage. Fig. 3: A sample of handwritten and printed numerals The proposed system consists of adding a step of windowing in the preprocessing stage, in order to reduce the size of the attribute vector, which decreases the running time. The proposed system contains three main steps, preprocessing, features extraction and classification. The full system is shown in figure 4. Fig. 4: Scheme for numeral recognition The preprocessing stage, in our case, is so fast in computing time. It consists of binarizing the numeral input image which is presented in grey level. Then, we preserve only the numeral position in image by cropping it. The next stage is to normalize the image in a predefined size. Finally, we have applied the windowing stage. This operation consists of dividing the numeral image into small windows. For every window we calculate the average of pixel grey level. This operation reduces widely the size of the image and preserves the useful information for recognition of every input image. So, the dimension of attribute vectors applied to the system will be reduced too. The three last steps (cropping, normalization and windowing) reduce also widely the computing time of processing. After preprocessing, features extraction is carried out by profile projection method. KNN and MLP methods are used for classification task. V. EXPERIMENTAL RESULTS AND COMPARATIVE STUDY In the first test, the simulations of numeral recognition are carried out without using windowing stage in the preprocessing phase. Several experiments were carried out to determine the recognition rate according to the size of normalization and the number of neurons in the hidden layer of the multilayer perceptron. Table 1 shows that the couple son and nc have an effect on recognition rate. In fact, with the couple (son,nc)= (40x40,40) we have achieved the maximum recognition rate equal to 86.8%. The attribute vector obtained contains 160 elements for each numeral. Image in grey level Features extraction : Profile Projection Classification : - Multilayer Perceptron (MLP) - K nearest neighbors (KNN) Input Image
  • 4. S. Benchaou et al Int. Journal of Engineering Research and Applications www.ijera.com ISSN : 2248-9622, Vol. 4, Issue 7( Version 3), July 2014, pp.26-30 www.ijera.com 29 | P a g e Table 1: Numeral recognition rate according to the size of normalization and the number of neurons in the hidden layer Size of normalization (son) Profile projection Number of primitives Number of neurons in the hidden layer (nc) Recognition rate % 20x20 80 15 77.89 20 82 30 81.05 30x30 120 25 82.63 35 84 45 83.68 40x40 160 30 78.9 40 86.8 55 84 50x50 200 50 84.73 80 85.26 140 83.16 Figure 5 illustrates the evolution of recognition rate according to the number of neurons in the hidden layer, using the size 40x40 as size of normalization. Fig. 5: Evolution of recognition rate according to the number of neurons in the hidden layer A normalization size of 40x40 obtains good recognition rate performance of isolated numerals. This size will be preserved for application of our proposed system. In the classifier, the number of neurons in the input layer is too high. It is equal to 160 which weakened the performance of the multilayer perceptron and increased the time of processing. For this reason, and in order to reduce the number of input neurons of the numeral recognition system, we have proposed the windowing step which consists to split the numeral image into equal areas of m pixels. For each area, the average of pixel grey level is calculated. This operation reduces the size of input image, attribute vector dimension, error rate and time processing. In the second test, we have fixed the size of normalization on 40x40. Then, we have applied our approach of windowing in preprocessing phase. Table 2 illustrates the numeral recognition rate using the proposed system after several simulations according to the number of hidden neurons. Table 2: Numeral recognition rate of proposed system Table 2 highlights the fact that with the operation of windowing, the size of the attribute vector is reduced to 120 and 80 while it was initially 160. The time processing using the K nearest neighbors and multilayer perceptron is reduced too. We also notice that the choice of the size of windows is essential. The window of two pixels preserves the same recognition rate. It is equal to 86.8%. But the window of four pixels improves the recognition rate to 87.9%, this justifies that the redundant attributes are partially eliminated. VI. CONCLUSION In this work we have presented a new approach for numerals recognition. In the preprocessing stage we have added the step of windowing. It consists to split the numeral image into equal areas of m pixels. For each area, the average of pixels is calculated. Features extraction is carried out by profile projection. Multilayer perceptron method and K nearest neighbors are used for classification. The simulations have obtained good results; a clear reduction of attributes vector size and time processing with a good increase of recognition rate are obtained. That confirms the performance of our proposed system for numeral recognition. REFERENCES [1] R.G.1. Benne, B.V.1. Dhandra and M. Hangarge, “Tri-scripts handwritten numeral recognition: a novel approach”. Advances in Size of Window-ing Number of primiti- ves Profile projection K nearest neighbors Recognit-ion rate (%) Time process-ing (min) Recognit-ion rate (%) Time proce- ssing (s) Without windowing 160 86.8 13.58 86.8 31 Window of 2 pixels 120 86.8 13.5 86.8 24 Window of 4 pixels 80 87.9 4 87.9 19
  • 5. S. Benchaou et al Int. Journal of Engineering Research and Applications www.ijera.com ISSN : 2248-9622, Vol. 4, Issue 7( Version 3), July 2014, pp.26-30 www.ijera.com 30 | P a g e Computational Research, ISSN: 0975–3273, Vol 1, Issue 2, pp: 47-51,2009. [2] Anuja P. Nagare. “License Plate Character Recognition System using Neural Network”. International Journal of Computer Applications (0975-8887), Vol.25, N°.10, July 2011. [3] N. Abu Bakar, S. Shamsuddin, and A. Ali. “An integrated formulation of zernike representation in character images”. In Nicolás García-Pedrajas, Francisco Herrera, Colin Fyfe, José Benítez, and Moonis Ali, editors, Trends in Applied Intelligent Systems, volume 6098 of Lecture Notes in Computer Science, pages 359–368. Springer Berlin / Heidelberg, 2010. [4] X. Dupré, “Contributions à la reconnaissance de l'écriture cursive à l'aide de modèles de Markov”. Thèse, Université René Descartes-Paris 5, 2003. [5] R.Ebrahim, M.R.Moradian, A.Esmkhani, F.M.Jafarlou. “Recognition of Persian handwritten digits using characterization Loci and mixture of experts”. International Journal of Digital Content.Technology and its Applications, Vol.3,N°.3, 2009 . [6] D. Gokana, “Contribution à la Reconnaissance Automatique de Caractères Manuscrits. Application à la Lecture Optique de Caractères sur Supports Mobiles”. Thèse de Doctorat : Université de Paris Sud, Centre d’Orsay, 1986. [7] R. GIL-PITA, X. YAO, “Evolving edited k- nearest neighbor classifiers”. International Journal of Neural Systems, Vol.18, N°.6, pp 459–467,2008. [8] I. Kuncheva, “Editing for the k-nearest neighbors rule by a genetic algorithm”. Elsevier, Pattern Recognition Letters 16, pp: 809-814,1995. [9] A. Adnan and S.Sameer, “Optical character recognition : Neural network analysis of hand-printed characters”. In SSPR ’98/SPR ’98 : Proceedings of the Joint IAPR International Workshops on Advances in Pattern Recognition, London, UK, Springer- Verlag, 1998. [10] J.P.Pinto, “Multilayer perceptron based hierarchical acoustic modeling for automatic speech recognition”. Thèse, Ecole Polytechnique fédérale de Lausanne, Suisse, 2010. [11] L.Oukhellou, “Paramètrisation et classification de signaux en contrôle non destructif. Application à la reconnaissance des défauts de rails par courant de foucault”. Thèse, Université Paris XI Orsay, 1997. [12] H.Bouziane, B.Messabih, A.Chouarfia, “Prédiction de la structure 2D des protéines par les réseaux de neurones”. Communication of IBIMA, Vol.6, 2008.