SlideShare a Scribd company logo
Ojo, J.A. & Adeniran, S.A.
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 58
One-sample Face Recognition Using HMM Model of Fiducial
Areas
OJO, John Adedapo jaojo@lautech.edu.ng
Department of Electronic & Electrical Engineering,
Ladoke Akintola University of Technology (LAUTECH),
Ogbomoso, P.M.B. 4000, Nigeria.
Adeniran, Solomon A. sadenira@oauife.edu.ng
Department of Electronic & Electrical Engineering,
Obafemi Awolowo University (OAU),
Ile Ife, Nigeria.
Abstract
In most real world applications, multiple image samples of individuals are not easy to collate for
recognition or verification. Therefore, there is a need to perform these tasks even if only one
training sample per person is available. This paper describes an effective algorithm for
recognition and verification with one sample image per class. It uses two dimensional discrete
wavelet transform (2D DWT) to extract features from images; and hidden Markov model (HMM)
was used for training, recognition and classification. It was tested with a subset of the AT&T
database and up to 90% correct classification (Hit) and false acceptance rate (FAR) of 0.02%
was achieved.
Keywords: Hidden Markov Model (HMM); Recognition Rate (RR); False Acceptance Rate
(FAR); Face Recognition (FR)
1. INTRODUCTION
Face recognition has attracted attention from the research and industrial communities with a view
to achieve a “hands-free” computer controlled systems for access control, security and
surveillance. Many algorithms have been proposed to solve this problem starting from the
geometric feature-based [1], holistic [2,3] to appearance-based approaches [4]. The performance
of these algorithms however depends heavily on the largeness of the number of training set with
the attendant problem of sample collection. Algorithms that have performed excellently well with
multiple sample problem (MSP) may completely fail to work if only one training sample is used
[5]. However, one sample problems are more real in everyday life than the MSP. National ID
cards, smart cards, student ID cards and international passports should contain enough biometric
information of individuals for recognition purposes. These cases fall under the one training
sample per class problem or simply one sample problem (OSP). Many algorithms have been
developed and comprehensive surveys are available [5,6].
In one sample problem, the idea is to get as much information as possible from the sample. One
approach to this is to increase the size of the training set by projecting the image into more than
one dimension space [7], using noise model to synthesise new faces [8] or generating virtual
samples or geometric views of the sample image [9]. But the one sample problem has been
changed to the multiple sample problem in these cases with increase in computational and
storage costs. In addition, virtual samples generated may be highly correlated and can not be
considered as independent training samples [10].
In appearance-based approaches, certain features of the image samples are extracted and
presented to a classifier or classifying system, which uses a similarity measure (probabilistic
Ojo, J.A. & Adeniran, S.A.
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 59
measure, majority voting or linearly weighted summing) to ascertain the identity of the image
[11,12]. The accuracy of the performance of these methods depends largely on the features
extracted from the image [5]. Gray-value features are credited with the ability to retain texture
information, while Gabor and other derived features are more robust against illumination and
geometrical changes [13,14]. Since there are many combining classifiers with established high
level of accuracy, good performance is expected with combination of appropriate feature
selection technique.
In this paper, we present a one sample face recognition and verification system, which uses two
dimensional discrete Wavelets transform (2D DWT) for feature extraction and one dimensional
discrete hidden Markov models (1D DHMM) for classification.
2. PRELIMINARIES
Hidden Markov model (HMM)
A signal that obeys the Markov process,
, (1)
can be represented by a HMM, which consists of two interrelated processes; the observable
symbol sequence and the underlying, unobservable Markov chain. A brief description of HMM is
presented below, while the reader is referred to an extensive description in [15]. HMM is
characterized by some elements; a specific number N of states {S}, while transition from one
state to another state emits observation vectors and the observation sequence is
denoted as . Observable symbols in each state can take any value in the
vector , where M is the number of the observable symbols. The probabilities of
transition from a state to is expressed as,
, (2)
and ,
and
The likelihood of emitting a certain observation vector at any state is , while the probability
distribution is expressed as,
(3)
The initial state (prior) distribution , where
, (4)
are the probabilities of being the first state of sequence. Therefore a short notation for
representing a model is,
(5)
Given a model λ, and observation sequence , the probability of the sequence given the model is
. This is calculated using the froward-backward algorithm,
, (6)
Ojo, J.A. & Adeniran, S.A.
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 60
, (7)
where is the forward variable and it is the probability of the partial observation sequence,
and state at time t, given the model λ;
, (8)
is the backward variable and it is is the probability of the partial observation sequence from
to the end, given state at time t and λ.
. (9)
The problem to solve for recognition purpose is to find the best state sequence that gives the
maximum likelihood with respect to a given model. The viterbi algorithm is used for solving this
problem. It finds the most probable path for each intermediate and finally for the terminating state.
The algorithm uses two variables and .
, (10)
where is the best score or highest probability along a single path, at time t, which accounts
for the first t observations and ends in state .
(11)
helps to keep tract of the “best path” ending in state at time t.
Wavelets
Wavelet transform uses multi resolution techniques to provide a time-frequency representation of
the signal. It can be described as breaking up of a signal into shifted and scaled versions of the
“mother” wavelet. Wavelet analysis is done by convolving the signal with wavelet kernels to
obtain wavelet coefficients representing the contributions of wavelets in the signal at different
scales and orientations [16,17].
Discrete wave transform (DWT) was developed to reduce the computation time and for easy
implementation of the wavelet transform. It produces a time-scaled representation of the signal by
using digital filtering techniques, the wavelet families. Unlike discrete Fourier transform that can
be represented by a convolution equation, DWT comprises transformation kernels or equations
that differ in its expansion functions, the nature of the functions (orthogonal or bi-orthogonal
basis) and how many resolutions of the functions that are computed. A signal, which passes
through the filter bank shown in Figure 2 is decomposed into four lower resolution components:
the approximation , horizontal , vertical and diagonal coefficients .
Ojo, J.A. & Adeniran, S.A.
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 61
)(
1
h
jcD +
)(
1
v
jcD +
)(
1
d
jcD +
)(
1
h
jcA +
jcA
FIGURE 1: One-dimensional discrete top-to-bottom HMM (Source: MATLAB R2007a Help File)
3. METHOD
Modeling a Face and Feature Extraction
A One dimensional (1D) discrete top-to-bottom (Bakis) HMM was used to segment each face into
states as shown in Figure 2. The algorithm for feature extraction shown in Figure 3 was used to
generate the observation vector. Two dimensional Discrete Wavelet Transform (2D DWT) was
used to decompose the image into its approximation coefficients, horizontal details, vertical
details and the diagonal details. ‘db1’, one of the Daubechies family of wavelets was used for
decomposition. The approximation coefficient was coded using 256 gray levels thereby producing
a coded (and reduced) form of the original or input image. The “coded” image was divided into
sub-images and the overlap between successive sub-images was allowed to be up to 5 pixels
less than the total height of the sub-image.
FIGURE 2: One-dimensional discrete top-to-bottom HMM
To generate the observation vector from each sub-image, the two dimensional sub-images were
converted into a vector by extracting the coefficients column-wise. The number of features (NF)
selected was varied to see its effect on the recognition ability of the system. The coefficients of
the sub-images were stacked to form a vector, therefore a face image was represented by a
vector ( ) in length, where Q is the number of states. Figure 4(a) shows the original image
from the AT&T database while Figure 4(b) shows the gray-scale of the approximation coefficient
of the same image with the sampling strip for segmenting the image into allowable states.
Ojo, J.A. & Adeniran, S.A.
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 62
FIGURE 3: Algorithm for feature extraction
Ojo, J.A. & Adeniran, S.A.
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 63
FIGURE 4: Grey-scale version of the approximation coefficients of an image and its segmentation
into states
Training
Features extracted from faces of individuals were used to train a model for each face using the
algorithm shown in Figure 5. The initial parameter were generated randomly and improved using
Baum-Welch re-estimation procedure [15] to get the parameters that optimised the likelihood of
the training set observation vectors for the each face.
State transition probability (A) is defined as,
= 0, (12)
= 0, (13)
where i.e. the model is not allowed to jump more than a state at a time. Since each face
was divided into five sub-images, the resulting matrix is
(14)
while and the initial state probability (π) is defined
as



=
≠
=
1,1
1,0
j
j
iπ (15)
. (16)
The maximum number of iteration for the re-estimation is set to 5 or if the error between the initial
and present value is less than , then the model is taken to have converged and the model
parameters are stored with appropriate class name or number .
Ojo, J.A. & Adeniran, S.A.
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 64
Algorithm for Model Re-estimation
(n is the maximum number of iteration allowed)
k = 1
initialise
compute
while do
estimate
if
quit
else
end
end
FIGURE 5: Algorithm for HMM training
Recognition and Verification
Given a face to be tested or recognised, feature (observation vector) extraction is first performed
as described in section 3.1. Model likelihoods (log-likelihood) for all the models in the training set
(given the observation vectors) is calculated and the model with the highest log-likelihood is
identified to be the model representing the face. Euclidean measure is used to test if a face is in
the training set or database. If the log-likelihood is within the stated distance, the model (face) is
recognised to be in the training set or in the database. However, in areas of applications such as
access control, it is desired to know the exact identity of an individual, therefore the need to verify
the correctness of the faces recognised.
For classification or verification, the Viterbi recogniser was used as shown in Figure 6. The test
(face) image was converted to an observation sequence and then model likelihoods
Ojo, J.A. & Adeniran, S.A.
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 65
are computed for each . The model with highest likelihood reveals the identity of the
unknown face.
(12)
4. RESULTS AND DISCUSSION
The algorithm was implemented in Matlab 7.4a on a HP AMD Turion 64 X2 TL-58, 1.90GHz, 2GB
RAM on a Windows operating system. It was tested with a subset of the AT&T (formerly ORL)
database [18]. A face image per person was used for training while five other images per person
were used for testing, some of which are shown in Figure 7. The recognized images were verified
for correctness, 80% correct classification (Hit) occurred while 20% were misclassified. The rest
of the images that were not in the training set were used to test the false acceptance rate (FAR)
i.e. the ratio of the numbers of images falsely accepted to the total number of images tested and
0.02 FAR occurred. The number of test images per class was reduced to two and 90% Hit, 0.025
FAR occurred as shown in Table 1. Furthermore, the algorithm was tested with ten subjects in the
AT&T database. The general observation was that the percentage Hit and FAR were independent
of number of subjects in class. For instance, 90% Hit, 0.05 FAR and 90% Hit, 0.05 FAR occurred
when five and two test images per class were used respectively.
FIGURE 6: Viterbi recogniser
Figure 8 shows the effect that the number of features selected per each state (subimage) has on
the number of correct classifications (Hit). The results show that 30 features per subimage were
sufficient to give the best performance. In addition, the average time for testing a face was
approximately 0.15s, which is near real-time. Going by these results, the algorithm is expected to
be adequate for implementation in applications where small size database is required [19].
The performance of the algorithm when Two Dimensional Discrete Cosine Transform (2D-DCT)
was used for feature extraction is compared with that of the Discrete Wavelet Transform (2D-
DWT) and the results are shown in Table 2. The results show that there is a significant
improvement in the recognition or classification accuracy when DWT was used for feature
extraction.
Ojo, J.A. & Adeniran, S.A.
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 66
Test Images Number of class Hit Miss FAR
5 20 80% 20% 0.02
2 20 90% 10% 0.025
5 10 90% 10% 0.04
2 10 90% 10% 0.05
TABLE 1: Classification accuracies achieved for a subset of AT&T database
Test Images Number of class
Hit
DCT DWT
5 20 39% 80%
2 20 50% 90%
5 10 46% 90%
2 10 45% 90%
TABLE 2: Classification accuracies for DCT and DWT
FIGURE 7: Some of the faces used for testing.
Ojo, J.A. & Adeniran, S.A.
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 67
FIGURE 8: Graph of correct classifications against number of features per state
5. CONSLUSION
The paper presented a one sample face recognition system. Feature extraction was performed
using 2D DWT and 1D top-to-bottom HMM was used for classification. When tested with a subset
of the AT&T database, up to 90% correct classification (Hit) and as low as 0.02 FAR were
achieved. The high recognition rate and the low FAR achieved shows that the new algorithm is
suitable for face recognition problems with small-size database such as access control for
personal computers (PCs) and personal digital assistants (PDAs).
6. REFERENCES
1. R. Brunelli and T. Poggio. “Face recognition: Features versus templates”. IEEE
Transaction on Pattern Analysis and Machine Intelligence, 15(10):1042-1062, 1993
2. L. Sirovich and M. Kirby. “Low-Dimensional procedure for the characterization of human
face”. Journal of the Optical Society of America A, 4(3):519–524, 1987
3. M. Turk and A. Pentland. “Eigenfaces for Recognition”. Journal of Cognitive
Neuroscience, 3(1):71-86, 1991
4. S. Lawrence, C.L. Giles, A. Tsoi and A. Back. “Face recognition: A convolutional neural-
network approach”. IEEE Transaction on Neural Networks, 8(1):98-113, 1997
5. W. Zhao, R. Chellappa, P.J. Philips and A. Rosenfeld. “Face recognition: A literature
survey”. ACM Computing Surveys 35(4):399-458, 2003
6. X. Tan, S. Chen, Z-H Zhou, and F. Zhang. “Face recognition from a single image per
person: a survey”. Pattern Recognition 39:1725-1745, 2006
7. J. Wu and Z.-H Zhou. “Face recognition with one training image per person”. Pattern
Recognition Letters, 23(2):1711-1719, 2001
8. H.C. Jung, B.W. Hwang and S.W. Lee. “Authenticating corrupted face image based on
noise model”. Proceedings of the 6th IEEE International Conference on Automatic Face
and Gesture Recognition, 272, 2004
9. F. Frade, De la Torre, R. Gross, S. Baker, and V. Kumar. “Representational oriented
component analysis (ROCA) for face recognition with one sample image per training
class”. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition
2:266-273, June 2005
Ojo, J.A. & Adeniran, S.A.
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 68
10. A.M. Martinez. “Recognizing imprecisely localised, partially occluded, and expression
variant faces from a single sample per class”. IEEE Transaction on Pattern Analysis and
Machine Intelligence 25(6):748-763, 2002
11. B.S. Manjunath, R. Chellappa and C.V.D. Malsburg. “A feature based approach to face
recognition”. In Proceedings of IEEE Conference on Computer Vision and Pattern
Recognition 1:373-378, 1992
12. M. Lades, J.Vorbruggen, J. Buhmann, J. Lange, von der Malsburg and R. Wurtz.
“Distortion invariant object recognition in the dynamic link architecture”. IEEE Transaction
on Computers 42(3):300-311, 1993
13. X. Tan, S.C. Chen, Z-H Zhou, and F. Zhang. “Recognising partially occluded, expression
variant faces from single training image per person with SOM and soft kNN ensemble”.
IEEE Transactions on Neural Networks, 16(4):875-886, 2005
14. H.-S. Le and H. Li. “Recognising frontal face images using Hidden Markov models with
one training image per person”. Proceedings of the 17
th
International Conference on
Pattern Recognition (ICPR04), 1:318-321, 2004
15. L.R. Rabiner. “A tutorial on Hidden Markov models and selected application in speech
recognition”. Proceedings of the IEEE, 77(2):257-286, 1989
16. I. Daubechies, “Orthonormal bases of compactly supported wavelets”. Communication on
Pure & Applied Mathematics XLI, 41:909-996, 1988
17. I. Daubechies. “Ten lectures on wavelets”. CBMS-NSF Conference series in Applied
Mathematics, No-61, SIAM, Philadelphia Pennsylvania, 1992
18. F. Samaria and A. Harter. “Parameterization of a stochastic model for human face
identification”. 2nd IEEE Workshop on Applications of Computer Vision, Saratosa FL.
pp.138-142, December, 1994
19. J. Roure, and M. Faundez-Zanuy. “Face recognition with small and large size
databases”. Proceedings of 39th Annual International Carnahan Conference on Security
Technology, pp 153-156, October 2005

More Related Content

What's hot (20)

PDF
LOCAL REGION PSEUDO-ZERNIKE MOMENT- BASED FEATURE EXTRACTION FOR FACIAL RECOG...
aciijournal
 
PDF
Extended Fuzzy Hyperline Segment Neural Network for Fingerprint Recognition
CSCJournals
 
PDF
Segmentation and Classification of MRI Brain Tumor
IRJET Journal
 
PDF
Ji3416861690
IJERA Editor
 
PDF
Unimodal Multi-Feature Fusion and one-dimensional Hidden Markov Models for Lo...
IJECEIAES
 
PDF
MAGNETIC RESONANCE BRAIN IMAGE SEGMENTATION
VLSICS Design
 
PDF
Detection of hard exudates using simulated annealing based thresholding mecha...
csandit
 
PDF
Multimodal Biometrics Recognition by Dimensionality Diminution Method
IJERA Editor
 
PDF
Medical Image segmentation using Image Mining concepts
Editor IJMTER
 
PDF
Implementation of miml framework using annotated
Editor IJMTER
 
PDF
Fuzzy Logic based Edge Detection Method for Image Processing
IJECEIAES
 
PDF
IRJET - Effective Workflow for High-Performance Recognition of Fruits using M...
IRJET Journal
 
PDF
Error entropy minimization for brain image registration using hilbert huang t...
eSAT Publishing House
 
PDF
Face Recognition using Feature Descriptors and Classifiers
Journal For Research
 
PDF
International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
PDF
Feature selection in multimodal
ijcsa
 
PDF
Segmentation of Tumor Region in MRI Images of Brain using Mathematical Morpho...
CSCJournals
 
PDF
Ijctt v7 p104
ssrgjournals
 
PDF
Face Recognition System Using Local Ternary Pattern and Signed Number Multipl...
inventionjournals
 
PDF
C04461620
IOSR-JEN
 
LOCAL REGION PSEUDO-ZERNIKE MOMENT- BASED FEATURE EXTRACTION FOR FACIAL RECOG...
aciijournal
 
Extended Fuzzy Hyperline Segment Neural Network for Fingerprint Recognition
CSCJournals
 
Segmentation and Classification of MRI Brain Tumor
IRJET Journal
 
Ji3416861690
IJERA Editor
 
Unimodal Multi-Feature Fusion and one-dimensional Hidden Markov Models for Lo...
IJECEIAES
 
MAGNETIC RESONANCE BRAIN IMAGE SEGMENTATION
VLSICS Design
 
Detection of hard exudates using simulated annealing based thresholding mecha...
csandit
 
Multimodal Biometrics Recognition by Dimensionality Diminution Method
IJERA Editor
 
Medical Image segmentation using Image Mining concepts
Editor IJMTER
 
Implementation of miml framework using annotated
Editor IJMTER
 
Fuzzy Logic based Edge Detection Method for Image Processing
IJECEIAES
 
IRJET - Effective Workflow for High-Performance Recognition of Fruits using M...
IRJET Journal
 
Error entropy minimization for brain image registration using hilbert huang t...
eSAT Publishing House
 
Face Recognition using Feature Descriptors and Classifiers
Journal For Research
 
International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
Feature selection in multimodal
ijcsa
 
Segmentation of Tumor Region in MRI Images of Brain using Mathematical Morpho...
CSCJournals
 
Ijctt v7 p104
ssrgjournals
 
Face Recognition System Using Local Ternary Pattern and Signed Number Multipl...
inventionjournals
 
C04461620
IOSR-JEN
 

Viewers also liked (14)

PPT
Principais diferenças entre ead e eol fórum 4
Rosa Serra
 
PDF
BeeRound Design Studio | Website Design & Development | Graphic Design | Port...
BeeRound Design
 
PPTX
10 Practical Tips for Effective Social Media Marketing
Kobus Louwrens
 
PDF
BeeRound Design Studio | Advertising & Marketing Collateral | Graphic Design ...
BeeRound Design
 
PDF
Cuáles son las herramientas de la web
Maria Martha Escobar
 
PPTX
goigi-website-design-and-development-portfolio
Priya Singh
 
PDF
BeeRound Design Studio | Branding & Corporate Identity | Graphic Design | Por...
BeeRound Design
 
PPT
Entrevista Ingeniero Alvaro Cajigas UPB
villafrade
 
PDF
Face Hallucination using Eigen Transformation in Transform Domain
CSCJournals
 
PPTX
Estudo de caso: Fedex
Mateus Costa
 
PPTX
Fluidización
thalia gutierrez
 
DOCX
Modelo de Contrato de Trabalho PJ - Terceirização e outsourcing
Contrato PJ
 
PDF
Questor 1 menage a troy 01
LUIS NARBONA
 
Principais diferenças entre ead e eol fórum 4
Rosa Serra
 
BeeRound Design Studio | Website Design & Development | Graphic Design | Port...
BeeRound Design
 
10 Practical Tips for Effective Social Media Marketing
Kobus Louwrens
 
BeeRound Design Studio | Advertising & Marketing Collateral | Graphic Design ...
BeeRound Design
 
Cuáles son las herramientas de la web
Maria Martha Escobar
 
goigi-website-design-and-development-portfolio
Priya Singh
 
BeeRound Design Studio | Branding & Corporate Identity | Graphic Design | Por...
BeeRound Design
 
Entrevista Ingeniero Alvaro Cajigas UPB
villafrade
 
Face Hallucination using Eigen Transformation in Transform Domain
CSCJournals
 
Estudo de caso: Fedex
Mateus Costa
 
Fluidización
thalia gutierrez
 
Modelo de Contrato de Trabalho PJ - Terceirização e outsourcing
Contrato PJ
 
Questor 1 menage a troy 01
LUIS NARBONA
 
Ad

Similar to One-Sample Face Recognition Using HMM Model of Fiducial Areas (20)

PDF
A comparative study of dimension reduction methods combined with wavelet tran...
ijcsit
 
PDF
COMPRESSION BASED FACE RECOGNITION USING DWT AND SVM
sipij
 
PDF
MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...
ijma
 
PDF
Mr image compression based on selection of mother wavelet and lifting based w...
ijma
 
PDF
Neural Network based Supervised Self Organizing Maps for Face Recognition
ijsc
 
PDF
A Pattern Classification Based approach for Blur Classification
ijeei-iaes
 
PDF
Iaetsd multi-view and multi band face recognition
Iaetsd Iaetsd
 
PDF
Efficient 3D stereo vision stabilization for multi-camera viewpoints
journalBEEI
 
PDF
A Spectral Domain Dominant Feature Extraction Algorithm for Palm-print Recogn...
CSCJournals
 
PDF
Gesture Recognition using Principle Component Analysis & Viola-Jones Algorithm
IJMER
 
PDF
MEDICAL DIAGNOSIS CLASSIFICATION USING MIGRATION BASED DIFFERENTIAL EVOLUTION...
cscpconf
 
PDF
Medical diagnosis classification
csandit
 
PDF
Multilinear Kernel Mapping for Feature Dimension Reduction in Content Based M...
ijma
 
PDF
50120130406014
IAEME Publication
 
PDF
Design and implementation of video tracking system based on camera field of view
sipij
 
PDF
WCTFR : W RAPPING C URVELET T RANSFORM B ASED F ACE R ECOGNITION
csandit
 
PDF
Smriti's research paper
Smriti Tikoo
 
PDF
Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...
CSCJournals
 
PDF
Classification and Comparison of License Plates Localization Algorithms
sipij
 
PDF
Classification and Comparison of License Plates Localization Algorithms
sipij
 
A comparative study of dimension reduction methods combined with wavelet tran...
ijcsit
 
COMPRESSION BASED FACE RECOGNITION USING DWT AND SVM
sipij
 
MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...
ijma
 
Mr image compression based on selection of mother wavelet and lifting based w...
ijma
 
Neural Network based Supervised Self Organizing Maps for Face Recognition
ijsc
 
A Pattern Classification Based approach for Blur Classification
ijeei-iaes
 
Iaetsd multi-view and multi band face recognition
Iaetsd Iaetsd
 
Efficient 3D stereo vision stabilization for multi-camera viewpoints
journalBEEI
 
A Spectral Domain Dominant Feature Extraction Algorithm for Palm-print Recogn...
CSCJournals
 
Gesture Recognition using Principle Component Analysis & Viola-Jones Algorithm
IJMER
 
MEDICAL DIAGNOSIS CLASSIFICATION USING MIGRATION BASED DIFFERENTIAL EVOLUTION...
cscpconf
 
Medical diagnosis classification
csandit
 
Multilinear Kernel Mapping for Feature Dimension Reduction in Content Based M...
ijma
 
50120130406014
IAEME Publication
 
Design and implementation of video tracking system based on camera field of view
sipij
 
WCTFR : W RAPPING C URVELET T RANSFORM B ASED F ACE R ECOGNITION
csandit
 
Smriti's research paper
Smriti Tikoo
 
Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...
CSCJournals
 
Classification and Comparison of License Plates Localization Algorithms
sipij
 
Classification and Comparison of License Plates Localization Algorithms
sipij
 
Ad

Recently uploaded (20)

PDF
TOP 10 AI TOOLS YOU MUST LEARN TO SURVIVE IN 2025 AND ABOVE
digilearnings.com
 
PPTX
Digital Professionalism and Interpersonal Competence
rutvikgediya1
 
PPTX
Dakar Framework Education For All- 2000(Act)
santoshmohalik1
 
PPTX
INTESTINALPARASITES OR WORM INFESTATIONS.pptx
PRADEEP ABOTHU
 
PPTX
Python-Application-in-Drug-Design by R D Jawarkar.pptx
Rahul Jawarkar
 
PPTX
20250924 Navigating the Future: How to tell the difference between an emergen...
McGuinness Institute
 
PPTX
Electrophysiology_of_Heart. Electrophysiology studies in Cardiovascular syste...
Rajshri Ghogare
 
PPTX
Cleaning Validation Ppt Pharmaceutical validation
Ms. Ashatai Patil
 
PDF
EXCRETION-STRUCTURE OF NEPHRON,URINE FORMATION
raviralanaresh2
 
PPTX
Rules and Regulations of Madhya Pradesh Library Part-I
SantoshKumarKori2
 
PPTX
Introduction to pediatric nursing in 5th Sem..pptx
AneetaSharma15
 
PPTX
Applied-Statistics-1.pptx hardiba zalaaa
hardizala899
 
PDF
Antianginal agents, Definition, Classification, MOA.pdf
Prerana Jadhav
 
PPTX
Basics and rules of probability with real-life uses
ravatkaran694
 
DOCX
Modul Ajar Deep Learning Bahasa Inggris Kelas 11 Terbaru 2025
wahyurestu63
 
PPTX
Applications of matrices In Real Life_20250724_091307_0000.pptx
gehlotkrish03
 
PPTX
PROTIEN ENERGY MALNUTRITION: NURSING MANAGEMENT.pptx
PRADEEP ABOTHU
 
DOCX
pgdei-UNIT -V Neurological Disorders & developmental disabilities
JELLA VISHNU DURGA PRASAD
 
PPTX
Top 10 AI Tools, Like ChatGPT. You Must Learn In 2025
Digilearnings
 
DOCX
Unit 5: Speech-language and swallowing disorders
JELLA VISHNU DURGA PRASAD
 
TOP 10 AI TOOLS YOU MUST LEARN TO SURVIVE IN 2025 AND ABOVE
digilearnings.com
 
Digital Professionalism and Interpersonal Competence
rutvikgediya1
 
Dakar Framework Education For All- 2000(Act)
santoshmohalik1
 
INTESTINALPARASITES OR WORM INFESTATIONS.pptx
PRADEEP ABOTHU
 
Python-Application-in-Drug-Design by R D Jawarkar.pptx
Rahul Jawarkar
 
20250924 Navigating the Future: How to tell the difference between an emergen...
McGuinness Institute
 
Electrophysiology_of_Heart. Electrophysiology studies in Cardiovascular syste...
Rajshri Ghogare
 
Cleaning Validation Ppt Pharmaceutical validation
Ms. Ashatai Patil
 
EXCRETION-STRUCTURE OF NEPHRON,URINE FORMATION
raviralanaresh2
 
Rules and Regulations of Madhya Pradesh Library Part-I
SantoshKumarKori2
 
Introduction to pediatric nursing in 5th Sem..pptx
AneetaSharma15
 
Applied-Statistics-1.pptx hardiba zalaaa
hardizala899
 
Antianginal agents, Definition, Classification, MOA.pdf
Prerana Jadhav
 
Basics and rules of probability with real-life uses
ravatkaran694
 
Modul Ajar Deep Learning Bahasa Inggris Kelas 11 Terbaru 2025
wahyurestu63
 
Applications of matrices In Real Life_20250724_091307_0000.pptx
gehlotkrish03
 
PROTIEN ENERGY MALNUTRITION: NURSING MANAGEMENT.pptx
PRADEEP ABOTHU
 
pgdei-UNIT -V Neurological Disorders & developmental disabilities
JELLA VISHNU DURGA PRASAD
 
Top 10 AI Tools, Like ChatGPT. You Must Learn In 2025
Digilearnings
 
Unit 5: Speech-language and swallowing disorders
JELLA VISHNU DURGA PRASAD
 

One-Sample Face Recognition Using HMM Model of Fiducial Areas

  • 1. Ojo, J.A. & Adeniran, S.A. International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 58 One-sample Face Recognition Using HMM Model of Fiducial Areas OJO, John Adedapo [email protected] Department of Electronic & Electrical Engineering, Ladoke Akintola University of Technology (LAUTECH), Ogbomoso, P.M.B. 4000, Nigeria. Adeniran, Solomon A. [email protected] Department of Electronic & Electrical Engineering, Obafemi Awolowo University (OAU), Ile Ife, Nigeria. Abstract In most real world applications, multiple image samples of individuals are not easy to collate for recognition or verification. Therefore, there is a need to perform these tasks even if only one training sample per person is available. This paper describes an effective algorithm for recognition and verification with one sample image per class. It uses two dimensional discrete wavelet transform (2D DWT) to extract features from images; and hidden Markov model (HMM) was used for training, recognition and classification. It was tested with a subset of the AT&T database and up to 90% correct classification (Hit) and false acceptance rate (FAR) of 0.02% was achieved. Keywords: Hidden Markov Model (HMM); Recognition Rate (RR); False Acceptance Rate (FAR); Face Recognition (FR) 1. INTRODUCTION Face recognition has attracted attention from the research and industrial communities with a view to achieve a “hands-free” computer controlled systems for access control, security and surveillance. Many algorithms have been proposed to solve this problem starting from the geometric feature-based [1], holistic [2,3] to appearance-based approaches [4]. The performance of these algorithms however depends heavily on the largeness of the number of training set with the attendant problem of sample collection. Algorithms that have performed excellently well with multiple sample problem (MSP) may completely fail to work if only one training sample is used [5]. However, one sample problems are more real in everyday life than the MSP. National ID cards, smart cards, student ID cards and international passports should contain enough biometric information of individuals for recognition purposes. These cases fall under the one training sample per class problem or simply one sample problem (OSP). Many algorithms have been developed and comprehensive surveys are available [5,6]. In one sample problem, the idea is to get as much information as possible from the sample. One approach to this is to increase the size of the training set by projecting the image into more than one dimension space [7], using noise model to synthesise new faces [8] or generating virtual samples or geometric views of the sample image [9]. But the one sample problem has been changed to the multiple sample problem in these cases with increase in computational and storage costs. In addition, virtual samples generated may be highly correlated and can not be considered as independent training samples [10]. In appearance-based approaches, certain features of the image samples are extracted and presented to a classifier or classifying system, which uses a similarity measure (probabilistic
  • 2. Ojo, J.A. & Adeniran, S.A. International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 59 measure, majority voting or linearly weighted summing) to ascertain the identity of the image [11,12]. The accuracy of the performance of these methods depends largely on the features extracted from the image [5]. Gray-value features are credited with the ability to retain texture information, while Gabor and other derived features are more robust against illumination and geometrical changes [13,14]. Since there are many combining classifiers with established high level of accuracy, good performance is expected with combination of appropriate feature selection technique. In this paper, we present a one sample face recognition and verification system, which uses two dimensional discrete Wavelets transform (2D DWT) for feature extraction and one dimensional discrete hidden Markov models (1D DHMM) for classification. 2. PRELIMINARIES Hidden Markov model (HMM) A signal that obeys the Markov process, , (1) can be represented by a HMM, which consists of two interrelated processes; the observable symbol sequence and the underlying, unobservable Markov chain. A brief description of HMM is presented below, while the reader is referred to an extensive description in [15]. HMM is characterized by some elements; a specific number N of states {S}, while transition from one state to another state emits observation vectors and the observation sequence is denoted as . Observable symbols in each state can take any value in the vector , where M is the number of the observable symbols. The probabilities of transition from a state to is expressed as, , (2) and , and The likelihood of emitting a certain observation vector at any state is , while the probability distribution is expressed as, (3) The initial state (prior) distribution , where , (4) are the probabilities of being the first state of sequence. Therefore a short notation for representing a model is, (5) Given a model λ, and observation sequence , the probability of the sequence given the model is . This is calculated using the froward-backward algorithm, , (6)
  • 3. Ojo, J.A. & Adeniran, S.A. International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 60 , (7) where is the forward variable and it is the probability of the partial observation sequence, and state at time t, given the model λ; , (8) is the backward variable and it is is the probability of the partial observation sequence from to the end, given state at time t and λ. . (9) The problem to solve for recognition purpose is to find the best state sequence that gives the maximum likelihood with respect to a given model. The viterbi algorithm is used for solving this problem. It finds the most probable path for each intermediate and finally for the terminating state. The algorithm uses two variables and . , (10) where is the best score or highest probability along a single path, at time t, which accounts for the first t observations and ends in state . (11) helps to keep tract of the “best path” ending in state at time t. Wavelets Wavelet transform uses multi resolution techniques to provide a time-frequency representation of the signal. It can be described as breaking up of a signal into shifted and scaled versions of the “mother” wavelet. Wavelet analysis is done by convolving the signal with wavelet kernels to obtain wavelet coefficients representing the contributions of wavelets in the signal at different scales and orientations [16,17]. Discrete wave transform (DWT) was developed to reduce the computation time and for easy implementation of the wavelet transform. It produces a time-scaled representation of the signal by using digital filtering techniques, the wavelet families. Unlike discrete Fourier transform that can be represented by a convolution equation, DWT comprises transformation kernels or equations that differ in its expansion functions, the nature of the functions (orthogonal or bi-orthogonal basis) and how many resolutions of the functions that are computed. A signal, which passes through the filter bank shown in Figure 2 is decomposed into four lower resolution components: the approximation , horizontal , vertical and diagonal coefficients .
  • 4. Ojo, J.A. & Adeniran, S.A. International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 61 )( 1 h jcD + )( 1 v jcD + )( 1 d jcD + )( 1 h jcA + jcA FIGURE 1: One-dimensional discrete top-to-bottom HMM (Source: MATLAB R2007a Help File) 3. METHOD Modeling a Face and Feature Extraction A One dimensional (1D) discrete top-to-bottom (Bakis) HMM was used to segment each face into states as shown in Figure 2. The algorithm for feature extraction shown in Figure 3 was used to generate the observation vector. Two dimensional Discrete Wavelet Transform (2D DWT) was used to decompose the image into its approximation coefficients, horizontal details, vertical details and the diagonal details. ‘db1’, one of the Daubechies family of wavelets was used for decomposition. The approximation coefficient was coded using 256 gray levels thereby producing a coded (and reduced) form of the original or input image. The “coded” image was divided into sub-images and the overlap between successive sub-images was allowed to be up to 5 pixels less than the total height of the sub-image. FIGURE 2: One-dimensional discrete top-to-bottom HMM To generate the observation vector from each sub-image, the two dimensional sub-images were converted into a vector by extracting the coefficients column-wise. The number of features (NF) selected was varied to see its effect on the recognition ability of the system. The coefficients of the sub-images were stacked to form a vector, therefore a face image was represented by a vector ( ) in length, where Q is the number of states. Figure 4(a) shows the original image from the AT&T database while Figure 4(b) shows the gray-scale of the approximation coefficient of the same image with the sampling strip for segmenting the image into allowable states.
  • 5. Ojo, J.A. & Adeniran, S.A. International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 62 FIGURE 3: Algorithm for feature extraction
  • 6. Ojo, J.A. & Adeniran, S.A. International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 63 FIGURE 4: Grey-scale version of the approximation coefficients of an image and its segmentation into states Training Features extracted from faces of individuals were used to train a model for each face using the algorithm shown in Figure 5. The initial parameter were generated randomly and improved using Baum-Welch re-estimation procedure [15] to get the parameters that optimised the likelihood of the training set observation vectors for the each face. State transition probability (A) is defined as, = 0, (12) = 0, (13) where i.e. the model is not allowed to jump more than a state at a time. Since each face was divided into five sub-images, the resulting matrix is (14) while and the initial state probability (π) is defined as    = ≠ = 1,1 1,0 j j iπ (15) . (16) The maximum number of iteration for the re-estimation is set to 5 or if the error between the initial and present value is less than , then the model is taken to have converged and the model parameters are stored with appropriate class name or number .
  • 7. Ojo, J.A. & Adeniran, S.A. International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 64 Algorithm for Model Re-estimation (n is the maximum number of iteration allowed) k = 1 initialise compute while do estimate if quit else end end FIGURE 5: Algorithm for HMM training Recognition and Verification Given a face to be tested or recognised, feature (observation vector) extraction is first performed as described in section 3.1. Model likelihoods (log-likelihood) for all the models in the training set (given the observation vectors) is calculated and the model with the highest log-likelihood is identified to be the model representing the face. Euclidean measure is used to test if a face is in the training set or database. If the log-likelihood is within the stated distance, the model (face) is recognised to be in the training set or in the database. However, in areas of applications such as access control, it is desired to know the exact identity of an individual, therefore the need to verify the correctness of the faces recognised. For classification or verification, the Viterbi recogniser was used as shown in Figure 6. The test (face) image was converted to an observation sequence and then model likelihoods
  • 8. Ojo, J.A. & Adeniran, S.A. International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 65 are computed for each . The model with highest likelihood reveals the identity of the unknown face. (12) 4. RESULTS AND DISCUSSION The algorithm was implemented in Matlab 7.4a on a HP AMD Turion 64 X2 TL-58, 1.90GHz, 2GB RAM on a Windows operating system. It was tested with a subset of the AT&T (formerly ORL) database [18]. A face image per person was used for training while five other images per person were used for testing, some of which are shown in Figure 7. The recognized images were verified for correctness, 80% correct classification (Hit) occurred while 20% were misclassified. The rest of the images that were not in the training set were used to test the false acceptance rate (FAR) i.e. the ratio of the numbers of images falsely accepted to the total number of images tested and 0.02 FAR occurred. The number of test images per class was reduced to two and 90% Hit, 0.025 FAR occurred as shown in Table 1. Furthermore, the algorithm was tested with ten subjects in the AT&T database. The general observation was that the percentage Hit and FAR were independent of number of subjects in class. For instance, 90% Hit, 0.05 FAR and 90% Hit, 0.05 FAR occurred when five and two test images per class were used respectively. FIGURE 6: Viterbi recogniser Figure 8 shows the effect that the number of features selected per each state (subimage) has on the number of correct classifications (Hit). The results show that 30 features per subimage were sufficient to give the best performance. In addition, the average time for testing a face was approximately 0.15s, which is near real-time. Going by these results, the algorithm is expected to be adequate for implementation in applications where small size database is required [19]. The performance of the algorithm when Two Dimensional Discrete Cosine Transform (2D-DCT) was used for feature extraction is compared with that of the Discrete Wavelet Transform (2D- DWT) and the results are shown in Table 2. The results show that there is a significant improvement in the recognition or classification accuracy when DWT was used for feature extraction.
  • 9. Ojo, J.A. & Adeniran, S.A. International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 66 Test Images Number of class Hit Miss FAR 5 20 80% 20% 0.02 2 20 90% 10% 0.025 5 10 90% 10% 0.04 2 10 90% 10% 0.05 TABLE 1: Classification accuracies achieved for a subset of AT&T database Test Images Number of class Hit DCT DWT 5 20 39% 80% 2 20 50% 90% 5 10 46% 90% 2 10 45% 90% TABLE 2: Classification accuracies for DCT and DWT FIGURE 7: Some of the faces used for testing.
  • 10. Ojo, J.A. & Adeniran, S.A. International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 67 FIGURE 8: Graph of correct classifications against number of features per state 5. CONSLUSION The paper presented a one sample face recognition system. Feature extraction was performed using 2D DWT and 1D top-to-bottom HMM was used for classification. When tested with a subset of the AT&T database, up to 90% correct classification (Hit) and as low as 0.02 FAR were achieved. The high recognition rate and the low FAR achieved shows that the new algorithm is suitable for face recognition problems with small-size database such as access control for personal computers (PCs) and personal digital assistants (PDAs). 6. REFERENCES 1. R. Brunelli and T. Poggio. “Face recognition: Features versus templates”. IEEE Transaction on Pattern Analysis and Machine Intelligence, 15(10):1042-1062, 1993 2. L. Sirovich and M. Kirby. “Low-Dimensional procedure for the characterization of human face”. Journal of the Optical Society of America A, 4(3):519–524, 1987 3. M. Turk and A. Pentland. “Eigenfaces for Recognition”. Journal of Cognitive Neuroscience, 3(1):71-86, 1991 4. S. Lawrence, C.L. Giles, A. Tsoi and A. Back. “Face recognition: A convolutional neural- network approach”. IEEE Transaction on Neural Networks, 8(1):98-113, 1997 5. W. Zhao, R. Chellappa, P.J. Philips and A. Rosenfeld. “Face recognition: A literature survey”. ACM Computing Surveys 35(4):399-458, 2003 6. X. Tan, S. Chen, Z-H Zhou, and F. Zhang. “Face recognition from a single image per person: a survey”. Pattern Recognition 39:1725-1745, 2006 7. J. Wu and Z.-H Zhou. “Face recognition with one training image per person”. Pattern Recognition Letters, 23(2):1711-1719, 2001 8. H.C. Jung, B.W. Hwang and S.W. Lee. “Authenticating corrupted face image based on noise model”. Proceedings of the 6th IEEE International Conference on Automatic Face and Gesture Recognition, 272, 2004 9. F. Frade, De la Torre, R. Gross, S. Baker, and V. Kumar. “Representational oriented component analysis (ROCA) for face recognition with one sample image per training class”. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition 2:266-273, June 2005
  • 11. Ojo, J.A. & Adeniran, S.A. International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 68 10. A.M. Martinez. “Recognizing imprecisely localised, partially occluded, and expression variant faces from a single sample per class”. IEEE Transaction on Pattern Analysis and Machine Intelligence 25(6):748-763, 2002 11. B.S. Manjunath, R. Chellappa and C.V.D. Malsburg. “A feature based approach to face recognition”. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition 1:373-378, 1992 12. M. Lades, J.Vorbruggen, J. Buhmann, J. Lange, von der Malsburg and R. Wurtz. “Distortion invariant object recognition in the dynamic link architecture”. IEEE Transaction on Computers 42(3):300-311, 1993 13. X. Tan, S.C. Chen, Z-H Zhou, and F. Zhang. “Recognising partially occluded, expression variant faces from single training image per person with SOM and soft kNN ensemble”. IEEE Transactions on Neural Networks, 16(4):875-886, 2005 14. H.-S. Le and H. Li. “Recognising frontal face images using Hidden Markov models with one training image per person”. Proceedings of the 17 th International Conference on Pattern Recognition (ICPR04), 1:318-321, 2004 15. L.R. Rabiner. “A tutorial on Hidden Markov models and selected application in speech recognition”. Proceedings of the IEEE, 77(2):257-286, 1989 16. I. Daubechies, “Orthonormal bases of compactly supported wavelets”. Communication on Pure & Applied Mathematics XLI, 41:909-996, 1988 17. I. Daubechies. “Ten lectures on wavelets”. CBMS-NSF Conference series in Applied Mathematics, No-61, SIAM, Philadelphia Pennsylvania, 1992 18. F. Samaria and A. Harter. “Parameterization of a stochastic model for human face identification”. 2nd IEEE Workshop on Applications of Computer Vision, Saratosa FL. pp.138-142, December, 1994 19. J. Roure, and M. Faundez-Zanuy. “Face recognition with small and large size databases”. Proceedings of 39th Annual International Carnahan Conference on Security Technology, pp 153-156, October 2005