Journal for Research | Volume 02 | Issue 06 | August 2016
ISSN: 2395-7549
All rights reserved by www.journalforresearch.org 28
Object Detection and Recognition: A survey
Anjana Mittal Dr. Sidhhartha Choubey
M.E. Scholar Associate Professor
Department of Computer Science & Engineering Department of Computer Science & Engineering
SSTC Bhilai (C.G.) SSTC Bhilai (C.G.)
Abstract
The efficiency and quality of a feature descriptor are critical to the user experience of many computer vision applications.
However, the existing descriptors are either too computationally expensive to achieve real-time performance, or not sufficiently
distinctive to identify correct matches from a large database with various transformations. In this paper, we propose a highly
efficient and distinctive binary descriptor, called local difference binary (LDB). LDB directly computes a binary string for an
image patch using simple intensity and gradient difference tests on pair wise grid cells within the patch. A multiple-gridding
strategy and a salient bit-selection method are applied to capture the distinct patterns of the patch at different spatial granularities.
Experimental results demonstrate that compared to the existing state-of-the-art binary descriptors, primarily designed for speed,
LDB has similar construction efficiency, while achieving a greater accuracy and faster speed for mobile object recognition and
tracking tasks.
Keywords: Object recognition, binary descriptor, spatial granularity, local difference binary and multiple - gridding
_______________________________________________________________________________________________________
I. INTRODUCTION
The object recognition is a challenging task in the real world [1]. The problem is that each object makes a different perspective to
the human mind with the position of the object, pose and background. A human brain can effortlessly by its own hypothesis and
biological inspirations. A method to solve the problem is to create a large database of the similar images with the variations is
done. The different methods have developed over the years for the recognition of the object from the given set of images.
Some of the basic components for the object recognition are:
1) Base Model: The model describes the functional descriptions to the feature information. The functional information models
for the size, shape and color of the features. Certain schemes are applied for facilitating the candidate objects for
consideration.
2) Feature detector: A feature detector detects the objects locations forming the hypotheses. The recognition depends on the
organization of the database the category of objet to be recognized.
3) Hypothesis formation and verification: A verifier or classifier can be considered for finding the object with the maximum
relevance refining the likelihood of the correct object.
The object recognition task is a challenging issue due to the following factors:
1) Scene Constancy: The various conditions such as illumination, background, camera parameters, etc affect the image of the
object dramatically. In such a case, the recognition may variated in different conditions for the same image.
2) Dimensionality: The image or video obtained can have three dimensional objects which are represented into two
dimensions. Here the features detected in two dimensional image space which are originally three dimensional features may
appear differently and can cause complexity in representations.
3) Number of objects: The feature detection for the objects in an image or video sequence is easier when the number of
objects is less. Selecting the appropriate features for object recognition is directly proportional to the number of objects in a
scene. Therefore, increasing the number of objects increases the effort of feature selection.
4) Occlusion: occlusion is a serious problem in an image processing task as it results in the absence of expected features and
generation of unexpected features. With the increase in number of objects in an image, the probability of occlusion also
increases.
The field of image processing and analysis is very broad, encompassing a wide variety of research issues, from efficient
encoding of images and video sequences, through image enhancement and restoration to image segmentation, recovering spatial
shape from shading and pattern distortions, learning about 3D from multiple unregistered 2D images, of image sequences, and
high level image understanding topics.
II. LITERATURE SURVEY
In 2012 Reza Oji [2], proposed a method for object recognition with boundary detection with the ASHIFT (Affine Scale
Invariant Feature Transform) and region merging. Initially the segmentation is done o partition the image into smaller portions.
The images are simulated with two camera orientations. The key point descriptor relative to the orientation can get invariance in
rotation. The histogram is created of local gradients to select the scale. The highest peak shows the direction of the local gradient.
Object Detection and Recognition: A survey
(J4R/ Volume 02 / Issue 06 / 006)
All rights reserved by www.journalforresearch.org 29
The regions in the image that corresponds to the object are identified with the Euclidean measure. The lower the distance hence
more is the similarity. The merging process is done after the ASHIFT is performed. The basic aim of merging is to merge the
non-object regions so that the object gets highlighted.
In 2012, V. Subbaroyan, Dr. Selvakumar Raja [3], proposed an approach for the recognition of the multiple objects by
considering the features from spatial as well as frequency domain. The input image is converted into gray scale image. The Sobel
operator is used to find the gradient of the image intensity. The color features of the image are obtained with the help of the
global color histogram. The two level wavelet transform is done to obtain the feature with higher frequency. The spatial features
are obtained from three colo moments for the basic colors. Both features are fused to obtain the recognized object.
Michael Calonder et al [5], proposed a method where the image patches are classified with respect to the pair wise intensity
values. The Nave Bayesian Classifier recognizes the patches from the different viewpoints. A clear advantage of binary
descriptors is that the Hamming distance. The individual bits are compared with the intensities of pairs of points. The positions
of the pixels are pre-selected randomly according to a Gaussian distribution around the patch center. The kernels are used to
smooth the patches before intensity difference and spatial pair of images.
Chiverton et al. [6] proposed a new fully automatic object tracking and segmentation framework. The framework consists of a
motion-based bootstrapping algorithm concurrent to a shape-based active contour. They propose two approaches of increasing
computational intensity and accuracy that statistically estimate the foreground from the potential mixture of foreground and
background enveloped by an alpha hull.
Zhenjun et al. [7] used combined feature set which is built using color histogram (HC) bins and gradient orientation histogram
(HOG) bins considering the color and contour representation of an object for object detection. The combined feature set is the
evolvement of color, edge orientation histograms and SIFT descriptors.
R. Saravanakumar et al. [8] represented the objects using the properties of the HSV color space. Adaptive k-means clustering
algorithm was applied to cluster objects centroids color values and co-ordinates were sent to next frame for clustering. A three
dimensional representation of the HSV color space is a hexacone, with the central vertical axis representing intensity.
Xiaobai Liu et al. [9] proposed hybrid online templates for object detection which uses different features such as flatness,
texture, or edge/corner. The template consists of multiple types of features, including sketches/edges, texture regions, and
flatness regions. Sketch/edge regions usually consist of various links, ridges, such as corners and junctions. Texture regions are a
large number of objects that are either too small or too distant to the camera. In contrast, flatness regions are always filled with
homogeneous color or intensity. The limitation of this method is; as the discriminative power of features change along with the
object movements, the hybrid template should be adaptively updated by either adjusting the feature confidences, or substituting
the old features with the newly discovered ones from the currently observed frames.
Liu et al. [10] presented a novel semiautomatic segmentation method for single video object extraction. Proposed method
formulates the separation of the video objects from the background as a classification problem. Each frame was divided into
small blocks of uniform size, which are called object blocks if the centering pixels belong to the object, or background blocks
otherwise. After a manual segmentation of the first frame, the blocks of this frame were used as the training samples for the
object-background classifier.
Johnsen et al. [11] used Approximated median filter to perform background modeling. For the implementation, better results
were obtained by scaling the increment and decrement by a step factor if the absolute difference between the current pixel and
the median-modeled background pixel is bigger than a threshold.
Victoria Yanulevskaya et al [12] proposed an object-based visual attention theory for the task of salient object detection. They
assume proto-object being a unit of attention and argue that notion of an object should be taken into account while assessing
object saliency.
Carlos Cuevas and Narciso García [13], proposed the background modeling algorithm to detect real time moving object. The
proposed method combined the background model and foreground model to detect an object from complex image very quality
taken by non-completely static cameras. It finds the bandwidth matrices for the kernels which are used in background modeling.
This proposed method updating the background model for reducing the misdetections. Bahadir Karasulu and Serdar Korukoglu
[13], proposed Moving object detection and tracking by using annealed background subtraction method in videos. Current frame
is subtracted from image and it is used to classify the pixel either foreground or background by comparing the difference with the
threshold. Simulated annealing (SA) technique is used to rectify the p – median problem. The total weighted distance between
demand points (nodes) and the closest facilities to demand points are used to find the p number. SA-based hybrid method is
developed for performance optimization of back, which is used to detect and track object(s) in videos.
Ling CAI Et Al [14], proposed a stereo vision-based model for multi-object detection and tracking in surveillance.
Illumination variation, shadow interference, and object occlusion problems are overcome by using stereo model. They identified
the feature points after they projected into 2D ground plane. For grouping the projected points according to their height values
and locations on the plane, a kernel-based clustering algorithm is used.
Bangjun Lei and Li-Qun Xu [15], proposed a detection and tracking of objects in wide range of outdoor surveillance and
monitoring scenarios in real-time video analysis system. Adaptive background modeling technique is used to extract the
foreground regions. A blob analysis is used for object tracking. It gives better result for non-crowded and the static state of scene.
Object Detection and Recognition: A survey
(J4R/ Volume 02 / Issue 06 / 006)
All rights reserved by www.journalforresearch.org 30
Feature Descriptor:
Feature point descriptors are widely used in many computer visions tasks such as marker less-augmented reality simultaneous
localization and mapping. Some available descriptors are:-
The SIFT descriptor, proposed over a decade ago, has been widely adopted as one of the highest quality options. However, it
imposes a heavy computation burden. This drawback has drawn extensive efforts for optimizing its speed without compromising
too much quality.
SURF is arguably the most noticeable. But recent experiments have shown that the SURF descriptor is still too
computationally heavy. Thus only a limited number of points can be handled for real‐time applications such as AR, especially
for handheld devices such as smart phones and tablets.
BRISK descriptor is a method for key point detection, description and matching. A novel scale-space FAST-based detector
reveals the descriptor sped which is combined with the assembly of a bit-string descriptor whose intensity is compared with each
neighborhood key point by sampling dedication.
BRIEF descriptor is proposed as an efficient feature point descriptor which uses binary strings. BRIEF using simple intensity
difference tests for computation and even when using relative few bits, it is highly discriminative. The descriptor is much faster
than other state-of-the-art ones, not only in matching and construction but also yields higher rates of recognition.
ORB descriptor is proposed on the basis of very fast binary BRIEF descriptor, which is resistant to noise and invariant to
Rotation. SIFT is at two orders of magnitude slower than ORB, it is an efficient alternative to SIFT or SURF. It is an efficient
alternative to SIFT or SURF. Its efficiency is tested on smart phone while patch-tracking and objects detection including several
real-world application.
FREAK represents retina-inspired and coarse-to-fine descriptor to improve the performance of current image descriptors. It
performs better than other state-of-the-art key point descriptor while remaining faster to compute with lower memory load and
also more robust than SIFT, SURF or BRISK.
Lightweight binary descriptors such as BRISK, FREAK, BRIEF and its variant rBRIEF (or ORB descriptor), become
increasingly popular as they are very efficient to store and to match. Raw intensities of a subset of pixels within an image patch
for binary tests, and thus have low discriminative ability. Lack of distinctiveness incurs an enormous number of false matches
when matching against a large database. The basic problem with existing descriptors is high computing complexity, matching
and storing the feature point descriptors.
III. PROPOSED METHODOLOGY
The method is a two-step process:
1) Smoothing: The image is smoothened at the corner for the boundary values of the image. It obtains the homogeneous points
and corner points of the image and selects the feature vectors.
2) Object Recognition: The process of object recognition is done from the set of features obtained after smoothing of the
image. Local Difference Binary (LDB) is done for the object recognition. The image is divided into small blocks and the
information from each pixel is extracted. This information describes the efficiency of the process and is the average of the
pixel intensity values. The first-order gradient is obtained from the average intensities that confine the variation in the
images. The level of block division needs to be fine as more granularity leads to more distinct recognition. For flow of
proposed methodology refer Fig 1.
Object Detection and Recognition: A survey
(J4R/ Volume 02 / Issue 06 / 006)
All rights reserved by www.journalforresearch.org 31
Fig. 1: Flow Chart of proposed methodology.
IV. CONCLUSION
In this paper, I only test some Images which were taken from the standard Pascal Voc dataset and first I recognize the object with
the existing algorithm means bounding box method, in bounding box method they recognize the whole object sometimes I did
not need of whole object. In our proposed work I did not work on bounding box method, first of all I will use gridding method
and grid it in 3*3 gridding and then I use edge detection method to find out the edges of the particular object only that is my
main recognition. The proposed method performs the recognition with the boundary of the extraction of the object with the
gridding for better recognition. After that report I will test our algorithm in some more Images and also compare our work with
the existing work also.
REFERENCES
[1] Xin Yang, Kwang-Ting (Tim) Cheng,”Local Difference Binary for Ultrafast and Distinctive Feature Description", IEEE Transactions on Pattern Analysis
and Machine Intelligence", VOL- 36, NO-1, JANUARY-2014.
[2] Reza Oji, An Automatic Algorithm for Object Recognition and Detection Based on ASHIFT Key points, Signal & Image Processing: An International
Journal (SIPIJ) Vol.3, No.5, pp. 29-39, 2012.
[3] V. Subbaroyan, Dr. Selvakumar Raja, Novel Feature Fusion Method of Object Recognition Using Wavelet Transform, International Journal of Scientific &
Engineering Research, Vol 3, Issue 9, 2012.10.
[4] ShulinYang, LiefengBo, Jue Wang and Linda Shapiro, Unsupervised Template Learning for Fine-Grained Object Recognition, Advances in Neural
Information Processing Systems, pp. 31223130, 2012.
[5] Michael Calonder, Vincent Lepetit, Christoph Strecha, and Pascal Fua, BRIEF: Binary Robust Independent Elementary Features, ECCV’10 Proceedings of
the 11th European conference on Computer vision: Part IV, pp. 778-792, 2010.
[6] John Chiverton, Xianghua Xie and Majid Mirmehdi,”Automatic Bootstrapping and Tracking of Object Contours “, IEEE Transactions on Image
Processing, Vol. 21, No. 3, March 2012.
[7] Zhenjun Han, Qixiang Ye, Jianbin Jiao,” Online feature evaluation for object tracking Using kalman filter “978- 1-4244-2175-6/08/$25.00 ©2008 IEEE.
Input the image
Perform the smoothing
and obtain the boundaries
Find the average intensity
of the pixel valued
Perform first order
gradient
Perform recognition
through matching
Output the recognized
object
Divide the image into n
blocks
Object Detection and Recognition: A survey
(J4R/ Volume 02 / Issue 06 / 006)
All rights reserved by www.journalforresearch.org 32
[8] S. Saravanakumar, A. Vadivel and C.G. Saneem Ahmed,” Human object tracking in video sequences “ICTACT Journal on Image and Video Processing,
August 2011, Volume: 02, Issue: 01.
[9] Xiaobai Liu, Liang Lin, Shuicheng Yan, Hai Jin, and Wenbin Jiang,” Adaptive Object Tracking by Learning Hybrid Template Online “,IEEE Transactions
On Circuits And Systems For Video Technology, Vol. 21, No. 11, November 2011.
[10] Yi Liu, Yuan F. Zheng, “Video Object Segmentation and Tracking Using ψ -Learning Classification “, IEEE transactions on circuits and systems for video
technology, vol. 15, no. 7, July 2005.
[11] Swantje Johnsen and Ashley Tews,” Real-Time Object Tracking and Classification Using a Static Camera “Proceedings of the IEEE ICRA 2009 Workshop
on People Detection and Tracking Kobe, Japan, May 2009”.
[12] Victoria Yanulevskaya, Jasper Uijlings, Jan-Mark Geusebroek, “Salient object detection: From pixels to segments”, Image and Vision Computing Vol. 31,
Pg. No. 31–42, 2013.
[13] Carlos Cuevas and Narciso García , “Improved background modeling for real-time spatio-temporal nonparametric moving object detection strategies”,
Image and Vision Computing Vol.31, Pg. No. 616–630,2013.
[14] Ling CAI et. al., “Multi-object detection and tracking by stereo vision”, Pattern Recognition Vol.43, Pg. No. 4028–4041, 2010.
[15] Bangjun Lei and Li-Qun Xu, “Real-time outdoor video surveillance with robust foreground extraction and object tracking via multi-state transition
management”, Pattern Recognition Letters Vol.27, Pg. No. 1816 – 1825, 2006.

More Related Content

PDF
Image Segmentation from RGBD Images by 3D Point Cloud Attributes and High-Lev...
PDF
Improving the Accuracy of Object Based Supervised Image Classification using ...
PDF
A Review on Matching For Sketch Technique
PDF
Web Image Retrieval Using Visual Dictionary
PDF
Computer Vision: Visual Extent of an Object
PDF
SIGNIFICANCE OF DIMENSIONALITY REDUCTION IN IMAGE PROCESSING
PDF
Content Based Image Retrieval : Classification Using Neural Networks
PDF
2015.basicsof imageanalysischapter2 (1)
Image Segmentation from RGBD Images by 3D Point Cloud Attributes and High-Lev...
Improving the Accuracy of Object Based Supervised Image Classification using ...
A Review on Matching For Sketch Technique
Web Image Retrieval Using Visual Dictionary
Computer Vision: Visual Extent of an Object
SIGNIFICANCE OF DIMENSIONALITY REDUCTION IN IMAGE PROCESSING
Content Based Image Retrieval : Classification Using Neural Networks
2015.basicsof imageanalysischapter2 (1)

What's hot (17)

PDF
F010433136
PDF
Content based image retrieval based on shape with texture features
PDF
Ijarcet vol-2-issue-7-2287-2291
PDF
Q0460398103
PDF
10.1.1.432.9149
PPTX
Object recognition
PDF
An Automatic Color Feature Vector Classification Based on Clustering Method
PDF
A Survey of Image Segmentation based on Artificial Intelligence and Evolution...
PDF
A Review of Feature Extraction Techniques for CBIR based on SVM
PPTX
Object recognition
PDF
A Fuzzy Set Approach for Edge Detection
PPTX
PPTX
Object recognition
PPTX
Content Based Image and Video Retrieval Algorithm
PDF
Visual Object Category Recognition
F010433136
Content based image retrieval based on shape with texture features
Ijarcet vol-2-issue-7-2287-2291
Q0460398103
10.1.1.432.9149
Object recognition
An Automatic Color Feature Vector Classification Based on Clustering Method
A Survey of Image Segmentation based on Artificial Intelligence and Evolution...
A Review of Feature Extraction Techniques for CBIR based on SVM
Object recognition
A Fuzzy Set Approach for Edge Detection
Object recognition
Content Based Image and Video Retrieval Algorithm
Visual Object Category Recognition
Ad

Viewers also liked (20)

PDF
REMOVAL OF DISSOLVED SOLIDS IN WASTE WATER USING ACTIVATED CARBON FROM COCONU...
PDF
CORRELATION STUDIES ON PHYSICO-CHEMICAL PARAMETER AND PHYTOPLANKTON IN FRESH ...
PDF
A NOVEL WEB IMAGE RE-RANKING APPROACH BASED ON QUERY SPECIFIC SEMANTIC SIGNAT...
PPTX
Trabalho acadêmico sobre o texto: De perto e de dentro: Notas para uma etnogr...
PDF
Nicola Mar Poems from 2016 and 2017
PDF
La actividad fisica, un aporte para la salud
PPTX
Epilepsia
PDF
Aristoteles: La justa medida
PDF
SMART MONITORING OF AUTOMOBILE USING IOT
PPTX
History of advertising - Mad Men and Women through the years
PPTX
HSA Foundation Overview
PDF
IMPLEMENTATION OF LEAN MANUFACTURING TOOLS FOR IMPROVING PRODUCTIVITY: A CASE...
PDF
PAPER ON WELDABILITY OF H30 ALUMINIUM WITH BS970 MILD STEEL USING FRICTION WE...
PDF
DISTRIBUTION OF ELECTRIC FIELD ANALYSIS IN 36 KV ROOF TOP BUSHING BY USING FE...
PDF
DESIGN AND ANALYSIS OF COTTON SEED SEPARATION MACHINE BY USING RELIABILITY IN...
PDF
SECURITY IMPLEMENTATION IN MEDIA STREAMING APPLICATIONS USING OPEN NETWORK AD...
PDF
AN APPROACH TO DETECT STUBBLE BURNED AREAS IN PUNJAB BY DIGITALLY ANALYZING S...
PDF
STUDY ON INTERSECTION AT 26TH MILE, KANJIRAPPALLY
PDF
A STUDY ON MECHANICAL PROPERTIES OF TREATED PALM SEED FIBER EPOXY COMPOSITE
PDF
AN ACTIVE PFC WITH FLYBACK DESIGN FOR INTELLIGENCE IN STREET LIGHT APPLICATION
REMOVAL OF DISSOLVED SOLIDS IN WASTE WATER USING ACTIVATED CARBON FROM COCONU...
CORRELATION STUDIES ON PHYSICO-CHEMICAL PARAMETER AND PHYTOPLANKTON IN FRESH ...
A NOVEL WEB IMAGE RE-RANKING APPROACH BASED ON QUERY SPECIFIC SEMANTIC SIGNAT...
Trabalho acadêmico sobre o texto: De perto e de dentro: Notas para uma etnogr...
Nicola Mar Poems from 2016 and 2017
La actividad fisica, un aporte para la salud
Epilepsia
Aristoteles: La justa medida
SMART MONITORING OF AUTOMOBILE USING IOT
History of advertising - Mad Men and Women through the years
HSA Foundation Overview
IMPLEMENTATION OF LEAN MANUFACTURING TOOLS FOR IMPROVING PRODUCTIVITY: A CASE...
PAPER ON WELDABILITY OF H30 ALUMINIUM WITH BS970 MILD STEEL USING FRICTION WE...
DISTRIBUTION OF ELECTRIC FIELD ANALYSIS IN 36 KV ROOF TOP BUSHING BY USING FE...
DESIGN AND ANALYSIS OF COTTON SEED SEPARATION MACHINE BY USING RELIABILITY IN...
SECURITY IMPLEMENTATION IN MEDIA STREAMING APPLICATIONS USING OPEN NETWORK AD...
AN APPROACH TO DETECT STUBBLE BURNED AREAS IN PUNJAB BY DIGITALLY ANALYZING S...
STUDY ON INTERSECTION AT 26TH MILE, KANJIRAPPALLY
A STUDY ON MECHANICAL PROPERTIES OF TREATED PALM SEED FIBER EPOXY COMPOSITE
AN ACTIVE PFC WITH FLYBACK DESIGN FOR INTELLIGENCE IN STREET LIGHT APPLICATION
Ad

Similar to OBJECT DETECTION AND RECOGNITION: A SURVEY (20)

PDF
Semi-Supervised Method of Multiple Object Segmentation with a Region Labeling...
PDF
Real Time Object Detection And Recognization.pdf
PDF
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...
PDF
Ijarcet vol-2-issue-4-1383-1388
PDF
2001714
PDF
Web Image Retrieval Using Visual Dictionary
PDF
Web Image Retrieval Using Visual Dictionary
PDF
Dj31514517
PDF
Dj31514517
PDF
A Survey on Approaches for Object Tracking
PDF
A survey on moving object tracking in video
PDF
A Survey on Image Retrieval By Different Features and Techniques
PDF
H018124360
PDF
Object detection for service robot using range and color features of an image
PDF
International Journal of Computational Engineering Research(IJCER)
PDF
WEAKLY SUPERVISED FINE-GRAINED CATEGORIZATION WITH PART-BASED IMAGE REPRESENT...
PDF
Object Recogniton Based on Undecimated Wavelet Transform
PDF
Colour Object Recognition using Biologically Inspired Model
PDF
Object Detection for Service Robot Using Range and Color Features of an Image
PDF
OBJECT DETECTION FOR SERVICE ROBOT USING RANGE AND COLOR FEATURES OF AN IMAGE
Semi-Supervised Method of Multiple Object Segmentation with a Region Labeling...
Real Time Object Detection And Recognization.pdf
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...
Ijarcet vol-2-issue-4-1383-1388
2001714
Web Image Retrieval Using Visual Dictionary
Web Image Retrieval Using Visual Dictionary
Dj31514517
Dj31514517
A Survey on Approaches for Object Tracking
A survey on moving object tracking in video
A Survey on Image Retrieval By Different Features and Techniques
H018124360
Object detection for service robot using range and color features of an image
International Journal of Computational Engineering Research(IJCER)
WEAKLY SUPERVISED FINE-GRAINED CATEGORIZATION WITH PART-BASED IMAGE REPRESENT...
Object Recogniton Based on Undecimated Wavelet Transform
Colour Object Recognition using Biologically Inspired Model
Object Detection for Service Robot Using Range and Color Features of an Image
OBJECT DETECTION FOR SERVICE ROBOT USING RANGE AND COLOR FEATURES OF AN IMAGE

More from Journal For Research (20)

PDF
Design and Analysis of Hydraulic Actuator in a Typical Aerospace vehicle | J4...
PDF
Experimental Verification and Validation of Stress Distribution of Composite ...
PDF
Image Binarization for the uses of Preprocessing to Detect Brain Abnormality ...
PDF
A Research Paper on BFO and PSO Based Movie Recommendation System | J4RV4I1016
PDF
IoT based Digital Agriculture Monitoring System and Their Impact on Optimal U...
PDF
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
PDF
HCI BASED APPLICATION FOR PLAYING COMPUTER GAMES | J4RV4I1014
PDF
A REVIEW ON DESIGN OF PUBLIC TRANSPORTATION SYSTEM IN CHANDRAPUR CITY | J4RV4...
PDF
A REVIEW ON LIFTING AND ASSEMBLY OF ROTARY KILN TYRE WITH SHELL BY FLEXIBLE G...
PDF
LABORATORY STUDY OF STRONG, MODERATE AND WEAK SANDSTONES | J4RV4I1012
PDF
DESIGN ANALYSIS AND FABRICATION OF MANUAL RICE TRANSPLANTING MACHINE | J4RV4I...
PDF
AN OVERVIEW: DAKNET TECHNOLOGY - BROADBAND AD-HOC CONNECTIVITY | J4RV4I1009
PDF
LINE FOLLOWER ROBOT | J4RV4I1010
PDF
CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008
PDF
AN INTEGRATED APPROACH TO REDUCE INTRA CITY TRAFFIC AT COIMBATORE | J4RV4I1002
PDF
A REVIEW STUDY ON GAS-SOLID CYCLONE SEPARATOR USING LAPPLE MODEL | J4RV4I1001
PDF
IMAGE SEGMENTATION USING FCM ALGORITM | J4RV3I12021
PDF
USE OF GALVANIZED STEELS FOR AUTOMOTIVE BODY- CAR SURVEY RESULTS AT COASTAL A...
PDF
UNMANNED AERIAL VEHICLE FOR REMITTANCE | J4RV3I12023
PDF
SURVEY ON A MODERN MEDICARE SYSTEM USING INTERNET OF THINGS | J4RV3I12024
Design and Analysis of Hydraulic Actuator in a Typical Aerospace vehicle | J4...
Experimental Verification and Validation of Stress Distribution of Composite ...
Image Binarization for the uses of Preprocessing to Detect Brain Abnormality ...
A Research Paper on BFO and PSO Based Movie Recommendation System | J4RV4I1016
IoT based Digital Agriculture Monitoring System and Their Impact on Optimal U...
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
HCI BASED APPLICATION FOR PLAYING COMPUTER GAMES | J4RV4I1014
A REVIEW ON DESIGN OF PUBLIC TRANSPORTATION SYSTEM IN CHANDRAPUR CITY | J4RV4...
A REVIEW ON LIFTING AND ASSEMBLY OF ROTARY KILN TYRE WITH SHELL BY FLEXIBLE G...
LABORATORY STUDY OF STRONG, MODERATE AND WEAK SANDSTONES | J4RV4I1012
DESIGN ANALYSIS AND FABRICATION OF MANUAL RICE TRANSPLANTING MACHINE | J4RV4I...
AN OVERVIEW: DAKNET TECHNOLOGY - BROADBAND AD-HOC CONNECTIVITY | J4RV4I1009
LINE FOLLOWER ROBOT | J4RV4I1010
CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008
AN INTEGRATED APPROACH TO REDUCE INTRA CITY TRAFFIC AT COIMBATORE | J4RV4I1002
A REVIEW STUDY ON GAS-SOLID CYCLONE SEPARATOR USING LAPPLE MODEL | J4RV4I1001
IMAGE SEGMENTATION USING FCM ALGORITM | J4RV3I12021
USE OF GALVANIZED STEELS FOR AUTOMOTIVE BODY- CAR SURVEY RESULTS AT COASTAL A...
UNMANNED AERIAL VEHICLE FOR REMITTANCE | J4RV3I12023
SURVEY ON A MODERN MEDICARE SYSTEM USING INTERNET OF THINGS | J4RV3I12024

Recently uploaded (20)

PDF
IS1343_2012...........................pdf
PPTX
MALARIA - educational ppt for students..
PPTX
ENGlishGrade8_Quarter2_WEEK1_LESSON1.pptx
PDF
Jana Ojana 2025 Prelims - School Quiz by Pragya - UEMK Quiz Club
PDF
NGÂN HÀNG CÂU HỎI TÁCH CHỌN LỌC THEO CHUYÊN ĐỀ TỪ ĐỀ THI THỬ TN THPT 2025 TIẾ...
PDF
Teacher's Day Quiz 2025
PDF
Unleashing the Potential of the Cultural and creative industries
PPTX
Ppt obs emergecy.pptxydirnbduejguxjjdjidjdbuc
PDF
New_Round_Up_6_SB.pdf download for free, easy to learn
DOCX
OA 7- Administrative Office Procedure and Management.docx
PPTX
Juvenile delinquency-Crim Research day 3x
PDF
Developing speaking skill_learning_mater.pdf
PDF
GSA-Past-Papers-2010-2024-2.pdf CSS examination
PDF
17649-Learning By Doing_text-tailieu.pdf
PDF
Physical pharmaceutics two in b pharmacy
PDF
Bacterial Diversity and Evolution Bacterial Taxonomy Lecture (4)_.pdf
DOCX
HELMET DETECTION AND BIOMETRIC BASED VEHICLESECURITY USING MACHINE LEARNING.docx
PPTX
Power of Gratitude: Honouring our teachers
PPTX
FILIPINO 8 Q2 WEEK 1(DAY 1).power point presentation
PPTX
INTRODUCTION TO PHILOSOPHY FULL SEM - COMPLETE.pptxINTRODUCTION TO PHILOSOPHY...
IS1343_2012...........................pdf
MALARIA - educational ppt for students..
ENGlishGrade8_Quarter2_WEEK1_LESSON1.pptx
Jana Ojana 2025 Prelims - School Quiz by Pragya - UEMK Quiz Club
NGÂN HÀNG CÂU HỎI TÁCH CHỌN LỌC THEO CHUYÊN ĐỀ TỪ ĐỀ THI THỬ TN THPT 2025 TIẾ...
Teacher's Day Quiz 2025
Unleashing the Potential of the Cultural and creative industries
Ppt obs emergecy.pptxydirnbduejguxjjdjidjdbuc
New_Round_Up_6_SB.pdf download for free, easy to learn
OA 7- Administrative Office Procedure and Management.docx
Juvenile delinquency-Crim Research day 3x
Developing speaking skill_learning_mater.pdf
GSA-Past-Papers-2010-2024-2.pdf CSS examination
17649-Learning By Doing_text-tailieu.pdf
Physical pharmaceutics two in b pharmacy
Bacterial Diversity and Evolution Bacterial Taxonomy Lecture (4)_.pdf
HELMET DETECTION AND BIOMETRIC BASED VEHICLESECURITY USING MACHINE LEARNING.docx
Power of Gratitude: Honouring our teachers
FILIPINO 8 Q2 WEEK 1(DAY 1).power point presentation
INTRODUCTION TO PHILOSOPHY FULL SEM - COMPLETE.pptxINTRODUCTION TO PHILOSOPHY...

OBJECT DETECTION AND RECOGNITION: A SURVEY

  • 1. Journal for Research | Volume 02 | Issue 06 | August 2016 ISSN: 2395-7549 All rights reserved by www.journalforresearch.org 28 Object Detection and Recognition: A survey Anjana Mittal Dr. Sidhhartha Choubey M.E. Scholar Associate Professor Department of Computer Science & Engineering Department of Computer Science & Engineering SSTC Bhilai (C.G.) SSTC Bhilai (C.G.) Abstract The efficiency and quality of a feature descriptor are critical to the user experience of many computer vision applications. However, the existing descriptors are either too computationally expensive to achieve real-time performance, or not sufficiently distinctive to identify correct matches from a large database with various transformations. In this paper, we propose a highly efficient and distinctive binary descriptor, called local difference binary (LDB). LDB directly computes a binary string for an image patch using simple intensity and gradient difference tests on pair wise grid cells within the patch. A multiple-gridding strategy and a salient bit-selection method are applied to capture the distinct patterns of the patch at different spatial granularities. Experimental results demonstrate that compared to the existing state-of-the-art binary descriptors, primarily designed for speed, LDB has similar construction efficiency, while achieving a greater accuracy and faster speed for mobile object recognition and tracking tasks. Keywords: Object recognition, binary descriptor, spatial granularity, local difference binary and multiple - gridding _______________________________________________________________________________________________________ I. INTRODUCTION The object recognition is a challenging task in the real world [1]. The problem is that each object makes a different perspective to the human mind with the position of the object, pose and background. A human brain can effortlessly by its own hypothesis and biological inspirations. A method to solve the problem is to create a large database of the similar images with the variations is done. The different methods have developed over the years for the recognition of the object from the given set of images. Some of the basic components for the object recognition are: 1) Base Model: The model describes the functional descriptions to the feature information. The functional information models for the size, shape and color of the features. Certain schemes are applied for facilitating the candidate objects for consideration. 2) Feature detector: A feature detector detects the objects locations forming the hypotheses. The recognition depends on the organization of the database the category of objet to be recognized. 3) Hypothesis formation and verification: A verifier or classifier can be considered for finding the object with the maximum relevance refining the likelihood of the correct object. The object recognition task is a challenging issue due to the following factors: 1) Scene Constancy: The various conditions such as illumination, background, camera parameters, etc affect the image of the object dramatically. In such a case, the recognition may variated in different conditions for the same image. 2) Dimensionality: The image or video obtained can have three dimensional objects which are represented into two dimensions. Here the features detected in two dimensional image space which are originally three dimensional features may appear differently and can cause complexity in representations. 3) Number of objects: The feature detection for the objects in an image or video sequence is easier when the number of objects is less. Selecting the appropriate features for object recognition is directly proportional to the number of objects in a scene. Therefore, increasing the number of objects increases the effort of feature selection. 4) Occlusion: occlusion is a serious problem in an image processing task as it results in the absence of expected features and generation of unexpected features. With the increase in number of objects in an image, the probability of occlusion also increases. The field of image processing and analysis is very broad, encompassing a wide variety of research issues, from efficient encoding of images and video sequences, through image enhancement and restoration to image segmentation, recovering spatial shape from shading and pattern distortions, learning about 3D from multiple unregistered 2D images, of image sequences, and high level image understanding topics. II. LITERATURE SURVEY In 2012 Reza Oji [2], proposed a method for object recognition with boundary detection with the ASHIFT (Affine Scale Invariant Feature Transform) and region merging. Initially the segmentation is done o partition the image into smaller portions. The images are simulated with two camera orientations. The key point descriptor relative to the orientation can get invariance in rotation. The histogram is created of local gradients to select the scale. The highest peak shows the direction of the local gradient.
  • 2. Object Detection and Recognition: A survey (J4R/ Volume 02 / Issue 06 / 006) All rights reserved by www.journalforresearch.org 29 The regions in the image that corresponds to the object are identified with the Euclidean measure. The lower the distance hence more is the similarity. The merging process is done after the ASHIFT is performed. The basic aim of merging is to merge the non-object regions so that the object gets highlighted. In 2012, V. Subbaroyan, Dr. Selvakumar Raja [3], proposed an approach for the recognition of the multiple objects by considering the features from spatial as well as frequency domain. The input image is converted into gray scale image. The Sobel operator is used to find the gradient of the image intensity. The color features of the image are obtained with the help of the global color histogram. The two level wavelet transform is done to obtain the feature with higher frequency. The spatial features are obtained from three colo moments for the basic colors. Both features are fused to obtain the recognized object. Michael Calonder et al [5], proposed a method where the image patches are classified with respect to the pair wise intensity values. The Nave Bayesian Classifier recognizes the patches from the different viewpoints. A clear advantage of binary descriptors is that the Hamming distance. The individual bits are compared with the intensities of pairs of points. The positions of the pixels are pre-selected randomly according to a Gaussian distribution around the patch center. The kernels are used to smooth the patches before intensity difference and spatial pair of images. Chiverton et al. [6] proposed a new fully automatic object tracking and segmentation framework. The framework consists of a motion-based bootstrapping algorithm concurrent to a shape-based active contour. They propose two approaches of increasing computational intensity and accuracy that statistically estimate the foreground from the potential mixture of foreground and background enveloped by an alpha hull. Zhenjun et al. [7] used combined feature set which is built using color histogram (HC) bins and gradient orientation histogram (HOG) bins considering the color and contour representation of an object for object detection. The combined feature set is the evolvement of color, edge orientation histograms and SIFT descriptors. R. Saravanakumar et al. [8] represented the objects using the properties of the HSV color space. Adaptive k-means clustering algorithm was applied to cluster objects centroids color values and co-ordinates were sent to next frame for clustering. A three dimensional representation of the HSV color space is a hexacone, with the central vertical axis representing intensity. Xiaobai Liu et al. [9] proposed hybrid online templates for object detection which uses different features such as flatness, texture, or edge/corner. The template consists of multiple types of features, including sketches/edges, texture regions, and flatness regions. Sketch/edge regions usually consist of various links, ridges, such as corners and junctions. Texture regions are a large number of objects that are either too small or too distant to the camera. In contrast, flatness regions are always filled with homogeneous color or intensity. The limitation of this method is; as the discriminative power of features change along with the object movements, the hybrid template should be adaptively updated by either adjusting the feature confidences, or substituting the old features with the newly discovered ones from the currently observed frames. Liu et al. [10] presented a novel semiautomatic segmentation method for single video object extraction. Proposed method formulates the separation of the video objects from the background as a classification problem. Each frame was divided into small blocks of uniform size, which are called object blocks if the centering pixels belong to the object, or background blocks otherwise. After a manual segmentation of the first frame, the blocks of this frame were used as the training samples for the object-background classifier. Johnsen et al. [11] used Approximated median filter to perform background modeling. For the implementation, better results were obtained by scaling the increment and decrement by a step factor if the absolute difference between the current pixel and the median-modeled background pixel is bigger than a threshold. Victoria Yanulevskaya et al [12] proposed an object-based visual attention theory for the task of salient object detection. They assume proto-object being a unit of attention and argue that notion of an object should be taken into account while assessing object saliency. Carlos Cuevas and Narciso García [13], proposed the background modeling algorithm to detect real time moving object. The proposed method combined the background model and foreground model to detect an object from complex image very quality taken by non-completely static cameras. It finds the bandwidth matrices for the kernels which are used in background modeling. This proposed method updating the background model for reducing the misdetections. Bahadir Karasulu and Serdar Korukoglu [13], proposed Moving object detection and tracking by using annealed background subtraction method in videos. Current frame is subtracted from image and it is used to classify the pixel either foreground or background by comparing the difference with the threshold. Simulated annealing (SA) technique is used to rectify the p – median problem. The total weighted distance between demand points (nodes) and the closest facilities to demand points are used to find the p number. SA-based hybrid method is developed for performance optimization of back, which is used to detect and track object(s) in videos. Ling CAI Et Al [14], proposed a stereo vision-based model for multi-object detection and tracking in surveillance. Illumination variation, shadow interference, and object occlusion problems are overcome by using stereo model. They identified the feature points after they projected into 2D ground plane. For grouping the projected points according to their height values and locations on the plane, a kernel-based clustering algorithm is used. Bangjun Lei and Li-Qun Xu [15], proposed a detection and tracking of objects in wide range of outdoor surveillance and monitoring scenarios in real-time video analysis system. Adaptive background modeling technique is used to extract the foreground regions. A blob analysis is used for object tracking. It gives better result for non-crowded and the static state of scene.
  • 3. Object Detection and Recognition: A survey (J4R/ Volume 02 / Issue 06 / 006) All rights reserved by www.journalforresearch.org 30 Feature Descriptor: Feature point descriptors are widely used in many computer visions tasks such as marker less-augmented reality simultaneous localization and mapping. Some available descriptors are:- The SIFT descriptor, proposed over a decade ago, has been widely adopted as one of the highest quality options. However, it imposes a heavy computation burden. This drawback has drawn extensive efforts for optimizing its speed without compromising too much quality. SURF is arguably the most noticeable. But recent experiments have shown that the SURF descriptor is still too computationally heavy. Thus only a limited number of points can be handled for real‐time applications such as AR, especially for handheld devices such as smart phones and tablets. BRISK descriptor is a method for key point detection, description and matching. A novel scale-space FAST-based detector reveals the descriptor sped which is combined with the assembly of a bit-string descriptor whose intensity is compared with each neighborhood key point by sampling dedication. BRIEF descriptor is proposed as an efficient feature point descriptor which uses binary strings. BRIEF using simple intensity difference tests for computation and even when using relative few bits, it is highly discriminative. The descriptor is much faster than other state-of-the-art ones, not only in matching and construction but also yields higher rates of recognition. ORB descriptor is proposed on the basis of very fast binary BRIEF descriptor, which is resistant to noise and invariant to Rotation. SIFT is at two orders of magnitude slower than ORB, it is an efficient alternative to SIFT or SURF. It is an efficient alternative to SIFT or SURF. Its efficiency is tested on smart phone while patch-tracking and objects detection including several real-world application. FREAK represents retina-inspired and coarse-to-fine descriptor to improve the performance of current image descriptors. It performs better than other state-of-the-art key point descriptor while remaining faster to compute with lower memory load and also more robust than SIFT, SURF or BRISK. Lightweight binary descriptors such as BRISK, FREAK, BRIEF and its variant rBRIEF (or ORB descriptor), become increasingly popular as they are very efficient to store and to match. Raw intensities of a subset of pixels within an image patch for binary tests, and thus have low discriminative ability. Lack of distinctiveness incurs an enormous number of false matches when matching against a large database. The basic problem with existing descriptors is high computing complexity, matching and storing the feature point descriptors. III. PROPOSED METHODOLOGY The method is a two-step process: 1) Smoothing: The image is smoothened at the corner for the boundary values of the image. It obtains the homogeneous points and corner points of the image and selects the feature vectors. 2) Object Recognition: The process of object recognition is done from the set of features obtained after smoothing of the image. Local Difference Binary (LDB) is done for the object recognition. The image is divided into small blocks and the information from each pixel is extracted. This information describes the efficiency of the process and is the average of the pixel intensity values. The first-order gradient is obtained from the average intensities that confine the variation in the images. The level of block division needs to be fine as more granularity leads to more distinct recognition. For flow of proposed methodology refer Fig 1.
  • 4. Object Detection and Recognition: A survey (J4R/ Volume 02 / Issue 06 / 006) All rights reserved by www.journalforresearch.org 31 Fig. 1: Flow Chart of proposed methodology. IV. CONCLUSION In this paper, I only test some Images which were taken from the standard Pascal Voc dataset and first I recognize the object with the existing algorithm means bounding box method, in bounding box method they recognize the whole object sometimes I did not need of whole object. In our proposed work I did not work on bounding box method, first of all I will use gridding method and grid it in 3*3 gridding and then I use edge detection method to find out the edges of the particular object only that is my main recognition. The proposed method performs the recognition with the boundary of the extraction of the object with the gridding for better recognition. After that report I will test our algorithm in some more Images and also compare our work with the existing work also. REFERENCES [1] Xin Yang, Kwang-Ting (Tim) Cheng,”Local Difference Binary for Ultrafast and Distinctive Feature Description", IEEE Transactions on Pattern Analysis and Machine Intelligence", VOL- 36, NO-1, JANUARY-2014. [2] Reza Oji, An Automatic Algorithm for Object Recognition and Detection Based on ASHIFT Key points, Signal & Image Processing: An International Journal (SIPIJ) Vol.3, No.5, pp. 29-39, 2012. [3] V. Subbaroyan, Dr. Selvakumar Raja, Novel Feature Fusion Method of Object Recognition Using Wavelet Transform, International Journal of Scientific & Engineering Research, Vol 3, Issue 9, 2012.10. [4] ShulinYang, LiefengBo, Jue Wang and Linda Shapiro, Unsupervised Template Learning for Fine-Grained Object Recognition, Advances in Neural Information Processing Systems, pp. 31223130, 2012. [5] Michael Calonder, Vincent Lepetit, Christoph Strecha, and Pascal Fua, BRIEF: Binary Robust Independent Elementary Features, ECCV’10 Proceedings of the 11th European conference on Computer vision: Part IV, pp. 778-792, 2010. [6] John Chiverton, Xianghua Xie and Majid Mirmehdi,”Automatic Bootstrapping and Tracking of Object Contours “, IEEE Transactions on Image Processing, Vol. 21, No. 3, March 2012. [7] Zhenjun Han, Qixiang Ye, Jianbin Jiao,” Online feature evaluation for object tracking Using kalman filter “978- 1-4244-2175-6/08/$25.00 ©2008 IEEE. Input the image Perform the smoothing and obtain the boundaries Find the average intensity of the pixel valued Perform first order gradient Perform recognition through matching Output the recognized object Divide the image into n blocks
  • 5. Object Detection and Recognition: A survey (J4R/ Volume 02 / Issue 06 / 006) All rights reserved by www.journalforresearch.org 32 [8] S. Saravanakumar, A. Vadivel and C.G. Saneem Ahmed,” Human object tracking in video sequences “ICTACT Journal on Image and Video Processing, August 2011, Volume: 02, Issue: 01. [9] Xiaobai Liu, Liang Lin, Shuicheng Yan, Hai Jin, and Wenbin Jiang,” Adaptive Object Tracking by Learning Hybrid Template Online “,IEEE Transactions On Circuits And Systems For Video Technology, Vol. 21, No. 11, November 2011. [10] Yi Liu, Yuan F. Zheng, “Video Object Segmentation and Tracking Using ψ -Learning Classification “, IEEE transactions on circuits and systems for video technology, vol. 15, no. 7, July 2005. [11] Swantje Johnsen and Ashley Tews,” Real-Time Object Tracking and Classification Using a Static Camera “Proceedings of the IEEE ICRA 2009 Workshop on People Detection and Tracking Kobe, Japan, May 2009”. [12] Victoria Yanulevskaya, Jasper Uijlings, Jan-Mark Geusebroek, “Salient object detection: From pixels to segments”, Image and Vision Computing Vol. 31, Pg. No. 31–42, 2013. [13] Carlos Cuevas and Narciso García , “Improved background modeling for real-time spatio-temporal nonparametric moving object detection strategies”, Image and Vision Computing Vol.31, Pg. No. 616–630,2013. [14] Ling CAI et. al., “Multi-object detection and tracking by stereo vision”, Pattern Recognition Vol.43, Pg. No. 4028–4041, 2010. [15] Bangjun Lei and Li-Qun Xu, “Real-time outdoor video surveillance with robust foreground extraction and object tracking via multi-state transition management”, Pattern Recognition Letters Vol.27, Pg. No. 1816 – 1825, 2006.