--- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 ---




                                   VISUAL SEARCH
                 Lov Loothra, Ashish Goel, Prateek and Shikha Vashistha
           Department of Information Technology and Computer Science Engineering
                   Amity School of Engineering and Technology, Bijwasan

Abstract     –     This    paper     describes    the      on a codification of the image, trying to work on a
implementation of an application which accepts an          minimal set of data which respects (and allows to
image as input from the user and finds images that         reconstruct) the most important characteristics of the
are similar to it from a specified directory. Similar      image. Besides, codification usually allows the
images may be defined as images that bear an               deletion of redundant information and it is easy to
exact (pixel to pixel) resemblance to the query            work on the improvement and analysis of the image
image or images that depict some likeness to the           directly on the codified representation of the same.
query image in terms of their intensities (color),
overall shape (texture) or a combination of these          Obviously, the reduction level of the image original
two factors. The application also aims to index or         data can be associated to a relative loss of
sort the images of the database in order of their          information. It is always convenient that the
similarity to the query image, i.e., from the most         codification admits inversion (i.e., recovering the
similar to the least similar image.                        original image or an approximation of that original
                                                           image with the slightest error). Also, despite
Index Terms – edge detection, hausdorff distance,          modifications made to the image, such as color, scale
image codification, image comparison, image                or texture changes, it would be important to maintain
indexing, image similarity                                 codification invariability. But this, at the same time,
                                                           requires the codified representation to store some
1. INTRODUCTION                                            extra information to make such an inversion possible.
As of now, almost all popular search engines are text
                                                           Traditionally, the problem of image similarity
or tag based, i.e., they search for a web page, an
                                                           analysis – i.e., the problem of finding the subset of an
image, a video etc. on the basis of keywords used to
                                                           image bank with similar characteristics to a given
describe/store them. This provides for extremely
                                                           image – has been solved by computing a "signature"
accurate and practical results when we want to search
                                                           (codification) of each image to be compared, so then,
for a particular topic or information contained in a
                                                           correspondence between the signatures could be
web page. But the same method usually leads to
                                                           analyzed by means of a distance function that
somewhat inaccurate results when we’re specifically
                                                           measures the degree of approximation between the
searching for images, videos or related media for the
                                                           two given signatures.
simple reason that one person’s description may not
be accurate enough to cover all keywords.
                                                           Traditional methods to compute signatures are based
                                                           on some attributes of the image (for example, color
Instead, if we use an image itself as the search
                                                           histogram, recognition of a fixed pattern, number of
‘keyword’ and check for images that are similar to it,
                                                           components of a given type, etc). This "linearity" of
we’re bound to get more accurate results. This is
                                                           the signature makes it really difficult to obtain data
especially useful when the user knows what he wants
                                                           about attributes which were not considered in the
to obtain as a result of the search: it could be an
                                                           signature (and which could be relevant to the
image similar to the one he inputs, an image of higher
                                                           similarity or difference between two images). For
quality (better resolution) or an image that ‘contains’
                                                           instance, if we only take into account color
the image he’s input.
                                                           histograms, we would not take into account image
                                                           texture, nor we would be able to recognize similar
2. IMAGE & IMAGE SIMILARITY
                                                           objects painted in different colors.
A digital image is a function f (x, y) which has
been discretized in spatial coordinates and brightness.    There are several well-researched methods in the
It can also be represented as a matrix, in which the       domain of image processing that can be used to
rates of line and column identify a point in the image,    formulate a working visual-query based database
and the content value in the matrix identifies the level   search application. The techniques used in our project
of gray (or color) in that point (pixel).                  are briefly described below. Furthermore, this paper
                                                           elucidates the nuances of the actual implementation
The volume of the required data for the storage (and       of the visual search application.
processing) of an image, makes it convenient to work
--- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 ---


3. HASHING                                                7. DETAILS OF IMPLEMENTATION
A cryptographic hash function is a transformation         The application, while searching, considers:
that takes an input (or 'message') and returns a fixed-        Exact match(es) (of the Source Image)
size string, which is called the hash value. The ideal         Color
hash function has three main properties - it is                Texture (Shape)
extremely easy to calculate a hash for any given data,
it is extremely difficult or almost impossible in a       The first point involves searching the target directory
practical sense to calculate a text that has a given      for an image or for images that are exact replicas of
hash, and it is extremely unlikely that two different     the query image. This is accomplished using the
messages, however close, will have the same hash.         hashing technique (explained below). The second and
                                                          third points involve searching for non-exact images
By computing and then comparing the hash of each          that bear some degree of resemblance to the query
image, it can be quickly ascertained whether the          image. For this, the images (query and database) are
images were identical or not.                             first subjected to the edge-detection filter and,
                                                          subsequently, the Hausdorff metric of the filtered
4. COLOR MAP                                              database images with respect to the query image is
A pixel by pixel image comparison of two images can       computed. Also, the generated Color Maps of the
also determine whether two images are alike. This,        images are compared trivially to generate difference
however, becomes highly inefficient for large images      metric. These are used to determine the degree of
and at the same time doesn’t take into account the        similarity. The nuances of the implementation of the
regional or spatial similarity or dissimilarity. Hence    above techniques are detailed below.
we use Color Maps. In our implementation, a Color
Map represents an image divided into blocks. These        7.1 HASHING TECHNIQUE
blocks (of a predetermined size) are made of a group      The SHA hash functions are a set of cryptographic
of pixels and are used to represent the average pixel     hash functions designed by the National Security
intensity of a particular area of the image.              Agency (NSA) and published by the NIST as a U.S.
                                                          Federal Information Processing Standard. SHA stands
Corresponding blocks of two image maps can then be        for Secure Hash Algorithm. The five algorithms are
compared to determine similarity or dissimilarity.        denoted SHA-1, SHA-224, SHA-256, SHA-384, and
                                                          SHA-512. The latter four variants are sometimes
5. EDGE DETECTION                                         collectively referred to as SHA-2. SHA-1 produces a
Edges characterize boundaries and are, therefore, a       message digest that is 160 bits long; the number in
problem of fundamental importance in image                the other four algorithm names denote the bit length
processing. Edges in images are areas with strong         of the digest they produce. The classes used for
intensity contrasts – a jump in intensity from one        computing these hashes are predefined in
pixel to the next. Detecting the edges of an image        System.Security.Cryptography [6] which
significantly reduces the amount of data and filters      can be freely used in any .NET or Visual Studio
out useless information, while preserving the             implementation.
important structural properties in an image.
                                                          Hashing is a faster method to compare the images to
6. HAUSDORFF DISTANCE                                     allow the tests to complete in a timely manner, rather
                                                          than comparing the individual pixels in each image
The Hausdorff distance [1] measures the extent to         using GetPixel (x, y) [5][6]. Hashes of two
which each point of a ‘model’ set lies near some point    images should match if and only if the corresponding
of an ‘image’ set and vice versa. Thus, this distance     images also match. Small changes to the image result
can be used to determine the degree of resemblance        in large unpredictable changes in the hash. This
between two objects that are superimposed on one          property of the generated hashes can be used to find
another. Computing the Hausdorff distance between         exact matches (duplicates) of the query image.
all possible relative positions of the query image and
the database image can solve the problem of detecting     The ComputeHash [6] method of this class takes a
image containment. The Hausdorff distance                 byte array of data as an input parameter and produces
computation differs from many other shape                 a 256 bit hash of that data. By computing and then
comparison methods in that no correspondence              comparing the hash of each image, it would be
between the query image and database image(s) is          quickly able to tell if the images were identical or not.
derived [1]. The method is quite tolerant of small        The problem was hence to device a way to convert
position errors as occur with edge detectors and other    the image data stored in the Bitmap [5][6] objects to
feature extraction methods. Moreover, the method          a suitable form for passing to the ComputeHash
extends naturally to the problem of comparing a           method,      namely,        a    byte     array.     The
portion of a model against an image.
                                                          ImageConvertor [6] class was thus used to allow


                                                      -2-
--- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 ---


us to convert the Image (or Bitmap) objects to the         the gradient of this signal (which, in one dimension,
hash-able byte array.                                      is just the first derivative with respect to t) we get a
                                                           signal as shown by [FIG 7.3.2].
Examples: [7.1.1], [7.1.2].
                                                           Clearly, the derivative shows a maximum located at
7.2 COLOR MAPS                                             the center of the edge in the original signal. This
                                                           method of locating an edge is characteristic of the
Color Maps can be easily and efficiently generated         ‘gradient filter’ family of edge detection filters and
for small images by taking the respective Red, Green
                                                           includes the Sobel method [3]. A pixel location is
and Blue averages of a Block (16x16 in our
                                                           declared an edge location if the value of the gradient
implementation) at a time dynamically using:
                                                           exceeds some threshold. As mentioned before, edges
                                                           will have higher pixel intensity values than those
IntnstyAvg =
                                                           surrounding it.
(IntnstyAvg * (p – 1) + CIntnsty)/p
                                                           Based on this one-dimensional analysis, the theory
where p represent the current pixel location, and
                                                           can be carried over to two-dimensions as long as
CIntensity represents the present calculated intensity
                                                           there is an accurate approximation to calculate the
value.
                                                           derivative of a two-dimensional image. The Sobel
                                                           operator performs a 2-D spatial gradient measurement
However this method fast deteriorates as image size
                                                           on an image. Typically it is used to find the
increases and the number of pixels go up to a few
                                                           approximate absolute gradient magnitude at each
million. The most practical and efficient solution is to
                                                           point in an input grayscale image.
Scale the image down to a fixed size. For this we
need to know the scale factor, sf, based on the image
                                                           The Sobel edge detector uses a pair of 3x3
dimensions and the size itself:
                                                           convolution masks [3], one estimating the gradient in
MAX_DIM = Max(Img_Width, Img_Height)                       the x-direction (columns, Gx) [FIG 7.3.3] and the
      sf = FIXED_SIZE / MAX_DIM                            other estimating the gradient in the y-direction (rows,
                                                           Gy) [FIG 7.3.3]. A convolution mask is usually much
So therefore, we have:                                     smaller than the actual image. As a result, the mask is
                                                           slid over the image, manipulating a square of pixels at
       New_Width = sf * Img_Width                          a time. An approximate magnitude can then be
      New_Height = sf * Img_Height                         calculated using: |G| = |G x| + |Gy| [3]

Once an image is scaled the Intensity Average for a        The actual algorithm involves the computation of the
block is computed and stored. The intensity of a           grayscale of the image (if required) followed by the
particular pixel is obtained by the trivial                application of the gradient masks.
GetPixel(x, y) method. These stored values of
the regional blocks (say A1, B1 for two images A, B)       In our implementation, we used the Bitmap class to
can then be compared by a simple absolute difference       represent the image. The GetPixel(x,y) method
scaled over the 8-bits used to represent the color         was used to obtain the value of the Color[5][6] of
component (RGB):                                           the pixel located at x, y. The working loop traversed
                                                           the entire dimensions of the image and obtained the
Difference =                                               Color value (24 bit value for modern images). By
1 - |Blk_A1_Avg - Blk_B1_Avg| / 255                        taking the average of the RGB component of the
                                                           Color value, we converted it to an 8-bit grayscale.
                                                           The computed value was then stored in a matrix as a
Examples: [7.2.1], [7.2.2].
                                                           simple integer between 0 – 255 for easy recall.
7.3 SOBEL EDGE DETECTION
                                                           The active pixel region, consisting of the current
There are many ways to perform edge detection.             pixel location (say x, y) was then subjected to a
However, most of the different methods may be              gradient. The region included 8 pixels adjacent to the
grouped into two categories: gradient and Laplacian.       active pixel for a total of 9 pixels which could be
The gradient method detects the edges by looking for       directly correlated (using Hadamard product) with the
the maximum and minimum in the first derivative of         3x3 gradient matrices and summed to produce the
the image. The Laplacian method searches for zero          gradient values in x and y directions. The computed
crossings in the second derivative of the image to find    gradient was then compared to the threshold of the 8-
edges.                                                     bit Bitmap, i.e., 0 & 255 and an appropriate intensity
                                                           value was assigned.
Suppose we have a signal, with an edge shown by the
jump in intensity as shown in [FIG 7.3.1]. If we take      Examples: [7.3.4], [7.3.5].


                                                       -3-
--- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 ---


7.4 CANNY EDGE DETECTION                                    along the edge in the edge direction and suppress any
                            [2]                             pixel value (set it equal to 0) that is not considered to
The Canny edge detection        algorithm is known to
                                                            be an edge (i.e., has a value less than its neighbor).
many as the optimal edge detector. It enhances the          This will give a thin line in the output image. This is
many edge detectors already available. It is important      accomplished by simply comparing the current pixel
that edges occurring in images should not be missed         value under consideration with its two nearest
and that there be NO responses to non-edges.                neighbors in one (of the four possible) direction that
Likewise, it is also important that the edge points be      has been determined previously. The lower values
well localized. In other words, the distance between        can be ignored.
the edge pixels as found by the detector and the actual
edge is to be at a minimum.                                 Finally, hysteresis is used as a means of eliminating
                                                            streaking [2]. Streaking is the breaking up of an edge
The detector draws upon the implementation of the
                                                            contour caused by the operator output fluctuating
Sobel filter discussed previously. But before applying
                                                            above and below a particular threshold. If a single
the Sobel filter to the image, there is a need to
                                                            threshold, T1 is applied to an image, and an edge has
eliminate noise from the image. This noise removal is
                                                            an average strength equal to T1, then, due to noise,
done with the help of a Gaussian filter which
                                                            there will be instances where the edge dips below the
basically blurs the image. This is done by applying a
                                                            threshold. Equally it will also extend above the
Gaussian mask over the image. For the purpose of
                                                            threshold making an edge look like a dashed line.
implementation, we used a 3x3 mask [FIG 7.4.1] and
slid it over the image; manipulating a square of pixels
                                                            To avoid this, hysteresis uses 2 thresholds: high and
at a time by simple convolution.
                                                            low. Any pixel in the image that has a value greater
                                                            than T1 is presumed to be an edge pixel, and is
After the application of the Gaussian and Sobel
                                                            marked as such immediately. Then, any pixels that
filters, we obtain an image (over an 8-bit grayscale)
                                                            are connected to this edge pixel and that have a value
that approximates the intensity change areas of the
                                                            greater than T2 are also selected as edge pixels. To
image. The problem statement now is to remove the
                                                            follow an edge, start with a gradient of T2 and stop
gray factor which is a local maximum but a non-
                                                            when you get a gradient below T1. This step is very
maximum when viewed w.r.t. its neighbors. This is
                                                            similar to the following of edges and suppression of
known as non-maximum suppression and is done by
                                                            non-maximums and hence can be clubbed together in
determining the edge direction and then following it
                                                            the final implementation.
to remove the regional non-maximums. This step was
clubbed with the implementation of the Sobel filter as
                                                            Example: [7.4.3].
the direction could be trivially deduced as: θ =
tan-1 Gy/Gx, with appropriate exceptions being              7.5 HAUSDORFF DISTANCE COMPUTATION
made when Gx and/or Gy compute to 0, as:
orientation = (Gy == 0) ? 0 : 90.                           Given two finite point sets A = {a1,...ap} and B
                                                            = {b1,...bq}, the hausdorff distance between
Once the edge direction is known, the next step is to       them is defined as:
relate the edge direction to a direction that can be
traced in an image. So if the pixels of a 5x5 image are       H(A, B) = max(h(A, B), h(B, A)) [1]
aligned as in [FIG 7.4.2], then, it can be seen by
looking at the centre pixel, a, there are only four         where h(A, B) = max a є A min b є B || a -
possible directions when describing the surrounding         b || and || - || is some underlying norm on the
pixels:                                                     points of A and B (for a visual representation of
                                                            hausdorff distance refer [7.5.1]).
        0 degrees (in the horizontal direction),
        45 degrees (along the positive diagonal),          The function h(A, B) is called the directed
        90 degrees (in the vertical direction), or         hausdorff distance [1] from A to B. It identifies the
        135 degrees (along the negative diagonal)          point a є A that is farthest from any point of B, and
Hence the obtained direction is now resolved into one       measures the distance from a to its nearest neighbor
of these four directions depending on which direction       in B (using the given norm || - ||, Euclidean in this
it is closest to. As an example, if the orientation angle   case). That is, h(A, B) in effect ranks each point of
is found to be 3 degrees, make it zero degrees. The         A based on its distance to the nearest point of B, and
resolved angle is stored in an array for further            then uses the largest ranked such point as the distance
reference and recall.                                       (the most mismatched point of A). Intuitively, if
                                                            h(A, B) = d, then each point of A must be within
Following the computation of the edge directions, we        distance d of some point of B, and there also is some
are now in a position to perform non-maximum                point of A that is exactly distance d from the nearest
suppression [2]. Therefore, we now need to trace            point of B (the most mismatched point).



                                                        -4-
--- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 ---


The hausdorff distance, H(A, B), is the maximum            example. Given a threshold distance τ and the point
of h(A, B) and h(B, A). Thus it measures the               (Bx, By), we need only consider it for distance
degree of mismatch between two sets, by measuring          computation from the point (Ax, Ay) iff: (Ax – τ)
the distance of the point of A that is farthest from any   ≤ Bx ≤ (Ax + τ) AND (Ay – τ) ≤ By ≤
point of B and vice versa. Intuitively, if the hausdorff   (Ay + τ). This speeds up computations for smaller
distance is d, then every point of A must be within a      values of τ and limits the maximum possible
distance d of some point of B and vice versa. Thus         hausdorff distance. Visual inaccuracies may occur
the notion of resemblance encoded by this distance is      when seemingly similar but translated images are
that each member of A be near some member of B             compared under this assumption.
and vice versa. Unlike most methods of comparing
shapes, there is no explicit pairing of points of A with   7.5.3 Termination at Infinite Distance
points of B (for example many points of A may be
close to the same point of B) [1].                         It can be noted that the outer loop of the algorithm
                                                           (Loop 2) simulates the maximum distance retention.
The extraction of the point sets from the images is        This assumption builds on the previous assumption in
based on the result of the Canny Edge detector. The        the sense that given the boundaries of the threshold
implementation uses those points of the Canny-             distance window, there may be a few points from A
filtered image that actually constitute an edge. These     which are not in the vicinity of any point from B.
points can be trivially determined by checking for         Hence the computed distance will retain the initial
only the non-zero intensity pixels.                        value of infinity. Further consideration of any point
                                                           hereafter is trivially meaningless as the maximum
The function h(A, B) can be trivially computed in          value of infinity was retained.
time O(pq) for two point sets of size p and q
respectively using the following Brute-Force               7.5.4 Scaling
Algorithm:                                                 Even after the application of the above techniques,
                                                           the computation efficiency rapidly deteriorates as
1. h = 0                                                   image size increases and the number pixels go up to a
2. for every point ai of A,                                few million. Hence, as was discussed in section 7.2,
      2.1 shortest = INF;                                  the image is scaled down to a fixed size on the basis
      2.2 for every point bj of B                          of a scale factor to effectively reduce the number of
            dij = d (ai , bj )                             pixels significantly.
            if dij < shortest then
                   shortest = dij                          The above assumptions do affect the overall accuracy
      2.3 if shortest > h then                             of the hausdorff metric but are useful nonetheless for
                   h = shortest                            a much required speed-up.

Our implementation used a slightly modified version        7.6 CONCLUSION AND OBSERVATIONS
of the above algorithm which makes certain
assumptions and eliminations based on the                  Hence, given any two images under consideration, we
computation of the Hausdorff metric. The steps to          can easily compute their hash-values and their mutual
improve computation time are summarized below:             hausdorff metric (after Canny filter application).
                                                           While on one hand the hash value comparison can
7.5.1 Termination at Zero Distance                         trivially determine whether or not the given images
                                                           are exact in all respects; the hasudorff metric signifies
This builds on the fact that the result of the distance    the ‘closeness’ of the two images. A hausdorff metric
norm (Euclidean norm was used in our                       of 0 indicates exactness as far as features are
implementation; i.e., d = √ {(x1 - x2)2 + (y1              concerned, whereas further values reveal increasing
- y2)2}) can never be less than 0. Hence, once the         dissimilarity between images.
inner loop of the above algorithm (Loop 2.2)
computes the shortest distance to be 0, we can safely      This implementation can be extended intuitively to
stop considering any further points from B to              consider a database of images.
compute the distance from the particular point ai є
A). This considerably speeds up the computation time       Examples:
by skipping a significant chunk of unconsidered            [7.6.1] Source Database
points.                                                    [7.6.2] Filtered images
                                                           [7.6.3] Hausdorff distances computed                w.r.t.
7.5.2 Threshold Distance Window                            Firefox_Logo_Normal (Source Image)
We can eliminate the need to consider a point if it lies   Results sorted in order of decreasing similarity.
outside a particular threshold distance window or
block. This can be understood with the help of an


                                                       -5-
--- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 ---


8. SUMMARY OF IMPLEMENTATION                            9. REFERENCES
A summary of the implementation is presented below      [1] Daniel P. Huttenlocher, Gregory A. Klanderman,
in the form of a pseudo-code.                           and William J. Rucklidge. Comparing Images Using
                                                        the Hausdorff Distance. IEEE Trans. Pattern Analysis
8.1 Input Source Image, SI                              and Machine Intelligence, September 1993.
8.2 Input Target Directory, TD
-- Preprocessing Phase                                  [2] J. Canny. A Computational Approach To Edge
8.3 For each image in the TD:                           Detection. IEEE Trans. Pattern Analysis and Machine
          8.3.1 Compute & store the hash value (HV)     Intelligence, November1986.
          8.3.2 Compute and Store Color Details (CD)
          8.3.3 Apply the Canny (Sobel based) filter    [3] I. Sobel, G. Feldman. ‘A 3x3 Isotropic Gradient
          8.3.4 Compute the location of non-zero        Operator for Image Processing’. Presented at a talk
          pixels and store in a matrix                  at the Stanford Artificial Project in 1968; Pattern
-- Preparation Phase                                    Classification and Scene Analysis, 1973.
8.4 Compute HV for SI
8.5 Compute & Store Color Details of SI                 [4] H. Alt, B. Behrends and J. Blomer. Measuring the
8.6 Apply Canny filter to SI                            resemblance of Polygon Shapes. Proc. Seventh ACM
8.7 Compute & store location of non-zero pixels         Symposium on Computational Geometry, 1991.
-- Comparison Phase
8.8 For each image in the TD:                           [5] Herbert Schildt. C# 2.0: The Complete Reference,
          8.8.1 Compare HV of SI with stored HVs of     Second Edition. Tata McGraw-Hill, 2006.
          the image
          8.8.2 Compare CD of SI with stored CDs of     [6] MSDN Library. msdn.microsoft.com/en-
          the image                                     us/library/default.aspx
          8.8.3 Compute Hausdorff metric b/w SI and
          the image using the stored location of non-
          zero pixels
          8.8.4 Assign rank to image based on HV
          comparison, computed Hausdorff metric and
          the Color Details
-- Sorting Phase
8.9 Sort images of TD based on rank
8.10 Display images in sort-order




                                                    -6-
--- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 ---


FIGURES


[7.1.1]




[7.1.2]




                                                  -7-
--- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 ---


[7.2.1]




[7.2.2]




                                                  -8-
--- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 ---


[7.3.1]




[7.3.2]




[7.3.3]




                                                  -9-
--- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 ---


[7.3.4]




[7.3.5]




                                                 - 10 -
--- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 ---


[7.4.1]




[7.4.2]




[7.4.3]




                                                 - 11 -
--- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 ---


[7.5.1]




                                                 - 12 -
--- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 ---


[7.6.1]




[7.6.2]




[7.6.3]




                                                 - 13 -

More Related Content

PPTX
Multimedia searching
PDF
Mind finder: interactive sketch-based image search on millions of images
PDF
An Attribute-Assisted Reranking Model for Web Image Search
PDF
IRJET- Semantic Retrieval of Trademarks based on Text and Images Conceptu...
PDF
An attribute assisted reranking model for web image search
PDF
Pc Seminar Jordi
PDF
Visual Object Category Recognition
PDF
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...
Multimedia searching
Mind finder: interactive sketch-based image search on millions of images
An Attribute-Assisted Reranking Model for Web Image Search
IRJET- Semantic Retrieval of Trademarks based on Text and Images Conceptu...
An attribute assisted reranking model for web image search
Pc Seminar Jordi
Visual Object Category Recognition
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...

What's hot (14)

PDF
Scene Description From Images To Sentences
PDF
D0341829
PPT
Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...
PDF
IRJET- A Survey on Different Image Retrieval Techniques
PDF
PDF
akashreport
PDF
International Journal of Engineering Research and Development (IJERD)
PDF
Hash Visualization a New Technique to improve Real World CyberSecurity
PDF
Beyond Memorability: Visualization Recognition
DOCX
LEARNING TO RANK IMAGE TAGS WITH LIMITED TRAINING EXAMPLES - IEEE PROJECTS I...
PPTX
Feature detection - Image Processing
PDF
Searching Images: Recent research at Southampton
PDF
F0343545
PPTX
Object recognition
Scene Description From Images To Sentences
D0341829
Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...
IRJET- A Survey on Different Image Retrieval Techniques
akashreport
International Journal of Engineering Research and Development (IJERD)
Hash Visualization a New Technique to improve Real World CyberSecurity
Beyond Memorability: Visualization Recognition
LEARNING TO RANK IMAGE TAGS WITH LIMITED TRAINING EXAMPLES - IEEE PROJECTS I...
Feature detection - Image Processing
Searching Images: Recent research at Southampton
F0343545
Object recognition
Ad

Viewers also liked (6)

PDF
Instuctie Nearpod
PDF
Getting started with bj jupiter
PPTX
The Impact of Algorithmic Trading
PDF
Gebruik van de IPAD bij beta onderwijs
PDF
The Impact of Algorithmic Trading
PDF
Hype vs. Reality: The AI Explainer
Instuctie Nearpod
Getting started with bj jupiter
The Impact of Algorithmic Trading
Gebruik van de IPAD bij beta onderwijs
The Impact of Algorithmic Trading
Hype vs. Reality: The AI Explainer
Ad

Similar to Visual Search (20)

PDF
Image Retrieval using Equalized Histogram Image Bins Moments
PDF
My2421322135
PDF
Image processing by manish myst, ssgbcoet
PDF
B0310408
PDF
Ic3414861499
PDF
A Review on Matching For Sketch Technique
PDF
SIGNIFICANCE OF DIMENSIONALITY REDUCTION IN IMAGE PROCESSING
PPT
SECURE IMAGE RETRIEVAL BASED ON HYBRID FEATURES AND HASHES
PDF
Fc4301935938
PDF
Human age and gender prediction management system project report.pdf
PDF
Ijarcet vol-2-issue-3-1078-1080
PPT
viretrieval2.ppt chain codes Multimedia Information Retrieval
PDF
Image similarity using fourier transform
PPTX
Digital image processing
PPTX
Introduction image features
PDF
PDF
PDF
Ijaems apr-2016-16 Active Learning Method for Interactive Image Retrieval
PPT
Visual Search
PDF
A hybrid approach for categorizing images based on complex networks and neur...
Image Retrieval using Equalized Histogram Image Bins Moments
My2421322135
Image processing by manish myst, ssgbcoet
B0310408
Ic3414861499
A Review on Matching For Sketch Technique
SIGNIFICANCE OF DIMENSIONALITY REDUCTION IN IMAGE PROCESSING
SECURE IMAGE RETRIEVAL BASED ON HYBRID FEATURES AND HASHES
Fc4301935938
Human age and gender prediction management system project report.pdf
Ijarcet vol-2-issue-3-1078-1080
viretrieval2.ppt chain codes Multimedia Information Retrieval
Image similarity using fourier transform
Digital image processing
Introduction image features
Ijaems apr-2016-16 Active Learning Method for Interactive Image Retrieval
Visual Search
A hybrid approach for categorizing images based on complex networks and neur...

More from Lov Loothra (7)

DOCX
System Design: Gold Loan Disbursement in Capital First
PPTX
Value at Risk Engine
PDF
Employer Branding (HRM)
PDF
Testing for the 'January Effect' under the CAPM framework
PPTX
Testing for the 'January Effect' under the CAPM framework
PPTX
IKEA's Distribution Strategy in India
PPTX
Dropbox: Managing Innovation in the Networked Economy
System Design: Gold Loan Disbursement in Capital First
Value at Risk Engine
Employer Branding (HRM)
Testing for the 'January Effect' under the CAPM framework
Testing for the 'January Effect' under the CAPM framework
IKEA's Distribution Strategy in India
Dropbox: Managing Innovation in the Networked Economy

Recently uploaded (20)

PDF
Transform-Your-Supply-Chain-with-AI-Driven-Quality-Engineering.pdf
PDF
Early detection and classification of bone marrow changes in lumbar vertebrae...
PDF
Statistics on Ai - sourced from AIPRM.pdf
PPTX
Module 1 Introduction to Web Programming .pptx
PDF
Lung cancer patients survival prediction using outlier detection and optimize...
PDF
AI.gov: A Trojan Horse in the Age of Artificial Intelligence
PDF
Advancing precision in air quality forecasting through machine learning integ...
PDF
Enhancing plagiarism detection using data pre-processing and machine learning...
PPTX
Internet of Everything -Basic concepts details
PDF
INTERSPEECH 2025 「Recent Advances and Future Directions in Voice Conversion」
PDF
Auditboard EB SOX Playbook 2023 edition.
PPTX
Custom Battery Pack Design Considerations for Performance and Safety
PDF
Transform-Your-Factory-with-AI-Driven-Quality-Engineering.pdf
PDF
NewMind AI Weekly Chronicles – August ’25 Week IV
PDF
sbt 2.0: go big (Scala Days 2025 edition)
PPTX
Training Program for knowledge in solar cell and solar industry
PDF
“A New Era of 3D Sensing: Transforming Industries and Creating Opportunities,...
PDF
Co-training pseudo-labeling for text classification with support vector machi...
PDF
Accessing-Finance-in-Jordan-MENA 2024 2025.pdf
PDF
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
Transform-Your-Supply-Chain-with-AI-Driven-Quality-Engineering.pdf
Early detection and classification of bone marrow changes in lumbar vertebrae...
Statistics on Ai - sourced from AIPRM.pdf
Module 1 Introduction to Web Programming .pptx
Lung cancer patients survival prediction using outlier detection and optimize...
AI.gov: A Trojan Horse in the Age of Artificial Intelligence
Advancing precision in air quality forecasting through machine learning integ...
Enhancing plagiarism detection using data pre-processing and machine learning...
Internet of Everything -Basic concepts details
INTERSPEECH 2025 「Recent Advances and Future Directions in Voice Conversion」
Auditboard EB SOX Playbook 2023 edition.
Custom Battery Pack Design Considerations for Performance and Safety
Transform-Your-Factory-with-AI-Driven-Quality-Engineering.pdf
NewMind AI Weekly Chronicles – August ’25 Week IV
sbt 2.0: go big (Scala Days 2025 edition)
Training Program for knowledge in solar cell and solar industry
“A New Era of 3D Sensing: Transforming Industries and Creating Opportunities,...
Co-training pseudo-labeling for text classification with support vector machi...
Accessing-Finance-in-Jordan-MENA 2024 2025.pdf
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...

Visual Search

  • 1. --- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 --- VISUAL SEARCH Lov Loothra, Ashish Goel, Prateek and Shikha Vashistha Department of Information Technology and Computer Science Engineering Amity School of Engineering and Technology, Bijwasan Abstract – This paper describes the on a codification of the image, trying to work on a implementation of an application which accepts an minimal set of data which respects (and allows to image as input from the user and finds images that reconstruct) the most important characteristics of the are similar to it from a specified directory. Similar image. Besides, codification usually allows the images may be defined as images that bear an deletion of redundant information and it is easy to exact (pixel to pixel) resemblance to the query work on the improvement and analysis of the image image or images that depict some likeness to the directly on the codified representation of the same. query image in terms of their intensities (color), overall shape (texture) or a combination of these Obviously, the reduction level of the image original two factors. The application also aims to index or data can be associated to a relative loss of sort the images of the database in order of their information. It is always convenient that the similarity to the query image, i.e., from the most codification admits inversion (i.e., recovering the similar to the least similar image. original image or an approximation of that original image with the slightest error). Also, despite Index Terms – edge detection, hausdorff distance, modifications made to the image, such as color, scale image codification, image comparison, image or texture changes, it would be important to maintain indexing, image similarity codification invariability. But this, at the same time, requires the codified representation to store some 1. INTRODUCTION extra information to make such an inversion possible. As of now, almost all popular search engines are text Traditionally, the problem of image similarity or tag based, i.e., they search for a web page, an analysis – i.e., the problem of finding the subset of an image, a video etc. on the basis of keywords used to image bank with similar characteristics to a given describe/store them. This provides for extremely image – has been solved by computing a "signature" accurate and practical results when we want to search (codification) of each image to be compared, so then, for a particular topic or information contained in a correspondence between the signatures could be web page. But the same method usually leads to analyzed by means of a distance function that somewhat inaccurate results when we’re specifically measures the degree of approximation between the searching for images, videos or related media for the two given signatures. simple reason that one person’s description may not be accurate enough to cover all keywords. Traditional methods to compute signatures are based on some attributes of the image (for example, color Instead, if we use an image itself as the search histogram, recognition of a fixed pattern, number of ‘keyword’ and check for images that are similar to it, components of a given type, etc). This "linearity" of we’re bound to get more accurate results. This is the signature makes it really difficult to obtain data especially useful when the user knows what he wants about attributes which were not considered in the to obtain as a result of the search: it could be an signature (and which could be relevant to the image similar to the one he inputs, an image of higher similarity or difference between two images). For quality (better resolution) or an image that ‘contains’ instance, if we only take into account color the image he’s input. histograms, we would not take into account image texture, nor we would be able to recognize similar 2. IMAGE & IMAGE SIMILARITY objects painted in different colors. A digital image is a function f (x, y) which has been discretized in spatial coordinates and brightness. There are several well-researched methods in the It can also be represented as a matrix, in which the domain of image processing that can be used to rates of line and column identify a point in the image, formulate a working visual-query based database and the content value in the matrix identifies the level search application. The techniques used in our project of gray (or color) in that point (pixel). are briefly described below. Furthermore, this paper elucidates the nuances of the actual implementation The volume of the required data for the storage (and of the visual search application. processing) of an image, makes it convenient to work
  • 2. --- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 --- 3. HASHING 7. DETAILS OF IMPLEMENTATION A cryptographic hash function is a transformation The application, while searching, considers: that takes an input (or 'message') and returns a fixed-  Exact match(es) (of the Source Image) size string, which is called the hash value. The ideal  Color hash function has three main properties - it is  Texture (Shape) extremely easy to calculate a hash for any given data, it is extremely difficult or almost impossible in a The first point involves searching the target directory practical sense to calculate a text that has a given for an image or for images that are exact replicas of hash, and it is extremely unlikely that two different the query image. This is accomplished using the messages, however close, will have the same hash. hashing technique (explained below). The second and third points involve searching for non-exact images By computing and then comparing the hash of each that bear some degree of resemblance to the query image, it can be quickly ascertained whether the image. For this, the images (query and database) are images were identical or not. first subjected to the edge-detection filter and, subsequently, the Hausdorff metric of the filtered 4. COLOR MAP database images with respect to the query image is A pixel by pixel image comparison of two images can computed. Also, the generated Color Maps of the also determine whether two images are alike. This, images are compared trivially to generate difference however, becomes highly inefficient for large images metric. These are used to determine the degree of and at the same time doesn’t take into account the similarity. The nuances of the implementation of the regional or spatial similarity or dissimilarity. Hence above techniques are detailed below. we use Color Maps. In our implementation, a Color Map represents an image divided into blocks. These 7.1 HASHING TECHNIQUE blocks (of a predetermined size) are made of a group The SHA hash functions are a set of cryptographic of pixels and are used to represent the average pixel hash functions designed by the National Security intensity of a particular area of the image. Agency (NSA) and published by the NIST as a U.S. Federal Information Processing Standard. SHA stands Corresponding blocks of two image maps can then be for Secure Hash Algorithm. The five algorithms are compared to determine similarity or dissimilarity. denoted SHA-1, SHA-224, SHA-256, SHA-384, and SHA-512. The latter four variants are sometimes 5. EDGE DETECTION collectively referred to as SHA-2. SHA-1 produces a Edges characterize boundaries and are, therefore, a message digest that is 160 bits long; the number in problem of fundamental importance in image the other four algorithm names denote the bit length processing. Edges in images are areas with strong of the digest they produce. The classes used for intensity contrasts – a jump in intensity from one computing these hashes are predefined in pixel to the next. Detecting the edges of an image System.Security.Cryptography [6] which significantly reduces the amount of data and filters can be freely used in any .NET or Visual Studio out useless information, while preserving the implementation. important structural properties in an image. Hashing is a faster method to compare the images to 6. HAUSDORFF DISTANCE allow the tests to complete in a timely manner, rather than comparing the individual pixels in each image The Hausdorff distance [1] measures the extent to using GetPixel (x, y) [5][6]. Hashes of two which each point of a ‘model’ set lies near some point images should match if and only if the corresponding of an ‘image’ set and vice versa. Thus, this distance images also match. Small changes to the image result can be used to determine the degree of resemblance in large unpredictable changes in the hash. This between two objects that are superimposed on one property of the generated hashes can be used to find another. Computing the Hausdorff distance between exact matches (duplicates) of the query image. all possible relative positions of the query image and the database image can solve the problem of detecting The ComputeHash [6] method of this class takes a image containment. The Hausdorff distance byte array of data as an input parameter and produces computation differs from many other shape a 256 bit hash of that data. By computing and then comparison methods in that no correspondence comparing the hash of each image, it would be between the query image and database image(s) is quickly able to tell if the images were identical or not. derived [1]. The method is quite tolerant of small The problem was hence to device a way to convert position errors as occur with edge detectors and other the image data stored in the Bitmap [5][6] objects to feature extraction methods. Moreover, the method a suitable form for passing to the ComputeHash extends naturally to the problem of comparing a method, namely, a byte array. The portion of a model against an image. ImageConvertor [6] class was thus used to allow -2-
  • 3. --- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 --- us to convert the Image (or Bitmap) objects to the the gradient of this signal (which, in one dimension, hash-able byte array. is just the first derivative with respect to t) we get a signal as shown by [FIG 7.3.2]. Examples: [7.1.1], [7.1.2]. Clearly, the derivative shows a maximum located at 7.2 COLOR MAPS the center of the edge in the original signal. This method of locating an edge is characteristic of the Color Maps can be easily and efficiently generated ‘gradient filter’ family of edge detection filters and for small images by taking the respective Red, Green includes the Sobel method [3]. A pixel location is and Blue averages of a Block (16x16 in our declared an edge location if the value of the gradient implementation) at a time dynamically using: exceeds some threshold. As mentioned before, edges will have higher pixel intensity values than those IntnstyAvg = surrounding it. (IntnstyAvg * (p – 1) + CIntnsty)/p Based on this one-dimensional analysis, the theory where p represent the current pixel location, and can be carried over to two-dimensions as long as CIntensity represents the present calculated intensity there is an accurate approximation to calculate the value. derivative of a two-dimensional image. The Sobel operator performs a 2-D spatial gradient measurement However this method fast deteriorates as image size on an image. Typically it is used to find the increases and the number of pixels go up to a few approximate absolute gradient magnitude at each million. The most practical and efficient solution is to point in an input grayscale image. Scale the image down to a fixed size. For this we need to know the scale factor, sf, based on the image The Sobel edge detector uses a pair of 3x3 dimensions and the size itself: convolution masks [3], one estimating the gradient in MAX_DIM = Max(Img_Width, Img_Height) the x-direction (columns, Gx) [FIG 7.3.3] and the sf = FIXED_SIZE / MAX_DIM other estimating the gradient in the y-direction (rows, Gy) [FIG 7.3.3]. A convolution mask is usually much So therefore, we have: smaller than the actual image. As a result, the mask is slid over the image, manipulating a square of pixels at New_Width = sf * Img_Width a time. An approximate magnitude can then be New_Height = sf * Img_Height calculated using: |G| = |G x| + |Gy| [3] Once an image is scaled the Intensity Average for a The actual algorithm involves the computation of the block is computed and stored. The intensity of a grayscale of the image (if required) followed by the particular pixel is obtained by the trivial application of the gradient masks. GetPixel(x, y) method. These stored values of the regional blocks (say A1, B1 for two images A, B) In our implementation, we used the Bitmap class to can then be compared by a simple absolute difference represent the image. The GetPixel(x,y) method scaled over the 8-bits used to represent the color was used to obtain the value of the Color[5][6] of component (RGB): the pixel located at x, y. The working loop traversed the entire dimensions of the image and obtained the Difference = Color value (24 bit value for modern images). By 1 - |Blk_A1_Avg - Blk_B1_Avg| / 255 taking the average of the RGB component of the Color value, we converted it to an 8-bit grayscale. The computed value was then stored in a matrix as a Examples: [7.2.1], [7.2.2]. simple integer between 0 – 255 for easy recall. 7.3 SOBEL EDGE DETECTION The active pixel region, consisting of the current There are many ways to perform edge detection. pixel location (say x, y) was then subjected to a However, most of the different methods may be gradient. The region included 8 pixels adjacent to the grouped into two categories: gradient and Laplacian. active pixel for a total of 9 pixels which could be The gradient method detects the edges by looking for directly correlated (using Hadamard product) with the the maximum and minimum in the first derivative of 3x3 gradient matrices and summed to produce the the image. The Laplacian method searches for zero gradient values in x and y directions. The computed crossings in the second derivative of the image to find gradient was then compared to the threshold of the 8- edges. bit Bitmap, i.e., 0 & 255 and an appropriate intensity value was assigned. Suppose we have a signal, with an edge shown by the jump in intensity as shown in [FIG 7.3.1]. If we take Examples: [7.3.4], [7.3.5]. -3-
  • 4. --- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 --- 7.4 CANNY EDGE DETECTION along the edge in the edge direction and suppress any [2] pixel value (set it equal to 0) that is not considered to The Canny edge detection algorithm is known to be an edge (i.e., has a value less than its neighbor). many as the optimal edge detector. It enhances the This will give a thin line in the output image. This is many edge detectors already available. It is important accomplished by simply comparing the current pixel that edges occurring in images should not be missed value under consideration with its two nearest and that there be NO responses to non-edges. neighbors in one (of the four possible) direction that Likewise, it is also important that the edge points be has been determined previously. The lower values well localized. In other words, the distance between can be ignored. the edge pixels as found by the detector and the actual edge is to be at a minimum. Finally, hysteresis is used as a means of eliminating streaking [2]. Streaking is the breaking up of an edge The detector draws upon the implementation of the contour caused by the operator output fluctuating Sobel filter discussed previously. But before applying above and below a particular threshold. If a single the Sobel filter to the image, there is a need to threshold, T1 is applied to an image, and an edge has eliminate noise from the image. This noise removal is an average strength equal to T1, then, due to noise, done with the help of a Gaussian filter which there will be instances where the edge dips below the basically blurs the image. This is done by applying a threshold. Equally it will also extend above the Gaussian mask over the image. For the purpose of threshold making an edge look like a dashed line. implementation, we used a 3x3 mask [FIG 7.4.1] and slid it over the image; manipulating a square of pixels To avoid this, hysteresis uses 2 thresholds: high and at a time by simple convolution. low. Any pixel in the image that has a value greater than T1 is presumed to be an edge pixel, and is After the application of the Gaussian and Sobel marked as such immediately. Then, any pixels that filters, we obtain an image (over an 8-bit grayscale) are connected to this edge pixel and that have a value that approximates the intensity change areas of the greater than T2 are also selected as edge pixels. To image. The problem statement now is to remove the follow an edge, start with a gradient of T2 and stop gray factor which is a local maximum but a non- when you get a gradient below T1. This step is very maximum when viewed w.r.t. its neighbors. This is similar to the following of edges and suppression of known as non-maximum suppression and is done by non-maximums and hence can be clubbed together in determining the edge direction and then following it the final implementation. to remove the regional non-maximums. This step was clubbed with the implementation of the Sobel filter as Example: [7.4.3]. the direction could be trivially deduced as: θ = tan-1 Gy/Gx, with appropriate exceptions being 7.5 HAUSDORFF DISTANCE COMPUTATION made when Gx and/or Gy compute to 0, as: orientation = (Gy == 0) ? 0 : 90. Given two finite point sets A = {a1,...ap} and B = {b1,...bq}, the hausdorff distance between Once the edge direction is known, the next step is to them is defined as: relate the edge direction to a direction that can be traced in an image. So if the pixels of a 5x5 image are H(A, B) = max(h(A, B), h(B, A)) [1] aligned as in [FIG 7.4.2], then, it can be seen by looking at the centre pixel, a, there are only four where h(A, B) = max a є A min b є B || a - possible directions when describing the surrounding b || and || - || is some underlying norm on the pixels: points of A and B (for a visual representation of hausdorff distance refer [7.5.1]).  0 degrees (in the horizontal direction),  45 degrees (along the positive diagonal), The function h(A, B) is called the directed  90 degrees (in the vertical direction), or hausdorff distance [1] from A to B. It identifies the  135 degrees (along the negative diagonal) point a є A that is farthest from any point of B, and Hence the obtained direction is now resolved into one measures the distance from a to its nearest neighbor of these four directions depending on which direction in B (using the given norm || - ||, Euclidean in this it is closest to. As an example, if the orientation angle case). That is, h(A, B) in effect ranks each point of is found to be 3 degrees, make it zero degrees. The A based on its distance to the nearest point of B, and resolved angle is stored in an array for further then uses the largest ranked such point as the distance reference and recall. (the most mismatched point of A). Intuitively, if h(A, B) = d, then each point of A must be within Following the computation of the edge directions, we distance d of some point of B, and there also is some are now in a position to perform non-maximum point of A that is exactly distance d from the nearest suppression [2]. Therefore, we now need to trace point of B (the most mismatched point). -4-
  • 5. --- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 --- The hausdorff distance, H(A, B), is the maximum example. Given a threshold distance τ and the point of h(A, B) and h(B, A). Thus it measures the (Bx, By), we need only consider it for distance degree of mismatch between two sets, by measuring computation from the point (Ax, Ay) iff: (Ax – τ) the distance of the point of A that is farthest from any ≤ Bx ≤ (Ax + τ) AND (Ay – τ) ≤ By ≤ point of B and vice versa. Intuitively, if the hausdorff (Ay + τ). This speeds up computations for smaller distance is d, then every point of A must be within a values of τ and limits the maximum possible distance d of some point of B and vice versa. Thus hausdorff distance. Visual inaccuracies may occur the notion of resemblance encoded by this distance is when seemingly similar but translated images are that each member of A be near some member of B compared under this assumption. and vice versa. Unlike most methods of comparing shapes, there is no explicit pairing of points of A with 7.5.3 Termination at Infinite Distance points of B (for example many points of A may be close to the same point of B) [1]. It can be noted that the outer loop of the algorithm (Loop 2) simulates the maximum distance retention. The extraction of the point sets from the images is This assumption builds on the previous assumption in based on the result of the Canny Edge detector. The the sense that given the boundaries of the threshold implementation uses those points of the Canny- distance window, there may be a few points from A filtered image that actually constitute an edge. These which are not in the vicinity of any point from B. points can be trivially determined by checking for Hence the computed distance will retain the initial only the non-zero intensity pixels. value of infinity. Further consideration of any point hereafter is trivially meaningless as the maximum The function h(A, B) can be trivially computed in value of infinity was retained. time O(pq) for two point sets of size p and q respectively using the following Brute-Force 7.5.4 Scaling Algorithm: Even after the application of the above techniques, the computation efficiency rapidly deteriorates as 1. h = 0 image size increases and the number pixels go up to a 2. for every point ai of A, few million. Hence, as was discussed in section 7.2, 2.1 shortest = INF; the image is scaled down to a fixed size on the basis 2.2 for every point bj of B of a scale factor to effectively reduce the number of dij = d (ai , bj ) pixels significantly. if dij < shortest then shortest = dij The above assumptions do affect the overall accuracy 2.3 if shortest > h then of the hausdorff metric but are useful nonetheless for h = shortest a much required speed-up. Our implementation used a slightly modified version 7.6 CONCLUSION AND OBSERVATIONS of the above algorithm which makes certain assumptions and eliminations based on the Hence, given any two images under consideration, we computation of the Hausdorff metric. The steps to can easily compute their hash-values and their mutual improve computation time are summarized below: hausdorff metric (after Canny filter application). While on one hand the hash value comparison can 7.5.1 Termination at Zero Distance trivially determine whether or not the given images are exact in all respects; the hasudorff metric signifies This builds on the fact that the result of the distance the ‘closeness’ of the two images. A hausdorff metric norm (Euclidean norm was used in our of 0 indicates exactness as far as features are implementation; i.e., d = √ {(x1 - x2)2 + (y1 concerned, whereas further values reveal increasing - y2)2}) can never be less than 0. Hence, once the dissimilarity between images. inner loop of the above algorithm (Loop 2.2) computes the shortest distance to be 0, we can safely This implementation can be extended intuitively to stop considering any further points from B to consider a database of images. compute the distance from the particular point ai є A). This considerably speeds up the computation time Examples: by skipping a significant chunk of unconsidered [7.6.1] Source Database points. [7.6.2] Filtered images [7.6.3] Hausdorff distances computed w.r.t. 7.5.2 Threshold Distance Window Firefox_Logo_Normal (Source Image) We can eliminate the need to consider a point if it lies Results sorted in order of decreasing similarity. outside a particular threshold distance window or block. This can be understood with the help of an -5-
  • 6. --- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 --- 8. SUMMARY OF IMPLEMENTATION 9. REFERENCES A summary of the implementation is presented below [1] Daniel P. Huttenlocher, Gregory A. Klanderman, in the form of a pseudo-code. and William J. Rucklidge. Comparing Images Using the Hausdorff Distance. IEEE Trans. Pattern Analysis 8.1 Input Source Image, SI and Machine Intelligence, September 1993. 8.2 Input Target Directory, TD -- Preprocessing Phase [2] J. Canny. A Computational Approach To Edge 8.3 For each image in the TD: Detection. IEEE Trans. Pattern Analysis and Machine 8.3.1 Compute & store the hash value (HV) Intelligence, November1986. 8.3.2 Compute and Store Color Details (CD) 8.3.3 Apply the Canny (Sobel based) filter [3] I. Sobel, G. Feldman. ‘A 3x3 Isotropic Gradient 8.3.4 Compute the location of non-zero Operator for Image Processing’. Presented at a talk pixels and store in a matrix at the Stanford Artificial Project in 1968; Pattern -- Preparation Phase Classification and Scene Analysis, 1973. 8.4 Compute HV for SI 8.5 Compute & Store Color Details of SI [4] H. Alt, B. Behrends and J. Blomer. Measuring the 8.6 Apply Canny filter to SI resemblance of Polygon Shapes. Proc. Seventh ACM 8.7 Compute & store location of non-zero pixels Symposium on Computational Geometry, 1991. -- Comparison Phase 8.8 For each image in the TD: [5] Herbert Schildt. C# 2.0: The Complete Reference, 8.8.1 Compare HV of SI with stored HVs of Second Edition. Tata McGraw-Hill, 2006. the image 8.8.2 Compare CD of SI with stored CDs of [6] MSDN Library. msdn.microsoft.com/en- the image us/library/default.aspx 8.8.3 Compute Hausdorff metric b/w SI and the image using the stored location of non- zero pixels 8.8.4 Assign rank to image based on HV comparison, computed Hausdorff metric and the Color Details -- Sorting Phase 8.9 Sort images of TD based on rank 8.10 Display images in sort-order -6-
  • 7. --- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 --- FIGURES [7.1.1] [7.1.2] -7-
  • 8. --- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 --- [7.2.1] [7.2.2] -8-
  • 9. --- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 --- [7.3.1] [7.3.2] [7.3.3] -9-
  • 10. --- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 --- [7.3.4] [7.3.5] - 10 -
  • 11. --- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 --- [7.4.1] [7.4.2] [7.4.3] - 11 -
  • 12. --- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 --- [7.5.1] - 12 -
  • 13. --- Technical Paper on ‘Visual Search’ by Group C6 of B.Tech. (CSE) for Minor Project, November 2008 --- [7.6.1] [7.6.2] [7.6.3] - 13 -