Similar Image Search with a Tiny
Bag-of-Delegates Representation

    Weiwen Tu, Rong Pan, Jingdong Wang
Similar Image Search
• text-based image search
• context-based image search
Tiny Bag-of-Delegates
Tiny Bag-of-Delegates
Tiny Bag-of-Delegates
Example for Tiny Bag-of-Delegates
Tiny Bag-of-Delegates
Spatial Partition Tree
• Spatial partition tree is a method for
  recursively subdividing a space into subsets
  by hyperplanes.

• Popular data structures: Kd-tree, PCA-tree,
  VP-tree.
Example for Spatial Partition Tree

                    The left figure shows
                    a spatial partitioning
                    induced by an RP tree.
                    The cross is q query
                    point and the lines
                    means the partition
                    hyperplanes.
Tree Construction
Problem
                           A larger amount of trees
                           yield a better performance,
                           while requiring more storage
                           to save the inverted indices
                           and accordingly more query
                           time



Problem:
Can we use a small amount of trees but with the search
performance guaranteed?
Vocabulary Construction
Criterion
Criterion
Forward selection scheme
Forward selection scheme
Random selection vs. forward selection



                       Random selection




Forward selection
Adaptive scheme
Adaptive scheme
Adaptive scheme
Adaptive forward selection
1. Initialization:
    𝑄 ← 𝐿, 𝑡 ← 0, 𝑒 ← 𝑄 , 𝑅 ← ∅
2. Repeat
3. Candidate proposal
   Randomly generate spatial partition tree 𝒯 per Q
4. Candidate selection
   Choose the spatial partition tree T from 𝒯 that keeps the large number of Q
5. Update
   Discard all the pairs of points lying in the same bucket in T from Q
   𝑡 ← 𝑡 + 1, 𝑒 ← 𝑄 , 𝑅 ← 𝑅 ∪ {𝑇}
6. Until 𝑒 ≤ 𝜀 && 𝑡 ≥ 𝜏
7. Return R;
EXPERIMENTS
• Data set
  – 32 ×32 color images
  – 1M tiny images to form around 80M images
  – global GIST descriptor(384 D vector)


• Evaluation criteria(average accuracy score)
Evaluation
Accuracy vs indices




Bucket size: the maximum number of points in a leaf node;
g: the number of target NNs.
Accuracy vs indices

                     bucket size = 300




bucket size = 100
accuracy vs accessed images
Out-of-sample Test
Visual search result




13 : 9               22 : 9
Thank you!

More Related Content

PDF
Skytree big data london meetup - may 2013
PPTX
PPSN talk: random search with search distribution different from the prior di...
PDF
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
PDF
Fast Perceptron Decision Tree Learning from Evolving Data Streams
PPTX
K-means Clustering with Scikit-Learn
PPTX
Extension and validation of moro et al
PPTX
SIGGRAPH 2010
PDF
Keeping Linked Open Data Caches Up-to-date by Predicting the Life-time of RDF...
Skytree big data london meetup - may 2013
PPSN talk: random search with search distribution different from the prior di...
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
Fast Perceptron Decision Tree Learning from Evolving Data Streams
K-means Clustering with Scikit-Learn
Extension and validation of moro et al
SIGGRAPH 2010
Keeping Linked Open Data Caches Up-to-date by Predicting the Life-time of RDF...

What's hot (7)

PDF
Using parallel hierarchical clustering to
ODP
Google's Dremel
PPT
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
PPTX
Formalization and Preliminary Evaluation of a Pipeline for Text Extraction Fr...
PPTX
Clustering: A Scikit Learn Tutorial
PDF
Moa: Real Time Analytics for Data Streams
PPTX
SchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
Using parallel hierarchical clustering to
Google's Dremel
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Formalization and Preliminary Evaluation of a Pipeline for Text Extraction Fr...
Clustering: A Scikit Learn Tutorial
Moa: Real Time Analytics for Data Streams
SchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
Ad

Viewers also liked (8)

PDF
Applying Auto-Data Classification Techniques for Large Data Sets
PPTX
Semantic Search tutorial at SemTech 2012
PDF
Efficient Online Evaluation of Big Data Stream Classifiers
PPT
Blooms Taxonomy
PPTX
Overcome Your 99 Enterprise Information Management Challenges with OpenText
PPTX
Introducing OpenText Auto-Classification
PPT
Working Of Search Engine
PDF
Digital in 2017 Global Overview
Applying Auto-Data Classification Techniques for Large Data Sets
Semantic Search tutorial at SemTech 2012
Efficient Online Evaluation of Big Data Stream Classifiers
Blooms Taxonomy
Overcome Your 99 Enterprise Information Management Challenges with OpenText
Introducing OpenText Auto-Classification
Working Of Search Engine
Digital in 2017 Global Overview
Ad

Similar to Similar image search (20)

PDF
Memory efficient java tutorial practices and challenges
PDF
To bag, or to boost? A question of balance
PPTX
A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...
PPTX
Support-Vector-Machine (Supervised Learning).pptx
PDF
Computer Vision Computer Vision: Algorithms and Applications Richard Szeliski
PPTX
Hadoop Summit 2012 | Bayesian Counters AKA In Memory Data Mining for Large Da...
PPTX
Clustering - ACM 2013 02-25
PPT
data clean.ppt
PDF
A Production Quality Sketching Library for the Analysis of Big Data
PPTX
How Machine Learning Helps Organizations to Work More Efficiently?
PPT
Machine Learning workshop by GDSC Amity University Chhattisgarh
PDF
How to Effectively Combine Numerical Features and Categorical Features
PPTX
Machine Learning : Clustering - Cluster analysis.pptx
PPTX
Presentation on Text Classification
PDF
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...
PPT
PPT
pattern_recognition2.ppt
PPTX
Paris Data Geeks
PDF
Scaling up genomic analysis with ADAM
Memory efficient java tutorial practices and challenges
To bag, or to boost? A question of balance
A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...
Support-Vector-Machine (Supervised Learning).pptx
Computer Vision Computer Vision: Algorithms and Applications Richard Szeliski
Hadoop Summit 2012 | Bayesian Counters AKA In Memory Data Mining for Large Da...
Clustering - ACM 2013 02-25
data clean.ppt
A Production Quality Sketching Library for the Analysis of Big Data
How Machine Learning Helps Organizations to Work More Efficiently?
Machine Learning workshop by GDSC Amity University Chhattisgarh
How to Effectively Combine Numerical Features and Categorical Features
Machine Learning : Clustering - Cluster analysis.pptx
Presentation on Text Classification
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...
pattern_recognition2.ppt
Paris Data Geeks
Scaling up genomic analysis with ADAM

Similar image search