3
Most read
4
Most read
7
Most read
BY
NANTHINI R O
II – MLIS
PONDICHERRY UNIVERSITY








Theory based approach to design various
aspects of information retrieval systems
Based on a set of principles and assumptions

Theory drives experiment by suggesting new
ways and means of doing tests
Experiment drives theory by justifying or
helping to improve the model


Cognitive or user centered
◦ Human information behaviour models
◦ Eg: Wilson’s model, Dervin’s model, Ellis’s model,
Bates’s model, Kulthau’s model, etc...



Structural or system centered
◦ Classical models based on logical and mathematical
principles
◦ Eg: Boolean search model, Vector Space model,
probabilistic model, etc...








Also called as ‘term vector model’ or ‘vector
processing model’
Represents both documents and queries by term
sets and compares global similarities between
queries and documents
used in information filtering, information
retrieval, indexing and relevancy rankings

first use was in the SMART Information Retrieval
System


term vectors are assigned for the keywords of the
documents and weights are provided according to
relevance



to compare different texts and retrieve relevant
records similar to the queries



terms are single words, keywords, or longer phrases



If words are chosen to be the terms, the
dimensionality of the vector is the number of words
in the vocabulary (the number of distinct words occurring in the corpus)


BASICS: (i and j are 2 documents, k – term, t – last term)

◦ Denotes the sum of the weights of all properties of
a vector

◦ Denotes the sum of products of corresponding term
weights for two vectors
◦ Denotes the sum of minimum component weights
of the corresponding two vectors


Similarity coefficients
◦ The Dice Coefficient

◦ The Jaccard Coefficient

acc. to Salton and McGill
Let the weights for the index terms assigned to two
documents i and j be as follows:

Doci = 3,2,1,0,0,0,1,1
Docj = 1,1,1,0,0,1,0,0
= 2 [(3*1)+(2*1)+(1*1)+(0*0)+(0*0)+(0*1)+(1*0)+(1*0)]
(3+2+1+0+0+0+1+1)+(1+1+1+0+0+1+0+0)
=12/12 = 1
= 6/(12-6)
= 1
Vector space model of information retrieval

More Related Content

PPTX
Probabilistic information retrieval models & systems
PPT
Information Retrieval Models
PDF
Introduction to Information Retrieval & Models
PDF
Information_Retrieval_Models_Nfaoui_El_Habib
PPTX
Model of information retrieval (3)
PPTX
Term weighting
PPTX
Information retrieval 10 vector and probabilistic models
PPTX
Introduction to Information Retrieval
Probabilistic information retrieval models & systems
Information Retrieval Models
Introduction to Information Retrieval & Models
Information_Retrieval_Models_Nfaoui_El_Habib
Model of information retrieval (3)
Term weighting
Information retrieval 10 vector and probabilistic models
Introduction to Information Retrieval

What's hot (20)

PPTX
Vector space model in information retrieval
PPTX
The vector space model
PPTX
Web search vs ir
PDF
CS6007 information retrieval - 5 units notes
PPTX
The impact of web on ir
PDF
Information retrieval-systems notes
PPT
similarity measure
PPTX
Boolean,vector space retrieval Models
PPT
Inverted index
PPTX
Information retrieval 7 boolean model
PPTX
WEB BASED INFORMATION RETRIEVAL SYSTEM
PPTX
Text MIning
PPTX
Information retrieval introduction
PPTX
Automatic indexing
PPTX
Information retrieval 14 fuzzy set models of ir
PPTX
Information retrieval (introduction)
PPT
3.5 model based clustering
PDF
Multimedia Information Retrieval
PPTX
Information Retrieval
PPTX
Text mining
Vector space model in information retrieval
The vector space model
Web search vs ir
CS6007 information retrieval - 5 units notes
The impact of web on ir
Information retrieval-systems notes
similarity measure
Boolean,vector space retrieval Models
Inverted index
Information retrieval 7 boolean model
WEB BASED INFORMATION RETRIEVAL SYSTEM
Text MIning
Information retrieval introduction
Automatic indexing
Information retrieval 14 fuzzy set models of ir
Information retrieval (introduction)
3.5 model based clustering
Multimedia Information Retrieval
Information Retrieval
Text mining
Ad

Similar to Vector space model of information retrieval (20)

PPTX
Week14-Multimedia Information Retrieval.pptx
PPT
Intro.ppt
PDF
IRS-Lecture-Notes irsirs IRS-Lecture-Notes irsirs IRS-Lecture-Notes irsi...
PDF
Some Information Retrieval Models and Our Experiments for TREC KBA
PDF
An Introduction to Information Retrieval.pdf
PPTX
lecture14-distributed-reprennnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnsentations.pptx
PDF
ICDIM 06 Web IR Tutorial [Compatibility Mode].pdf
PDF
call for papers, research paper publishing, where to publish research paper, ...
PPT
6640200.pptNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
PDF
MODELING AND RETRIEVAL 4.pdfMODELING AND RETRIEVAL EVALUATION
PPTX
JM Information Retrieval Techniques Unit II
PPT
4-IR Models_new.ppt
PPT
4-IR Models_new.ppt
PPTX
PPTX
Information retrival system and PageRank algorithm
PDF
Chapter 4 IR Models.pdf
PPT
chapter 5 Information Retrieval Models.ppt
PDF
Information Retrieval on Text using Concept Similarity
PDF
Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...
PPT
Ir models
Week14-Multimedia Information Retrieval.pptx
Intro.ppt
IRS-Lecture-Notes irsirs IRS-Lecture-Notes irsirs IRS-Lecture-Notes irsi...
Some Information Retrieval Models and Our Experiments for TREC KBA
An Introduction to Information Retrieval.pdf
lecture14-distributed-reprennnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnsentations.pptx
ICDIM 06 Web IR Tutorial [Compatibility Mode].pdf
call for papers, research paper publishing, where to publish research paper, ...
6640200.pptNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
MODELING AND RETRIEVAL 4.pdfMODELING AND RETRIEVAL EVALUATION
JM Information Retrieval Techniques Unit II
4-IR Models_new.ppt
4-IR Models_new.ppt
Information retrival system and PageRank algorithm
Chapter 4 IR Models.pdf
chapter 5 Information Retrieval Models.ppt
Information Retrieval on Text using Concept Similarity
Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...
Ir models
Ad

Recently uploaded (20)

PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
Five Habits of High-Impact Board Members
PDF
CloudStack 4.21: First Look Webinar slides
PDF
Unlock new opportunities with location data.pdf
PPTX
Tartificialntelligence_presentation.pptx
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
DOCX
search engine optimization ppt fir known well about this
PDF
STKI Israel Market Study 2025 version august
PPTX
O2C Customer Invoices to Receipt V15A.pptx
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PPTX
Chapter 5: Probability Theory and Statistics
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
WOOl fibre morphology and structure.pdf for textiles
PPTX
Web Crawler for Trend Tracking Gen Z Insights.pptx
Enhancing emotion recognition model for a student engagement use case through...
sustainability-14-14877-v2.pddhzftheheeeee
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Zenith AI: Advanced Artificial Intelligence
Five Habits of High-Impact Board Members
CloudStack 4.21: First Look Webinar slides
Unlock new opportunities with location data.pdf
Tartificialntelligence_presentation.pptx
A contest of sentiment analysis: k-nearest neighbor versus neural network
Univ-Connecticut-ChatGPT-Presentaion.pdf
search engine optimization ppt fir known well about this
STKI Israel Market Study 2025 version august
O2C Customer Invoices to Receipt V15A.pptx
NewMind AI Weekly Chronicles – August ’25 Week III
Chapter 5: Probability Theory and Statistics
1 - Historical Antecedents, Social Consideration.pdf
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
WOOl fibre morphology and structure.pdf for textiles
Web Crawler for Trend Tracking Gen Z Insights.pptx

Vector space model of information retrieval

  • 1. BY NANTHINI R O II – MLIS PONDICHERRY UNIVERSITY
  • 2.     Theory based approach to design various aspects of information retrieval systems Based on a set of principles and assumptions Theory drives experiment by suggesting new ways and means of doing tests Experiment drives theory by justifying or helping to improve the model
  • 3.  Cognitive or user centered ◦ Human information behaviour models ◦ Eg: Wilson’s model, Dervin’s model, Ellis’s model, Bates’s model, Kulthau’s model, etc...  Structural or system centered ◦ Classical models based on logical and mathematical principles ◦ Eg: Boolean search model, Vector Space model, probabilistic model, etc...
  • 4.     Also called as ‘term vector model’ or ‘vector processing model’ Represents both documents and queries by term sets and compares global similarities between queries and documents used in information filtering, information retrieval, indexing and relevancy rankings first use was in the SMART Information Retrieval System
  • 5.  term vectors are assigned for the keywords of the documents and weights are provided according to relevance  to compare different texts and retrieve relevant records similar to the queries  terms are single words, keywords, or longer phrases  If words are chosen to be the terms, the dimensionality of the vector is the number of words in the vocabulary (the number of distinct words occurring in the corpus)
  • 6.  BASICS: (i and j are 2 documents, k – term, t – last term) ◦ Denotes the sum of the weights of all properties of a vector ◦ Denotes the sum of products of corresponding term weights for two vectors
  • 7. ◦ Denotes the sum of minimum component weights of the corresponding two vectors  Similarity coefficients ◦ The Dice Coefficient ◦ The Jaccard Coefficient acc. to Salton and McGill
  • 8. Let the weights for the index terms assigned to two documents i and j be as follows: Doci = 3,2,1,0,0,0,1,1 Docj = 1,1,1,0,0,1,0,0