SlideShare a Scribd company logo
Ms. T. Primya
Assistant Professor
Department of Computer Science and Engineering
Dr. N. G. P. Institute of Technology
Coimbatore
 facts provided or learned about something or someone.
 what is conveyed or represented by a particular arrangement
or sequence of things.
 informing, telling, thing told, knowledge, items of knowledge,
news
 knowledge communicated or received concerning a particular
fact or circumstance
 knowing familiarity gained by experience
 person’s range of information
 a theoretical or practical understanding of the sum of what is
known
Information  retrieval (introduction)
 Data
The raw material of information
 Information
Data organized and presented in a particular manner
 Knowledge
“Justified true belief”
Information that can be acted upon
 Wisdom
Distilled and integrated knowledge
Demonstrative of high-level “understanding”
 Data
98.6º F, 99.5º F, 100.3º F, 101º F, …
 Information
Hourly body temperature: 98.6º F, 99.5º F, 100.3º F, 101º F,..
 Knowledge
If you have a temperature above 100º F, you most likely have
a fever
 Wisdom
If you don’t feel well, go see a doctor
 Information as process
 Information as communication
 Information as message transmission and reception
 Information = characteristics of the output of a process
◦ Tells us something about the process and the input
 Information-generating process do not occur in isolation
(separation)
 Communication = transmission of information
 Communication = producing the same message at the
destination that was sent at the source
The message must be encoded for transmission across a
medium (called channel)
But the channel is noisy and can distort the message
 Semantics (meaning) is irrelevant
 Fetch something that’s been stored
 Recover a stored state of knowledge
 Search through stored messages to find some messages
relevant to the task at hand
 The tracing and recovery of specific information from stored
data.
 It is the activity of obtaining information system resources
relevant to an information need from a collection of
information resources. Searches can be based on full-text or
other content-based indexing.
 Information retrieval is the science of searching for
information in a document, searching for documents
themselves, and also searching for metadata that describe data,
and for databases of texts, images or sounds.
 An information retrieval process begins when a user enters a
query into the system.
 Queries are formal statements of information needs, for
example search strings in web search engines.
 In information retrieval a query does not uniquely identify a
single object in the collection.
 Instead, several objects may match the query, perhaps with
different degrees of relevancy.
 An object is an entity that is represented by information in a
content collection or database. User queries are matched
against the database information.
 In information retrieval the results returned may or may not
match the query, so results are typically ranked.
 This ranking of
results is a key
difference of
information
retrieval searching
compared to
database searching.
 Retrospective
“Searching the past”
Different queries posed against a static collection
Time invariant
 Prospective
“Searching the future”
Static query posed against a dynamic collection
Time dependent
Ad hoc retrieval: find documents “about this”
 Compile a list of mammals that are considered to be
endangered, identify their habitat and, if possible, specify what
threatens them.
Known item search
 Find Jimmy Lin’s homepage.
 What’s the ISBN number of “Introduction to Information
Retrieval”?
Directed exploration
 Who makes the best chocolates?
Question answering
“Factoid”
 Who discovered America?
 When did TamilNadu become a state?
 What team won the World Series in 1998?
“List”
 What countries export oil?
 Name Indian cities that have “Tourist” Spot.
“Definition”
 Who is Information?
 What is Retrieval?
 Filtering:
Make a binary decision about each incoming document
Ex: Spam or not
 Routing:
Sort incoming documents into different bins?
Ex: Categorize news headlines:
World? Nation? Metro? Sports
Defn:
A structured set of data held in a computer, especially one
that is accessible in various ways.
Example:
Banks storing account information
Retailers storing inventories
Universities storing student grades
Information  retrieval (introduction)
Database IR
What we’re retrieving Structured data. Clear
semantics based on a
formal model.
Mostly unstructured. Free
text with some metadata.
Queries we’re posing Formally defined queries.
Unambiguous.
Vague, imprecise
information needs
Results we get Exact. Always correct in a
formal sense.
Sometimes relevant, often
not.
Interaction with system One-shot queries. Interaction is important
Other issues Concurrency, recovery,
atomicity are all critical
Issues downplayed.
Information  retrieval (introduction)
 Precision: What fractions of the returned results are relevant
to the information need?
 Recall: What fractions of the relevant documents in the
collection were returned by the systems?
Precision=TP/(TP+FP)
Recall=TP/(TP+FN)
Relevant Non Relevant
Retrieved True positives (TP) False Positives (FP)
Not Retrieved False Negatives (FN) True Negatives (TN)
Information  retrieval (introduction)
Crawling:
 The system browses the document collection and fetches
documents
Indexing:
 The system builds an index of the documents fetched during
crawling
Ranking:
 The system retrieves documents that are relevant to the query
from the index and displays to the user
Relevance feedback:
 The initial results returned from a given query may be used to
refine the query itself
Information  retrieval (introduction)
Information  retrieval (introduction)

More Related Content

PPTX
Introduction to Information Retrieval
Roi Blanco
 
PPTX
Information retrieval introduction
nimmyjans4
 
PPTX
Information retrieval s
silambu111
 
PDF
Introduction to Information Retrieval & Models
Mounia Lalmas-Roelleke
 
PPTX
Lectures 1,2,3
alaa223
 
PDF
CS6007 information retrieval - 5 units notes
Anandh Arumugakan
 
PPTX
Model of information retrieval (3)
9866825059
 
PPTX
Vector space model of information retrieval
Nanthini Dominique
 
Introduction to Information Retrieval
Roi Blanco
 
Information retrieval introduction
nimmyjans4
 
Information retrieval s
silambu111
 
Introduction to Information Retrieval & Models
Mounia Lalmas-Roelleke
 
Lectures 1,2,3
alaa223
 
CS6007 information retrieval - 5 units notes
Anandh Arumugakan
 
Model of information retrieval (3)
9866825059
 
Vector space model of information retrieval
Nanthini Dominique
 

What's hot (20)

PPTX
Ppt evaluation of information retrieval system
silambu111
 
PPT
Information retrieval system
Leslie Vargas
 
PPTX
Probabilistic retrieval model
baradhimarch81
 
PPTX
Boolean,vector space retrieval Models
Primya Tamil
 
PPTX
Database indexing techniques
ahmadmughal0312
 
PPT
Information Retrieval Models
Nisha Arankandath
 
PDF
CS8080 INFORMATION RETRIEVAL TECHNIQUES - IRT - UNIT - I PPT IN PDF
AALIM MUHAMMED SALEGH COLLEGE OF ENGINEERING
 
PPTX
Automatic indexing
dhatchayaninandu
 
PPTX
Informatio retrival evaluation
NidhirBiswas
 
PPTX
Probabilistic information retrieval models & systems
Selman Bozkır
 
PPT
Data indexing presentation
gmbmanikandan
 
PPTX
Functions of information retrival system(1)
silambu111
 
PPTX
Information Retrieval
ssbd6985
 
PDF
User studies: enquiry foundations and methodological considerations
Giannis Tsakonas
 
PPTX
Text mining
Koshy Geoji
 
PPTX
Introduction to Metadata
EUDAT
 
PPTX
WEB BASED INFORMATION RETRIEVAL SYSTEM
Sai Kumar Ale
 
PPTX
Metadata
saurabh kaushik
 
PDF
Evaluation in Information Retrieval
Dishant Ailawadi
 
Ppt evaluation of information retrieval system
silambu111
 
Information retrieval system
Leslie Vargas
 
Probabilistic retrieval model
baradhimarch81
 
Boolean,vector space retrieval Models
Primya Tamil
 
Database indexing techniques
ahmadmughal0312
 
Information Retrieval Models
Nisha Arankandath
 
CS8080 INFORMATION RETRIEVAL TECHNIQUES - IRT - UNIT - I PPT IN PDF
AALIM MUHAMMED SALEGH COLLEGE OF ENGINEERING
 
Automatic indexing
dhatchayaninandu
 
Informatio retrival evaluation
NidhirBiswas
 
Probabilistic information retrieval models & systems
Selman Bozkır
 
Data indexing presentation
gmbmanikandan
 
Functions of information retrival system(1)
silambu111
 
Information Retrieval
ssbd6985
 
User studies: enquiry foundations and methodological considerations
Giannis Tsakonas
 
Text mining
Koshy Geoji
 
Introduction to Metadata
EUDAT
 
WEB BASED INFORMATION RETRIEVAL SYSTEM
Sai Kumar Ale
 
Metadata
saurabh kaushik
 
Evaluation in Information Retrieval
Dishant Ailawadi
 
Ad

Similar to Information retrieval (introduction) (20)

PPT
Information retrival system it is part and parcel
VAIBHAVEPAWAR
 
PPT
information retirval system,search info insights in unsturtcured data
VAIBHAVEPAWAR
 
PPTX
IRT Unit_I.pptx
thenmozhip8
 
DOCX
unit 1 INTRODUCTION
karthiksmart21
 
PPTX
information Storage nd retrieval.pptx
Siva Kumar
 
PDF
Introduction to irs notes easy way learning
JafarHussain48
 
PPTX
Interview_Search_Process (1).pptx
AbhinayRaparthi
 
PDF
Chapter 1 Introduction to Information Storage and Retrieval.pdf
Habtamu100
 
PPTX
information retrieval in artificial intelligence
PriyadharshiniG41
 
PDF
Fundamentals of IR models
M. Atif Qureshi
 
PPTX
Information storage and retrieval system and
garedew32
 
PPTX
Text Mining.pptx
vrundadevani
 
PPTX
INFORMATION RETRIEVAL Anandraj.L
anujessy
 
PPT
Bioinformatioc: Information Retrieval
Dr. Rupak Chakravarty
 
PPTX
lecture8-evaluation.pptxnnnnnnnnnnnnnnnnnnnnnnnnn
RAtna29
 
PPTX
Information storage and retrieval
Sadaf Rafiq
 
PPTX
CSC315_LECTURE on database design and management
tissandavid
 
PPT
Bioinformatioc: Information Retrieval - II
Dr. Rupak Chakravarty
 
PPTX
Chapter 1.pptx
Habtamu100
 
PPSX
INFORMATION RETRIEVAL ‎AND DISSEMINATION
Libcorpio
 
Information retrival system it is part and parcel
VAIBHAVEPAWAR
 
information retirval system,search info insights in unsturtcured data
VAIBHAVEPAWAR
 
IRT Unit_I.pptx
thenmozhip8
 
unit 1 INTRODUCTION
karthiksmart21
 
information Storage nd retrieval.pptx
Siva Kumar
 
Introduction to irs notes easy way learning
JafarHussain48
 
Interview_Search_Process (1).pptx
AbhinayRaparthi
 
Chapter 1 Introduction to Information Storage and Retrieval.pdf
Habtamu100
 
information retrieval in artificial intelligence
PriyadharshiniG41
 
Fundamentals of IR models
M. Atif Qureshi
 
Information storage and retrieval system and
garedew32
 
Text Mining.pptx
vrundadevani
 
INFORMATION RETRIEVAL Anandraj.L
anujessy
 
Bioinformatioc: Information Retrieval
Dr. Rupak Chakravarty
 
lecture8-evaluation.pptxnnnnnnnnnnnnnnnnnnnnnnnnn
RAtna29
 
Information storage and retrieval
Sadaf Rafiq
 
CSC315_LECTURE on database design and management
tissandavid
 
Bioinformatioc: Information Retrieval - II
Dr. Rupak Chakravarty
 
Chapter 1.pptx
Habtamu100
 
INFORMATION RETRIEVAL ‎AND DISSEMINATION
Libcorpio
 
Ad

More from Primya Tamil (6)

PPTX
Term weighting
Primya Tamil
 
DOCX
Open source search engine
Primya Tamil
 
PPTX
Components of a search engine
Primya Tamil
 
PPTX
The impact of web on ir
Primya Tamil
 
PPTX
Web search vs ir
Primya Tamil
 
PPTX
Issues in ir
Primya Tamil
 
Term weighting
Primya Tamil
 
Open source search engine
Primya Tamil
 
Components of a search engine
Primya Tamil
 
The impact of web on ir
Primya Tamil
 
Web search vs ir
Primya Tamil
 
Issues in ir
Primya Tamil
 

Recently uploaded (20)

PPTX
CONCEPT OF CHILD CARE. pptx
AneetaSharma15
 
PPTX
CARE OF UNCONSCIOUS PATIENTS .pptx
AneetaSharma15
 
PPTX
Continental Accounting in Odoo 18 - Odoo Slides
Celine George
 
PPTX
Artificial Intelligence in Gastroentrology: Advancements and Future Presprec...
AyanHossain
 
PPTX
20250924 Navigating the Future: How to tell the difference between an emergen...
McGuinness Institute
 
PDF
Health-The-Ultimate-Treasure (1).pdf/8th class science curiosity /samyans edu...
Sandeep Swamy
 
PPTX
Gupta Art & Architecture Temple and Sculptures.pptx
Virag Sontakke
 
PPTX
Cleaning Validation Ppt Pharmaceutical validation
Ms. Ashatai Patil
 
PPTX
A Smarter Way to Think About Choosing a College
Cyndy McDonald
 
DOCX
SAROCES Action-Plan FOR ARAL PROGRAM IN DEPED
Levenmartlacuna1
 
PDF
The Minister of Tourism, Culture and Creative Arts, Abla Dzifa Gomashie has e...
nservice241
 
DOCX
Unit 5: Speech-language and swallowing disorders
JELLA VISHNU DURGA PRASAD
 
PPTX
Measures_of_location_-_Averages_and__percentiles_by_DR SURYA K.pptx
Surya Ganesh
 
PDF
What is CFA?? Complete Guide to the Chartered Financial Analyst Program
sp4989653
 
PDF
Virat Kohli- the Pride of Indian cricket
kushpar147
 
PPTX
Dakar Framework Education For All- 2000(Act)
santoshmohalik1
 
PPTX
Five Point Someone – Chetan Bhagat | Book Summary & Analysis by Bhupesh Kushwaha
Bhupesh Kushwaha
 
PDF
The-Invisible-Living-World-Beyond-Our-Naked-Eye chapter 2.pdf/8th science cur...
Sandeep Swamy
 
PPTX
Information Texts_Infographic on Forgetting Curve.pptx
Tata Sevilla
 
PPTX
Python-Application-in-Drug-Design by R D Jawarkar.pptx
Rahul Jawarkar
 
CONCEPT OF CHILD CARE. pptx
AneetaSharma15
 
CARE OF UNCONSCIOUS PATIENTS .pptx
AneetaSharma15
 
Continental Accounting in Odoo 18 - Odoo Slides
Celine George
 
Artificial Intelligence in Gastroentrology: Advancements and Future Presprec...
AyanHossain
 
20250924 Navigating the Future: How to tell the difference between an emergen...
McGuinness Institute
 
Health-The-Ultimate-Treasure (1).pdf/8th class science curiosity /samyans edu...
Sandeep Swamy
 
Gupta Art & Architecture Temple and Sculptures.pptx
Virag Sontakke
 
Cleaning Validation Ppt Pharmaceutical validation
Ms. Ashatai Patil
 
A Smarter Way to Think About Choosing a College
Cyndy McDonald
 
SAROCES Action-Plan FOR ARAL PROGRAM IN DEPED
Levenmartlacuna1
 
The Minister of Tourism, Culture and Creative Arts, Abla Dzifa Gomashie has e...
nservice241
 
Unit 5: Speech-language and swallowing disorders
JELLA VISHNU DURGA PRASAD
 
Measures_of_location_-_Averages_and__percentiles_by_DR SURYA K.pptx
Surya Ganesh
 
What is CFA?? Complete Guide to the Chartered Financial Analyst Program
sp4989653
 
Virat Kohli- the Pride of Indian cricket
kushpar147
 
Dakar Framework Education For All- 2000(Act)
santoshmohalik1
 
Five Point Someone – Chetan Bhagat | Book Summary & Analysis by Bhupesh Kushwaha
Bhupesh Kushwaha
 
The-Invisible-Living-World-Beyond-Our-Naked-Eye chapter 2.pdf/8th science cur...
Sandeep Swamy
 
Information Texts_Infographic on Forgetting Curve.pptx
Tata Sevilla
 
Python-Application-in-Drug-Design by R D Jawarkar.pptx
Rahul Jawarkar
 

Information retrieval (introduction)

  • 1. Ms. T. Primya Assistant Professor Department of Computer Science and Engineering Dr. N. G. P. Institute of Technology Coimbatore
  • 2.  facts provided or learned about something or someone.  what is conveyed or represented by a particular arrangement or sequence of things.  informing, telling, thing told, knowledge, items of knowledge, news  knowledge communicated or received concerning a particular fact or circumstance
  • 3.  knowing familiarity gained by experience  person’s range of information  a theoretical or practical understanding of the sum of what is known
  • 5.  Data The raw material of information  Information Data organized and presented in a particular manner  Knowledge “Justified true belief” Information that can be acted upon  Wisdom Distilled and integrated knowledge Demonstrative of high-level “understanding”
  • 6.  Data 98.6º F, 99.5º F, 100.3º F, 101º F, …  Information Hourly body temperature: 98.6º F, 99.5º F, 100.3º F, 101º F,..  Knowledge If you have a temperature above 100º F, you most likely have a fever  Wisdom If you don’t feel well, go see a doctor
  • 7.  Information as process  Information as communication  Information as message transmission and reception
  • 8.  Information = characteristics of the output of a process ◦ Tells us something about the process and the input  Information-generating process do not occur in isolation (separation)
  • 9.  Communication = transmission of information
  • 10.  Communication = producing the same message at the destination that was sent at the source The message must be encoded for transmission across a medium (called channel) But the channel is noisy and can distort the message  Semantics (meaning) is irrelevant
  • 11.  Fetch something that’s been stored  Recover a stored state of knowledge  Search through stored messages to find some messages relevant to the task at hand
  • 12.  The tracing and recovery of specific information from stored data.  It is the activity of obtaining information system resources relevant to an information need from a collection of information resources. Searches can be based on full-text or other content-based indexing.  Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for metadata that describe data, and for databases of texts, images or sounds.
  • 13.  An information retrieval process begins when a user enters a query into the system.  Queries are formal statements of information needs, for example search strings in web search engines.  In information retrieval a query does not uniquely identify a single object in the collection.  Instead, several objects may match the query, perhaps with different degrees of relevancy.  An object is an entity that is represented by information in a content collection or database. User queries are matched against the database information.
  • 14.  In information retrieval the results returned may or may not match the query, so results are typically ranked.  This ranking of results is a key difference of information retrieval searching compared to database searching.
  • 15.  Retrospective “Searching the past” Different queries posed against a static collection Time invariant  Prospective “Searching the future” Static query posed against a dynamic collection Time dependent
  • 16. Ad hoc retrieval: find documents “about this”  Compile a list of mammals that are considered to be endangered, identify their habitat and, if possible, specify what threatens them. Known item search  Find Jimmy Lin’s homepage.  What’s the ISBN number of “Introduction to Information Retrieval”? Directed exploration  Who makes the best chocolates?
  • 17. Question answering “Factoid”  Who discovered America?  When did TamilNadu become a state?  What team won the World Series in 1998? “List”  What countries export oil?  Name Indian cities that have “Tourist” Spot. “Definition”  Who is Information?  What is Retrieval?
  • 18.  Filtering: Make a binary decision about each incoming document Ex: Spam or not  Routing: Sort incoming documents into different bins? Ex: Categorize news headlines: World? Nation? Metro? Sports
  • 19. Defn: A structured set of data held in a computer, especially one that is accessible in various ways. Example: Banks storing account information Retailers storing inventories Universities storing student grades
  • 21. Database IR What we’re retrieving Structured data. Clear semantics based on a formal model. Mostly unstructured. Free text with some metadata. Queries we’re posing Formally defined queries. Unambiguous. Vague, imprecise information needs Results we get Exact. Always correct in a formal sense. Sometimes relevant, often not. Interaction with system One-shot queries. Interaction is important Other issues Concurrency, recovery, atomicity are all critical Issues downplayed.
  • 23.  Precision: What fractions of the returned results are relevant to the information need?  Recall: What fractions of the relevant documents in the collection were returned by the systems?
  • 24. Precision=TP/(TP+FP) Recall=TP/(TP+FN) Relevant Non Relevant Retrieved True positives (TP) False Positives (FP) Not Retrieved False Negatives (FN) True Negatives (TN)
  • 26. Crawling:  The system browses the document collection and fetches documents Indexing:  The system builds an index of the documents fetched during crawling Ranking:  The system retrieves documents that are relevant to the query from the index and displays to the user Relevance feedback:  The initial results returned from a given query may be used to refine the query itself