SlideShare a Scribd company logo
By
SATHISHKUMAR G
(sathishsak111@gmail.com)
 Homework assignments and programming
exercises: ~40%
 Mid-term exam: ~25%
 Term project: ~35%
 Including proposal, presentation, and final report
 About 3 programming exercises
 Team-based (at most 2 persons per team)
 You can either write your own code or reuse existing
open source code
 The term project
 Either team-based system development (the same as
programming exercises)
 Or academic paper presentation
 Only one person per team allowed
 A proposal is *required* before midterm (Apr. 11,
2014)
 The score you get depends on the functions,
difficulty and quality of your project
 For system development:
 System functions and correctness
 For academic paper presentation
 Quality and your presentation of the paper
 Major methods/experimental results *must* be presented
 Papers from top conferences are strongly suggested
 E.g. SIGIR, WWW, CIKM, WSDM, JCDL, ICMR, …
 Proposals are *required* for each team, and will be counted
in the score
 Submission instructions
 Programs, project proposals, and project reports in
electronic files must be submitted to the TA online at:
 Submissions website: (TBD)
 Before submission:
 User name: Your student ID
 Please change your default password at your first login
 This course will NOT tell you
 The tips and tricks of using search engines,
although power users might have better ideas on
how to improve them
 There’re plenty of books and websites on that…
 How to find books in libraries,
although it’s somewhat related to the basic IR
concepts
 How to make money on the Web,
although the currently largest search engine did it
 Things that you have been doing all day!
 Searching for something interesting: Web, news,
e-mail, image, video, …
 Asking for advices
 …
 User interests are changing all the time…
 2011: New Zealand Earthquake
 2012: Jeremy Lin
 2013: Meteor Russia
 2014: ? (next slide)
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 Blast
 Explosion
 Chelyabinsk
 Asteroid 2012 DA14
 …
 An Introduction to Information Retrieval and Applications
 流星
 彗星
 隕石
 俄羅斯
 地球
 …
 And other languages…
 And other search engines…
 And social websites…
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 “Information retrieval is a field concerned with the
structure, analysis, organization, storage, searching,
and retrieval of information.” (Salton, 1968)
 Information retrieval (IR): a research field that
targets at effectively and efficiently searching
information in text and multimedia documents
 In this course, we will introduce the basic text
and query models in IR, retrieval evaluation,
indexing and searching, and applications for IR
 An Introduction to Information Retrieval and Applications
Inverted
Index
User
Interface
Text Operations
Query
Expansion
Indexing
Retrieval
Ranking
Text
query
user need
user feedback
ranked docs
retrieved docs
Doc representationlogical view
inverted file
Document
Collection
 Text IR
 Indexing and searching
 Query languages and operations
 Retrieval evaluation
 Modeling
 Boolean model
 Vector space model
 Probabilistic model
 Applications for IR
 Multimedia IR
 Web search
 Digital libraries
 Basics in IR (focus)
 Inverted indexes for boolean queries (Ch.1-5)
 Term weighting and vector space model (Ch. 6-7)
 Evaluation in IR (Ch. 8)
 Advanced Topics
 Relevance feedback (Ch. 9)
 XML retrieval (Ch. 10)
 Probabilistic IR (Ch. 11)
 Language models (Ch. 12)
 Machine learning in IR (useful)
 Text classification (Ch. 13-15)
 Document clustering (Ch. 16-18)
 Web Search
 Web crawling and indexes (Ch. 19-20)
 Link analysis (Ch. 21)
 Text mining
 Machine Learning
 Natural Language Processing
 Social Network Analysis
 …
 Cross-language IR
 Image, video, and multimedia IR
 Speech retrieval
 Music retrieval
 User interfaces
 Parallel, distributed, and P2P IR
 Digital libraries
 Information science perspective
 Logic-based approaches to IR
 Natural language processing techniques
 …
 Before midterm
 Boolean retrieval (1 wk)
 Indexing (2 wks)
 Vector space model and evaluation (2 wk)
 Relevance feedback (1 wk)
 Probabilistic IR (2 wk)
 After midterm
 Text classification (1-2 wk)
 Document clustering (1-2 wk)
 Web search (2 wks)
 Advanced topics: CLIR, IE, … (2 wks)
 Term Project Presentation (3 wks)
 Wikipedia page on Information Retrieval:
http://en.wikipedia.org/wiki/Information_ret
rieval
 Information Retrieval Resources: http://www-
csli.stanford.edu/~hinrich/information-
retrieval.html

 Journals
 ACM TOIS: Transactions on Information Systems
 JASIST: Journal of the American Society of Information Sciences
 IP&M: Information Processing and Management
 IEEE TKDE: Transactions on Knowledge and Data Engineering
 Conferences
 ACM SIGIR: International Conference on Information Retrieval
 WWW: World Wide Web Conference
 ACM CIKM: Conference on Information Knowledge and
Management
 JCDL: ACM/IEEE Joint Conference on Digital Libraries
 ACM WSDM: International Conference on Web Search and
Data Mining
 TREC: Text Retrieval Conference
 Slides and lectures will be offered mainly in
English
 For better understanding for domestic students,
important concepts will be briefly summarized
in Chinese
 An Introduction to Information Retrieval and Applications

More Related Content

What's hot (20)

PPTX
Data Warehouse Fundamentals
Rashmi Bhat
 
PPT
Information Retrieval Models
Nisha Arankandath
 
PPTX
Exploratory data analysis with Python
Davis David
 
PPTX
Types Of Keys in DBMS
PadamNepal1
 
PDF
Data mining and data warehouse lab manual updated
Yugal Kumar
 
PPTX
FUNCTION DEPENDENCY AND TYPES & EXAMPLE
Vraj Patel
 
PDF
Database design & Normalization (1NF, 2NF, 3NF)
Jargalsaikhan Alyeksandr
 
PPTX
Transaction management in DBMS
Megha Sharma
 
PPTX
Information retrieval s
silambu111
 
PPTX
Tdm information retrieval
KU Leuven
 
PPTX
Big data visualization
Anurag Gupta
 
PDF
Information retrieval concept, practice and challenge
Gan Keng Hoon
 
PPT
3.1 clustering
Krish_ver2
 
PDF
Introduction To Data Science
Spotle.ai
 
PPT
Database backup and recovery basics
Shahed Mohamed
 
PPTX
Distributed Query Processing
Mythili Kannan
 
DOCX
Big data lecture notes
Mohit Saini
 
PPTX
ID3 ALGORITHM
HARDIK SINGH
 
PPSX
Parallel Database
VESIT/University of Mumbai
 
Data Warehouse Fundamentals
Rashmi Bhat
 
Information Retrieval Models
Nisha Arankandath
 
Exploratory data analysis with Python
Davis David
 
Types Of Keys in DBMS
PadamNepal1
 
Data mining and data warehouse lab manual updated
Yugal Kumar
 
FUNCTION DEPENDENCY AND TYPES & EXAMPLE
Vraj Patel
 
Database design & Normalization (1NF, 2NF, 3NF)
Jargalsaikhan Alyeksandr
 
Transaction management in DBMS
Megha Sharma
 
Information retrieval s
silambu111
 
Tdm information retrieval
KU Leuven
 
Big data visualization
Anurag Gupta
 
Information retrieval concept, practice and challenge
Gan Keng Hoon
 
3.1 clustering
Krish_ver2
 
Introduction To Data Science
Spotle.ai
 
Database backup and recovery basics
Shahed Mohamed
 
Distributed Query Processing
Mythili Kannan
 
Big data lecture notes
Mohit Saini
 
ID3 ALGORITHM
HARDIK SINGH
 
Parallel Database
VESIT/University of Mumbai
 

Similar to An Introduction to Information Retrieval and Applications (20)

PPT
Slawek Korea
Slawek
 
PPT
Semantic Web in Action
Sebastian Ryszard Kruk
 
DOC
Info 2402 information retrieval technologies course_outline
Shahriar Rafee
 
PDF
Data science syllabus
anoop bk
 
PPT
00 intro
Basma Fayech
 
PPTX
Findability through Traceability - A Realistic Application of Candidate Tr...
Markus Borg
 
PPTX
Eddi: Interactive Topic-Based Browsing of Social Status Streams
Michael Bernstein
 
PPTX
Ranking Resources in Folksonomies by Exploiting Semantic and Context-specific...
Thomas Rodenhausen
 
PDF
semantic and social (intra)webs
Fabien Gandon
 
PPTX
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Francesco Osborne
 
PDF
PATHS Final state of art monitoring report v0_4
pathsproject
 
PDF
Topic Modeling for Learning Analytics Researchers LAK15 Tutorial
Vitomir Kovanovic
 
PPT
bonino
Dario Bonino
 
PDF
Deep Learning for Recommender Systems @ TDC SP 2019
Gabriel Moreira
 
PPTX
SMART Seminar Series: "From Big Data to Smart data"
SMART Infrastructure Facility
 
PPTX
UML-Driven Software Performance Engineering: A systematic mapping and a revie...
Vəhid Gəruslu
 
PPTX
INSC580MacasaOpenSourceSoftwareLibrariesFall2016
Michael J. Macasa
 
PPTX
Mei Wang & Sharon Hu's Institutional Repository and Academic Library
FuWaye Bender
 
PPT
Geo-annotations in Semantic Digital Libraries
mdabrowski
 
PPT
Revising lis curriculum with respect to ict application in india
Libsoul Technologies Pvt. Ltd.
 
Slawek Korea
Slawek
 
Semantic Web in Action
Sebastian Ryszard Kruk
 
Info 2402 information retrieval technologies course_outline
Shahriar Rafee
 
Data science syllabus
anoop bk
 
00 intro
Basma Fayech
 
Findability through Traceability - A Realistic Application of Candidate Tr...
Markus Borg
 
Eddi: Interactive Topic-Based Browsing of Social Status Streams
Michael Bernstein
 
Ranking Resources in Folksonomies by Exploiting Semantic and Context-specific...
Thomas Rodenhausen
 
semantic and social (intra)webs
Fabien Gandon
 
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Francesco Osborne
 
PATHS Final state of art monitoring report v0_4
pathsproject
 
Topic Modeling for Learning Analytics Researchers LAK15 Tutorial
Vitomir Kovanovic
 
bonino
Dario Bonino
 
Deep Learning for Recommender Systems @ TDC SP 2019
Gabriel Moreira
 
SMART Seminar Series: "From Big Data to Smart data"
SMART Infrastructure Facility
 
UML-Driven Software Performance Engineering: A systematic mapping and a revie...
Vəhid Gəruslu
 
INSC580MacasaOpenSourceSoftwareLibrariesFall2016
Michael J. Macasa
 
Mei Wang & Sharon Hu's Institutional Repository and Academic Library
FuWaye Bender
 
Geo-annotations in Semantic Digital Libraries
mdabrowski
 
Revising lis curriculum with respect to ict application in india
Libsoul Technologies Pvt. Ltd.
 
Ad

More from sathish sak (20)

PPTX
TRANSPARENT CONCRE
sathish sak
 
PPT
Stationary Waves
sathish sak
 
PPT
Electrical Activity of the Heart
sathish sak
 
PPTX
Electrical Activity of the Heart
sathish sak
 
PPT
Software process life cycles
sathish sak
 
PPT
Digital Logic Circuits
sathish sak
 
PPT
Real-Time Scheduling
sathish sak
 
PPT
Real-Time Signal Processing: Implementation and Application
sathish sak
 
PPT
DIGITAL SIGNAL PROCESSOR OVERVIEW
sathish sak
 
PPTX
FRACTAL ROBOTICS
sathish sak
 
PPTX
Electro bike
sathish sak
 
PPTX
ROBOTIC SURGERY
sathish sak
 
PPTX
POWER GENERATION OF THERMAL POWER PLANT
sathish sak
 
PPT
mathematics application fiels of engineering
sathish sak
 
PPT
Plastics…
sathish sak
 
PPTX
ENGINEERING
sathish sak
 
PPTX
ENVIRONMENTAL POLLUTION
sathish sak
 
PPTX
RFID TECHNOLOGY
sathish sak
 
PPT
green chemistry
sathish sak
 
PPT
NANOTECHNOLOGY
sathish sak
 
TRANSPARENT CONCRE
sathish sak
 
Stationary Waves
sathish sak
 
Electrical Activity of the Heart
sathish sak
 
Electrical Activity of the Heart
sathish sak
 
Software process life cycles
sathish sak
 
Digital Logic Circuits
sathish sak
 
Real-Time Scheduling
sathish sak
 
Real-Time Signal Processing: Implementation and Application
sathish sak
 
DIGITAL SIGNAL PROCESSOR OVERVIEW
sathish sak
 
FRACTAL ROBOTICS
sathish sak
 
Electro bike
sathish sak
 
ROBOTIC SURGERY
sathish sak
 
POWER GENERATION OF THERMAL POWER PLANT
sathish sak
 
mathematics application fiels of engineering
sathish sak
 
Plastics…
sathish sak
 
ENGINEERING
sathish sak
 
ENVIRONMENTAL POLLUTION
sathish sak
 
RFID TECHNOLOGY
sathish sak
 
green chemistry
sathish sak
 
NANOTECHNOLOGY
sathish sak
 
Ad

Recently uploaded (20)

PPTX
一比一原版(SUNY-Albany毕业证)纽约州立大学奥尔巴尼分校毕业证如何办理
Taqyea
 
PPT
Computer Securityyyyyyyy - Chapter 1.ppt
SolomonSB
 
PPTX
一比一原版(LaTech毕业证)路易斯安那理工大学毕业证如何办理
Taqyea
 
PDF
Web Hosting for Shopify WooCommerce etc.
Harry_Phoneix Harry_Phoneix
 
PPTX
原版西班牙莱昂大学毕业证(León毕业证书)如何办理
Taqyea
 
PPTX
INTEGRATION OF ICT IN LEARNING AND INCORPORATIING TECHNOLOGY
kvshardwork1235
 
PPTX
ONLINE BIRTH CERTIFICATE APPLICATION SYSYTEM PPT.pptx
ShyamasreeDutta
 
PPTX
法国巴黎第二大学本科毕业证{Paris 2学费发票Paris 2成绩单}办理方法
Taqyea
 
PPTX
西班牙武康大学毕业证书{UCAMOfferUCAM成绩单水印}原版制作
Taqyea
 
PDF
𝐁𝐔𝐊𝐓𝐈 𝐊𝐄𝐌𝐄𝐍𝐀𝐍𝐆𝐀𝐍 𝐊𝐈𝐏𝐄𝐑𝟒𝐃 𝐇𝐀𝐑𝐈 𝐈𝐍𝐈 𝟐𝟎𝟐𝟓
hokimamad0
 
PPTX
本科硕士学历佛罗里达大学毕业证(UF毕业证书)24小时在线办理
Taqyea
 
PPTX
L1A Season 1 ENGLISH made by A hegy fixed
toszolder91
 
PPT
Computer Securityyyyyyyy - Chapter 2.ppt
SolomonSB
 
PPTX
unit 2_2 copy right fdrgfdgfai and sm.pptx
nepmithibai2024
 
PPTX
Cost_of_Quality_Presentation_Software_Engineering.pptx
farispalayi
 
PDF
The-Hidden-Dangers-of-Skipping-Penetration-Testing.pdf.pdf
naksh4thra
 
PDF
AI_MOD_1.pdf artificial intelligence notes
shreyarrce
 
PPTX
Presentation3gsgsgsgsdfgadgsfgfgsfgagsfgsfgzfdgsdgs.pptx
SUB03
 
PPTX
PM200.pptxghjgfhjghjghjghjghjghjghjghjghjghj
breadpaan921
 
PPTX
L1A Season 1 Guide made by A hegy Eng Grammar fixed
toszolder91
 
一比一原版(SUNY-Albany毕业证)纽约州立大学奥尔巴尼分校毕业证如何办理
Taqyea
 
Computer Securityyyyyyyy - Chapter 1.ppt
SolomonSB
 
一比一原版(LaTech毕业证)路易斯安那理工大学毕业证如何办理
Taqyea
 
Web Hosting for Shopify WooCommerce etc.
Harry_Phoneix Harry_Phoneix
 
原版西班牙莱昂大学毕业证(León毕业证书)如何办理
Taqyea
 
INTEGRATION OF ICT IN LEARNING AND INCORPORATIING TECHNOLOGY
kvshardwork1235
 
ONLINE BIRTH CERTIFICATE APPLICATION SYSYTEM PPT.pptx
ShyamasreeDutta
 
法国巴黎第二大学本科毕业证{Paris 2学费发票Paris 2成绩单}办理方法
Taqyea
 
西班牙武康大学毕业证书{UCAMOfferUCAM成绩单水印}原版制作
Taqyea
 
𝐁𝐔𝐊𝐓𝐈 𝐊𝐄𝐌𝐄𝐍𝐀𝐍𝐆𝐀𝐍 𝐊𝐈𝐏𝐄𝐑𝟒𝐃 𝐇𝐀𝐑𝐈 𝐈𝐍𝐈 𝟐𝟎𝟐𝟓
hokimamad0
 
本科硕士学历佛罗里达大学毕业证(UF毕业证书)24小时在线办理
Taqyea
 
L1A Season 1 ENGLISH made by A hegy fixed
toszolder91
 
Computer Securityyyyyyyy - Chapter 2.ppt
SolomonSB
 
unit 2_2 copy right fdrgfdgfai and sm.pptx
nepmithibai2024
 
Cost_of_Quality_Presentation_Software_Engineering.pptx
farispalayi
 
The-Hidden-Dangers-of-Skipping-Penetration-Testing.pdf.pdf
naksh4thra
 
AI_MOD_1.pdf artificial intelligence notes
shreyarrce
 
Presentation3gsgsgsgsdfgadgsfgfgsfgagsfgsfgzfdgsdgs.pptx
SUB03
 
PM200.pptxghjgfhjghjghjghjghjghjghjghjghjghj
breadpaan921
 
L1A Season 1 Guide made by A hegy Eng Grammar fixed
toszolder91
 

An Introduction to Information Retrieval and Applications

  • 2.  Homework assignments and programming exercises: ~40%  Mid-term exam: ~25%  Term project: ~35%  Including proposal, presentation, and final report
  • 3.  About 3 programming exercises  Team-based (at most 2 persons per team)  You can either write your own code or reuse existing open source code  The term project  Either team-based system development (the same as programming exercises)  Or academic paper presentation  Only one person per team allowed  A proposal is *required* before midterm (Apr. 11, 2014)
  • 4.  The score you get depends on the functions, difficulty and quality of your project  For system development:  System functions and correctness  For academic paper presentation  Quality and your presentation of the paper  Major methods/experimental results *must* be presented  Papers from top conferences are strongly suggested  E.g. SIGIR, WWW, CIKM, WSDM, JCDL, ICMR, …  Proposals are *required* for each team, and will be counted in the score
  • 5.  Submission instructions  Programs, project proposals, and project reports in electronic files must be submitted to the TA online at:  Submissions website: (TBD)  Before submission:  User name: Your student ID  Please change your default password at your first login
  • 6.  This course will NOT tell you  The tips and tricks of using search engines, although power users might have better ideas on how to improve them  There’re plenty of books and websites on that…  How to find books in libraries, although it’s somewhat related to the basic IR concepts  How to make money on the Web, although the currently largest search engine did it
  • 7.  Things that you have been doing all day!  Searching for something interesting: Web, news, e-mail, image, video, …  Asking for advices  …  User interests are changing all the time…  2011: New Zealand Earthquake  2012: Jeremy Lin  2013: Meteor Russia  2014: ? (next slide)
  • 17.  Blast  Explosion  Chelyabinsk  Asteroid 2012 DA14  …
  • 19.  流星  彗星  隕石  俄羅斯  地球  …  And other languages…  And other search engines…  And social websites…
  • 27.  “Information retrieval is a field concerned with the structure, analysis, organization, storage, searching, and retrieval of information.” (Salton, 1968)
  • 28.  Information retrieval (IR): a research field that targets at effectively and efficiently searching information in text and multimedia documents  In this course, we will introduce the basic text and query models in IR, retrieval evaluation, indexing and searching, and applications for IR
  • 30. Inverted Index User Interface Text Operations Query Expansion Indexing Retrieval Ranking Text query user need user feedback ranked docs retrieved docs Doc representationlogical view inverted file Document Collection
  • 31.  Text IR  Indexing and searching  Query languages and operations  Retrieval evaluation  Modeling  Boolean model  Vector space model  Probabilistic model  Applications for IR  Multimedia IR  Web search  Digital libraries
  • 32.  Basics in IR (focus)  Inverted indexes for boolean queries (Ch.1-5)  Term weighting and vector space model (Ch. 6-7)  Evaluation in IR (Ch. 8)  Advanced Topics  Relevance feedback (Ch. 9)  XML retrieval (Ch. 10)  Probabilistic IR (Ch. 11)  Language models (Ch. 12)  Machine learning in IR (useful)  Text classification (Ch. 13-15)  Document clustering (Ch. 16-18)  Web Search  Web crawling and indexes (Ch. 19-20)  Link analysis (Ch. 21)
  • 33.  Text mining  Machine Learning  Natural Language Processing  Social Network Analysis  …
  • 34.  Cross-language IR  Image, video, and multimedia IR  Speech retrieval  Music retrieval  User interfaces  Parallel, distributed, and P2P IR  Digital libraries  Information science perspective  Logic-based approaches to IR  Natural language processing techniques  …
  • 35.  Before midterm  Boolean retrieval (1 wk)  Indexing (2 wks)  Vector space model and evaluation (2 wk)  Relevance feedback (1 wk)  Probabilistic IR (2 wk)  After midterm  Text classification (1-2 wk)  Document clustering (1-2 wk)  Web search (2 wks)  Advanced topics: CLIR, IE, … (2 wks)  Term Project Presentation (3 wks)
  • 36.  Wikipedia page on Information Retrieval: http://en.wikipedia.org/wiki/Information_ret rieval  Information Retrieval Resources: http://www- csli.stanford.edu/~hinrich/information- retrieval.html 
  • 37.  Journals  ACM TOIS: Transactions on Information Systems  JASIST: Journal of the American Society of Information Sciences  IP&M: Information Processing and Management  IEEE TKDE: Transactions on Knowledge and Data Engineering  Conferences  ACM SIGIR: International Conference on Information Retrieval  WWW: World Wide Web Conference  ACM CIKM: Conference on Information Knowledge and Management  JCDL: ACM/IEEE Joint Conference on Digital Libraries  ACM WSDM: International Conference on Web Search and Data Mining  TREC: Text Retrieval Conference
  • 38.  Slides and lectures will be offered mainly in English  For better understanding for domestic students, important concepts will be briefly summarized in Chinese