SlideShare a Scribd company logo
Submitted by: Iqra Tamseela
Roll no: 20 21
Topic
Normalization
Presentation
Information Retrieval
Technique
Our Content
• IR tool
Why use information retrieval tools
Retrieval Tools. Systems created for retrieval of information.Retrieval
tools are essential as basic building blocks for a system that will organize
recorded information that is collected by libraries, archives, museums, etc.
What is Normalization
The process of organizing data to minimize
redundancy. Normalization usually involves dividing a database into
two or more tables and defining relationships between the tables. ...
For example, in an employee list, each table would contain only one
birthdate field. normalization and its Types. Normalization is the
process of organizing data into a related table; it also eliminates
redundancy and increases the integrity which improves performance
of the query. Mar 15, 2011
Types of Normalization
• Normalization Avoids:
• Duplication of Data - The same data is listed in multiple lines of the database
• Insert Anomaly - A record about an entity cannot be inserted into the table without first
inserting information about another entity - Cannot enter a customer without a sales order
• Delete Anomaly - A record cannot be deleted without deleting a record about a related entity.
Cannot delete a sales order without deleting all of the customer's information.
• Update Anomaly - Cannot update information without changing information in many places.
To update customer information, it must be updated for each sales order the customer has placed
•
• Normalization ensure that the database is structured in the best possible way.
• To achieve control over data redundancy .There should be no necessary
duplication of data in different tables.
• To ensure tables have flexible.
• Searching,sorting,and creating indexes is faster, since tables are narrower, and
more rows fit on a data page.
• You usually have more tables
• Index searching is often faster
• A common misunderstanding is the term "frequency". To some, it seems to
be the count of objects. But usually, frequency is a relative value. TF/IDF
usually is a two-fold normalization. First, each document is normalized to
length 1, so there is no bias for longer or shorter documents
• Formula
• =tfi= tfi/tfmax
• More complicated SQL required for multitable sub queries and
joins.
• Extra work for DBMS can mean slower applications
• First Normal form(1NF)
• Second Normal form(2NF)
• Third Normal form(3NF)
• Fourth Normal form(4NF)
• Fifth Normal form(5NF)
information retrieval Techniques and normalization
•
Types of Normalization
•
• Document length normalization adjusts the term frequency or the
relevance score in order to normalize the effect of document length on the
document ranking.
. we may need to “normalize” words in indexed text as well as query words into the same
form
. we want to match U.S.A and USA
Token are transformed to terms which are then entered into the index
A term is a(normalized)word type ,which is an entry in our IR system dictionary
We most commonly implicitly define equivalence class of terms by e.g.,
deleting periods to form a term
U.S.A, USA(USA
. deleting hyphens to form a term
.anti-discriminatory, antidiscriminatory (antidiscriminatory
• Accents: e.g., French résumés. resume
• Simple remedy remove accent but not good in case of Resume
with and without accent.
information retrieval Techniques and normalization
Thanks for paying attention

More Related Content

What's hot (20)

PPTX
Term weighting
Primya Tamil
 
PPTX
Automatic indexing
dhatchayaninandu
 
PPTX
Model of information retrieval (3)
9866825059
 
PPTX
multi dimensional data model
moni sindhu
 
PPTX
Digital library
Sandeep Singh Saini
 
PPTX
File organization
RituBhargava7
 
PPTX
Web mining
TeklayBirhane
 
PPTX
Information retrieval (introduction)
Primya Tamil
 
PDF
CS6007 information retrieval - 5 units notes
Anandh Arumugakan
 
PPTX
Web mining
Tanjarul Islam Mishu
 
PPT
Information Retrieval Models
Nisha Arankandath
 
PPTX
2 phase locking protocol DBMS
Dhananjaysinh Jhala
 
PPTX
Text mining
ThejeswiniChivukula
 
PPTX
Data Mining
SHIKHA GAUTAM
 
PDF
Evaluation in Information Retrieval
Dishant Ailawadi
 
PPTX
web mining
Arpit Verma
 
PPTX
Clustering in Data Mining
Archana Swaminathan
 
PPTX
Distributed DBMS - Unit 6 - Query Processing
Gyanmanjari Institute Of Technology
 
PPTX
Data mining
Akannsha Totewar
 
PPTX
Lec1,2
alaa223
 
Term weighting
Primya Tamil
 
Automatic indexing
dhatchayaninandu
 
Model of information retrieval (3)
9866825059
 
multi dimensional data model
moni sindhu
 
Digital library
Sandeep Singh Saini
 
File organization
RituBhargava7
 
Web mining
TeklayBirhane
 
Information retrieval (introduction)
Primya Tamil
 
CS6007 information retrieval - 5 units notes
Anandh Arumugakan
 
Information Retrieval Models
Nisha Arankandath
 
2 phase locking protocol DBMS
Dhananjaysinh Jhala
 
Text mining
ThejeswiniChivukula
 
Data Mining
SHIKHA GAUTAM
 
Evaluation in Information Retrieval
Dishant Ailawadi
 
web mining
Arpit Verma
 
Clustering in Data Mining
Archana Swaminathan
 
Distributed DBMS - Unit 6 - Query Processing
Gyanmanjari Institute Of Technology
 
Data mining
Akannsha Totewar
 
Lec1,2
alaa223
 

Similar to information retrieval Techniques and normalization (20)

PPT
Normalization.ppt What is Normalizations
SHAKIR325211
 
PPTX
Normalization_database_EERD_education,presentation.pptx
charlesharri01
 
PPT
Normalization
Altafsoomro
 
PPTX
normalization-1.pptx
AbhishekJohnCharan1
 
PDF
UNIT 4 NORMALIZATION AND QUERY OPTIMIZATION 9.pdf
saranyaksr92
 
PPTX
Database Normalisation
sheetalverma38
 
PPTX
database ds...normalization data base
wwcd090
 
PPTX
database ds...normalization in data base
wwcd090
 
PPT
Normalization
Masud Parves
 
PPTX
Normalization
nikesparkz
 
PPTX
Normalization
Dr. C.V. Suresh Babu
 
PDF
Normalization in Database
A. S. M. Shafi
 
PDF
Normalisation [Slides].pdf introduction language
AndrewSilungwe2
 
PDF
DBMS unit-3.pdf
Prof. Dr. K. Adisesha
 
PPTX
Normalization in Relational Data Model.pptx
HajarMeseehYaseen
 
PPTX
What's database normalization
Harish Gyanani
 
DOCX
Normalization in relational database management systems
Preethi T G
 
PPT
Normalization PRESENTATION
bit allahabad
 
PPTX
Database Management System - Database Normalization.pptx
JoshuaFandialanMader
 
PPTX
Persentation of SAD 2
Khaled Salmeen BAzqameh
 
Normalization.ppt What is Normalizations
SHAKIR325211
 
Normalization_database_EERD_education,presentation.pptx
charlesharri01
 
Normalization
Altafsoomro
 
normalization-1.pptx
AbhishekJohnCharan1
 
UNIT 4 NORMALIZATION AND QUERY OPTIMIZATION 9.pdf
saranyaksr92
 
Database Normalisation
sheetalverma38
 
database ds...normalization data base
wwcd090
 
database ds...normalization in data base
wwcd090
 
Normalization
Masud Parves
 
Normalization
nikesparkz
 
Normalization
Dr. C.V. Suresh Babu
 
Normalization in Database
A. S. M. Shafi
 
Normalisation [Slides].pdf introduction language
AndrewSilungwe2
 
DBMS unit-3.pdf
Prof. Dr. K. Adisesha
 
Normalization in Relational Data Model.pptx
HajarMeseehYaseen
 
What's database normalization
Harish Gyanani
 
Normalization in relational database management systems
Preethi T G
 
Normalization PRESENTATION
bit allahabad
 
Database Management System - Database Normalization.pptx
JoshuaFandialanMader
 
Persentation of SAD 2
Khaled Salmeen BAzqameh
 
Ad

Recently uploaded (20)

PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Biography of Daniel Podor.pdf
Daniel Podor
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Ad

information retrieval Techniques and normalization

  • 1. Submitted by: Iqra Tamseela Roll no: 20 21 Topic Normalization
  • 4. Why use information retrieval tools Retrieval Tools. Systems created for retrieval of information.Retrieval tools are essential as basic building blocks for a system that will organize recorded information that is collected by libraries, archives, museums, etc.
  • 5. What is Normalization The process of organizing data to minimize redundancy. Normalization usually involves dividing a database into two or more tables and defining relationships between the tables. ... For example, in an employee list, each table would contain only one birthdate field. normalization and its Types. Normalization is the process of organizing data into a related table; it also eliminates redundancy and increases the integrity which improves performance of the query. Mar 15, 2011
  • 6. Types of Normalization • Normalization Avoids: • Duplication of Data - The same data is listed in multiple lines of the database • Insert Anomaly - A record about an entity cannot be inserted into the table without first inserting information about another entity - Cannot enter a customer without a sales order • Delete Anomaly - A record cannot be deleted without deleting a record about a related entity. Cannot delete a sales order without deleting all of the customer's information. • Update Anomaly - Cannot update information without changing information in many places. To update customer information, it must be updated for each sales order the customer has placed •
  • 7. • Normalization ensure that the database is structured in the best possible way. • To achieve control over data redundancy .There should be no necessary duplication of data in different tables. • To ensure tables have flexible.
  • 8. • Searching,sorting,and creating indexes is faster, since tables are narrower, and more rows fit on a data page. • You usually have more tables • Index searching is often faster
  • 9. • A common misunderstanding is the term "frequency". To some, it seems to be the count of objects. But usually, frequency is a relative value. TF/IDF usually is a two-fold normalization. First, each document is normalized to length 1, so there is no bias for longer or shorter documents • Formula • =tfi= tfi/tfmax
  • 10. • More complicated SQL required for multitable sub queries and joins. • Extra work for DBMS can mean slower applications
  • 11. • First Normal form(1NF) • Second Normal form(2NF) • Third Normal form(3NF) • Fourth Normal form(4NF) • Fifth Normal form(5NF)
  • 13.
  • 15. • Document length normalization adjusts the term frequency or the relevance score in order to normalize the effect of document length on the document ranking.
  • 16. . we may need to “normalize” words in indexed text as well as query words into the same form . we want to match U.S.A and USA Token are transformed to terms which are then entered into the index A term is a(normalized)word type ,which is an entry in our IR system dictionary We most commonly implicitly define equivalence class of terms by e.g., deleting periods to form a term U.S.A, USA(USA . deleting hyphens to form a term .anti-discriminatory, antidiscriminatory (antidiscriminatory
  • 17. • Accents: e.g., French résumés. resume • Simple remedy remove accent but not good in case of Resume with and without accent.
  • 19. Thanks for paying attention