SlideShare a Scribd company logo
2
Most read
3
Most read
13
Most read
Cost Estimation
in
Query Optimization
Cost Estimation in Query Optimization
• The main aim of query optimization is to
choose the most efficient way of implementing
the relational algebra operations at the lowest
possible cost.
• The query optimizer should not depend solely
on heuristic rules, but, it should also estimate
the cost of executing the different strategies
and find out the strategy with the minimum
cost estimate.
• The cost functions used in query optimization are
estimates and not exact cost functions.
• The cost of an operation is heavily dependent on
its selectivity, that is, the proportion of select
operation(s) that forms the output.
• In general the different algorithms are suitable
for low or high selectivity queries.
• In order for query optimizer to choose suitable
algorithm for an operation an estimate of the
cost of executing that algorithm must be
provided
• The cost of an algorithm is depend of a
cardinality of its input.
• To estimate the cost of different query execution
strategies, the query tree is viewed as containing
a series of basic operations which are linked in
order to perform the query.
• It is also important to know the expected
cardinality of an operation’s output because this
forms the input to the next operation.
Cost Components of Query Execution
• The cost of executing the query includes the
following components:
– Access cost to secondary storage.
– Storage cost.
– Computation cost.
– Memory uses cost.
– Communication cost.
Importance of Access cost
• Out of the above five cost components, the most
important is the secondary storage access cost.
• The emphasis of the cost minimization depends
on the size and type of database applications.
• For example in smaller database the emphasis is
on the minimizing computing cost as because
most of the data in the files involve in the query
can be completely store in the main memory.
• For large database, the main emphasis is on
minimizing the access cost to secondary device.
• For distributed database, the communication cost
is minimized as because many sites are involved
for the data transfer.
• To estimate the cost of various execution
strategies, we must keep track of any information
that is needed for the cost function.
This information may be stored in database
catalog, where it is accessed by the query
optimizer.
Information in system Catalogue
• The number of tuples in relation as R [nTuples(R)].
• The average record size in relation R.
• The number of blocks required to store relation R as
[nBlocks(R)].
• The blocking factors in relation R (that is the number of
tuples of R that fit into one block) as [bFactor(R)].
• Primary access method for each file.
• Primary access attributes for each file.
• The number of level of each multilevel index I (primary,
secondary or clustering) as [nLevelsA(I)].
• The number of first level index blocks as [nBlocksA (I)].
• The number of distinct values that are appear for
attribute A in relation R as [nDistinctA(R)].
• The minimum and maximum possible values for
attribute A in relation R as [minA(R), maxA(R)].
• The selectivity of an attribute, which is the fraction of
records satisfying an equality condition on the
attribute.
• The selection cardinality of given attribute A in relation
R as [SCA(R)].
• The selection cardinality is the average number of
tuples that satisfied an equality condition on attribute
A.
Cost functions for SELECT Operation
• Linear Search:
– [nBlocks(R)/2], if the record is found.
– [nBlocks(R)], if no record satisfied the condition.
• Binary Search :
o [log2(nBlocks(R))], if equality condition is on key attribute,
because SCA(R) = 1 in this case.
o[log2(nBlocks(R))] + [SCA(R)/bFactor(R)] – 1, otherwise.
• Equity condition on Primary key
– [nLevelA(I) + 1]
• Equity condition on Non-Primary key :-
– [nLevelA(I) + 1] + [nBlocks(R)/2]
Cost functions for JOIN Operation
• Join operation is the most time consuming
operation to process.
• An estimate for the size (number of tuples) of the
file that results after the JOIN operation is
required to develop reasonably accurate cost
functions for JOIN operations.
• The JOIN operations define the relation
containing tuples that satisfy a specific predicate
F from the Cartesian product of two relations R
and S.
Different strategies for JOIN operations
Different strategies for JOIN operations

More Related Content

What's hot (20)

PPTX
Code generation
Aparna Nayak
 
PPTX
Query processing
Dr. C.V. Suresh Babu
 
PPTX
8 queens problem using back tracking
Tech_MX
 
PPTX
Validation based protocol
BBDITM LUCKNOW
 
PDF
Query optimization in SQL
Abdul Rehman
 
PPTX
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
Gyanmanjari Institute Of Technology
 
PPTX
Data structures and algorithms
Julie Iskander
 
PPTX
Binary Search Tree in Data Structure
Dharita Chokshi
 
PPTX
Relational algebra ppt
GirdharRatne
 
PPTX
Unit iv(simple code generator)
Kalaimathi Vijayakumar
 
PPTX
2 phase locking protocol DBMS
Dhananjaysinh Jhala
 
PPTX
Intermediate code generator
sanchi29
 
PPTX
Transaction processing ppt
Javed Khan
 
PPTX
Distributed database
ReachLocal Services India
 
PPTX
Transaction states and properties
Chetan Mahawar
 
PPTX
database recovery techniques
Kalhan Liyanage
 
PPT
Algorithm analysis
sumitbardhan
 
PPTX
String matching algorithms
Ashikapokiya12345
 
PPTX
Lecture 14 run time environment
Iffat Anjum
 
PPTX
SQL Joins.pptx
Ankit Rai
 
Code generation
Aparna Nayak
 
Query processing
Dr. C.V. Suresh Babu
 
8 queens problem using back tracking
Tech_MX
 
Validation based protocol
BBDITM LUCKNOW
 
Query optimization in SQL
Abdul Rehman
 
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
Gyanmanjari Institute Of Technology
 
Data structures and algorithms
Julie Iskander
 
Binary Search Tree in Data Structure
Dharita Chokshi
 
Relational algebra ppt
GirdharRatne
 
Unit iv(simple code generator)
Kalaimathi Vijayakumar
 
2 phase locking protocol DBMS
Dhananjaysinh Jhala
 
Intermediate code generator
sanchi29
 
Transaction processing ppt
Javed Khan
 
Distributed database
ReachLocal Services India
 
Transaction states and properties
Chetan Mahawar
 
database recovery techniques
Kalhan Liyanage
 
Algorithm analysis
sumitbardhan
 
String matching algorithms
Ashikapokiya12345
 
Lecture 14 run time environment
Iffat Anjum
 
SQL Joins.pptx
Ankit Rai
 

Similar to Cost estimation for Query Optimization (20)

PPT
VNSISPL_DBMS_Concepts_ch13
sriprasoon
 
PPTX
Query-porcessing-& Query optimization
Saranya Natarajan
 
PDF
8 query
tranly8x
 
PDF
MySQL Optimizer Cost Model
Olav Sandstå
 
PPTX
Firebird: cost-based optimization and statistics, by Dmitry Yemanov (in English)
Alexey Kovyazin
 
PPSX
Cost Based Oracle
Santosh Kangane
 
PDF
CH5_Query Processing and Optimization.pdf
amariyarana
 
ODP
Understandung Firebird optimizer, by Dmitry Yemanov (in English)
Alexey Kovyazin
 
PDF
How to Analyze and Tune MySQL Queries for Better Performance
oysteing
 
PPTX
1a_query_processing_sil_7ed_ch15.pptx
DrSonuMittal
 
PPT
1a_query_processing_sil_7ed_ch15.ppt
rsingh5987
 
PDF
Measures of query cost
Hitesh Mohapatra
 
PDF
Query Optimization - Brandon Latronica
"FENG "GEORGE"" YU
 
PDF
Issues in Query Processing and Optimization
Editor IJMTER
 
PPT
Algorithm ch13.ppt
Dreamless2
 
PDF
Query Processing and Optimisation - Lecture 10 - Introduction to Databases (1...
Beat Signer
 
PDF
Query optimizer vivek sharma
aioughydchapter
 
PDF
unit 3 DBMS.docx.pdf geometric transformer in query processing
FallenAngel35
 
VNSISPL_DBMS_Concepts_ch13
sriprasoon
 
Query-porcessing-& Query optimization
Saranya Natarajan
 
8 query
tranly8x
 
MySQL Optimizer Cost Model
Olav Sandstå
 
Firebird: cost-based optimization and statistics, by Dmitry Yemanov (in English)
Alexey Kovyazin
 
Cost Based Oracle
Santosh Kangane
 
CH5_Query Processing and Optimization.pdf
amariyarana
 
Understandung Firebird optimizer, by Dmitry Yemanov (in English)
Alexey Kovyazin
 
How to Analyze and Tune MySQL Queries for Better Performance
oysteing
 
1a_query_processing_sil_7ed_ch15.pptx
DrSonuMittal
 
1a_query_processing_sil_7ed_ch15.ppt
rsingh5987
 
Measures of query cost
Hitesh Mohapatra
 
Query Optimization - Brandon Latronica
"FENG "GEORGE"" YU
 
Issues in Query Processing and Optimization
Editor IJMTER
 
Algorithm ch13.ppt
Dreamless2
 
Query Processing and Optimisation - Lecture 10 - Introduction to Databases (1...
Beat Signer
 
Query optimizer vivek sharma
aioughydchapter
 
unit 3 DBMS.docx.pdf geometric transformer in query processing
FallenAngel35
 
Ad

More from Ravinder Kamboj (14)

PPTX
Data warehouse,data mining & Big Data
Ravinder Kamboj
 
PPTX
DDBMS
Ravinder Kamboj
 
PPTX
Query processing and optimization (updated)
Ravinder Kamboj
 
PPTX
Query processing
Ravinder Kamboj
 
PPTX
Normalization of Data Base
Ravinder Kamboj
 
PPTX
Architecture of dbms(lecture 3)
Ravinder Kamboj
 
PPTX
Sql fundamentals
Ravinder Kamboj
 
PPTX
Lecture 1&2(rdbms-ii)
Ravinder Kamboj
 
PPTX
Java script
Ravinder Kamboj
 
PPT
File Management
Ravinder Kamboj
 
PPTX
HTML Forms
Ravinder Kamboj
 
PPTX
DHTML
Ravinder Kamboj
 
PPTX
CSA lecture-1
Ravinder Kamboj
 
PPTX
Relational database management system (rdbms) i
Ravinder Kamboj
 
Data warehouse,data mining & Big Data
Ravinder Kamboj
 
Query processing and optimization (updated)
Ravinder Kamboj
 
Query processing
Ravinder Kamboj
 
Normalization of Data Base
Ravinder Kamboj
 
Architecture of dbms(lecture 3)
Ravinder Kamboj
 
Sql fundamentals
Ravinder Kamboj
 
Lecture 1&2(rdbms-ii)
Ravinder Kamboj
 
Java script
Ravinder Kamboj
 
File Management
Ravinder Kamboj
 
HTML Forms
Ravinder Kamboj
 
CSA lecture-1
Ravinder Kamboj
 
Relational database management system (rdbms) i
Ravinder Kamboj
 
Ad

Recently uploaded (20)

PDF
Horarios de distribución de agua en julio
pegazohn1978
 
PDF
Stokey: A Jewish Village by Rachel Kolsky
History of Stoke Newington
 
PDF
Chapter-V-DED-Entrepreneurship: Institutions Facilitating Entrepreneurship
Dayanand Huded
 
PDF
Is Assignment Help Legal in Australia_.pdf
thomas19williams83
 
PDF
Biological Bilingual Glossary Hindi and English Medium
World of Wisdom
 
PPTX
ENG8_Q1_WEEK2_LESSON1. Presentation pptx
marawehsvinetshe
 
PDF
Vani - The Voice of Excellence - Jul 2025 issue
Savipriya Raghavendra
 
PDF
Aprendendo Arquitetura Framework Salesforce - Dia 03
Mauricio Alexandre Silva
 
PDF
Week 2 - Irish Natural Heritage Powerpoint.pdf
swainealan
 
PPTX
Difference between write and update in odoo 18
Celine George
 
PDF
Exploring the Different Types of Experimental Research
Thelma Villaflores
 
PDF
AI-Powered-Visual-Storytelling-for-Nonprofits.pdf
TechSoup
 
PDF
Reconstruct, Restore, Reimagine: New Perspectives on Stoke Newington’s Histor...
History of Stoke Newington
 
PPTX
EDUCATIONAL MEDIA/ TEACHING AUDIO VISUAL AIDS
Sonali Gupta
 
PPTX
infertility, types,causes, impact, and management
Ritu480198
 
PDF
Workbook de Inglés Completo - English Path.pdf
shityouenglishpath
 
PDF
The History of Phone Numbers in Stoke Newington by Billy Thomas
History of Stoke Newington
 
PPTX
How to Manage Allocation Report for Manufacturing Orders in Odoo 18
Celine George
 
PPTX
SD_GMRC5_Session 6AB_Dulog Pedagohikal at Pagtataya (1).pptx
NickeyArguelles
 
PPTX
DIGITAL CITIZENSHIP TOPIC TLE 8 MATATAG CURRICULUM
ROBERTAUGUSTINEFRANC
 
Horarios de distribución de agua en julio
pegazohn1978
 
Stokey: A Jewish Village by Rachel Kolsky
History of Stoke Newington
 
Chapter-V-DED-Entrepreneurship: Institutions Facilitating Entrepreneurship
Dayanand Huded
 
Is Assignment Help Legal in Australia_.pdf
thomas19williams83
 
Biological Bilingual Glossary Hindi and English Medium
World of Wisdom
 
ENG8_Q1_WEEK2_LESSON1. Presentation pptx
marawehsvinetshe
 
Vani - The Voice of Excellence - Jul 2025 issue
Savipriya Raghavendra
 
Aprendendo Arquitetura Framework Salesforce - Dia 03
Mauricio Alexandre Silva
 
Week 2 - Irish Natural Heritage Powerpoint.pdf
swainealan
 
Difference between write and update in odoo 18
Celine George
 
Exploring the Different Types of Experimental Research
Thelma Villaflores
 
AI-Powered-Visual-Storytelling-for-Nonprofits.pdf
TechSoup
 
Reconstruct, Restore, Reimagine: New Perspectives on Stoke Newington’s Histor...
History of Stoke Newington
 
EDUCATIONAL MEDIA/ TEACHING AUDIO VISUAL AIDS
Sonali Gupta
 
infertility, types,causes, impact, and management
Ritu480198
 
Workbook de Inglés Completo - English Path.pdf
shityouenglishpath
 
The History of Phone Numbers in Stoke Newington by Billy Thomas
History of Stoke Newington
 
How to Manage Allocation Report for Manufacturing Orders in Odoo 18
Celine George
 
SD_GMRC5_Session 6AB_Dulog Pedagohikal at Pagtataya (1).pptx
NickeyArguelles
 
DIGITAL CITIZENSHIP TOPIC TLE 8 MATATAG CURRICULUM
ROBERTAUGUSTINEFRANC
 

Cost estimation for Query Optimization

  • 2. Cost Estimation in Query Optimization • The main aim of query optimization is to choose the most efficient way of implementing the relational algebra operations at the lowest possible cost. • The query optimizer should not depend solely on heuristic rules, but, it should also estimate the cost of executing the different strategies and find out the strategy with the minimum cost estimate.
  • 3. • The cost functions used in query optimization are estimates and not exact cost functions. • The cost of an operation is heavily dependent on its selectivity, that is, the proportion of select operation(s) that forms the output. • In general the different algorithms are suitable for low or high selectivity queries. • In order for query optimizer to choose suitable algorithm for an operation an estimate of the cost of executing that algorithm must be provided
  • 4. • The cost of an algorithm is depend of a cardinality of its input. • To estimate the cost of different query execution strategies, the query tree is viewed as containing a series of basic operations which are linked in order to perform the query. • It is also important to know the expected cardinality of an operation’s output because this forms the input to the next operation.
  • 5. Cost Components of Query Execution • The cost of executing the query includes the following components: – Access cost to secondary storage. – Storage cost. – Computation cost. – Memory uses cost. – Communication cost.
  • 6. Importance of Access cost • Out of the above five cost components, the most important is the secondary storage access cost. • The emphasis of the cost minimization depends on the size and type of database applications. • For example in smaller database the emphasis is on the minimizing computing cost as because most of the data in the files involve in the query can be completely store in the main memory. • For large database, the main emphasis is on minimizing the access cost to secondary device.
  • 7. • For distributed database, the communication cost is minimized as because many sites are involved for the data transfer. • To estimate the cost of various execution strategies, we must keep track of any information that is needed for the cost function. This information may be stored in database catalog, where it is accessed by the query optimizer.
  • 8. Information in system Catalogue • The number of tuples in relation as R [nTuples(R)]. • The average record size in relation R. • The number of blocks required to store relation R as [nBlocks(R)]. • The blocking factors in relation R (that is the number of tuples of R that fit into one block) as [bFactor(R)]. • Primary access method for each file. • Primary access attributes for each file. • The number of level of each multilevel index I (primary, secondary or clustering) as [nLevelsA(I)].
  • 9. • The number of first level index blocks as [nBlocksA (I)]. • The number of distinct values that are appear for attribute A in relation R as [nDistinctA(R)]. • The minimum and maximum possible values for attribute A in relation R as [minA(R), maxA(R)]. • The selectivity of an attribute, which is the fraction of records satisfying an equality condition on the attribute. • The selection cardinality of given attribute A in relation R as [SCA(R)]. • The selection cardinality is the average number of tuples that satisfied an equality condition on attribute A.
  • 10. Cost functions for SELECT Operation • Linear Search: – [nBlocks(R)/2], if the record is found. – [nBlocks(R)], if no record satisfied the condition. • Binary Search : o [log2(nBlocks(R))], if equality condition is on key attribute, because SCA(R) = 1 in this case. o[log2(nBlocks(R))] + [SCA(R)/bFactor(R)] – 1, otherwise.
  • 11. • Equity condition on Primary key – [nLevelA(I) + 1] • Equity condition on Non-Primary key :- – [nLevelA(I) + 1] + [nBlocks(R)/2]
  • 12. Cost functions for JOIN Operation • Join operation is the most time consuming operation to process. • An estimate for the size (number of tuples) of the file that results after the JOIN operation is required to develop reasonably accurate cost functions for JOIN operations. • The JOIN operations define the relation containing tuples that satisfy a specific predicate F from the Cartesian product of two relations R and S.
  • 13. Different strategies for JOIN operations
  • 14. Different strategies for JOIN operations