SlideShare a Scribd company logo
Name- Anshika Das
Roll-14400121029
Subcode- PCC-CS601
Sub-DBMS
Department-CSE
Agenda
1. Introduction
â€ĸ Defining the Role of Databases
â€ĸ Overview of Key Components
2. Query Processing and Optimization
â€ĸ Understanding the Query Lifecycle
â€ĸ Techniques for Optimizing Queries
2
3. Database Security
â€ĸ Importance of Secure Database
Management
â€ĸ Key Security Measures
4. Advanced Topics
â€ĸ In-Memory Databases, NoSQL,
Big Data, ML, Blockchain
Importance of Database Management
â€ĸ Central role in storing, retrieving, and managing
data.
â€ĸ Foundation for various applications and
systems.
Key Components:
Query Processing:
Transformation of user
queries into
executable actions.
Optimization:
Enhancing query
performance through
efficient algorithms.
Security: Safeguarding
data integrity,
confidentiality, and
availability. 3
4
â—ŧ Query processing:
â—ŧ Is the list of activities that are perform to obtain the
required tuples that satisfy a given query.
â—ŧ Query optimization:
â—ŧ The process of choosing a suitable execution
strategy for processing a query.
Introduction to Query Processing and Optimization
Two internal
representations
of a query:
â€ĸ Query Tree
â€ĸ Query Graph
query
parser and
translator
relational algebra
expression
optimizer
evaluation plan
evaluation
engine
output
data data
data
statistics
Query Processing
It is done in the following steps:
S tep-1:
Parser: During parse call, the database performs the following
checks- Syntax check, Semantic check and Shared pool check,
after converting the query into relational algebra. Parser performs
the following checks as (refer detailed diagram).
1.Syntax check – concludes SQL syntactic validity. Example:
SELECT * FORM employee
Here error of wrong spelling of FROM is given by this check.
Process of query processing
2. Semantic check – determines whether the statement is meaningful or not.
Example: query contains a table name which does not exist is checked by this
check.
3. Shared Pool check – Every query possess a hash code during its
execution. So, this check determines existence of written hash code in shared
pool if code exists in shared pool then database will not take additional steps for
optimization and execution.
Hard Parse and Soft Parse –
If there is a fresh query and its hash code does not exist in
shared pool then that query has to pass through from the
additional steps known as hard parsing otherwise if hash code
exists then query does not passes through additional steps. It just
passes directly to execution engine (refer detailed diagram). This
is known as soft parsing.
Hard Parse includes following steps – Optimizer and Row source
generation.
Step-2:
Optimizer: During optimization stage, database must perform a
hard parse at least for one unique DML statement and perform
optimization during this parse. This database never optimizes DDL
unless it includes a DML component such as subquery that
require optimization.
It is a process in which multiple query execution plan for satisfying a
query are examined and most efficient query plan is satisfied for
execution.
Database catalog stores the execution plans and then optimizer passes
the lowest cost plan for execution.
Row Source Generation –
The Row Source Generation is a software that receives a optimal
execution plan from the optimizer and produces an iterative execution
plan that is usable by the rest of the database. the iterative plan is the
binary program that when executes by the sql engine produces the
result set.
Step-3:
Execution Engine: Finally runs the query and display the required
result.
Query in a high level language
Scanning, parsing and validating
Immediate form of query
Query optimizer
Execution plan
Query code generator
Code to execute the query
Example of query process
SELECT * FROM Student Details WHERE name=Paul
Parse query and translate
check syntax, verify names, etc
translate into relational algebra (RDBMS)
create evaluation plans
Find best plan (optimization)
Execute plan
student_details
cid name
00112233 Paul
00112238 Rob
00112235 Matt
takes
cid courseid
00112233 312
00112233 395
00112235 312
course
courseid coursename
312 Advanced DBs
395 Machine Learning
student_details
cid name
00112233 Paul
00112238 Rob
00112235 Matt
takes
cid courseid
00112233 312
00112233 395
00112235 312
course
courseid coursename
312 Advanced DBs
395 Machine Learning
Query to retrieve the information of a particular student name =
Paul
SELECT * FROM Student_ details WHERE name=Paul
΃name=Paul(Student_ details)
Ī€name( ΃ =001122359(Student Details) )
Relational Algebra
Specify which access path to follow
Specify which algorithm to use to evaluate operator
Specify how operators interleave
Optimization:
estimate the cost of each plan (not all plans)
select plan with lowest estimated cost
Evaluation plans
Student_details
΃name=Paul
Student_details
΃coursename=Advanced DBs l
_details takes
cid; hash join
courseid; index-nested loop
course
Ī€name
What needs to be considered
â€ĸ Disk I/O s
â€ĸ Sequential
â€ĸ random
â€ĸ CPU time
â€ĸ Network communication
What are we going to consider
â€ĸ Disk I/O s
â€ĸ page reads/writes
â€ĸ Ignoring cost of writing final output
Estimating Cost
Query Optimization
Need for Query Optimization
1.Improved Performance:
Reduced Execution Time: Optimized
queries often result in reduced execution
times, leading to faster response times for
users and applications.
Resource Utilization: Optimized queries
use system resources more efficiently,
preventing unnecessary strain on the
database server.
2.Resource Conservation:
CPU and Memory Usage: Efficiently optimized
queries consume fewer CPU and memory
resources, allowing the system to handle a larger
number of concurrent queries without degradation
in performance.
Disk I/O: Optimized queries minimize disk I/O
operations, reducing the load on storage devices
and improving overall system throughput.
3.Scalability:
Support for Growth: As the volume of data and user
queries increases, well-optimized queries ensure that the
database system can scale effectively without a
proportional decrease in performance.
Adaptability: Query optimization allows the DBMS to
adapt to changes in data distribution, size, and query
patterns.
4.Cost Reduction:
Hardware Costs: By optimizing queries and utilizing
system resources efficiently, organizations can avoid the
need for constant hardware upgrades to meet growing
demands.
Operational Costs: Faster query execution and
reduced resource consumption contribute to lower
operational costs, especially in large-scale enterprise
environments.
Rule-based Optimization
Overview: Rule-based optimization involves
static optimization using predefined rules set by
the database administrator or system designer.
How It Works:
â€ĸ The DBMS relies on a set of
predefined rules and heuristics to
choose an execution plan for a given
query.
â€ĸ These rules are typically based on the
structure of the query, the available
indexes, and historical performance
data.
Advantages:
â€ĸ Simplicity: Rule-based optimization is
straightforward to implement and
understand.
â€ĸ Predictability: The query optimizer
follows a set of fixed rules, providing
predictable results.
Limitations:
â€ĸ Lack of Adaptability: Rule-based
optimization may struggle with
adaptability to changing data
distributions or dynamic workload
patterns.
â€ĸ Limited Complexity: In complex
scenarios, rule-based systems may not
handle optimization challenges.
Cost-based Optimization
Overview: Cost-based optimization
is a dynamic optimization approach
based on estimating the cost of
different execution plans and
choosing most efficient one.
How It Works:
â€ĸ The DBMS analyzes multiple
execution plans for a query
and estimates the cost
associated with each plan.
â€ĸ The cost includes factors
such as disk I/O, CPU usage,
and memory consumption.
â€ĸ The optimizer selects the
execution plan with the
lowest estimated cost.
Advantages:
â€ĸ Adaptability: Cost-based
optimization adapts to
changes in data distribution,
statistics, and system
resources.
Limitations:
â€ĸ Complexity: Implementing a
cost-based optimizer
requires sophisticated
algorithms and statistical
modeling.
Query optimization is used to access and modify
the database in the most efficient way possible. It
is the art of obtaining necessary information in a
predictable, reliable, and timely manner. Query
optimization is formally described as the process
of transforming a query into an equivalent form
that may be evaluated more efficiently. The goal of
query optimization is to find an execution plan that
reduces the time required to process a query. We
must complete two major tasks to attain this
optimization target.
The first is to determine the optimal plan to access
the database, and the second is to reduce the
time required to execute the query plan.
Following query parsing which is a process by which this
decision making is done that for a given query, calculating
how many different ways there are in which the query can
run, then the parsed query is delivered to the query optimizer,
which generates various execution plans to analyze the
parsed query and select the plan with the lowest estimated
cost. The catalog manager assists the optimizer in selecting
the optimum plan to perform the query by generating the cost
of each plan.
Process of Query Optimization
The optimizer tries to come up with the best execution
plan possible for a SQL statement.
Among all the candidate plans reviewed, the optimizer
chooses the plan with the lowest cost. The optimizer
computes costs based on available facts. The cost
computation takes into account query execution factors
such as I/O, CPU, and communication for a certain query
in a given context.
Purpose of the Query Optimizer in DBMS
Sr No. Class Name Role
01 10 Shreya CR
02 10 Ritik
For example, there is a query that requests information
about students who are in leadership roles, such as being
a class representative. If the optimizer statistics show
that 50% of students are in positions of leadership, the
optimizer may decide that a full table search is the most
efficient. However, if data show that just a small number of
students are in positions of leadership, reading an index
followed by table access by row id may be more efficient
than a full table scan.
Because the database has so many internal statistics and
tools at its disposal, the optimizer is frequently in a better
position than the user to decide the best way to execute a
statement. As a result, the optimizer is used by all SQL
statements.
The optimizer is made up of three parts: the transformer, the
estimator, and the plan generator. The figure below depicts
those components.
Optimizer Components
Query Transformer :The query transformer determines
whether it is advantageous to rewrite the original SQL
statement into a semantically equivalent SQL statement at a
lower cost for some statements.
When a plausible alternative exists, the database compares
the costs of each alternative and chooses the one with the
lowest cost. The query transformer shown in the query
below can be taken as an example of how query optimization
is done by transforming an OR-based input query into a
UNION ALL-based output query.
The given query is transformed using query transformer
Estimator:
The estimator is the optimizer component that calculates the total
cost of a given execution plan.
To determine the cost, the estimator employs three different
methods:
Selectivity: The query picks a percentage of the rows in the
row set, with 0 indicating no rows and 1 indicating all rows.
Selectivity is determined by a query predicate, such
as WHERE the last name LIKE X%, or by a mix of predicates. As
the selectivity value approaches zero, a predicate gets more
selective, and as the value nears one, it becomes less selective
(or more unselective).
For example, The row set can be a base table, a view, or the result
of a join. The selectivity is tied to a query predicate, such as Last
Name = 'Prakash', or a combination of predicates, such as Last _
Name = 'Prakash' AND job id = 'SDE'.
Cost: This metric represents the number of units of
labor or resources used. The query optimizer uses disc
I/O, CPU utilization, and memory usage as units of
effort. For example, if the plan for query A has a lower
cost than the plan for query B, then the following
outcomes are possible: A executes faster
than B, A executes slower than B or A executes in the
same amount of time as B.
Cardinality: The cardinality of an execution plan is
the number of rows returned by each action. This
input is shared by all cost functions and is essential
for determining the best strategy. Cardinality in
DBMS can be calculated using DBMS STATS table
statistics or after taking into account the impact of
predicates (filter, join, and so
on), DISTINCT or GROUP BY operations, and so on.
In an execution plan, the Rows column displays the
estimated cardinality.
For example, if the optimizer estimates that a full table
scan will yield 100 rows, then the cardinality estimate for
this operation is 100. The cardinality estimate appears in
the execution plan's Rows column.
The plan generator investigates multiple plans for a query block
by experimenting with various access paths, join methods, and
join orders.
Because of the different combinations that the database can
utilize to generate the same outcome, many plans are available.
The plan with the lowest cost is chosen by the optimizer.
Depending on how it is invoked, the optimizer performs
different actions.
The database offers the following optimization types:
Normal Optimization: the optimizer parses the SQL
and produces an execution plan. For most SQL
statements, the usual mode gives a reasonable plan. The
optimizer when operating under normal mode it has
stringent time limits, usually a fraction of a second, during
which it must identify an optimal plan.
SQL Tuning Advisor optimization: The optimizer is
known as Automatic Tuning Optimizer when SQL Tuning
Advisor invokes it by taking one or more SQL statements
as an input. In this situation, the optimizer conducts further
analysis to improve the plan generated in regular mode.
The optimizer produces a set of activities, along with their
reasoning and predicted reward, to produce a
considerably better plan.
Automatic Tuning Optimizer
Plan Generator
Query Optimization Techniques with Examples
Indexing:
Objective: Accelerate data retrieval
by creating indexes on columns used
in WHERE clauses.
Example
Query Rewriting:
Objective: Restructure queries to
provide the same result with
improved efficiency.
Example
Join Optimization:
Objective: Optimize JOIN
operations to minimize
computational overhead.
Example:
Subquery
Optimization:
Objective: Optimize
subqueries to improve their
efficiency.
Example:
Query Caching:
Objective: Store and reuse the
results of frequently executed queries.
Example:
Parallel Processing:
Objective: Execute multiple parts of a
query simultaneously to improve overall
performance.
Example:
❑ Overview To Database Security.
❑ What is Database Security
❑ Why need of database security.
❑ Concepts of Database Security.
❑ Security Problems
❑ Security Controls
Outline of Database Security
In today’s world, we need everything secured
whether it is your mobile phone , computer ,
vehicle or almost anything.
Overview
What is database security?
Database:
It is a collection of
information stored in a
computer.
Security:
It is being free from
danger.
Database Security:
It is the mechanisms that
protect the database against
intentional or accidental
threats.
Definition of Database Security
Database Security is defined as
the process by which
“Confidentiality,
Integrity and Availability” of the
database can be protected
Database
Security
Authentication
Authorization
and Access
control
Data
Encryption
Data privacy
protection
Data Integrity
Verification
Auditing and
Logging
Concepts of Database Security
Three are 3 main aspects
Secrecy or Confidentiality
Integrity
Availability
SECRECY
â–Ē It is protecting the database
from unauthorized users.
â–Ē Ensures that users are allowed
to do the things they are
trying to do.
â–Ē Encryption is a technique or a
process by which the data is
encoded in such a way that only
that authorized users are able to
read the data.
â–Ē Protecting the database from
authorized users.
â–Ē Ensures that what users are
trying to do is correct.
â–Ē For examples,
â–Ē An employee should be able to
modify his or her own information.
INTEGRITY
Database must have not unplanned
downtime.
To ensure this ,following steps should be
taken
Restrict the amount of the storage
space given to each user in the
database.
Limitthe number of concurrentsessions
made available to each database user.
Backup the data at periodic intervals to
ensure data recovery in case of application
users.
AVAILABILITY
SECURITY PROBLEMS
Any circumstance or event with the potential to adversely
impact an IS through unauthorized access, destruction,
disclosure, modification of data, and/or denial of service.
There are two kinds of threat.
â€ĸ Non-fraudulent Threat
â€ĸ fraudulent Threat
â–Ē Natural or accidental disasters.
â–Ē Errors or bugs in hardware or software.
â–Ē Human errors.
fraudulent Threat
Authorized users
Those who abuse their privileges and authority.
Hostile agents
Those improper users (outsider or insiders).
who attack the software and/or hardware system, or
read or write data in a database
Non-fraudulent Threat
Database Protection Requirements
1. Protection from Improper Access
2. Protection from Inference
3. Integrity of the Database
4. User Authentication
5. Multilevel Protection
6. Confinement
7. Management and Protection of Sensitive Data
Security Controls
Authorization- privileges, views.
Encryption - public key / private key, secure
sockets.
Authentication – passwords.
Logical- firewalls, net proxies.
A FIREWALL is dedicated software on another computer which
inspects network trafficpassing through it and denies (or) permits
passage based on set of rules. Basically it is a piece of software that
monitors all traffic that goes from your system to another via the
Internet or network and Vice Versa.
Diagram Representation
Encryption and Decryption
Privacy Protection:
Encryption safeguards sensitive information, such as
personal details, financial data, and communications,
protecting user privacy.
Secure Communication: Encryption secures data
during transmission, preventing unauthorized
interception and eavesdropping.
Advantages and Disadvantages of Encryption
Advantages
Authentication and Authorization:
Encryption can be used in conjunction with
authentication and authorization mechanisms to
enhance overall security.
Performance Overhead:
The process of encrypting and decrypting data
introduces computational overhead, potentially
impacting system performance.
Potential for Key Exposure:
If encryption keys are not adequately protected, they
may be vulnerable to theft, leading to unauthorized
access.
Resource Consumption:
Encryption can consume additional resources, such as
CPU and memory, especially in resource-constrained
environments.
Disadvantages
Access to Encrypted Data
The primary purpose of decryption is to
enable authorized users to access and read
the originally encrypted data.
Data Utilization:
Decryption allows for the utilization of data
for various purposes such as analysis,
reporting, and decision-making.
Data Recovery:
In the case of data loss or system failures, having
access to decryption keys allows for the recovery of
encrypted data.
Advantage and Disadvantages of Decryption
Security Risks:
Decryption, if not carefully managed, can pose security
risks, especially if unauthorized parties gain access to
decryption keys.
Potential for Misuse:
If decryption keys fall into the wrong hands, there is a risk of
data misuse, unauthorized access, and potential security
breaches.
Data Exposure:
Decrypting data exposes it to potential threats during the
time it is in its readable form, especially if not adequately
protected.
Advantages Disadvantages
Read authorization - allows reading, but not modification of data
Insert authorization - allows insertion of new data,
but not modification of existing data.
Update authorization - allows modification, but not deletion of data.
Delete authorization - allows deletion of data
Authorization
Advance Topics
Distributed Databases:
Overview: Distributed databases involve the
storage and management of data across
multiple locations or servers.
Key Aspects: Data distribution, replication,
consistency, and fault tolerance.
Challenges: Network latency, data
synchronization, and ensuring consistency
across distributed nodes.
Data Warehousing:
Overview: Data warehousing involves the
collection, integration, and storage of data from
different sources for analysis and reporting.
Key Aspects: ETL (Extract, Transform, Load)
processes, data marts, and multidimensional data
models.
Challenges: Data integration, data quality, and
designing effective data models for analysis.
Data Mining and Machine Learning:
Overview: Data mining involves discovering patterns
and trends in large datasets, while machine learning
uses algorithms to make predictions.
Key Aspects: Classification, clustering, regression, and
predictive modeling.
Challenges: Feature selection, model interpretation,
and ensuring the quality of input data.
Blockchain and Databases:
Overview: Blockchain is a distributed ledger
technology that enables secure and transparent
transactions.
Key Aspects: Decentralization, consensus
mechanisms, and smart contracts.
Challenges: Scalability, privacy concerns, and
integration with traditional databases.
In today's dynamic technological landscape, the
understanding and implementation of these concepts are
crucial for organizations aiming to harness the full
potential of their data. Balancing performance, security,
and innovation is key to building robust and adaptive
database systems that meet the demands of modern
applications and business requirements. As technology
continues to evolve, staying informed about emerging
trends and advanced database management techniques
becomes increasingly important for professionals in the
field.
Conclusion
Thank you

More Related Content

PPTX
Query processing and optimization on dbms
ar1289589
 
PPTX
Processes in Query Optimization in (ABMS) Advanced Database Management Systems
gamemaker762
 
PDF
unit 3 DBMS.docx.pdf geometric transformer in query processing
FallenAngel35
 
PDF
unit 3 DBMS.docx.pdf geometry in query p
FallenAngel35
 
PPTX
LECTURE_06_DATABASE PROCESSING & OPTIMAZATION.pptx
AthosBeatus
 
PPTX
DB LECTURE 5 QUERY PROCESSING.pptx
grahamoyigo19
 
PPT
Overview of query evaluation
avniS
 
PPT
Database performance tuning and query optimization
Usman Tariq
 
Query processing and optimization on dbms
ar1289589
 
Processes in Query Optimization in (ABMS) Advanced Database Management Systems
gamemaker762
 
unit 3 DBMS.docx.pdf geometric transformer in query processing
FallenAngel35
 
unit 3 DBMS.docx.pdf geometry in query p
FallenAngel35
 
LECTURE_06_DATABASE PROCESSING & OPTIMAZATION.pptx
AthosBeatus
 
DB LECTURE 5 QUERY PROCESSING.pptx
grahamoyigo19
 
Overview of query evaluation
avniS
 
Database performance tuning and query optimization
Usman Tariq
 

Similar to dd presentation.pdf (20)

PPTX
Query optimization
Pooja Dixit
 
PPTX
Oracle performance tuning for java developers
Saeed Shahsavan
 
DOCX
Debate Initial Post and Response Rubric Student Name .docx
simonithomas47935
 
PDF
Issues in Query Processing and Optimization
Editor IJMTER
 
PPTX
Query processing and optimization (updated)
Ravinder Kamboj
 
PPTX
Cost-Based-Query-Optimization-in-DBMS.pptx
maddishiva1989
 
PDF
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
cookie1969
 
PDF
Managing Statistics for Optimal Query Performance
Karen Morton
 
PPTX
Concepts of Query Processing in ADBMS.pptx
AaradhyaDixit6
 
PPT
Coronel_PPT_Ch11.ppt
abdulbasetalselwi
 
PPTX
Query Execution Time and Query Optimization.
Radhe Krishna Rajan
 
PPTX
Adbms 37 query optimisation
Vaibhav Khanna
 
PPTX
Optimising Queries - Series 1 Query Optimiser Architecture
DAGEOP LTD
 
PPT
PASS Summit 2010 Keynote David DeWitt
GraySystemsLab
 
PPTX
Sql and PL/SQL Best Practices I
Carlos Oliveira
 
PPTX
PresentaciÃŗn Oracle Database MigraciÃŗn consideraciones 10g/11g/12c
Ronald Francisco Vargas Quesada
 
PPTX
Mc seminar
Ankit Anand
 
PPTX
Ch-2-Query-Process.pptx advanced database
tasheebedane
 
PPTX
700442110-advanced database Ch-2-Query-Process.pptx
tasheebedane
 
PDF
Chapter 2.pdf WND FWKJFW KSD;KFLWHFB ASNK
alemunuruhak9
 
Query optimization
Pooja Dixit
 
Oracle performance tuning for java developers
Saeed Shahsavan
 
Debate Initial Post and Response Rubric Student Name .docx
simonithomas47935
 
Issues in Query Processing and Optimization
Editor IJMTER
 
Query processing and optimization (updated)
Ravinder Kamboj
 
Cost-Based-Query-Optimization-in-DBMS.pptx
maddishiva1989
 
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
cookie1969
 
Managing Statistics for Optimal Query Performance
Karen Morton
 
Concepts of Query Processing in ADBMS.pptx
AaradhyaDixit6
 
Coronel_PPT_Ch11.ppt
abdulbasetalselwi
 
Query Execution Time and Query Optimization.
Radhe Krishna Rajan
 
Adbms 37 query optimisation
Vaibhav Khanna
 
Optimising Queries - Series 1 Query Optimiser Architecture
DAGEOP LTD
 
PASS Summit 2010 Keynote David DeWitt
GraySystemsLab
 
Sql and PL/SQL Best Practices I
Carlos Oliveira
 
PresentaciÃŗn Oracle Database MigraciÃŗn consideraciones 10g/11g/12c
Ronald Francisco Vargas Quesada
 
Mc seminar
Ankit Anand
 
Ch-2-Query-Process.pptx advanced database
tasheebedane
 
700442110-advanced database Ch-2-Query-Process.pptx
tasheebedane
 
Chapter 2.pdf WND FWKJFW KSD;KFLWHFB ASNK
alemunuruhak9
 
Ad

More from AnSHiKa187943 (17)

PDF
databasesecurit-phpapp01.pdf
AnSHiKa187943
 
PDF
wepik-enhancing-visual-data-exploring-arithmetic-and-logic-operations-in-imag...
AnSHiKa187943
 
PDF
11100121024_Asif Ikbal_CA1_OS...pdf
AnSHiKa187943
 
PDF
CA-1_14400121029_HSMC_301.pdf
AnSHiKa187943
 
PDF
14400121029_Anshika Das_Software Engineering.pdf
AnSHiKa187943
 
PPTX
time-value-of-money.pptx
AnSHiKa187943
 
PPTX
divisionoflabour-170216153446-2.pptx
AnSHiKa187943
 
PPTX
pumping-lemma-181011153118.pptx
AnSHiKa187943
 
PPT
LECT9.ppt
AnSHiKa187943
 
PPTX
_86c448dfa47cdab170075f16cd25c650_PeerReviewforUpload.pptx
AnSHiKa187943
 
PPTX
sorting-160810203705.pptx
AnSHiKa187943
 
PPTX
ChodonKumar.pptx
AnSHiKa187943
 
PPTX
SubhamDas.pptx
AnSHiKa187943
 
PPTX
26-170918023441 (1).pptx
AnSHiKa187943
 
PPTX
BhamD.pptx
AnSHiKa187943
 
PPTX
Anshika 1111.pptx
AnSHiKa187943
 
PPTX
9pL7F2E8XI0alXYT280.pptx
AnSHiKa187943
 
databasesecurit-phpapp01.pdf
AnSHiKa187943
 
wepik-enhancing-visual-data-exploring-arithmetic-and-logic-operations-in-imag...
AnSHiKa187943
 
11100121024_Asif Ikbal_CA1_OS...pdf
AnSHiKa187943
 
CA-1_14400121029_HSMC_301.pdf
AnSHiKa187943
 
14400121029_Anshika Das_Software Engineering.pdf
AnSHiKa187943
 
time-value-of-money.pptx
AnSHiKa187943
 
divisionoflabour-170216153446-2.pptx
AnSHiKa187943
 
pumping-lemma-181011153118.pptx
AnSHiKa187943
 
LECT9.ppt
AnSHiKa187943
 
_86c448dfa47cdab170075f16cd25c650_PeerReviewforUpload.pptx
AnSHiKa187943
 
sorting-160810203705.pptx
AnSHiKa187943
 
ChodonKumar.pptx
AnSHiKa187943
 
SubhamDas.pptx
AnSHiKa187943
 
26-170918023441 (1).pptx
AnSHiKa187943
 
BhamD.pptx
AnSHiKa187943
 
Anshika 1111.pptx
AnSHiKa187943
 
9pL7F2E8XI0alXYT280.pptx
AnSHiKa187943
 
Ad

Recently uploaded (20)

PPTX
Victory Precisions_Supplier Profile.pptx
victoryprecisions199
 
PPTX
Information Retrieval and Extraction - Module 7
premSankar19
 
PPTX
Civil Engineering Practices_BY Sh.JP Mishra 23.09.pptx
bineetmishra1990
 
PPTX
sunil mishra pptmmmmmmmmmmmmmmmmmmmmmmmmm
singhamit111
 
PPTX
Inventory management chapter in automation and robotics.
atisht0104
 
PDF
Biodegradable Plastics: Innovations and Market Potential (www.kiu.ac.ug)
publication11
 
PDF
Zero Carbon Building Performance standard
BassemOsman1
 
PDF
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
PPTX
quantum computing transition from classical mechanics.pptx
gvlbcy
 
PPT
1. SYSTEMS, ROLES, AND DEVELOPMENT METHODOLOGIES.ppt
zilow058
 
PDF
Advanced LangChain & RAG: Building a Financial AI Assistant with Real-Time Data
Soufiane Sejjari
 
PDF
Unit I Part II.pdf : Security Fundamentals
Dr. Madhuri Jawale
 
PDF
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
PDF
Natural_Language_processing_Unit_I_notes.pdf
sanguleumeshit
 
PDF
Chad Ayach - A Versatile Aerospace Professional
Chad Ayach
 
PDF
AI-Driven IoT-Enabled UAV Inspection Framework for Predictive Maintenance and...
ijcncjournal019
 
PDF
STUDY OF NOVEL CHANNEL MATERIALS USING III-V COMPOUNDS WITH VARIOUS GATE DIEL...
ijoejnl
 
PDF
The Effect of Artifact Removal from EEG Signals on the Detection of Epileptic...
Partho Prosad
 
PPTX
22PCOAM21 Session 2 Understanding Data Source.pptx
Guru Nanak Technical Institutions
 
PDF
Construction of a Thermal Vacuum Chamber for Environment Test of Triple CubeS...
2208441
 
Victory Precisions_Supplier Profile.pptx
victoryprecisions199
 
Information Retrieval and Extraction - Module 7
premSankar19
 
Civil Engineering Practices_BY Sh.JP Mishra 23.09.pptx
bineetmishra1990
 
sunil mishra pptmmmmmmmmmmmmmmmmmmmmmmmmm
singhamit111
 
Inventory management chapter in automation and robotics.
atisht0104
 
Biodegradable Plastics: Innovations and Market Potential (www.kiu.ac.ug)
publication11
 
Zero Carbon Building Performance standard
BassemOsman1
 
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
quantum computing transition from classical mechanics.pptx
gvlbcy
 
1. SYSTEMS, ROLES, AND DEVELOPMENT METHODOLOGIES.ppt
zilow058
 
Advanced LangChain & RAG: Building a Financial AI Assistant with Real-Time Data
Soufiane Sejjari
 
Unit I Part II.pdf : Security Fundamentals
Dr. Madhuri Jawale
 
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
Natural_Language_processing_Unit_I_notes.pdf
sanguleumeshit
 
Chad Ayach - A Versatile Aerospace Professional
Chad Ayach
 
AI-Driven IoT-Enabled UAV Inspection Framework for Predictive Maintenance and...
ijcncjournal019
 
STUDY OF NOVEL CHANNEL MATERIALS USING III-V COMPOUNDS WITH VARIOUS GATE DIEL...
ijoejnl
 
The Effect of Artifact Removal from EEG Signals on the Detection of Epileptic...
Partho Prosad
 
22PCOAM21 Session 2 Understanding Data Source.pptx
Guru Nanak Technical Institutions
 
Construction of a Thermal Vacuum Chamber for Environment Test of Triple CubeS...
2208441
 

dd presentation.pdf

  • 1. Name- Anshika Das Roll-14400121029 Subcode- PCC-CS601 Sub-DBMS Department-CSE
  • 2. Agenda 1. Introduction â€ĸ Defining the Role of Databases â€ĸ Overview of Key Components 2. Query Processing and Optimization â€ĸ Understanding the Query Lifecycle â€ĸ Techniques for Optimizing Queries 2 3. Database Security â€ĸ Importance of Secure Database Management â€ĸ Key Security Measures 4. Advanced Topics â€ĸ In-Memory Databases, NoSQL, Big Data, ML, Blockchain
  • 3. Importance of Database Management â€ĸ Central role in storing, retrieving, and managing data. â€ĸ Foundation for various applications and systems. Key Components: Query Processing: Transformation of user queries into executable actions. Optimization: Enhancing query performance through efficient algorithms. Security: Safeguarding data integrity, confidentiality, and availability. 3
  • 4. 4 â—ŧ Query processing: â—ŧ Is the list of activities that are perform to obtain the required tuples that satisfy a given query. â—ŧ Query optimization: â—ŧ The process of choosing a suitable execution strategy for processing a query. Introduction to Query Processing and Optimization Two internal representations of a query: â€ĸ Query Tree â€ĸ Query Graph
  • 5. query parser and translator relational algebra expression optimizer evaluation plan evaluation engine output data data data statistics Query Processing
  • 6. It is done in the following steps: S tep-1: Parser: During parse call, the database performs the following checks- Syntax check, Semantic check and Shared pool check, after converting the query into relational algebra. Parser performs the following checks as (refer detailed diagram). 1.Syntax check – concludes SQL syntactic validity. Example: SELECT * FORM employee Here error of wrong spelling of FROM is given by this check. Process of query processing
  • 7. 2. Semantic check – determines whether the statement is meaningful or not. Example: query contains a table name which does not exist is checked by this check. 3. Shared Pool check – Every query possess a hash code during its execution. So, this check determines existence of written hash code in shared pool if code exists in shared pool then database will not take additional steps for optimization and execution. Hard Parse and Soft Parse – If there is a fresh query and its hash code does not exist in shared pool then that query has to pass through from the additional steps known as hard parsing otherwise if hash code exists then query does not passes through additional steps. It just passes directly to execution engine (refer detailed diagram). This is known as soft parsing. Hard Parse includes following steps – Optimizer and Row source generation. Step-2: Optimizer: During optimization stage, database must perform a hard parse at least for one unique DML statement and perform optimization during this parse. This database never optimizes DDL unless it includes a DML component such as subquery that require optimization.
  • 8. It is a process in which multiple query execution plan for satisfying a query are examined and most efficient query plan is satisfied for execution. Database catalog stores the execution plans and then optimizer passes the lowest cost plan for execution. Row Source Generation – The Row Source Generation is a software that receives a optimal execution plan from the optimizer and produces an iterative execution plan that is usable by the rest of the database. the iterative plan is the binary program that when executes by the sql engine produces the result set. Step-3: Execution Engine: Finally runs the query and display the required result. Query in a high level language Scanning, parsing and validating Immediate form of query Query optimizer Execution plan Query code generator Code to execute the query
  • 9. Example of query process SELECT * FROM Student Details WHERE name=Paul Parse query and translate check syntax, verify names, etc translate into relational algebra (RDBMS) create evaluation plans Find best plan (optimization) Execute plan student_details cid name 00112233 Paul 00112238 Rob 00112235 Matt takes cid courseid 00112233 312 00112233 395 00112235 312 course courseid coursename 312 Advanced DBs 395 Machine Learning
  • 10. student_details cid name 00112233 Paul 00112238 Rob 00112235 Matt takes cid courseid 00112233 312 00112233 395 00112235 312 course courseid coursename 312 Advanced DBs 395 Machine Learning Query to retrieve the information of a particular student name = Paul SELECT * FROM Student_ details WHERE name=Paul ΃name=Paul(Student_ details) Ī€name( ΃ =001122359(Student Details) ) Relational Algebra
  • 11. Specify which access path to follow Specify which algorithm to use to evaluate operator Specify how operators interleave Optimization: estimate the cost of each plan (not all plans) select plan with lowest estimated cost Evaluation plans Student_details ΃name=Paul Student_details ΃coursename=Advanced DBs l _details takes cid; hash join courseid; index-nested loop course Ī€name
  • 12. What needs to be considered â€ĸ Disk I/O s â€ĸ Sequential â€ĸ random â€ĸ CPU time â€ĸ Network communication What are we going to consider â€ĸ Disk I/O s â€ĸ page reads/writes â€ĸ Ignoring cost of writing final output Estimating Cost
  • 13. Query Optimization Need for Query Optimization 1.Improved Performance: Reduced Execution Time: Optimized queries often result in reduced execution times, leading to faster response times for users and applications. Resource Utilization: Optimized queries use system resources more efficiently, preventing unnecessary strain on the database server. 2.Resource Conservation: CPU and Memory Usage: Efficiently optimized queries consume fewer CPU and memory resources, allowing the system to handle a larger number of concurrent queries without degradation in performance. Disk I/O: Optimized queries minimize disk I/O operations, reducing the load on storage devices and improving overall system throughput. 3.Scalability: Support for Growth: As the volume of data and user queries increases, well-optimized queries ensure that the database system can scale effectively without a proportional decrease in performance. Adaptability: Query optimization allows the DBMS to adapt to changes in data distribution, size, and query patterns. 4.Cost Reduction: Hardware Costs: By optimizing queries and utilizing system resources efficiently, organizations can avoid the need for constant hardware upgrades to meet growing demands. Operational Costs: Faster query execution and reduced resource consumption contribute to lower operational costs, especially in large-scale enterprise environments.
  • 14. Rule-based Optimization Overview: Rule-based optimization involves static optimization using predefined rules set by the database administrator or system designer. How It Works: â€ĸ The DBMS relies on a set of predefined rules and heuristics to choose an execution plan for a given query. â€ĸ These rules are typically based on the structure of the query, the available indexes, and historical performance data. Advantages: â€ĸ Simplicity: Rule-based optimization is straightforward to implement and understand. â€ĸ Predictability: The query optimizer follows a set of fixed rules, providing predictable results. Limitations: â€ĸ Lack of Adaptability: Rule-based optimization may struggle with adaptability to changing data distributions or dynamic workload patterns. â€ĸ Limited Complexity: In complex scenarios, rule-based systems may not handle optimization challenges. Cost-based Optimization Overview: Cost-based optimization is a dynamic optimization approach based on estimating the cost of different execution plans and choosing most efficient one. How It Works: â€ĸ The DBMS analyzes multiple execution plans for a query and estimates the cost associated with each plan. â€ĸ The cost includes factors such as disk I/O, CPU usage, and memory consumption. â€ĸ The optimizer selects the execution plan with the lowest estimated cost. Advantages: â€ĸ Adaptability: Cost-based optimization adapts to changes in data distribution, statistics, and system resources. Limitations: â€ĸ Complexity: Implementing a cost-based optimizer requires sophisticated algorithms and statistical modeling.
  • 15. Query optimization is used to access and modify the database in the most efficient way possible. It is the art of obtaining necessary information in a predictable, reliable, and timely manner. Query optimization is formally described as the process of transforming a query into an equivalent form that may be evaluated more efficiently. The goal of query optimization is to find an execution plan that reduces the time required to process a query. We must complete two major tasks to attain this optimization target. The first is to determine the optimal plan to access the database, and the second is to reduce the time required to execute the query plan. Following query parsing which is a process by which this decision making is done that for a given query, calculating how many different ways there are in which the query can run, then the parsed query is delivered to the query optimizer, which generates various execution plans to analyze the parsed query and select the plan with the lowest estimated cost. The catalog manager assists the optimizer in selecting the optimum plan to perform the query by generating the cost of each plan. Process of Query Optimization
  • 16. The optimizer tries to come up with the best execution plan possible for a SQL statement. Among all the candidate plans reviewed, the optimizer chooses the plan with the lowest cost. The optimizer computes costs based on available facts. The cost computation takes into account query execution factors such as I/O, CPU, and communication for a certain query in a given context. Purpose of the Query Optimizer in DBMS Sr No. Class Name Role 01 10 Shreya CR 02 10 Ritik For example, there is a query that requests information about students who are in leadership roles, such as being a class representative. If the optimizer statistics show that 50% of students are in positions of leadership, the optimizer may decide that a full table search is the most efficient. However, if data show that just a small number of students are in positions of leadership, reading an index followed by table access by row id may be more efficient than a full table scan. Because the database has so many internal statistics and tools at its disposal, the optimizer is frequently in a better position than the user to decide the best way to execute a statement. As a result, the optimizer is used by all SQL statements.
  • 17. The optimizer is made up of three parts: the transformer, the estimator, and the plan generator. The figure below depicts those components. Optimizer Components Query Transformer :The query transformer determines whether it is advantageous to rewrite the original SQL statement into a semantically equivalent SQL statement at a lower cost for some statements. When a plausible alternative exists, the database compares the costs of each alternative and chooses the one with the lowest cost. The query transformer shown in the query below can be taken as an example of how query optimization is done by transforming an OR-based input query into a UNION ALL-based output query. The given query is transformed using query transformer
  • 18. Estimator: The estimator is the optimizer component that calculates the total cost of a given execution plan. To determine the cost, the estimator employs three different methods: Selectivity: The query picks a percentage of the rows in the row set, with 0 indicating no rows and 1 indicating all rows. Selectivity is determined by a query predicate, such as WHERE the last name LIKE X%, or by a mix of predicates. As the selectivity value approaches zero, a predicate gets more selective, and as the value nears one, it becomes less selective (or more unselective). For example, The row set can be a base table, a view, or the result of a join. The selectivity is tied to a query predicate, such as Last Name = 'Prakash', or a combination of predicates, such as Last _ Name = 'Prakash' AND job id = 'SDE'. Cost: This metric represents the number of units of labor or resources used. The query optimizer uses disc I/O, CPU utilization, and memory usage as units of effort. For example, if the plan for query A has a lower cost than the plan for query B, then the following outcomes are possible: A executes faster than B, A executes slower than B or A executes in the same amount of time as B. Cardinality: The cardinality of an execution plan is the number of rows returned by each action. This input is shared by all cost functions and is essential for determining the best strategy. Cardinality in DBMS can be calculated using DBMS STATS table statistics or after taking into account the impact of predicates (filter, join, and so on), DISTINCT or GROUP BY operations, and so on. In an execution plan, the Rows column displays the estimated cardinality. For example, if the optimizer estimates that a full table scan will yield 100 rows, then the cardinality estimate for this operation is 100. The cardinality estimate appears in the execution plan's Rows column.
  • 19. The plan generator investigates multiple plans for a query block by experimenting with various access paths, join methods, and join orders. Because of the different combinations that the database can utilize to generate the same outcome, many plans are available. The plan with the lowest cost is chosen by the optimizer. Depending on how it is invoked, the optimizer performs different actions. The database offers the following optimization types: Normal Optimization: the optimizer parses the SQL and produces an execution plan. For most SQL statements, the usual mode gives a reasonable plan. The optimizer when operating under normal mode it has stringent time limits, usually a fraction of a second, during which it must identify an optimal plan. SQL Tuning Advisor optimization: The optimizer is known as Automatic Tuning Optimizer when SQL Tuning Advisor invokes it by taking one or more SQL statements as an input. In this situation, the optimizer conducts further analysis to improve the plan generated in regular mode. The optimizer produces a set of activities, along with their reasoning and predicted reward, to produce a considerably better plan. Automatic Tuning Optimizer Plan Generator
  • 20. Query Optimization Techniques with Examples Indexing: Objective: Accelerate data retrieval by creating indexes on columns used in WHERE clauses. Example Query Rewriting: Objective: Restructure queries to provide the same result with improved efficiency. Example Join Optimization: Objective: Optimize JOIN operations to minimize computational overhead. Example: Subquery Optimization: Objective: Optimize subqueries to improve their efficiency. Example: Query Caching: Objective: Store and reuse the results of frequently executed queries. Example: Parallel Processing: Objective: Execute multiple parts of a query simultaneously to improve overall performance. Example:
  • 21. ❑ Overview To Database Security. ❑ What is Database Security ❑ Why need of database security. ❑ Concepts of Database Security. ❑ Security Problems ❑ Security Controls Outline of Database Security
  • 22. In today’s world, we need everything secured whether it is your mobile phone , computer , vehicle or almost anything. Overview
  • 23. What is database security? Database: It is a collection of information stored in a computer. Security: It is being free from danger. Database Security: It is the mechanisms that protect the database against intentional or accidental threats.
  • 24. Definition of Database Security Database Security is defined as the process by which “Confidentiality, Integrity and Availability” of the database can be protected Database Security Authentication Authorization and Access control Data Encryption Data privacy protection Data Integrity Verification Auditing and Logging Concepts of Database Security Three are 3 main aspects Secrecy or Confidentiality Integrity Availability
  • 25. SECRECY â–Ē It is protecting the database from unauthorized users. â–Ē Ensures that users are allowed to do the things they are trying to do. â–Ē Encryption is a technique or a process by which the data is encoded in such a way that only that authorized users are able to read the data. â–Ē Protecting the database from authorized users. â–Ē Ensures that what users are trying to do is correct. â–Ē For examples, â–Ē An employee should be able to modify his or her own information. INTEGRITY Database must have not unplanned downtime. To ensure this ,following steps should be taken Restrict the amount of the storage space given to each user in the database. Limitthe number of concurrentsessions made available to each database user. Backup the data at periodic intervals to ensure data recovery in case of application users. AVAILABILITY
  • 26. SECURITY PROBLEMS Any circumstance or event with the potential to adversely impact an IS through unauthorized access, destruction, disclosure, modification of data, and/or denial of service. There are two kinds of threat. â€ĸ Non-fraudulent Threat â€ĸ fraudulent Threat
  • 27. â–Ē Natural or accidental disasters. â–Ē Errors or bugs in hardware or software. â–Ē Human errors. fraudulent Threat Authorized users Those who abuse their privileges and authority. Hostile agents Those improper users (outsider or insiders). who attack the software and/or hardware system, or read or write data in a database Non-fraudulent Threat
  • 28. Database Protection Requirements 1. Protection from Improper Access 2. Protection from Inference 3. Integrity of the Database 4. User Authentication 5. Multilevel Protection 6. Confinement 7. Management and Protection of Sensitive Data
  • 29. Security Controls Authorization- privileges, views. Encryption - public key / private key, secure sockets. Authentication – passwords. Logical- firewalls, net proxies. A FIREWALL is dedicated software on another computer which inspects network trafficpassing through it and denies (or) permits passage based on set of rules. Basically it is a piece of software that monitors all traffic that goes from your system to another via the Internet or network and Vice Versa. Diagram Representation
  • 30. Encryption and Decryption Privacy Protection: Encryption safeguards sensitive information, such as personal details, financial data, and communications, protecting user privacy. Secure Communication: Encryption secures data during transmission, preventing unauthorized interception and eavesdropping. Advantages and Disadvantages of Encryption Advantages Authentication and Authorization: Encryption can be used in conjunction with authentication and authorization mechanisms to enhance overall security. Performance Overhead: The process of encrypting and decrypting data introduces computational overhead, potentially impacting system performance. Potential for Key Exposure: If encryption keys are not adequately protected, they may be vulnerable to theft, leading to unauthorized access. Resource Consumption: Encryption can consume additional resources, such as CPU and memory, especially in resource-constrained environments. Disadvantages
  • 31. Access to Encrypted Data The primary purpose of decryption is to enable authorized users to access and read the originally encrypted data. Data Utilization: Decryption allows for the utilization of data for various purposes such as analysis, reporting, and decision-making. Data Recovery: In the case of data loss or system failures, having access to decryption keys allows for the recovery of encrypted data. Advantage and Disadvantages of Decryption Security Risks: Decryption, if not carefully managed, can pose security risks, especially if unauthorized parties gain access to decryption keys. Potential for Misuse: If decryption keys fall into the wrong hands, there is a risk of data misuse, unauthorized access, and potential security breaches. Data Exposure: Decrypting data exposes it to potential threats during the time it is in its readable form, especially if not adequately protected. Advantages Disadvantages
  • 32. Read authorization - allows reading, but not modification of data Insert authorization - allows insertion of new data, but not modification of existing data. Update authorization - allows modification, but not deletion of data. Delete authorization - allows deletion of data Authorization
  • 33. Advance Topics Distributed Databases: Overview: Distributed databases involve the storage and management of data across multiple locations or servers. Key Aspects: Data distribution, replication, consistency, and fault tolerance. Challenges: Network latency, data synchronization, and ensuring consistency across distributed nodes. Data Warehousing: Overview: Data warehousing involves the collection, integration, and storage of data from different sources for analysis and reporting. Key Aspects: ETL (Extract, Transform, Load) processes, data marts, and multidimensional data models. Challenges: Data integration, data quality, and designing effective data models for analysis. Data Mining and Machine Learning: Overview: Data mining involves discovering patterns and trends in large datasets, while machine learning uses algorithms to make predictions. Key Aspects: Classification, clustering, regression, and predictive modeling. Challenges: Feature selection, model interpretation, and ensuring the quality of input data. Blockchain and Databases: Overview: Blockchain is a distributed ledger technology that enables secure and transparent transactions. Key Aspects: Decentralization, consensus mechanisms, and smart contracts. Challenges: Scalability, privacy concerns, and integration with traditional databases.
  • 34. In today's dynamic technological landscape, the understanding and implementation of these concepts are crucial for organizations aiming to harness the full potential of their data. Balancing performance, security, and innovation is key to building robust and adaptive database systems that meet the demands of modern applications and business requirements. As technology continues to evolve, staying informed about emerging trends and advanced database management techniques becomes increasingly important for professionals in the field. Conclusion