SlideShare a Scribd company logo
OPTIMIZING MYSQL QUERIES
KhoaHD
OVERVIEW
• Optimization Overview
• Understanding the Query Execution
• Using explain
• Optimizing Specific Types of Queries
• Indexing
• Partitioning
• Demo partitioning
1. OPTIMIZATION OVERVIEW
1.1 Optimizing at the Database Level
• tables
• indexes
• storage engine
• locking strategy
• Are all memory areas used for caching sized correctly?
1. OPTIMIZATION OVERVIEW
1.2 Optimizing at the Hardware Level
• Disk seeks
• Disk reading and writing
• CPU cycles
• Memory bandwidth
2. UNDERSTANDING THE QUERY EXECUTION
2. UNDERSTANDING THE QUERY EXECUTION
1. The client sends the SQL statement to the server.
2. The server checks the query cache. If there’s a hit, it returns the stored result from
the cache; otherwise, it passes the SQL statement to the next step.
3. The server parses, preprocesses, and optimizes the SQL into a query execution
plan.
4. The query execution engine executes the plan by making calls to the storage
engine
API.
5. The server sends the result to the client.
3. USING EXPLAIN
3.1 Usage:
• EXPLAIN is used to obtain a query execution plan.
• EXPLAIN is useful for examining queries involving
partitioned tables.
• EXPLAIN works with SELECT, DELETE, INSERT, REPLACE,
and UPDATE statements.
3. USING EXPLAIN
3.1 Usage:
3. USING EXPLAIN
3.3 What should I care about?
3. USING EXPLAIN
3.3 What should I care about?
• Partitions
• Extra
• Type
3. USING EXPLAIN
3.3 What should I care about?
Partitions: show which partitions were used.
3. USING EXPLAIN
3.3 What should I care about?
Extra
• Using filesort
• Using temporary
• Using index
• Using index for group-by
• …
3. USING EXPLAIN
3.3 What should I care about?
Extra
• Using filesort: MySQL must do an extra pass to find out
how to retrieve the rows in sorted order. The sort is done
by going through all rows according to the join type and
storing the sort key and pointer to the row for all rows that
match the WHERE clause. The keys then are sorted and the
rows are retrieved in sorted order.
3. USING EXPLAIN
3.3 What should I care about?
Extra
• Using temporary: To resolve the query, MySQL needs to
create a temporary table to hold the result. This typically
happens if the query contains GROUP BY and ORDER BY
clauses that list columns differently.
3. USING EXPLAIN
3.3 What should I care about?
Extra
• Using index: The column information is retrieved from the
table using only information in the index tree without
having to do an additional seek to read the actual row. This
strategy can be used when the query uses only columns
that are part of a single index.
3. USING EXPLAIN
3.3 What should I care about?
Extra
• Using index for group-by: Similar to the Using index table
access method, Using index for group-by indicates that
MySQL found an index that can be used to retrieve all
columns of a GROUP BY or DISTINCT query without any
extra disk access to the actual table. Additionally, the index
is used in the most efficient way so that for each group,
only a few index entries are read.
3. USING EXPLAIN
3.3 What should I care about?
Type: describes how tables are joined. The following list
describes the join types, ordered from the best type to the
worst:
• System
• Const
• Eq_ref
• Ref
• Fulltext
3. USING EXPLAIN
3.3 What should I care about?
Type:
• Ref_or_null
• Index_merge
• Unique_subquery
• Index_subquery
• Range
• Index
• All
4. OPTIMIZING SPECIFIC TYPES OF QUERIES
4.1 Count()
• SELECT COUNT(*) FROM world.City WHERE ID > 5;
• If you examine this query with SHOW STATUS, you’ll see
that it scans 4,079 rows. If you negate the conditions and
subtract the number of cities whose IDs are less than or
equal to 5 from the total number of cities, you can reduce
that to five rows:
• SELECT (SELECT COUNT(*) FROM world.City) - COUNT(*)
-> FROM world.City WHERE ID <= 5;
4. OPTIMIZING SPECIFIC TYPES OF QUERIES
4.2 LIMIT and OFFSET
• One simple technique to improve efficiency is to do the offset on a covering index,
rather than the full rows. You can then join the result to the full row and retrieve the
additional columns you need. This can be much more efficient. Consider the following
query:
4. OPTIMIZING SPECIFIC TYPES OF QUERIES
4.3 Group by
• The most general way to satisfy a GROUP BY clause is to scan the whole table and
create a new temporary table where all rows from each group are consecutive, and
then use this temporary table to discover groups and apply aggregate functions (if
any). In some cases, MySQL is able to do much better than that and to avoid
creation of temporary tables by using index access.
4. OPTIMIZING SPECIFIC TYPES OF QUERIES
4.3 Group by
Loose Index Scan
If loose index scan is applicable to a query, the EXPLAIN output shows Using index for
group-by in the Extra column.
4. OPTIMIZING SPECIFIC TYPES OF QUERIES
4.3 Group by
Loose Index Scan
Condition:
• The query is over a single table.
• The GROUP BY names only columns that form a leftmost prefix of the index and no
other columns. (If, instead of GROUP BY, the query has a DISTINCT clause, all distinct
attributes refer to columns that form a leftmost prefix of the index.)
• The only aggregate functions used in the select list (if any) are MIN() and MAX(), and
all of them refer to the same column. The column must be in the index and must
immediately follow the columns in the GROUP BY.
4. OPTIMIZING SPECIFIC TYPES OF QUERIES
4.3 Group by
Loose Index Scan
Condition:
• Any other parts of the index than those from the GROUP BY referenced in the query
must be constants (that is, they must be referenced in equalities with constants),
except for the argument of MIN() or MAX() functions.
• For columns in the index, full column values must be indexed, not just a prefix. For
example, with c1 VARCHAR(20), INDEX (c1(10)), the index cannot be used for loose
index scan.
4. OPTIMIZING SPECIFIC TYPES OF QUERIES
4.3 Group by
Loose Index Scan
Example: index(c1, c2, c3)
4. OPTIMIZING SPECIFIC TYPES OF QUERIES
4.3 Group by
Loose Index Scan
4. OPTIMIZING SPECIFIC TYPES OF QUERIES
4.3 Group by
Tight Index Scan
4. OPTIMIZING SPECIFIC TYPES OF QUERIES
4.3 UNION
• UNION ALL is much faster than UNION
• Move the WHERE inside each subquery
Example:
• Slow:
(SELECT type, release FROM short_sleeve)
UNION
(SELECT type, release FROM long_sleeve);
WHERE release >=2013;
• Fast:
(SELECT type, release FROM short_sleeve WHERE release >=2013)
UNION
(SELECT type, release FROM long_sleeve WHERE release >=2013);
5. INDEXING
5.1 B-Tree
• When people talk about an index without mentioning a
type, they’re probably referring
to a B-Tree index
• A B-Tree index speeds up data access because the storage
engine doesn’t have to scan
the whole table to find the desired data. Instead, it starts
at the root node (not shown
in this figure).
5. INDEXING
5.1 B-Tree
5. INDEXING
5.1 B-Tree
• Consider InnoDB whose page size is 16KB and suppose we have an index on a integer
column of size 4bytes, so a node can contain at most 16 * 1024 / 4 = 4096 keys, and a
node can have at most 4097 children.
• So for a B+tree of height 1, the root node has 4096 keys and the nodes at height 1
(the leaf nodes) have 4096 * 4097 = 16781312 key values.
5. INDEXING
5.1 B-Tree
5. INDEXING
5.1 B-Tree
Benefits:
• Match the full value: find a person named Cuba Allen who was born
on 1960-01-01.
• Match a leftmost prefix: find all people with the last name Allen
• Match a column prefix: all people whose last names begin with J
• Match a range of values: find people whose last names are between Allen and
Barrymore
• Match one part exactly and match a range on another part: find everyone whose last
name is Allen and whose first name starts with the letter K (Kim, Karl, etc.)
5. INDEXING
5.1 B-Tree
Limits:
• They are not useful if the lookup does not start from the leftmost side of the
indexed columns.
• You can’t skip columns in the index.
5. INDEXING
5.2 Hash indexes
A hash index is built on a hash table and is useful only for exact lookups that use
every column in the index. For each row, the storage engine computes a hash
code of the indexed columns, which is a small value that will probably differ
from the hash codes computed for other rows with different key values. It stores
the hash codes in the index and stores a pointer to each row in a hash table.
5. INDEXING
5.2 Hash indexes
5. INDEXING
5.2 Hash indexes
Limits:
• MySQL can’t use the values in the index to avoid reading the rows
• can’t use hash indexes for sorting
• don’t support partial key matching
• support only equality comparisons that use the =, IN(), and <=>
operators
• When there are collisions (multiple values with the same hash): slow
5. INDEXING
5.3 Indexing Strategies for High Performance
Isolating the Column
• MySQL generally can’t use indexes on columns unless the columns are
isolated in the query. “Isolating” the column means it should not be part of an
expression or be inside a function in the query.
SELECT actor_id FROM sakila.actor WHERE actor_id + 1 = 5;
• Here’s another example of a common mistake:
mysql> SELECT ... WHERE TO_DAYS(CURRENT_DATE) - TO_DAYS(date_col) <= 10;
5. INDEXING
5.3 Indexing Strategies for High Performance
Prefix Indexes and Index Selectivity
• Sometimes you need to index very long character columns, which makes your
indexes large and slow. The trick is to choose a prefix that’s long enough to give good
selectivity, but short enough to save space.
5. INDEXING
5.3 Indexing Strategies for High Performance
Choosing a Good Column Order
• place the most selective columns first in the index.
• Let’s use the following query as an example:
SELECT * FROM payment WHERE staff_id = 2 AND customer_id = 584;
• index on (staff_id, customer_id) OR (customer_id , staff_id) ???
5. INDEXING
5.3 Indexing Strategies for High Performance
Choosing a Good Column Order
5. INDEXING
5.3 Indexing Strategies for High Performance
Covering Indexes
• MySQL can also use an index to retrieve a column’s data, so it doesn’t have to read
the row at all. After all, the index’s leaf nodes contain the values they index;
• MySQL can use only B-Tree indexes to cover queries.
• The general rule is to choose the columns for filtering first (WHERE clause with
equality conditions), then sorting/grouping (GROUP BY and ORDER BY clauses) and
finally the data projection (SELECT clause).
5. INDEXING
5.3 Indexing Strategies for High Performance
Covering Indexes
5. INDEXING
5.3 Indexing Strategies for High Performance
Using Index Scans for Sorts
MySQL has two ways to produce ordered results: it can use a sort operation, or it can
scan an index in order. You can tell when MySQL plans to scan an index by looking for
“index” in the type column in EXPLAIN.
5. INDEXING
5.3 Indexing Strategies for High Performance
Using Index Scans for Sorts
Index on (rental_date,
inventory_id,
Customer_id)
5. INDEXING
5.3 Indexing Strategies for High Performance
Using Index Scans for Sorts
Index on (rental_date, inventory_id, customer_id)
5. INDEXING
5.3 Indexing Strategies for High Performance
Using Index Scans for Sorts
Index on
(rental_date,
inventory_id,
Customer_id)
6. PARTITIONING
6.1 scenarios:
• When the table is much too big to fit in memory, or when you have “hot” rows at the
end of a table that has lots of historical data.
• Partitioned data is easier to maintain than nonpartitioned data. For example, it’s easier
to discard old data by dropping an entire partition, which you can do quickly. You can
also optimize, check, and repair individual partitions.
• If you really need to, you can back up and restore individual partitions, which is very
helpful with extremely large datasets.
6. PARTITIONING
6.2 How Partitioning Works:
When you query a partitioned table, the partitioning layer opens and locks all of the
underlying partitions, the query optimizer determines whether any of the partitions can
be ignored (pruned), and then the partitioning layer forwards the handler API calls to
the storage engine that manages the partitions.
6. PARTITIONING
6.3 Types of Partitioning:
• MySQL supports several types of partitioning. The most common type we’ve seen
used is range partitioning.

More Related Content

What's hot (20)

PPTX
Virtual machines and containers
Patrick Pierson
 
PDF
[215] Druid로 쉽고 빠르게 데이터 분석하기
NAVER D2
 
PPTX
NGINX: Basics and Best Practices
NGINX, Inc.
 
PDF
Service discovery with Eureka and Spring Cloud
Marcelo Serpa
 
PPTX
Rethinking Cloud Proxies
Mikey Cohen - Hiring Amazing Engineers
 
PPTX
Asp.Net Core MVC with Entity Framework
Shravan A
 
PDF
Introduction to Azure
Robert Crane
 
PDF
Kubernetes - A Comprehensive Overview
Bob Killen
 
PPTX
React + Redux Introduction
Nikolaus Graf
 
ODP
Elasticsearch for beginners
Neil Baker
 
PPTX
Entity Framework Database and Code First
James Johnson
 
PPTX
AWS Code-Deploy
Knoldus Inc.
 
PPTX
Introduction Node.js
Erik van Appeldoorn
 
PPTX
Azure key vault
Rahul Nath
 
PDF
Amazon Aurora 성능 향상 및 마이그레이션 모범 사례 - AWS Summit Seoul 2017
Amazon Web Services Korea
 
PPTX
쿠버네티스 ( Kubernetes ) 소개 자료
Opennaru, inc.
 
PDF
Azure vm introduction
Lalit Rawat
 
PPTX
Express JS Middleware Tutorial
Simplilearn
 
PPTX
Docker Ecosystem on Azure
Patrick Chanezon
 
PDF
Kubernetes Concepts And Architecture Powerpoint Presentation Slides
SlideTeam
 
Virtual machines and containers
Patrick Pierson
 
[215] Druid로 쉽고 빠르게 데이터 분석하기
NAVER D2
 
NGINX: Basics and Best Practices
NGINX, Inc.
 
Service discovery with Eureka and Spring Cloud
Marcelo Serpa
 
Rethinking Cloud Proxies
Mikey Cohen - Hiring Amazing Engineers
 
Asp.Net Core MVC with Entity Framework
Shravan A
 
Introduction to Azure
Robert Crane
 
Kubernetes - A Comprehensive Overview
Bob Killen
 
React + Redux Introduction
Nikolaus Graf
 
Elasticsearch for beginners
Neil Baker
 
Entity Framework Database and Code First
James Johnson
 
AWS Code-Deploy
Knoldus Inc.
 
Introduction Node.js
Erik van Appeldoorn
 
Azure key vault
Rahul Nath
 
Amazon Aurora 성능 향상 및 마이그레이션 모범 사례 - AWS Summit Seoul 2017
Amazon Web Services Korea
 
쿠버네티스 ( Kubernetes ) 소개 자료
Opennaru, inc.
 
Azure vm introduction
Lalit Rawat
 
Express JS Middleware Tutorial
Simplilearn
 
Docker Ecosystem on Azure
Patrick Chanezon
 
Kubernetes Concepts And Architecture Powerpoint Presentation Slides
SlideTeam
 

Similar to Optimizing MySQL queries (20)

PDF
MySQL Performance Optimization
Mindfire Solutions
 
PDF
Advanced MySQL Query Optimizations
Dave Stokes
 
PDF
High Performance Mysql - Friday Tech Talks at Squareboat
Squareboat
 
PPTX
MySQL Indexes
Anton Zhukov
 
PDF
Zurich2007 MySQL Query Optimization
Hiệp Lê Tuấn
 
PDF
Zurich2007 MySQL Query Optimization
Hiệp Lê Tuấn
 
PDF
query optimization
Dimara Hakim
 
PDF
Longhorn PHP - MySQL Indexes, Histograms, Locking Options, and Other Ways to ...
Dave Stokes
 
PDF
MySQL Indexes and Histograms - RMOUG Training Days 2022
Dave Stokes
 
PDF
Introduction to MySQL Query Tuning for Dev[Op]s
Sveta Smirnova
 
PPTX
MySQL Indexing - Best practices for MySQL 5.6
MYXPLAIN
 
PPTX
Open Source 1010 and Quest InSync presentations March 30th, 2021 on MySQL Ind...
Dave Stokes
 
PDF
MySQL Query Optimisation 101
Federico Razzoli
 
PPTX
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
Dave Stokes
 
ODP
San diegophp
Dave Stokes
 
PDF
MySQL Indexing
BADR
 
PDF
Dutch PHP Conference 2021 - MySQL Indexes and Histograms
Dave Stokes
 
PPTX
Confoo 2021 - MySQL Indexes & Histograms
Dave Stokes
 
PPTX
Работа с индексами - лучшие практики для MySQL 5.6, Петр Зайцев (Percona)
Ontico
 
PDF
Optimized cluster index generation
Rutvik Pensionwar
 
MySQL Performance Optimization
Mindfire Solutions
 
Advanced MySQL Query Optimizations
Dave Stokes
 
High Performance Mysql - Friday Tech Talks at Squareboat
Squareboat
 
MySQL Indexes
Anton Zhukov
 
Zurich2007 MySQL Query Optimization
Hiệp Lê Tuấn
 
Zurich2007 MySQL Query Optimization
Hiệp Lê Tuấn
 
query optimization
Dimara Hakim
 
Longhorn PHP - MySQL Indexes, Histograms, Locking Options, and Other Ways to ...
Dave Stokes
 
MySQL Indexes and Histograms - RMOUG Training Days 2022
Dave Stokes
 
Introduction to MySQL Query Tuning for Dev[Op]s
Sveta Smirnova
 
MySQL Indexing - Best practices for MySQL 5.6
MYXPLAIN
 
Open Source 1010 and Quest InSync presentations March 30th, 2021 on MySQL Ind...
Dave Stokes
 
MySQL Query Optimisation 101
Federico Razzoli
 
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
Dave Stokes
 
San diegophp
Dave Stokes
 
MySQL Indexing
BADR
 
Dutch PHP Conference 2021 - MySQL Indexes and Histograms
Dave Stokes
 
Confoo 2021 - MySQL Indexes & Histograms
Dave Stokes
 
Работа с индексами - лучшие практики для MySQL 5.6, Петр Зайцев (Percona)
Ontico
 
Optimized cluster index generation
Rutvik Pensionwar
 
Ad

More from GMO-Z.com Vietnam Lab Center (20)

PPTX
高負荷に耐えうるWebApplication Serverの作り方
GMO-Z.com Vietnam Lab Center
 
PPTX
Phương pháp và chiến lược đối ứng tải trong Web Application Server
GMO-Z.com Vietnam Lab Center
 
PPTX
Ứng dụng NLP vào việc xác định ý muốn người dùng (Intent Detection) và sửa lỗ...
GMO-Z.com Vietnam Lab Center
 
PPTX
Tìm hiểu và triển khai ứng dụng Web với Kubernetes
GMO-Z.com Vietnam Lab Center
 
PPTX
Xây dựng hệ thống quản lý sân bóng sử dụng Yii Framework
GMO-Z.com Vietnam Lab Center
 
PPTX
Nhận biết giao dịch lừa đảo sử dụng học máy
GMO-Z.com Vietnam Lab Center
 
PDF
Hệ thống giám sát nhận diện khuôn mặt
GMO-Z.com Vietnam Lab Center
 
PPTX
Image Style Transfer
GMO-Z.com Vietnam Lab Center
 
PPTX
Surveillance on slam technology
GMO-Z.com Vietnam Lab Center
 
PDF
Blockchain & Smart Contract - Bắt đầu như thế nào và các ứng dụng
GMO-Z.com Vietnam Lab Center
 
PDF
Giới thiệu Embulk
GMO-Z.com Vietnam Lab Center
 
PDF
Giới thiệu docker và ứng dụng trong ci-cd
GMO-Z.com Vietnam Lab Center
 
PDF
Tài liệu giới thiệu công ty GMO-Z.com Vietnam Lab Center
GMO-Z.com Vietnam Lab Center
 
PDF
Agile retrospective
GMO-Z.com Vietnam Lab Center
 
PDF
Giới thiệu Agile + Scrum
GMO-Z.com Vietnam Lab Center
 
PDF
Create android app can send SMS and Email by React Natice
GMO-Z.com Vietnam Lab Center
 
PDF
Introduce React Native
GMO-Z.com Vietnam Lab Center
 
PPTX
Git in real product
GMO-Z.com Vietnam Lab Center
 
高負荷に耐えうるWebApplication Serverの作り方
GMO-Z.com Vietnam Lab Center
 
Phương pháp và chiến lược đối ứng tải trong Web Application Server
GMO-Z.com Vietnam Lab Center
 
Ứng dụng NLP vào việc xác định ý muốn người dùng (Intent Detection) và sửa lỗ...
GMO-Z.com Vietnam Lab Center
 
Tìm hiểu và triển khai ứng dụng Web với Kubernetes
GMO-Z.com Vietnam Lab Center
 
Xây dựng hệ thống quản lý sân bóng sử dụng Yii Framework
GMO-Z.com Vietnam Lab Center
 
Nhận biết giao dịch lừa đảo sử dụng học máy
GMO-Z.com Vietnam Lab Center
 
Hệ thống giám sát nhận diện khuôn mặt
GMO-Z.com Vietnam Lab Center
 
Image Style Transfer
GMO-Z.com Vietnam Lab Center
 
Surveillance on slam technology
GMO-Z.com Vietnam Lab Center
 
Blockchain & Smart Contract - Bắt đầu như thế nào và các ứng dụng
GMO-Z.com Vietnam Lab Center
 
Giới thiệu Embulk
GMO-Z.com Vietnam Lab Center
 
Giới thiệu docker và ứng dụng trong ci-cd
GMO-Z.com Vietnam Lab Center
 
Tài liệu giới thiệu công ty GMO-Z.com Vietnam Lab Center
GMO-Z.com Vietnam Lab Center
 
Agile retrospective
GMO-Z.com Vietnam Lab Center
 
Giới thiệu Agile + Scrum
GMO-Z.com Vietnam Lab Center
 
Create android app can send SMS and Email by React Natice
GMO-Z.com Vietnam Lab Center
 
Introduce React Native
GMO-Z.com Vietnam Lab Center
 
Git in real product
GMO-Z.com Vietnam Lab Center
 
Ad

Recently uploaded (20)

PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PPTX
Designing Production-Ready AI Agents
Kunal Rai
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Designing Production-Ready AI Agents
Kunal Rai
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 

Optimizing MySQL queries

  • 2. OVERVIEW • Optimization Overview • Understanding the Query Execution • Using explain • Optimizing Specific Types of Queries • Indexing • Partitioning • Demo partitioning
  • 3. 1. OPTIMIZATION OVERVIEW 1.1 Optimizing at the Database Level • tables • indexes • storage engine • locking strategy • Are all memory areas used for caching sized correctly?
  • 4. 1. OPTIMIZATION OVERVIEW 1.2 Optimizing at the Hardware Level • Disk seeks • Disk reading and writing • CPU cycles • Memory bandwidth
  • 5. 2. UNDERSTANDING THE QUERY EXECUTION
  • 6. 2. UNDERSTANDING THE QUERY EXECUTION 1. The client sends the SQL statement to the server. 2. The server checks the query cache. If there’s a hit, it returns the stored result from the cache; otherwise, it passes the SQL statement to the next step. 3. The server parses, preprocesses, and optimizes the SQL into a query execution plan. 4. The query execution engine executes the plan by making calls to the storage engine API. 5. The server sends the result to the client.
  • 7. 3. USING EXPLAIN 3.1 Usage: • EXPLAIN is used to obtain a query execution plan. • EXPLAIN is useful for examining queries involving partitioned tables. • EXPLAIN works with SELECT, DELETE, INSERT, REPLACE, and UPDATE statements.
  • 9. 3. USING EXPLAIN 3.3 What should I care about?
  • 10. 3. USING EXPLAIN 3.3 What should I care about? • Partitions • Extra • Type
  • 11. 3. USING EXPLAIN 3.3 What should I care about? Partitions: show which partitions were used.
  • 12. 3. USING EXPLAIN 3.3 What should I care about? Extra • Using filesort • Using temporary • Using index • Using index for group-by • …
  • 13. 3. USING EXPLAIN 3.3 What should I care about? Extra • Using filesort: MySQL must do an extra pass to find out how to retrieve the rows in sorted order. The sort is done by going through all rows according to the join type and storing the sort key and pointer to the row for all rows that match the WHERE clause. The keys then are sorted and the rows are retrieved in sorted order.
  • 14. 3. USING EXPLAIN 3.3 What should I care about? Extra • Using temporary: To resolve the query, MySQL needs to create a temporary table to hold the result. This typically happens if the query contains GROUP BY and ORDER BY clauses that list columns differently.
  • 15. 3. USING EXPLAIN 3.3 What should I care about? Extra • Using index: The column information is retrieved from the table using only information in the index tree without having to do an additional seek to read the actual row. This strategy can be used when the query uses only columns that are part of a single index.
  • 16. 3. USING EXPLAIN 3.3 What should I care about? Extra • Using index for group-by: Similar to the Using index table access method, Using index for group-by indicates that MySQL found an index that can be used to retrieve all columns of a GROUP BY or DISTINCT query without any extra disk access to the actual table. Additionally, the index is used in the most efficient way so that for each group, only a few index entries are read.
  • 17. 3. USING EXPLAIN 3.3 What should I care about? Type: describes how tables are joined. The following list describes the join types, ordered from the best type to the worst: • System • Const • Eq_ref • Ref • Fulltext
  • 18. 3. USING EXPLAIN 3.3 What should I care about? Type: • Ref_or_null • Index_merge • Unique_subquery • Index_subquery • Range • Index • All
  • 19. 4. OPTIMIZING SPECIFIC TYPES OF QUERIES 4.1 Count() • SELECT COUNT(*) FROM world.City WHERE ID > 5; • If you examine this query with SHOW STATUS, you’ll see that it scans 4,079 rows. If you negate the conditions and subtract the number of cities whose IDs are less than or equal to 5 from the total number of cities, you can reduce that to five rows: • SELECT (SELECT COUNT(*) FROM world.City) - COUNT(*) -> FROM world.City WHERE ID <= 5;
  • 20. 4. OPTIMIZING SPECIFIC TYPES OF QUERIES 4.2 LIMIT and OFFSET • One simple technique to improve efficiency is to do the offset on a covering index, rather than the full rows. You can then join the result to the full row and retrieve the additional columns you need. This can be much more efficient. Consider the following query:
  • 21. 4. OPTIMIZING SPECIFIC TYPES OF QUERIES 4.3 Group by • The most general way to satisfy a GROUP BY clause is to scan the whole table and create a new temporary table where all rows from each group are consecutive, and then use this temporary table to discover groups and apply aggregate functions (if any). In some cases, MySQL is able to do much better than that and to avoid creation of temporary tables by using index access.
  • 22. 4. OPTIMIZING SPECIFIC TYPES OF QUERIES 4.3 Group by Loose Index Scan If loose index scan is applicable to a query, the EXPLAIN output shows Using index for group-by in the Extra column.
  • 23. 4. OPTIMIZING SPECIFIC TYPES OF QUERIES 4.3 Group by Loose Index Scan Condition: • The query is over a single table. • The GROUP BY names only columns that form a leftmost prefix of the index and no other columns. (If, instead of GROUP BY, the query has a DISTINCT clause, all distinct attributes refer to columns that form a leftmost prefix of the index.) • The only aggregate functions used in the select list (if any) are MIN() and MAX(), and all of them refer to the same column. The column must be in the index and must immediately follow the columns in the GROUP BY.
  • 24. 4. OPTIMIZING SPECIFIC TYPES OF QUERIES 4.3 Group by Loose Index Scan Condition: • Any other parts of the index than those from the GROUP BY referenced in the query must be constants (that is, they must be referenced in equalities with constants), except for the argument of MIN() or MAX() functions. • For columns in the index, full column values must be indexed, not just a prefix. For example, with c1 VARCHAR(20), INDEX (c1(10)), the index cannot be used for loose index scan.
  • 25. 4. OPTIMIZING SPECIFIC TYPES OF QUERIES 4.3 Group by Loose Index Scan Example: index(c1, c2, c3)
  • 26. 4. OPTIMIZING SPECIFIC TYPES OF QUERIES 4.3 Group by Loose Index Scan
  • 27. 4. OPTIMIZING SPECIFIC TYPES OF QUERIES 4.3 Group by Tight Index Scan
  • 28. 4. OPTIMIZING SPECIFIC TYPES OF QUERIES 4.3 UNION • UNION ALL is much faster than UNION • Move the WHERE inside each subquery Example: • Slow: (SELECT type, release FROM short_sleeve) UNION (SELECT type, release FROM long_sleeve); WHERE release >=2013; • Fast: (SELECT type, release FROM short_sleeve WHERE release >=2013) UNION (SELECT type, release FROM long_sleeve WHERE release >=2013);
  • 29. 5. INDEXING 5.1 B-Tree • When people talk about an index without mentioning a type, they’re probably referring to a B-Tree index • A B-Tree index speeds up data access because the storage engine doesn’t have to scan the whole table to find the desired data. Instead, it starts at the root node (not shown in this figure).
  • 31. 5. INDEXING 5.1 B-Tree • Consider InnoDB whose page size is 16KB and suppose we have an index on a integer column of size 4bytes, so a node can contain at most 16 * 1024 / 4 = 4096 keys, and a node can have at most 4097 children. • So for a B+tree of height 1, the root node has 4096 keys and the nodes at height 1 (the leaf nodes) have 4096 * 4097 = 16781312 key values.
  • 33. 5. INDEXING 5.1 B-Tree Benefits: • Match the full value: find a person named Cuba Allen who was born on 1960-01-01. • Match a leftmost prefix: find all people with the last name Allen • Match a column prefix: all people whose last names begin with J • Match a range of values: find people whose last names are between Allen and Barrymore • Match one part exactly and match a range on another part: find everyone whose last name is Allen and whose first name starts with the letter K (Kim, Karl, etc.)
  • 34. 5. INDEXING 5.1 B-Tree Limits: • They are not useful if the lookup does not start from the leftmost side of the indexed columns. • You can’t skip columns in the index.
  • 35. 5. INDEXING 5.2 Hash indexes A hash index is built on a hash table and is useful only for exact lookups that use every column in the index. For each row, the storage engine computes a hash code of the indexed columns, which is a small value that will probably differ from the hash codes computed for other rows with different key values. It stores the hash codes in the index and stores a pointer to each row in a hash table.
  • 37. 5. INDEXING 5.2 Hash indexes Limits: • MySQL can’t use the values in the index to avoid reading the rows • can’t use hash indexes for sorting • don’t support partial key matching • support only equality comparisons that use the =, IN(), and <=> operators • When there are collisions (multiple values with the same hash): slow
  • 38. 5. INDEXING 5.3 Indexing Strategies for High Performance Isolating the Column • MySQL generally can’t use indexes on columns unless the columns are isolated in the query. “Isolating” the column means it should not be part of an expression or be inside a function in the query. SELECT actor_id FROM sakila.actor WHERE actor_id + 1 = 5; • Here’s another example of a common mistake: mysql> SELECT ... WHERE TO_DAYS(CURRENT_DATE) - TO_DAYS(date_col) <= 10;
  • 39. 5. INDEXING 5.3 Indexing Strategies for High Performance Prefix Indexes and Index Selectivity • Sometimes you need to index very long character columns, which makes your indexes large and slow. The trick is to choose a prefix that’s long enough to give good selectivity, but short enough to save space.
  • 40. 5. INDEXING 5.3 Indexing Strategies for High Performance Choosing a Good Column Order • place the most selective columns first in the index. • Let’s use the following query as an example: SELECT * FROM payment WHERE staff_id = 2 AND customer_id = 584; • index on (staff_id, customer_id) OR (customer_id , staff_id) ???
  • 41. 5. INDEXING 5.3 Indexing Strategies for High Performance Choosing a Good Column Order
  • 42. 5. INDEXING 5.3 Indexing Strategies for High Performance Covering Indexes • MySQL can also use an index to retrieve a column’s data, so it doesn’t have to read the row at all. After all, the index’s leaf nodes contain the values they index; • MySQL can use only B-Tree indexes to cover queries. • The general rule is to choose the columns for filtering first (WHERE clause with equality conditions), then sorting/grouping (GROUP BY and ORDER BY clauses) and finally the data projection (SELECT clause).
  • 43. 5. INDEXING 5.3 Indexing Strategies for High Performance Covering Indexes
  • 44. 5. INDEXING 5.3 Indexing Strategies for High Performance Using Index Scans for Sorts MySQL has two ways to produce ordered results: it can use a sort operation, or it can scan an index in order. You can tell when MySQL plans to scan an index by looking for “index” in the type column in EXPLAIN.
  • 45. 5. INDEXING 5.3 Indexing Strategies for High Performance Using Index Scans for Sorts Index on (rental_date, inventory_id, Customer_id)
  • 46. 5. INDEXING 5.3 Indexing Strategies for High Performance Using Index Scans for Sorts Index on (rental_date, inventory_id, customer_id)
  • 47. 5. INDEXING 5.3 Indexing Strategies for High Performance Using Index Scans for Sorts Index on (rental_date, inventory_id, Customer_id)
  • 48. 6. PARTITIONING 6.1 scenarios: • When the table is much too big to fit in memory, or when you have “hot” rows at the end of a table that has lots of historical data. • Partitioned data is easier to maintain than nonpartitioned data. For example, it’s easier to discard old data by dropping an entire partition, which you can do quickly. You can also optimize, check, and repair individual partitions. • If you really need to, you can back up and restore individual partitions, which is very helpful with extremely large datasets.
  • 49. 6. PARTITIONING 6.2 How Partitioning Works: When you query a partitioned table, the partitioning layer opens and locks all of the underlying partitions, the query optimizer determines whether any of the partitions can be ignored (pruned), and then the partitioning layer forwards the handler API calls to the storage engine that manages the partitions.
  • 50. 6. PARTITIONING 6.3 Types of Partitioning: • MySQL supports several types of partitioning. The most common type we’ve seen used is range partitioning.

Editor's Notes

  • #9: https://dev.mysql.com/doc/refman/5.7/en/explain-output.html
  • #13: https://dev.mysql.com/doc/refman/5.7/en/explain-output.html#explain-extra-information
  • #14: https://dev.mysql.com/doc/refman/5.7/en/explain-output.html#explain-extra-information
  • #15: https://dev.mysql.com/doc/refman/5.7/en/explain-output.html#explain-extra-information
  • #16: https://dev.mysql.com/doc/refman/5.7/en/explain-output.html#explain-extra-information
  • #17: https://dev.mysql.com/doc/refman/5.7/en/explain-output.html#explain-extra-information
  • #18: https://dev.mysql.com/doc/refman/5.7/en/explain-output.html#explain-join-types
  • #21: This “deferred join” works because it lets the server examine as little data as possible in an index without accessing rows, and then, once the desired rows are found, join them against the full table to retrieve the other columns from the row.
  • #29: https://www.iheavy.com/2013/06/13/how-to-optimize-mysql-union-for-high-speed/
  • #31: https://www.cs.usfca.edu/~galles/visualization/BPlusTree.html http://www.ovaistariq.net/733/understanding-btree-indexes-and-how-they-impact-performance/#.WsMCtIhuaUk
  • #40: http://code-epicenter.com/prefixed-index-in-mysql-database/
  • #43: https://community.toadworld.com/platforms/mysql/b/weblog/archive/2017/04/06/speed-up-your-queries-using-the-covering-index-in-mysql
  • #45: If MySQL isn’t using the index to cover the query, it will have to look up each row it finds in the index. This is basically random I/O, so reading data in index order is usually much slower than a sequential table scan, especially for I/Obound workloads.
  • #46: Ordering the results by the index works only when the index’s order is exactly the same as the ORDER BY clause and all columns are sorted in the same direction (ascending or descending).12 If the query joins multiple tables, it works only when all columns in the ORDER BY clause refer to the first table. The ORDER BY clause also has the same limitation as lookup queries: it needs to form a leftmost prefix of the index. In all other cases, MySQL uses a sort.