SlideShare a Scribd company logo
Sergei Petrunia, MariaDB
New features
in MariaDB/MySQL
query optimizer
12:49:092
MySQL/MariaDB optimizer development
● Some features have common heritage
● Big releases:
– MariaDB 5.3/5.5
– MySQL 5.6
– (upcoming) MariaDB 10.0
12:49:093
New optimizer features
Subqueries Batched Key Access
(MRR)
Index Condition Pushdown
Extended Keys
EXPLAIN UPDATE/
DELETE
Subqueries
FROM IN Others
PERFORMANCE_SCHEMA
Engine-independent
statistics
InnoDB persistent statistics
12:49:094
New optimizer features
Subqueries Batched Key Access
(MRR)
Index Condition Pushdown
Extended Keys
EXPLAIN UPDATE/
DELETE
Subqueries
FROM IN Others
Engine-independent
statistics
InnoDB persistent statistics
PERFORMANCE_SCHEMA
12:49:095
Subqueries in MySQL
● Subqueries are practially unusable
● e.g. Facebook disabled them in the parser
● Reason - “naive execution”.
12:49:096
Naive subquery execution
● For IN (SELECT... ) subqueries:
select * from hotel
where
hotel.country='USA' and
hotel.name IN (select hotel_stays.hotel
from hotel_stays
where hotel_stays.customer='John Smith')
for (each hotel in USA ) {
if (john smith stayed here) {
…
}
}
● Naive execution:
● Slow!
12:49:097
Naive subquery execution (2)
● For FROM(SELECT …) subquereis:
1. Retrieve all hotels with > 500 rooms, store in a temporary
table big_hotel;
2. Search in big_hotel for hotels near AMS.
● Naive execution:
● Slow!
select *
from
(select *
from hotel
where hotel.rooms > 500
) as big_hotel
where
big_hotel.nearest_aiport='AMS';
12:49:098
New subquery optimizations
● Handle IN (SELECT ...)
● Handle FROM (SELECT …)
● Handle a lot of cases
● Comparison with
PostgreSQL
– ~1000x slower before
– ~same order of magnitude now
● Releases
– MySQL 6.0
– MariaDB 5.5
● Sheeri Kritzer @ Mozilla seems
happy with this one
– MySQL 5.6
● Subset of MariaDB 5.5's
features
12:49:099
Subquery optimizations - summary
● Subqueries were generally unusable before MariaDB
5.3/5.5
● “Core” subquery optimizations are in
– MariaDB 5.3/5.5
– MySQL 5.6
● MariaDB has extra additions
● Further information:
https://kb.askmonty.org/en/subquery-optimizations/
12:49:0910
Subqueries Batched Key Access
(MRR)
Index Condition Pushdown
Extended Keys
EXPLAIN UPDATE/
DELETE
Subqueries
FROM IN Others
Engine-independent
statistics
InnoDB persistent statistics
PERFORMANCE_SCHEMA
12:49:0911
Batched Key Access - background
● Big, IO-bound joins were slow
– DBT-3 benchmark could not finish*
● Reason?
● Nested Loops join hits the second table at random
locations.
12:49:0912
Batched Key Access idea
Nested Loops Join Batched Key Access
Speedup reasons
● Fewer disk head movements
● Cache-friendliness
● Prefetch-friendliness
12:49:0913
Batched Key Access benchmark
set join_cache_level=6; – enable BKA
select max(l_extendedprice)
from orders, lineitem
where
l_orderkey=o_orderkey and
o_orderdate between $DATE1 and $DATE2
Run with
● Various join_buffer_size settings
● Various size of $DATE1...$DATE2 range
12:49:0914
Batched Key Access benchmark (2)
-2,000,000 3,000,000 8,000,000 13,000,000 18,000,000 23,000,000 28,000,000 33,000,000
0
500
1000
1500
2000
2500
3000
BKA join performance depending on buffer size
query_size=1, regular
query_size=1, BKA
query_size=2, regular
query_size=2, BKA
query_size=3, regular
query_size=3, BKA
Buffer size, bytes
Querytime,sec
Performance without BKA
Performance with BKA,
given sufficient buffer size
12:49:0915
Batched Key Access summary
● Optimization for big, IO-bound joins
– Orders-of-magnitude speedups
● Available in
– MariaDB 5.3/5.5 (more advanced)
– MySQL 5.6
● Not fully automatic yet
– Needs to be manually enabled
– Need to set buffer sizes.
12:49:0916
Subqueries Batched Key Access
(MRR)
Index Condition Pushdown
Extended Keys
EXPLAIN UPDATE/
DELETE
Subqueries
FROM IN Others
Engine-independent
statistics
InnoDB persistent statistics
PERFORMANCE_SCHEMA
12:49:0917
Index Condition Pushdown
alter table lineitem add index s_r (l_shipdate, l_receiptdate);
select count(*) from lineitem
where
l_shipdate between '1993-01-01' and '1993-02-01' and
datediff(l_receiptdate,l_shipdate) > 25 and
l_quantity > 40
● A new feature in MariaDB 5.3/ MySQL 5.6
+----+-------------+----------+-------+---------------+------+---------+------+--------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------+------+---------+------+--------+------------------------------------+
| 1 | SIMPLE | lineitem | range | s_r | s_r | 4 | NULL | 158854 | Using index condition; Using where |
+----+-------------+----------+-------+---------------+------+---------+------+--------+------------------------------------+
1.Read index records in the range
l_shipdate between '1993-01-01' and '1993-02-01'
2.Check the index condition
datediff(l_receiptdate,l_shipdate) > 25
3.Read full table rows
4.Check the WHERE condition
l_quantity > 40
← New!
← Filters out records before
table rows are read
12:49:0918
Index Condition Pushdown - conclusions
Summary
● Applicable to any index-based access (ref, range, etc)
● Checks parts of WHERE after reading the index
● Reduces number of table records to be read
● Speedup can be like in “Using index”
– Great for IO-bound load (5x, 10x)
– Some for CPU-bound workload (2x)
Conclusions
● Have a selective condition on column?
– Put the column into index, at the end.
12:49:0919
Extended keys
● Before: optimizer has limited support for “tail” columns
– 'Using index' supports it
– ORDER BY col1, col2, pk1 support it
● After MariaDB 5.5/ MySQL 5.6
– all parts of optimizer (ref access, range access, etc) can use the “tail”
CREATE TABLE tbl (
pk1 sometype,
pk2 sometype,
...
col1 sometype,
col2 sometype,
...
KEY indexA (col1, col2)
...
PRIMARY KEY (pk1, pk2)
) ENGINE=InnoDB
indexA col1 col2 pk1 pk2
● Secondary indexes in InnoDB have invisible “tail”
12:49:0920
Subqueries Batched Key Access
(MRR)
Index Condition Pushdown
Extended Keys
EXPLAIN UPDATE/
DELETE
Subqueries
FROM IN Others
Engine-independent
statistics
InnoDB persistent statistics
PERFORMANCE_SCHEMA
12:49:0921
Better EXPLAIN in MySQL 5.6
● EXPLAIN for UPDATE/DELETE/INSERT … SELECT
– shows query plan for the finding records to update/delete
mysql> explain update customer set c_acctbal = c_acctbal - 100 where c_custkey=12354;
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | customer | range | PRIMARY | PRIMARY | 4 | NULL | 1 | Using where |
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
● EXPLAIN FORMAT=JSON
– Produces [big] JSON output
– Shows more information:
● Shows conditions attached to tables
● Shows whether “Using temporary; using filesort” is done to handle
GROUP BY or ORDER BY.
● Shows where subqueries are attached
– No other known additions
– Will be in MariaDB 10.0
The most useful addition!
12:49:0922
EXPLAIN FORMAT=JSON
What are the “conditions attached to tables”?
explain
select
count(*)
from
orders, customer
where
customer.c_custkey=orders.o_custkey and
customer.c_mktsegment='BUILDING' and
orders.o_totalprice > customer.c_acctbal and
orders.o_orderpriority='1-URGENT'
+----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+
| 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 1509871 | Using where |
| 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | dbt3sf10.customer.c_custkey | 7 | Using where |
+----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+
?
12:49:0923
EXPLAIN FORMAT=JSON (2)
{
"query_block": {
"select_id": 1,
"nested_loop": [
{
"table": {
"table_name": "customer",
"access_type": "ALL",
"possible_keys": [
"PRIMARY"
],
"rows": 1509871,
"filtered": 100,
"attached_condition": "(`dbt3sf10`.`customer`.`c_mktsegment` = 'BUILDING')"
}
},
{
"table": {
"table_name": "orders",
"access_type": "ref",
"possible_keys": [
"i_o_custkey"
],
"key": "i_o_custkey",
"used_key_parts": [
"o_custkey"
],
"key_length": "5",
"ref": [
"dbt3sf10.customer.c_custkey"
],
"rows": 7,
"filtered": 100,
"attached_condition": "((`dbt3sf10`.`orders`.`o_orderpriority` = '1-URGENT') and (`dbt3sf10`.`orders`.`o_totalprice` >
`dbt3sf10`.`customer`.`c_acctbal`))"
}
}
]
}
}
+----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+
| 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 1509871 | Using where |
| 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | dbt3sf10.customer.c_custkey | 7 | Using where |
+----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+
12:49:0924
EXPLAIN ANALYZE (kind of)
● Does EXPLAIN match the reality?
● Where is most of the time spent?
● MySQL/MariaDB don't have “EXPLAIN ANALYZE” ...
select
count(*)
from
orders, customer
where
customer.c_custkey=orders.o_custkey and
customer.c_mktsegment='BUILDING' and orders.o_orderpriority='1-URGENT'
+------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+
| 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 149415 | Using where |
| 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | customer.c_custkey | 7 | Using index |
+------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+
12:49:0925
Traditional solution: Status variables
Problems:
● Only #rows counters
● all tables are counted together
mysql> flush status;
Query OK, 0 rows affected (0.00 sec)
mysql> {run query}
mysql> show status like 'Handler%';
+----------------------------+--------+
| Variable_name | Value |
+----------------------------+--------+
| Handler_commit | 1 |
| Handler_delete | 0 |
| Handler_discover | 0 |
| Handler_icp_attempts | 0 |
| Handler_icp_match | 0 |
| Handler_mrr_init | 0 |
| Handler_mrr_key_refills | 0 |
| Handler_mrr_rowid_refills | 0 |
| Handler_prepare | 0 |
| Handler_read_first | 0 |
| Handler_read_key | 30142 |
| Handler_read_last | 0 |
| Handler_read_next | 303959 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_deleted | 0 |
| Handler_read_rnd_next | 150001 |
| Handler_rollback | 0 |
...
. . .
12:49:0926
Newer solution: userstat
● In Facebook patch, Percona, MariaDB:
mysql> set global userstat=1;
mysql> flush table_statistics;
mysql> flush index_statistics;
mysql> {query}
mysql> show table_statistics;
+--------------+------------+-----------+--------------+-------------------------+
| Table_schema | Table_name | Rows_read | Rows_changed | Rows_changed_x_#indexes |
+--------------+------------+-----------+--------------+-------------------------+
| dbt3sf1 | orders | 303959 | 0 | 0 |
| dbt3sf1 | customer | 150000 | 0 | 0 |
+--------------+------------+-----------+--------------+-------------------------+
mysql> show index_statistics;
+--------------+------------+-------------+-----------+
| Table_schema | Table_name | Index_name | Rows_read |
+--------------+------------+-------------+-----------+
| dbt3sf1 | orders | i_o_custkey | 303959 |
+--------------+------------+-------------+-----------+
● Counters are per-table
– Ok as long as you don't have self-joins
● Overhead is negligible
● Counters are server-wide (other queries affect them, too)
12:49:0927
Latest addition: PERFORMANCE_SCHEMA
● Allows to measure *time* spent reading each table
● Has some visible overhead (Facebook's tests: 7%)
● Counters are system-wide
● Still no luck with self-joins
mysql> truncate performance_schema.table_io_waits_summary_by_table;
mysql> {query}
mysql> select
object_schema,
object_name,
count_read,
sum_timer_read, -- this is picoseconds
sum_timer_read / (1000*1000*1000*1000) as read_seconds -- this is seconds
from
performance_schema.table_io_waits_summary_by_table
where
object_schema = 'dbt3sf1' and object_name in ('orders','customer');
+---------------+-------------+------------+----------------+--------------+
| object_schema | object_name | count_read | sum_timer_read | read_seconds |
+---------------+-------------+------------+----------------+--------------+
| dbt3sf1 | orders | 334101 | 5739345397323 | 5.7393 |
| dbt3sf1 | customer | 150001 | 1273653046701 | 1.2737 |
+---------------+-------------+------------+----------------+--------------+
12:49:0928
Subqueries Batched Key Access
(MRR)
Index Condition Pushdown
Extended Keys
EXPLAIN UPDATE/
DELETE
Subqueries
FROM IN Others
Engine-independent
statistics
InnoDB persistent statistics
PERFORMANCE_SCHEMA
12:49:0929
What is table/index statistics?
select
count(*)
from
customer, orders
where
customer.c_custkey=orders.o_custkey and customer.c_mktsegment='BUILDING';
+------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+
| 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 148305 | Using where |
| 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | customer.c_custkey | 7 | Using index |
+------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+
MariaDB > show table status like 'orders'G
*************************** 1. row ***************************
Name: orders
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 1495152
.............
MariaDB > show keys from orders where key_name='i_o_custkey'G
*************************** 1. row ***************************
Table: orders
Non_unique: 1
Key_name: i_o_custkey
Seq_in_index: 1
Column_name: o_custkey
Collation: A
Cardinality: 212941
Sub_part: NULL
.................
?
1495152 / 212941 = 7
“There are on average 7 orders
for a given c_custkey”
12:49:0930
The problem with index statistics and InnoDB
MySQL 5.5, InnoDB
● Statistics is calculated on-the-fly
– When the table is opened (server restart, DDL)
– When sufficient number of records have been updated
– ...
● Calculation uses random sampling
– @@innodb_stats_sample_pages
● Result:
– Statistics changes without warning
=> Query plans change, without warning
● For example, DBT-3 benchmark
– 22 analytics queries
– Plans-per-query: avg=2.8, max=7.
12:49:0931
Persistent table statistics
Persistent statistics v1
● Percona Server 5.5 (ported to MariaDB 5.5)
– Need to enable it: innodb_use_sys_stats_table=1
● Statistics is stored inside InnoDB
– User-visible through information_schema.innodb_sys_stats (read-only)
● Setting innodb_stats_auto_update=OFF prevents unexpected updates
Persistent statistics v2
● MySQL 5.6
– Enabled by default: innodb_stats_persistent=1
● Stored in regular InnoDB tables
– mysql.innodb_table_stats, mysql.innodb_index_stats
● Setting innodb_stats_auto_recalc=OFF prevents unexpected updates
● Can also specify persistence/auto-recalc as a table option
12:49:0932
Persistent table statistics - summary
● Percona, then MySQL
– Made statistics persistent
– Disallowed automatic updates
● Remaining issue #1: it's still random sampling
– DBT-3 benchmark
– scale=30
– Re-ran EXPLAINS for
benchmark queries
– Counted different query
plans
● Remaining issue #2: limited amount of statistics
– Only on index columns
– Only AVG(#different_values)
12:49:0933
Upcoming: Engine-independent statistics
MariaDB 10.0: Engine-independent statistics
● Collected/used on SQL layer
● No auto updates, only ANALYZE TABLE
– 100% precise statics
● More statistics
– Index statistics (like before)
– Table statistics (like before)
– Column statistics
● MIN/MAX values
● Number of NULL / not NULL values
● Histograms
● => Optimizer will be smarter and more reliable
12:49:0934
Conclusions
● Lots of new query optimizer features recently
– Subqueries now just work
– Big joins are much faster
● Need to turn it on
– More diagnostics
● Even more is coming
● Releases with features
– MariaDB 5.5
– MySQL 5.6,
– (upcoming) MariaDB 10.0
12:49:0935
New optimizer features
Subqueries Batched Key Access
(MRR)
Index Condition Pushdown
Extended Keys
EXPLAIN UPDATE/
DELETE
Subqueries
FROM IN Others
PERFORMANCE_SCHEMA
Engine-independent
statistics
InnoDB persistent statistics
12:49:0936
Thanks
Q & A

More Related Content

What's hot (20)

PDF
Mysqlconf2013 mariadb-cassandra-interoperability
Sergey Petrunya
 
PDF
Playing with the CONNECT storage engine
Federico Razzoli
 
PDF
Introduction into MySQL Query Tuning for Dev[Op]s
Sveta Smirnova
 
PDF
Character Encoding - MySQL DevRoom - FOSDEM 2015
mushupl
 
PDF
Performance Schema for MySQL Troubleshooting
Sveta Smirnova
 
PDF
MySQL and MariaDB Backups
Federico Razzoli
 
PDF
Query Optimizer in MariaDB 10.4
Sergey Petrunya
 
PDF
Optimizer Trace Walkthrough
Sergey Petrunya
 
PDF
Introducing new SQL syntax and improving performance with preparse Query Rewr...
Sveta Smirnova
 
PDF
New features in Performance Schema 5.7 in action
Sveta Smirnova
 
PDF
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
Sergey Petrunya
 
PDF
Using histograms to get better performance
Sergey Petrunya
 
PDF
Optimizer features in recent releases of other databases
Sergey Petrunya
 
PDF
Performance Schema for MySQL Troubleshooting
Sveta Smirnova
 
PDF
Efficient Pagination Using MySQL
Evan Weaver
 
PDF
MySQL Query tuning 101
Sveta Smirnova
 
PDF
0888 learning-mysql
sabir18
 
PDF
Why Use EXPLAIN FORMAT=JSON?
Sveta Smirnova
 
PPTX
MySQLinsanity
Stanley Huang
 
PDF
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)
Sergey Petrunya
 
Mysqlconf2013 mariadb-cassandra-interoperability
Sergey Petrunya
 
Playing with the CONNECT storage engine
Federico Razzoli
 
Introduction into MySQL Query Tuning for Dev[Op]s
Sveta Smirnova
 
Character Encoding - MySQL DevRoom - FOSDEM 2015
mushupl
 
Performance Schema for MySQL Troubleshooting
Sveta Smirnova
 
MySQL and MariaDB Backups
Federico Razzoli
 
Query Optimizer in MariaDB 10.4
Sergey Petrunya
 
Optimizer Trace Walkthrough
Sergey Petrunya
 
Introducing new SQL syntax and improving performance with preparse Query Rewr...
Sveta Smirnova
 
New features in Performance Schema 5.7 in action
Sveta Smirnova
 
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
Sergey Petrunya
 
Using histograms to get better performance
Sergey Petrunya
 
Optimizer features in recent releases of other databases
Sergey Petrunya
 
Performance Schema for MySQL Troubleshooting
Sveta Smirnova
 
Efficient Pagination Using MySQL
Evan Weaver
 
MySQL Query tuning 101
Sveta Smirnova
 
0888 learning-mysql
sabir18
 
Why Use EXPLAIN FORMAT=JSON?
Sveta Smirnova
 
MySQLinsanity
Stanley Huang
 
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)
Sergey Petrunya
 

Viewers also liked (11)

PDF
Эволюция репликации в MySQL и MariaDB
Sergey Petrunya
 
PDF
Илья Космодемьянский (PostgreSQL-Consulting.com)
Ontico
 
PDF
Сергей Житинский, Александр Чистяков (Git in Sky)
Ontico
 
PPTX
MyRocks: табличный движок для MySQL на основе RocksDB
Sergey Petrunya
 
PDF
Павел Лузанов, Postgres Professional. «PostgreSQL для пользователей Oracle»
Mail.ru Group
 
PDF
Профилирование кода на C/C++ в *nix-системах / Александр Алексеев (Postgres P...
Ontico
 
PDF
NVMf: 5 млн IOPS по сети своими руками / Андрей Николаенко (IBS)
Ontico
 
PDF
ZSON, или прозрачное сжатие JSON
Aleksander Alekseev
 
PDF
Профилирование кода на C/C++ в *nix системах
Aleksander Alekseev
 
PDF
Функциональное программирование - Александр Алексеев
Aleksander Alekseev
 
PDF
Новые технологии репликации данных в PostgreSQL - Александр Алексеев
Aleksander Alekseev
 
Эволюция репликации в MySQL и MariaDB
Sergey Petrunya
 
Илья Космодемьянский (PostgreSQL-Consulting.com)
Ontico
 
Сергей Житинский, Александр Чистяков (Git in Sky)
Ontico
 
MyRocks: табличный движок для MySQL на основе RocksDB
Sergey Petrunya
 
Павел Лузанов, Postgres Professional. «PostgreSQL для пользователей Oracle»
Mail.ru Group
 
Профилирование кода на C/C++ в *nix-системах / Александр Алексеев (Postgres P...
Ontico
 
NVMf: 5 млн IOPS по сети своими руками / Андрей Николаенко (IBS)
Ontico
 
ZSON, или прозрачное сжатие JSON
Aleksander Alekseev
 
Профилирование кода на C/C++ в *nix системах
Aleksander Alekseev
 
Функциональное программирование - Александр Алексеев
Aleksander Alekseev
 
Новые технологии репликации данных в PostgreSQL - Александр Алексеев
Aleksander Alekseev
 
Ad

Similar to New features-in-mariadb-and-mysql-optimizers (20)

PDF
MariaDB 10 Tutorial - 13.11.11 - Percona Live London
Ivan Zoratti
 
PDF
Introduction to MySQL Query Tuning for Dev[Op]s
Sveta Smirnova
 
PDF
My sql 56_roadmap_april2012
sqlhjalp
 
PDF
New optimizer features in MariaDB releases before 10.12
Sergey Petrunya
 
PDF
Introduction into MySQL Query Tuning
Sveta Smirnova
 
PDF
Understanding query-execution806
yubao fu
 
PDF
Understanding Query Execution
webhostingguy
 
PPTX
Работа с индексами - лучшие практики для MySQL 5.6, Петр Зайцев (Percona)
Ontico
 
PDF
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
MariaDB plc
 
PDF
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
MariaDB plc
 
PDF
Advanced Query Optimizer Tuning and Analysis
MYXPLAIN
 
PDF
Query optimization: from 0 to 10 (and up to 5.7)
Jaime Crespo
 
PDF
Covering indexes
MYXPLAIN
 
PDF
Practical my sql performance optimization
Marian Marinov
 
PDF
MySQL Indexing
BADR
 
PDF
How MySQL can boost (or kill) your application
Federico Razzoli
 
PDF
Optimizer overviewoow2014
Mysql User Camp
 
PPTX
What's New In MySQL 5.6
Abdul Manaf
 
PDF
MySQL Indexes and Histograms - RMOUG Training Days 2022
Dave Stokes
 
PDF
Longhorn PHP - MySQL Indexes, Histograms, Locking Options, and Other Ways to ...
Dave Stokes
 
MariaDB 10 Tutorial - 13.11.11 - Percona Live London
Ivan Zoratti
 
Introduction to MySQL Query Tuning for Dev[Op]s
Sveta Smirnova
 
My sql 56_roadmap_april2012
sqlhjalp
 
New optimizer features in MariaDB releases before 10.12
Sergey Petrunya
 
Introduction into MySQL Query Tuning
Sveta Smirnova
 
Understanding query-execution806
yubao fu
 
Understanding Query Execution
webhostingguy
 
Работа с индексами - лучшие практики для MySQL 5.6, Петр Зайцев (Percona)
Ontico
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
MariaDB plc
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
MariaDB plc
 
Advanced Query Optimizer Tuning and Analysis
MYXPLAIN
 
Query optimization: from 0 to 10 (and up to 5.7)
Jaime Crespo
 
Covering indexes
MYXPLAIN
 
Practical my sql performance optimization
Marian Marinov
 
MySQL Indexing
BADR
 
How MySQL can boost (or kill) your application
Federico Razzoli
 
Optimizer overviewoow2014
Mysql User Camp
 
What's New In MySQL 5.6
Abdul Manaf
 
MySQL Indexes and Histograms - RMOUG Training Days 2022
Dave Stokes
 
Longhorn PHP - MySQL Indexes, Histograms, Locking Options, and Other Ways to ...
Dave Stokes
 
Ad

More from Sergey Petrunya (19)

PDF
MariaDB's join optimizer: how it works and current fixes
Sergey Petrunya
 
PDF
Improved histograms in MariaDB 10.8
Sergey Petrunya
 
PDF
Improving MariaDB’s Query Optimizer with better selectivity estimates
Sergey Petrunya
 
PDF
JSON Support in MariaDB: News, non-news and the bigger picture
Sergey Petrunya
 
PDF
ANALYZE for Statements - MariaDB's hidden gem
Sergey Petrunya
 
PDF
MariaDB 10.4 - что нового
Sergey Petrunya
 
PDF
MariaDB Optimizer - further down the rabbit hole
Sergey Petrunya
 
PDF
Lessons for the optimizer from running the TPC-DS benchmark
Sergey Petrunya
 
PDF
MariaDB 10.3 Optimizer - where does it stand
Sergey Petrunya
 
PDF
MyRocks in MariaDB | M18
Sergey Petrunya
 
PDF
New Query Optimizer features in MariaDB 10.3
Sergey Petrunya
 
PDF
MyRocks in MariaDB
Sergey Petrunya
 
PDF
Histograms in MariaDB, MySQL and PostgreSQL
Sergey Petrunya
 
PDF
Say Hello to MyRocks
Sergey Petrunya
 
PDF
Common Table Expressions in MariaDB 10.2
Sergey Petrunya
 
PDF
MyRocks in MariaDB: why and how
Sergey Petrunya
 
PDF
MariaDB 10.1 - что нового.
Sergey Petrunya
 
PDF
Window functions in MariaDB 10.2
Sergey Petrunya
 
PDF
MariaDB: ANALYZE for statements (lightning talk)
Sergey Petrunya
 
MariaDB's join optimizer: how it works and current fixes
Sergey Petrunya
 
Improved histograms in MariaDB 10.8
Sergey Petrunya
 
Improving MariaDB’s Query Optimizer with better selectivity estimates
Sergey Petrunya
 
JSON Support in MariaDB: News, non-news and the bigger picture
Sergey Petrunya
 
ANALYZE for Statements - MariaDB's hidden gem
Sergey Petrunya
 
MariaDB 10.4 - что нового
Sergey Petrunya
 
MariaDB Optimizer - further down the rabbit hole
Sergey Petrunya
 
Lessons for the optimizer from running the TPC-DS benchmark
Sergey Petrunya
 
MariaDB 10.3 Optimizer - where does it stand
Sergey Petrunya
 
MyRocks in MariaDB | M18
Sergey Petrunya
 
New Query Optimizer features in MariaDB 10.3
Sergey Petrunya
 
MyRocks in MariaDB
Sergey Petrunya
 
Histograms in MariaDB, MySQL and PostgreSQL
Sergey Petrunya
 
Say Hello to MyRocks
Sergey Petrunya
 
Common Table Expressions in MariaDB 10.2
Sergey Petrunya
 
MyRocks in MariaDB: why and how
Sergey Petrunya
 
MariaDB 10.1 - что нового.
Sergey Petrunya
 
Window functions in MariaDB 10.2
Sergey Petrunya
 
MariaDB: ANALYZE for statements (lightning talk)
Sergey Petrunya
 

Recently uploaded (20)

PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PPTX
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
PPTX
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
PDF
Smart Air Quality Monitoring with Serrax AQM190 LITE
SERRAX TECHNOLOGIES LLP
 
PPTX
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Persuasive AI: risks and opportunities in the age of digital debate
Speck&Tech
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Complete Network Protection with Real-Time Security
L4RGINDIA
 
PDF
NewMind AI Journal - Weekly Chronicles - July'25 Week II
NewMind AI
 
PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
Smart Air Quality Monitoring with Serrax AQM190 LITE
SERRAX TECHNOLOGIES LLP
 
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Persuasive AI: risks and opportunities in the age of digital debate
Speck&Tech
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Complete Network Protection with Real-Time Security
L4RGINDIA
 
NewMind AI Journal - Weekly Chronicles - July'25 Week II
NewMind AI
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 

New features-in-mariadb-and-mysql-optimizers

  • 1. Sergei Petrunia, MariaDB New features in MariaDB/MySQL query optimizer
  • 2. 12:49:092 MySQL/MariaDB optimizer development ● Some features have common heritage ● Big releases: – MariaDB 5.3/5.5 – MySQL 5.6 – (upcoming) MariaDB 10.0
  • 3. 12:49:093 New optimizer features Subqueries Batched Key Access (MRR) Index Condition Pushdown Extended Keys EXPLAIN UPDATE/ DELETE Subqueries FROM IN Others PERFORMANCE_SCHEMA Engine-independent statistics InnoDB persistent statistics
  • 4. 12:49:094 New optimizer features Subqueries Batched Key Access (MRR) Index Condition Pushdown Extended Keys EXPLAIN UPDATE/ DELETE Subqueries FROM IN Others Engine-independent statistics InnoDB persistent statistics PERFORMANCE_SCHEMA
  • 5. 12:49:095 Subqueries in MySQL ● Subqueries are practially unusable ● e.g. Facebook disabled them in the parser ● Reason - “naive execution”.
  • 6. 12:49:096 Naive subquery execution ● For IN (SELECT... ) subqueries: select * from hotel where hotel.country='USA' and hotel.name IN (select hotel_stays.hotel from hotel_stays where hotel_stays.customer='John Smith') for (each hotel in USA ) { if (john smith stayed here) { … } } ● Naive execution: ● Slow!
  • 7. 12:49:097 Naive subquery execution (2) ● For FROM(SELECT …) subquereis: 1. Retrieve all hotels with > 500 rooms, store in a temporary table big_hotel; 2. Search in big_hotel for hotels near AMS. ● Naive execution: ● Slow! select * from (select * from hotel where hotel.rooms > 500 ) as big_hotel where big_hotel.nearest_aiport='AMS';
  • 8. 12:49:098 New subquery optimizations ● Handle IN (SELECT ...) ● Handle FROM (SELECT …) ● Handle a lot of cases ● Comparison with PostgreSQL – ~1000x slower before – ~same order of magnitude now ● Releases – MySQL 6.0 – MariaDB 5.5 ● Sheeri Kritzer @ Mozilla seems happy with this one – MySQL 5.6 ● Subset of MariaDB 5.5's features
  • 9. 12:49:099 Subquery optimizations - summary ● Subqueries were generally unusable before MariaDB 5.3/5.5 ● “Core” subquery optimizations are in – MariaDB 5.3/5.5 – MySQL 5.6 ● MariaDB has extra additions ● Further information: https://kb.askmonty.org/en/subquery-optimizations/
  • 10. 12:49:0910 Subqueries Batched Key Access (MRR) Index Condition Pushdown Extended Keys EXPLAIN UPDATE/ DELETE Subqueries FROM IN Others Engine-independent statistics InnoDB persistent statistics PERFORMANCE_SCHEMA
  • 11. 12:49:0911 Batched Key Access - background ● Big, IO-bound joins were slow – DBT-3 benchmark could not finish* ● Reason? ● Nested Loops join hits the second table at random locations.
  • 12. 12:49:0912 Batched Key Access idea Nested Loops Join Batched Key Access Speedup reasons ● Fewer disk head movements ● Cache-friendliness ● Prefetch-friendliness
  • 13. 12:49:0913 Batched Key Access benchmark set join_cache_level=6; – enable BKA select max(l_extendedprice) from orders, lineitem where l_orderkey=o_orderkey and o_orderdate between $DATE1 and $DATE2 Run with ● Various join_buffer_size settings ● Various size of $DATE1...$DATE2 range
  • 14. 12:49:0914 Batched Key Access benchmark (2) -2,000,000 3,000,000 8,000,000 13,000,000 18,000,000 23,000,000 28,000,000 33,000,000 0 500 1000 1500 2000 2500 3000 BKA join performance depending on buffer size query_size=1, regular query_size=1, BKA query_size=2, regular query_size=2, BKA query_size=3, regular query_size=3, BKA Buffer size, bytes Querytime,sec Performance without BKA Performance with BKA, given sufficient buffer size
  • 15. 12:49:0915 Batched Key Access summary ● Optimization for big, IO-bound joins – Orders-of-magnitude speedups ● Available in – MariaDB 5.3/5.5 (more advanced) – MySQL 5.6 ● Not fully automatic yet – Needs to be manually enabled – Need to set buffer sizes.
  • 16. 12:49:0916 Subqueries Batched Key Access (MRR) Index Condition Pushdown Extended Keys EXPLAIN UPDATE/ DELETE Subqueries FROM IN Others Engine-independent statistics InnoDB persistent statistics PERFORMANCE_SCHEMA
  • 17. 12:49:0917 Index Condition Pushdown alter table lineitem add index s_r (l_shipdate, l_receiptdate); select count(*) from lineitem where l_shipdate between '1993-01-01' and '1993-02-01' and datediff(l_receiptdate,l_shipdate) > 25 and l_quantity > 40 ● A new feature in MariaDB 5.3/ MySQL 5.6 +----+-------------+----------+-------+---------------+------+---------+------+--------+------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+-------+---------------+------+---------+------+--------+------------------------------------+ | 1 | SIMPLE | lineitem | range | s_r | s_r | 4 | NULL | 158854 | Using index condition; Using where | +----+-------------+----------+-------+---------------+------+---------+------+--------+------------------------------------+ 1.Read index records in the range l_shipdate between '1993-01-01' and '1993-02-01' 2.Check the index condition datediff(l_receiptdate,l_shipdate) > 25 3.Read full table rows 4.Check the WHERE condition l_quantity > 40 ← New! ← Filters out records before table rows are read
  • 18. 12:49:0918 Index Condition Pushdown - conclusions Summary ● Applicable to any index-based access (ref, range, etc) ● Checks parts of WHERE after reading the index ● Reduces number of table records to be read ● Speedup can be like in “Using index” – Great for IO-bound load (5x, 10x) – Some for CPU-bound workload (2x) Conclusions ● Have a selective condition on column? – Put the column into index, at the end.
  • 19. 12:49:0919 Extended keys ● Before: optimizer has limited support for “tail” columns – 'Using index' supports it – ORDER BY col1, col2, pk1 support it ● After MariaDB 5.5/ MySQL 5.6 – all parts of optimizer (ref access, range access, etc) can use the “tail” CREATE TABLE tbl ( pk1 sometype, pk2 sometype, ... col1 sometype, col2 sometype, ... KEY indexA (col1, col2) ... PRIMARY KEY (pk1, pk2) ) ENGINE=InnoDB indexA col1 col2 pk1 pk2 ● Secondary indexes in InnoDB have invisible “tail”
  • 20. 12:49:0920 Subqueries Batched Key Access (MRR) Index Condition Pushdown Extended Keys EXPLAIN UPDATE/ DELETE Subqueries FROM IN Others Engine-independent statistics InnoDB persistent statistics PERFORMANCE_SCHEMA
  • 21. 12:49:0921 Better EXPLAIN in MySQL 5.6 ● EXPLAIN for UPDATE/DELETE/INSERT … SELECT – shows query plan for the finding records to update/delete mysql> explain update customer set c_acctbal = c_acctbal - 100 where c_custkey=12354; +----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+ | 1 | SIMPLE | customer | range | PRIMARY | PRIMARY | 4 | NULL | 1 | Using where | +----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+ ● EXPLAIN FORMAT=JSON – Produces [big] JSON output – Shows more information: ● Shows conditions attached to tables ● Shows whether “Using temporary; using filesort” is done to handle GROUP BY or ORDER BY. ● Shows where subqueries are attached – No other known additions – Will be in MariaDB 10.0 The most useful addition!
  • 22. 12:49:0922 EXPLAIN FORMAT=JSON What are the “conditions attached to tables”? explain select count(*) from orders, customer where customer.c_custkey=orders.o_custkey and customer.c_mktsegment='BUILDING' and orders.o_totalprice > customer.c_acctbal and orders.o_orderpriority='1-URGENT' +----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+ | 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 1509871 | Using where | | 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | dbt3sf10.customer.c_custkey | 7 | Using where | +----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+ ?
  • 23. 12:49:0923 EXPLAIN FORMAT=JSON (2) { "query_block": { "select_id": 1, "nested_loop": [ { "table": { "table_name": "customer", "access_type": "ALL", "possible_keys": [ "PRIMARY" ], "rows": 1509871, "filtered": 100, "attached_condition": "(`dbt3sf10`.`customer`.`c_mktsegment` = 'BUILDING')" } }, { "table": { "table_name": "orders", "access_type": "ref", "possible_keys": [ "i_o_custkey" ], "key": "i_o_custkey", "used_key_parts": [ "o_custkey" ], "key_length": "5", "ref": [ "dbt3sf10.customer.c_custkey" ], "rows": 7, "filtered": 100, "attached_condition": "((`dbt3sf10`.`orders`.`o_orderpriority` = '1-URGENT') and (`dbt3sf10`.`orders`.`o_totalprice` > `dbt3sf10`.`customer`.`c_acctbal`))" } } ] } } +----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+ | 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 1509871 | Using where | | 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | dbt3sf10.customer.c_custkey | 7 | Using where | +----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+
  • 24. 12:49:0924 EXPLAIN ANALYZE (kind of) ● Does EXPLAIN match the reality? ● Where is most of the time spent? ● MySQL/MariaDB don't have “EXPLAIN ANALYZE” ... select count(*) from orders, customer where customer.c_custkey=orders.o_custkey and customer.c_mktsegment='BUILDING' and orders.o_orderpriority='1-URGENT' +------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+ | 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 149415 | Using where | | 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | customer.c_custkey | 7 | Using index | +------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+
  • 25. 12:49:0925 Traditional solution: Status variables Problems: ● Only #rows counters ● all tables are counted together mysql> flush status; Query OK, 0 rows affected (0.00 sec) mysql> {run query} mysql> show status like 'Handler%'; +----------------------------+--------+ | Variable_name | Value | +----------------------------+--------+ | Handler_commit | 1 | | Handler_delete | 0 | | Handler_discover | 0 | | Handler_icp_attempts | 0 | | Handler_icp_match | 0 | | Handler_mrr_init | 0 | | Handler_mrr_key_refills | 0 | | Handler_mrr_rowid_refills | 0 | | Handler_prepare | 0 | | Handler_read_first | 0 | | Handler_read_key | 30142 | | Handler_read_last | 0 | | Handler_read_next | 303959 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_deleted | 0 | | Handler_read_rnd_next | 150001 | | Handler_rollback | 0 | ... . . .
  • 26. 12:49:0926 Newer solution: userstat ● In Facebook patch, Percona, MariaDB: mysql> set global userstat=1; mysql> flush table_statistics; mysql> flush index_statistics; mysql> {query} mysql> show table_statistics; +--------------+------------+-----------+--------------+-------------------------+ | Table_schema | Table_name | Rows_read | Rows_changed | Rows_changed_x_#indexes | +--------------+------------+-----------+--------------+-------------------------+ | dbt3sf1 | orders | 303959 | 0 | 0 | | dbt3sf1 | customer | 150000 | 0 | 0 | +--------------+------------+-----------+--------------+-------------------------+ mysql> show index_statistics; +--------------+------------+-------------+-----------+ | Table_schema | Table_name | Index_name | Rows_read | +--------------+------------+-------------+-----------+ | dbt3sf1 | orders | i_o_custkey | 303959 | +--------------+------------+-------------+-----------+ ● Counters are per-table – Ok as long as you don't have self-joins ● Overhead is negligible ● Counters are server-wide (other queries affect them, too)
  • 27. 12:49:0927 Latest addition: PERFORMANCE_SCHEMA ● Allows to measure *time* spent reading each table ● Has some visible overhead (Facebook's tests: 7%) ● Counters are system-wide ● Still no luck with self-joins mysql> truncate performance_schema.table_io_waits_summary_by_table; mysql> {query} mysql> select object_schema, object_name, count_read, sum_timer_read, -- this is picoseconds sum_timer_read / (1000*1000*1000*1000) as read_seconds -- this is seconds from performance_schema.table_io_waits_summary_by_table where object_schema = 'dbt3sf1' and object_name in ('orders','customer'); +---------------+-------------+------------+----------------+--------------+ | object_schema | object_name | count_read | sum_timer_read | read_seconds | +---------------+-------------+------------+----------------+--------------+ | dbt3sf1 | orders | 334101 | 5739345397323 | 5.7393 | | dbt3sf1 | customer | 150001 | 1273653046701 | 1.2737 | +---------------+-------------+------------+----------------+--------------+
  • 28. 12:49:0928 Subqueries Batched Key Access (MRR) Index Condition Pushdown Extended Keys EXPLAIN UPDATE/ DELETE Subqueries FROM IN Others Engine-independent statistics InnoDB persistent statistics PERFORMANCE_SCHEMA
  • 29. 12:49:0929 What is table/index statistics? select count(*) from customer, orders where customer.c_custkey=orders.o_custkey and customer.c_mktsegment='BUILDING'; +------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+ | 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 148305 | Using where | | 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | customer.c_custkey | 7 | Using index | +------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+ MariaDB > show table status like 'orders'G *************************** 1. row *************************** Name: orders Engine: InnoDB Version: 10 Row_format: Compact Rows: 1495152 ............. MariaDB > show keys from orders where key_name='i_o_custkey'G *************************** 1. row *************************** Table: orders Non_unique: 1 Key_name: i_o_custkey Seq_in_index: 1 Column_name: o_custkey Collation: A Cardinality: 212941 Sub_part: NULL ................. ? 1495152 / 212941 = 7 “There are on average 7 orders for a given c_custkey”
  • 30. 12:49:0930 The problem with index statistics and InnoDB MySQL 5.5, InnoDB ● Statistics is calculated on-the-fly – When the table is opened (server restart, DDL) – When sufficient number of records have been updated – ... ● Calculation uses random sampling – @@innodb_stats_sample_pages ● Result: – Statistics changes without warning => Query plans change, without warning ● For example, DBT-3 benchmark – 22 analytics queries – Plans-per-query: avg=2.8, max=7.
  • 31. 12:49:0931 Persistent table statistics Persistent statistics v1 ● Percona Server 5.5 (ported to MariaDB 5.5) – Need to enable it: innodb_use_sys_stats_table=1 ● Statistics is stored inside InnoDB – User-visible through information_schema.innodb_sys_stats (read-only) ● Setting innodb_stats_auto_update=OFF prevents unexpected updates Persistent statistics v2 ● MySQL 5.6 – Enabled by default: innodb_stats_persistent=1 ● Stored in regular InnoDB tables – mysql.innodb_table_stats, mysql.innodb_index_stats ● Setting innodb_stats_auto_recalc=OFF prevents unexpected updates ● Can also specify persistence/auto-recalc as a table option
  • 32. 12:49:0932 Persistent table statistics - summary ● Percona, then MySQL – Made statistics persistent – Disallowed automatic updates ● Remaining issue #1: it's still random sampling – DBT-3 benchmark – scale=30 – Re-ran EXPLAINS for benchmark queries – Counted different query plans ● Remaining issue #2: limited amount of statistics – Only on index columns – Only AVG(#different_values)
  • 33. 12:49:0933 Upcoming: Engine-independent statistics MariaDB 10.0: Engine-independent statistics ● Collected/used on SQL layer ● No auto updates, only ANALYZE TABLE – 100% precise statics ● More statistics – Index statistics (like before) – Table statistics (like before) – Column statistics ● MIN/MAX values ● Number of NULL / not NULL values ● Histograms ● => Optimizer will be smarter and more reliable
  • 34. 12:49:0934 Conclusions ● Lots of new query optimizer features recently – Subqueries now just work – Big joins are much faster ● Need to turn it on – More diagnostics ● Even more is coming ● Releases with features – MariaDB 5.5 – MySQL 5.6, – (upcoming) MariaDB 10.0
  • 35. 12:49:0935 New optimizer features Subqueries Batched Key Access (MRR) Index Condition Pushdown Extended Keys EXPLAIN UPDATE/ DELETE Subqueries FROM IN Others PERFORMANCE_SCHEMA Engine-independent statistics InnoDB persistent statistics