SlideShare a Scribd company logo
Recursive Query Throwdown
in MySQL 8
BILL KARWIN
PERCONA LIVE OPEN SOURCE DATABASE CONFERENCE 2017
Bill Karwin
Software developer, consultant, trainer
Using MySQL since 2000
Senior Database Architect at SchoolMessenger
Author of SQL Antipatterns: Avoiding the Pitfalls of
Database Programming
Oracle ACE Director
How to Query a Tree?
Hierarchical data
§ Organization charts
§ Categories and sub-categories
§ Parts explosion
§ Threaded discussions
https://commons.wikimedia.org/wiki/File:Staff_Organisation_Diagram,_1896.jpg
Example: Threaded Comments
Adjacency List Example Data
comment_id parent_id author comment
1 NULL Fran What’s the cause of this bug?
2 1 Ollie I think it’s a null pointer.
3 2 Fran No, I checked for that.
4 1 Kukla We need to check valid input.
5 4 Ollie Yes, that’s a bug.
6 4 Fran Yes, please add a check
7 6 Kukla That fixed it.
Can’t Easily Query Deep Trees
SELECT * FROM Comments c1
LEFT JOIN Comments c2 ON (c2.parent_id = c1.comment_id)
LEFT JOIN Comments c3 ON (c3.parent_id = c2.comment_id)
LEFT JOIN Comments c4 ON (c4.parent_id = c3.comment_id)
LEFT JOIN Comments c5 ON (c5.parent_id = c4.comment_id)
LEFT JOIN Comments c6 ON (c6.parent_id = c5.comment_id)
LEFT JOIN Comments c7 ON (c7.parent_id = c6.comment_id)
LEFT JOIN Comments c8 ON (c8.parent_id = c7.comment_id)
LEFT JOIN Comments c9 ON (c9.parent_id = c8.comment_id)
LEFT JOIN Comments c10 ON (c10.parent_id = c9.comment_id)
...
MySQL Workarounds
MySQL Workarounds
MySQL lacked support for recursive queries, so workarounds were needed
These are all denormalized designs, most don’t have referential integrity
§Path enumeration
§Nested sets
§Closure table
Path Enumeration Example Data
comment_id path author comment
1 1/ Fran What’s the cause of this bug?
2 1/2/ Ollie I think it’s a null pointer.
3 1/2/3/ Fran No, I checked for that.
4 1/4/ Kukla We need to check valid input.
5 1/4/5/ Ollie Yes, that’s a bug.
6 1/4/6/ Fran Yes, please add a check
7 1/4/6/7/ Kukla That fixed it.
Path Enumeration Example Queries
Query ancestors of comment #7:
SELECT * FROM Comments
WHERE '1/4/6/7/' LIKE CONCAT(path, '%');
Query descendants of comment #4:
SELECT * FROM Comments
WHERE path LIKE '1/4/%';
Path Enumeration Pros and Cons
Pros:
§Single non-recursive query to get a tree or a subtree
Cons:
§Complex updates to add or remove a node
§Numbers are stored in a string—no referential integrity
Nested Sets
Each comment encodes its descendants using two numbers:
§ A comment’s left number is less than all numbers used by the comment’s descendants.
§ A comment’s right number is greater than all numbers used by the comment’s
descendants.
§ A comment’s numbers are between all
numbers used by the comment’s ancestors.
References:
§ “Recursive Hierarchies: The Relational Taboo!” Michael J. Kamfonas,
Relational Journal, Oct/Nov 1992
§ “Trees and Hierarchies in SQL For Smarties,” Joe Celko, 2004
§ “Managing Hierarchical Data in MySQL,” Mike Hillyer, 2005
Nested Sets Example
Nested Sets Example Data
comment_id nsleft nsright author comment
1 1 14 Fran What’s the cause of this bug?
2 2 5 Ollie I think it’s a null pointer.
3 3 4 Fran No, I checked for that.
4 6 13 Kukla We need to check valid input.
5 7 8 Ollie Yes, that’s a bug.
6 9 12 Fran Yes, please add a check
7 10 11 Kukla That fixed it.
Nested Sets Example Queries
Query ancestors of comment #7:
SELECT ancestor.* FROM Comments child
JOIN Comments ancestor
ON child.nsleft BETWEEN ancestor.nsleft AND ancestor.nsright
WHERE child.comment_id = 7;
Query subtree under comment #4:
SELECT descendant.* FROM Comments parent
JOIN Comments descendant
ON descendant.nsleft BETWEEN parent.nsleft AND parent.nsright
WHERE parent.comment_id = 4;
Nested Sets Pros and Cons
Pros:
§Single non-recursive query to get a tree or a subtree
Cons:
§Complex updates to add or remove a node
§Numbers are not foreign keys—no referential integrity
Closure Table
Many-to-many table
Stores every path from each node to each of its descendants
A node even connects to itself
CREATE TABLE Closure (
ancestor INT NOT NULL,
descendant INT NOT NULL,
length INT NOT NULL,
PRIMARY KEY (ancestor, descendant),
FOREIGN KEY(ancestor) REFERENCES Comments(comment_id),
FOREIGN KEY(descendant) REFERENCES Comments(comment_id)
);
Closure Table Example
Closure Table Example Data
comment_id author comment
1 Fran What’s the cause of this bug?
2 Ollie I think it’s a null pointer.
3 Fran No, I checked for that.
4 Kukla We need to check valid input.
5 Ollie Yes, that’s a bug.
6 Fran Yes, please add a check
7 Kukla That fixed it.
ancestor descendant length
1 1 0
1 2 1
1 3 2
1 4 1
1 5 2
1 6 2
1 7 3
2 2 0
2 3 1
3 3 0
4 4 0
4 5 1
4 6 1
4 7 2
5 5 0
6 6 0
6 7 1
7 7 0
Closure Table Example Queries
Query ancestors of comment #7:
SELECT c.* FROM Comments c
JOIN Closure t
ON (c.comment_id = t.ancestor)
WHERE t.descendant = 7;
Query subtree under comment #4:
SELECT c.* FROM Comments c
JOIN Closure t
ON (c.comment_id = t.descendant)
WHERE t.ancestor = 4;
Closure Table Pros and Cons
Pros:
§Single non-recursive query to get a tree or a subtree
§Referential integrity!
Cons:
§Extra table is required
§Hierarchy is stored redundantly, too easy to mess up
§Lots of joins to do most kinds of queries
ANSI SQL Recursive CTE
WITHer Recursive Queries in MySQL?
SQL vendors gradually implemented SQL-99 WITH syntax:
§ IBM DB2 UDB 8 (Dec. 2002)
§ Microsoft SQL Server 2005 (Oct. 2005)
§ Sybase SQL Anywhere 11 (Aug. 2008)
§ Firebird 2.1 (Sep. 2008)
§ PostgreSQL 8.4 (Jul. 2009)
§ Oracle 11g release 2 (Sep. 2009)
§ Teradata (date and version of support unknown, at least 2009)
§ HSQLDB 2.3 (Jul. 2013)
§ SQLite 3.8.3.1 (Feb. 2014)
§ H2 (date and version unknown)
https://www.percona.com/blog/2014/02/11/wither-recursive-queries/
ANSI SQL Recursive Common Table Expression
WITH RECURSIVE cte_name (col_name, col_name, col_name) AS
(
subquery base case
UNION ALL
subquery referencing cte_name
)
SELECT ... FROM cte_name ...
https://dev.mysql.com/doc/refman/8.0/en/with.html
Generating a Series of Numbers
WITH RECURSIVE MySeries (n) AS
(
SELECT 1 AS n
UNION ALL
SELECT 1+n FROM MySeries WHERE n < 10
)
SELECT * FROM MySeries;
+------+
| n |
+------+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 10 |
+------+
Generating a Series of Dates
WITH RECURSIVE MyDates (d) AS
(
SELECT CURRENT_DATE() AS d
UNION ALL
SELECT d + INTERVAL 1 DAY FROM MyDates
WHERE d < CURRENT_DATE() + INTERVAL 7 DAY
)
SELECT * FROM MyDates;
+------------+
| d |
+------------+
| 2017-04-24 |
| 2017-04-25 |
| 2017-04-26 |
| 2017-04-27 |
| 2017-04-28 |
| 2017-04-29 |
| 2017-04-30 |
| 2017-05-01 |
+------------+
Query ancestors of comment #7
WITH RECURSIVE CommentTree (comment_id, parent_id, author, comment,
depth) AS
(
SELECT comment_id, parent_id, author, comment, 0 AS depth
FROM Comments
WHERE comment_id = 7
UNION ALL
SELECT c.comment_id, c.parent_id, c.author, c.comment, ct.depth+1
FROM CommentTree ct
JOIN Comments c ON (ct.parent_id = c.comment_id)
)
SELECT * FROM CommentTree;
Recursive Query Throwdown
Query subtree under comment #4
WITH RECURSIVE CommentTree (comment_id, parent_id, author, comment,
depth) AS
(
SELECT comment_id, parent_id, author, comment, 0 AS depth
FROM Comments
WHERE comment_id = 4
UNION ALL
SELECT c.comment_id, c.parent_id, c.author, c.comment, ct.depth+1
FROM CommentTree ct
JOIN Comments c ON (ct.comment_id = c.parent_id)
)
SELECT * FROM CommentTree;
Recursive CTE Pros and Cons
Pros:
§ ANSI SQL-99 Standard
§ Compatible with other SQL implementations
§ Works with Adjacency List (single source of authority)
§ Referential integrity!
Cons:
§ Not compatible with earlier MySQL versions
§ Use of materialized temporary tables may cause performance problems
MySQL CTE Implementation: 💯
Thanks	to	@MarkusWinand for	his	preview	analysis	based	on	8.0.1-dmr
http://modern-sql.com/feature/with
Big Hierarchies
ITIS: Sample Hierarchical Data
Integrated Taxonomic Information System
(https://www.itis.gov/)
§Biological database of species of animals, plants, fungi
§One big tree of 544,954 nodes
§Data comes in adjacency list & path enumeration format
§I converted to closure table for query tests
ITIS Data Model
mysql> select * from longnames
where completename = 'Eschscholzia californica';
+--------+---------------------------+
| tsn | completename |
+--------+---------------------------+
| 18956 | Eschscholzia californica |
+--------+---------------------------+
mysql> select * from hierarchy where TSN = '18956'G
TSN: 18956
Parent_TSN: 18954
level: 11
ChildrenCount: 8
hierarchy_string: 202422-954898-846494-954900-846496-846504-18063-846547-18409-18880-18954-18956
Indexes
mysql> ALTER TABLE hierarchy ADD KEY (tsn, parent_tsn);
Query OK, 0 rows affected (1.30 sec)
Breadcrumbs Query
WITH RECURSIVE taxonomy AS
(
SELECT base.tsn, base.parent_tsn, 0 as depth
FROM hierarchy base
WHERE tsn = '18956'
UNION ALL
SELECT next.tsn, next.parent_tsn, t.depth+1
FROM hierarchy next JOIN taxonomy t
WHERE t.parent_tsn = next.tsn
)
SELECT * FROM taxonomy JOIN longnames USING (tsn)
ORDER BY depth DESC;
Breadcrumbs Query Result
+--------+------------+-------+--------------------------+
| tsn | parent_tsn | depth | completename |
+--------+------------+-------+--------------------------+
| 202422 | 0 | 11 | Plantae |
| 954898 | 202422 | 10 | Viridiplantae |
| 846494 | 954898 | 9 | Streptophyta |
| 954900 | 846494 | 8 | Embryophyta |
| 846496 | 954900 | 7 | Tracheophyta |
| 846504 | 846496 | 6 | Spermatophytina |
| 18063 | 846504 | 5 | Magnoliopsida |
| 846547 | 18063 | 4 | Ranunculanae |
| 18409 | 846547 | 3 | Ranunculales |
| 18880 | 18409 | 2 | Papaveraceae |
| 18954 | 18880 | 1 | Eschscholzia |
| 18956 | 18954 | 0 | Eschscholzia californica |
+--------+------------+-------+--------------------------+
12 rows in set (0.00 sec)
Breadcrumbs Query EXPLAIN Plan
§New note in Extra: "Recursive"
§Using index (covering index) for both base case and recursive case
§I can eliminate the filesort if I allow natural order (base case first)
§No "Using Temporary"? Not so fast…
+----+-------------+------------+--------+---------------+---------+---------+--------------+------+----------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+--------+---------------+---------+---------+--------------+------+----------+-----------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 4 | 100.00 | Using where; Using filesort |
| 1 | PRIMARY | longnames | eq_ref | PRIMARY,tsn | PRIMARY | 4 | taxonomy.tsn | 1 | 100.00 | NULL |
| 2 | DERIVED | base | ref | TSN | TSN | 4 | const | 1 | 100.00 | Using index |
| 3 | UNION | t | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | Recursive; Using where |
| 3 | UNION | next | ref | TSN | TSN | 4 | t.parent_tsn | 1 | 100.00 | Using index |
+----+-------------+------------+--------+---------------+---------+---------+--------------+------+----------+-----------------------------+
Breadcrumbs Query Performance
mysql> SELECT * FROM SYS.STATEMENTS_WITH_TEMP_TABLESG
query: WITH RECURSIVE `taxonomy` AS ( ...
`tsn` ) ORDER BY `depth` DESC
db: itis
exec_count: 1
total_latency: 10.05 ms
memory_tmp_tables: 1
disk_tmp_tables: 0
avg_tmp_tables_per_query: 1
tmp_tables_to_disk_pct: 0
first_seen: 2017-04-24 22:07:56
last_seen: 2017-04-24 22:07:56
digest: 8438633360bedce178823bb868589fd0
Breadcrumbs Query Stages
mysql> SELECT * FROM SYS.USER_SUMMARY_BY_STAGES;
+------+--------------------------------+-------+---------------+-------------+
| user | event_name | total | total_latency | avg_latency |
+------+--------------------------------+-------+---------------+-------------+
| root | stage/sql/System lock | 40 | 6.62 ms | 165.60 us |
| root | stage/sql/Opening tables | 191 | 3.16 ms | 16.52 us |
| root | stage/sql/checking permissions | 45 | 1.50 ms | 33.44 us |
| root | stage/sql/Creating sort index | 1 | 239.63 us | 239.63 us |
| root | stage/sql/closing tables | 191 | 191.03 us | 1.00 us |
| root | stage/sql/starting | 2 | 188.44 us | 94.22 us |
| root | stage/sql/Sending data | 6 | 138.96 us | 23.16 us |
| root | stage/sql/statistics | 4 | 122.42 us | 30.60 us |
| root | stage/sql/query end | 191 | 56.67 us | 296.00 ns |
| root | stage/sql/preparing | 4 | 33.57 us | 8.39 us |
| root | stage/sql/freeing items | 2 | 27.93 us | 13.96 us |
| root | stage/sql/optimizing | 5 | 20.03 us | 4.01 us |
| root | stage/sql/executing | 7 | 15.39 us | 2.20 us |
| root | stage/sql/removing tmp table | 4 | 9.35 us | 2.34 us |
| root | stage/sql/init | 3 | 8.76 us | 2.92 us |
| root | stage/sql/Sorting result | 2 | 4.16 us | 2.08 us |
| root | stage/sql/end | 3 | 1.93 us | 644.00 ns |
| root | stage/sql/cleaning up | 2 | 1.43 us | 715.00 ns |
+------+--------------------------------+-------+---------------+-------------+
Tree Expansion Query Result
See Demo
Tree Expansion Query
WITH RECURSIVE ancestors (tsn, parent_tsn) AS (
SELECT h.tsn, h.parent_tsn FROM hierarchy AS h WHERE h.tsn = %s
UNION ALL
SELECT h.tsn, h.parent_tsn FROM hierarchy AS h JOIN ancestors AS base ON h.tsn = base.parent_tsn
),
breadcrumbs (tsn, parent_tsn, depth, breadcrumbs) AS (
SELECT h.tsn, h.parent_tsn, 0 AS depth, CAST(LPAD(h.tsn, 8, '0') AS CHAR(255)) AS breadcrumbs
FROM hierarchy AS h WHERE h.parent_tsn = 0
UNION ALL
SELECT h.tsn, h.parent_tsn, base.depth+1 AS depth, CONCAT(base.breadcrumbs, ',', LPAD(h.tsn, 8,
'0'))
FROM hierarchy AS h
JOIN ancestors AS a ON h.tsn = a.tsn
JOIN breadcrumbs AS base ON h.parent_tsn = base.tsn
)
SELECT l.tsn, l.completename, b.depth, b.breadcrumbs
FROM breadcrumbs AS b JOIN longnames AS l ON b.tsn = l.tsn
UNION
SELECT l.tsn, l.completename, b.depth+1, CONCAT(b.breadcrumbs, ',', LPAD(h.tsn, 8, '0'))
FROM breadcrumbs AS b
JOIN hierarchy AS h ON b.tsn = h.parent_tsn
JOIN longnames AS l ON l.tsn = h.tsn
ORDER BY breadcrumbs
Tree Expansion Query EXPLAIN
--------------+------------+--------+-------------+---------+-------------------+--------+----------+--------------------------------
select_type | table | type | key | key_len | ref | rows | filtered | Extra
--------------+------------+--------+-------------+---------+-------------------+--------+----------+--------------------------------
PRIMARY | <derived2> | ALL | NULL | NULL | NULL | 250230 | 100.00 | Using where
PRIMARY | l | eq_ref | PRIMARY | 4 | b.tsn | 1 | 100.00 | NULL
DERIVED | h | index | TSN | 9 | NULL | 500466 | 10.00 | Using where; Using index
UNION | base | ALL | NULL | NULL | NULL | 50046 | 100.00 | Recursive; Using where
UNION | <derived4> | ALL | NULL | NULL | NULL | 4 | 100.00 | Using where; Using join buffer
UNION | h | ref | TSN | 9 | a.tsn,base.tsn | 1 | 100.00 | Using index
DERIVED | h | ref | TSN | 4 | const | 1 | 100.00 | Using index
UNION | base | ALL | NULL | NULL | NULL | 2 | 100.00 | Recursive; Using where
UNION | h | ref | TSN | 4 | base.parent_tsn | 1 | 100.00 | Using index
UNION | h | index | TSN | 9 | NULL | 500466 | 100.00 | Using where; Using index
UNION | l | eq_ref | PRIMARY | 4 | itis.h.TSN | 1 | 100.00 | NULL
UNION | <derived2> | ref | <auto_key0> | 5 | itis.h.Parent_TSN | 10 | 100.00 | NULL
| UNION RESULT | <union1,8> | ALL | NULL | NULL | NULL | NULL | NULL | Using temporary; Using filesort
--------------+------------+--------+-------------+---------+-------------------+--------+----------+--------------------------------
Maybe I need more indexes?
Unfortunately I ran out of time to analyze.
Tree Expansion Query Performance
mysql> SELECT * FROM SYS.STATEMENTS_WITH_TEMP_TABLESG
query: WITH RECURSIVE `ancestors` ( ` ... `l`
. `completename` , `b` .
db: itis
exec_count: 1
total_latency: 1.24 s
memory_tmp_tables: 3
disk_tmp_tables: 0
avg_tmp_tables_per_query: 3
tmp_tables_to_disk_pct: 0
first_seen: 2017-04-27 01:33:14
last_seen: 2017-04-27 01:33:14
digest: 86c1417d2ff3679863db754eff425e94
Tree Expansion Query Stages
mysql> SELECT * FROM SYS.USER_SUMMARY_BY_STAGES;
+------+--------------------------------+-------+---------------+-------------+
| user | event_name | total | total_latency | avg_latency |
+------+--------------------------------+-------+---------------+-------------+
| root | stage/sql/Sending data | 12 | 979.42 ms | 81.62 ms |
| root | stage/sql/System lock | 40 | 6.34 ms | 158.52 us |
| root | stage/sql/Opening tables | 191 | 3.34 ms | 17.51 us |
| root | stage/sql/checking permissions | 53 | 1.35 ms | 25.45 us |
| root | stage/sql/starting | 2 | 356.31 us | 178.16 us |
| root | stage/sql/statistics | 12 | 271.01 us | 22.58 us |
| root | stage/sql/closing tables | 191 | 179.15 us | 937.00 ns |
| root | stage/sql/preparing | 12 | 98.18 us | 8.18 us |
| root | stage/sql/query end | 191 | 57.60 us | 301.00 ns |
| root | stage/sql/freeing items | 2 | 47.93 us | 23.96 us |
| root | stage/sql/Creating sort index | 1 | 37.38 us | 37.38 us |
| root | stage/sql/optimizing | 13 | 30.60 us | 2.35 us |
| root | stage/sql/executing | 13 | 30.27 us | 2.33 us |
| root | stage/sql/removing tmp table | 14 | 24.44 us | 1.74 us |
| root | stage/sql/init | 3 | 14.78 us | 4.93 us |
| root | stage/sql/cleaning up | 2 | 11.66 us | 5.83 us |
| root | stage/sql/Sorting result | 2 | 3.67 us | 1.84 us |
| root | stage/sql/end | 3 | 3.04 us | 1.01 us |
+------+--------------------------------+-------+---------------+-------------+
Conclusions
Conclusions
§Overall, MySQL 8 support for recursive CTE queries is
worth the wait.
§Exotic cases exist that are beyond any optimizer.
§I'm excited to upgrade to MySQL 8.0.x ASAP!
§Now that virtually all major SQL brands support
recursive CTE's, we need developer tools and popular
apps to use them!
License and Copyright
Copyright 2017 Bill Karwin
http://www.slideshare.net/billkarwin
Released under a Creative Commons 3.0 License:
http://creativecommons.org/licenses/by-nc-nd/3.0/
You are free to share—to copy, distribute,
and transmit this work, under the following conditions:
Attribution.
You must attribute this
work to Bill Karwin.
Noncommercial.
You may not use this
work for commercial
purposes.
No Derivative Works.
You may not alter,
transform, or build
upon this work.

More Related Content

What's hot (20)

KEY
Trees In The Database - Advanced data structures
Lorenzo Alberton
 
PDF
Load Data Fast!
Karwin Software Solutions LLC
 
PDF
The MySQL Query Optimizer Explained Through Optimizer Trace
oysteing
 
PDF
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
mumrah
 
PPTX
Real-time Stream Processing with Apache Flink
DataWorks Summit
 
PDF
Solving PostgreSQL wicked problems
Alexander Korotkov
 
PDF
MongoDB Fundamentals
MongoDB
 
PPTX
PostgreSQL and CockroachDB SQL
CockroachDB
 
PDF
Spring Data JPA
Cheng Ta Yeh
 
PPTX
Introduction to SQL Antipatterns
Krishnakumar S
 
PPTX
PostGreSQL Performance Tuning
Maven Logix
 
PPTX
Capabilities for Resources and Effects
Martin Odersky
 
PDF
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Guido Schmutz
 
PDF
Building Event Streaming Architectures on Scylla and Kafka
ScyllaDB
 
PDF
Clean code
Arturo Herrero
 
PDF
Introduction to Apache Calcite
Jordan Halterman
 
PDF
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Altinity Ltd
 
PDF
PostgreSQL Tutorial For Beginners | Edureka
Edureka!
 
PPTX
Introduction to Apache ZooKeeper
Saurav Haloi
 
PDF
Open Source SQL - beyond parsers: ZetaSQL and Apache Calcite
Julian Hyde
 
Trees In The Database - Advanced data structures
Lorenzo Alberton
 
The MySQL Query Optimizer Explained Through Optimizer Trace
oysteing
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
mumrah
 
Real-time Stream Processing with Apache Flink
DataWorks Summit
 
Solving PostgreSQL wicked problems
Alexander Korotkov
 
MongoDB Fundamentals
MongoDB
 
PostgreSQL and CockroachDB SQL
CockroachDB
 
Spring Data JPA
Cheng Ta Yeh
 
Introduction to SQL Antipatterns
Krishnakumar S
 
PostGreSQL Performance Tuning
Maven Logix
 
Capabilities for Resources and Effects
Martin Odersky
 
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Guido Schmutz
 
Building Event Streaming Architectures on Scylla and Kafka
ScyllaDB
 
Clean code
Arturo Herrero
 
Introduction to Apache Calcite
Jordan Halterman
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Altinity Ltd
 
PostgreSQL Tutorial For Beginners | Edureka
Edureka!
 
Introduction to Apache ZooKeeper
Saurav Haloi
 
Open Source SQL - beyond parsers: ZetaSQL and Apache Calcite
Julian Hyde
 

Viewers also liked (20)

PDF
MySQL InnoDB Cluster - A complete High Availability solution for MySQL
Olivier DASINI
 
PDF
Online MySQL Backups with Percona XtraBackup
Kenny Gryp
 
PPTX
Mysql参数-GDB
zhaolinjnu
 
PDF
MySQL Best Practices - OTN LAD Tour
Ronald Bradford
 
PDF
MySQL Backup and Recovery Essentials
Ronald Bradford
 
PDF
MySQL InnoDB Cluster and Group Replication - OSI 2017 Bangalore
Sujatha Sivakumar
 
PDF
Group Replication: A Journey to the Group Communication Core
Alfranio Júnior
 
PPT
Mysql high availability and scalability
yin gong
 
PDF
What you wanted to know about MySQL, but could not find using inernal instrum...
Sveta Smirnova
 
PDF
Capturing, Analyzing and Optimizing MySQL
Ronald Bradford
 
PDF
Galera cluster for high availability
Mydbops
 
PDF
MySQL Group Replication - HandsOn Tutorial
Kenny Gryp
 
ODP
Mastering InnoDB Diagnostics
guest8212a5
 
PDF
MySQL Replication Performance Tuning for Fun and Profit!
Vitor Oliveira
 
PDF
MySQL High Availability with Group Replication
Nuno Carvalho
 
PDF
MySQL Group Replication
Kenny Gryp
 
PDF
MySQL innodb cluster and Group Replication in a nutshell - hands-on tutorial ...
Frederic Descamps
 
PPT
淘宝数据库架构演进历程
zhaolinjnu
 
PDF
Inno db internals innodb file formats and source code structure
zhaolinjnu
 
PDF
Reducing Risk When Upgrading MySQL
Kenny Gryp
 
MySQL InnoDB Cluster - A complete High Availability solution for MySQL
Olivier DASINI
 
Online MySQL Backups with Percona XtraBackup
Kenny Gryp
 
Mysql参数-GDB
zhaolinjnu
 
MySQL Best Practices - OTN LAD Tour
Ronald Bradford
 
MySQL Backup and Recovery Essentials
Ronald Bradford
 
MySQL InnoDB Cluster and Group Replication - OSI 2017 Bangalore
Sujatha Sivakumar
 
Group Replication: A Journey to the Group Communication Core
Alfranio Júnior
 
Mysql high availability and scalability
yin gong
 
What you wanted to know about MySQL, but could not find using inernal instrum...
Sveta Smirnova
 
Capturing, Analyzing and Optimizing MySQL
Ronald Bradford
 
Galera cluster for high availability
Mydbops
 
MySQL Group Replication - HandsOn Tutorial
Kenny Gryp
 
Mastering InnoDB Diagnostics
guest8212a5
 
MySQL Replication Performance Tuning for Fun and Profit!
Vitor Oliveira
 
MySQL High Availability with Group Replication
Nuno Carvalho
 
MySQL Group Replication
Kenny Gryp
 
MySQL innodb cluster and Group Replication in a nutshell - hands-on tutorial ...
Frederic Descamps
 
淘宝数据库架构演进历程
zhaolinjnu
 
Inno db internals innodb file formats and source code structure
zhaolinjnu
 
Reducing Risk When Upgrading MySQL
Kenny Gryp
 
Ad

Similar to Recursive Query Throwdown (20)

PDF
Writing Recursive Queries
Ben Lis
 
PDF
[APJ] Common Table Expressions (CTEs) in SQL
EDB
 
PDF
Programming the SQL Way with Common Table Expressions
EDB
 
PDF
M|18 Taking Advantage of Common Table Expressions
MariaDB plc
 
PDF
Common Table Expressions (CTE) & Window Functions in MySQL 8.0
oysteing
 
PPT
Handling tree structures — recursive SPs, nested sets, recursive CTEs
Mind The Firebird
 
PDF
More SQL in MySQL 8.0
Norvald Ryeng
 
PPTX
MySQL Optimizer: What's New in 8.0
Manyi Lu
 
PDF
MySQL Optimizer: What’s New in 8.0
oysteing
 
PDF
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)
Sergey Petrunya
 
PDF
Database Design most common pitfalls
Federico Razzoli
 
PDF
Common Table Expressions in MariaDB 10.2
Sergey Petrunya
 
PDF
Indexing in Cassandra
Ed Anuff
 
PDF
MySQL Kitchen : spice up your everyday SQL queries
Damien Seguy
 
PPTX
DATASTORAGE.pptx
Neheurevathy
 
PPTX
Modern sql
Elizabeth Smith
 
PDF
MySQL 8.0: not only good, it’s GREAT! - PHP UK 2019
Gabriela Ferrara
 
PDF
DATASTORAGE.pdf
Neheurevathy
 
PDF
DATASTORAGE
Neheurevathy
 
PPTX
Query hierarchical data the easy way, with CTEs
MariaDB plc
 
Writing Recursive Queries
Ben Lis
 
[APJ] Common Table Expressions (CTEs) in SQL
EDB
 
Programming the SQL Way with Common Table Expressions
EDB
 
M|18 Taking Advantage of Common Table Expressions
MariaDB plc
 
Common Table Expressions (CTE) & Window Functions in MySQL 8.0
oysteing
 
Handling tree structures — recursive SPs, nested sets, recursive CTEs
Mind The Firebird
 
More SQL in MySQL 8.0
Norvald Ryeng
 
MySQL Optimizer: What's New in 8.0
Manyi Lu
 
MySQL Optimizer: What’s New in 8.0
oysteing
 
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)
Sergey Petrunya
 
Database Design most common pitfalls
Federico Razzoli
 
Common Table Expressions in MariaDB 10.2
Sergey Petrunya
 
Indexing in Cassandra
Ed Anuff
 
MySQL Kitchen : spice up your everyday SQL queries
Damien Seguy
 
DATASTORAGE.pptx
Neheurevathy
 
Modern sql
Elizabeth Smith
 
MySQL 8.0: not only good, it’s GREAT! - PHP UK 2019
Gabriela Ferrara
 
DATASTORAGE.pdf
Neheurevathy
 
DATASTORAGE
Neheurevathy
 
Query hierarchical data the easy way, with CTEs
MariaDB plc
 
Ad

More from Karwin Software Solutions LLC (11)

PDF
InnoDB Locking Explained with Stick Figures
Karwin Software Solutions LLC
 
PDF
SQL Outer Joins for Fun and Profit
Karwin Software Solutions LLC
 
PDF
Survey of Percona Toolkit
Karwin Software Solutions LLC
 
PDF
How to Design Indexes, Really
Karwin Software Solutions LLC
 
PDF
Percona toolkit
Karwin Software Solutions LLC
 
PDF
MySQL 5.5 Guide to InnoDB Status
Karwin Software Solutions LLC
 
PDF
Requirements the Last Bottleneck
Karwin Software Solutions LLC
 
PDF
Mentor Your Indexes
Karwin Software Solutions LLC
 
PDF
Sql Injection Myths and Fallacies
Karwin Software Solutions LLC
 
PDF
Full Text Search In PostgreSQL
Karwin Software Solutions LLC
 
InnoDB Locking Explained with Stick Figures
Karwin Software Solutions LLC
 
SQL Outer Joins for Fun and Profit
Karwin Software Solutions LLC
 
Survey of Percona Toolkit
Karwin Software Solutions LLC
 
How to Design Indexes, Really
Karwin Software Solutions LLC
 
MySQL 5.5 Guide to InnoDB Status
Karwin Software Solutions LLC
 
Requirements the Last Bottleneck
Karwin Software Solutions LLC
 
Mentor Your Indexes
Karwin Software Solutions LLC
 
Sql Injection Myths and Fallacies
Karwin Software Solutions LLC
 
Full Text Search In PostgreSQL
Karwin Software Solutions LLC
 

Recently uploaded (20)

PDF
Top Agile Project Management Tools for Teams in 2025
Orangescrum
 
PPTX
AEM User Group: India Chapter Kickoff Meeting
jennaf3
 
PDF
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
PPTX
Comprehensive Risk Assessment Module for Smarter Risk Management
EHA Soft Solutions
 
PDF
IDM Crack with Internet Download Manager 6.42 Build 43 with Patch Latest 2025
bashirkhan333g
 
PDF
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
PDF
Driver Easy Pro 6.1.1 Crack Licensce key 2025 FREE
utfefguu
 
PPTX
In From the Cold: Open Source as Part of Mainstream Software Asset Management
Shane Coughlan
 
PPTX
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 
PPTX
Agentic Automation Journey Series Day 2 – Prompt Engineering for UiPath Agents
klpathrudu
 
PDF
Add Background Images to Charts in IBM SPSS Statistics Version 31.pdf
Version 1 Analytics
 
PPTX
OpenChain @ OSS NA - In From the Cold: Open Source as Part of Mainstream Soft...
Shane Coughlan
 
PPTX
Milwaukee Marketo User Group - Summer Road Trip: Mapping and Personalizing Yo...
bbedford2
 
PDF
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 
PDF
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
PDF
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
PPTX
Customise Your Correlation Table in IBM SPSS Statistics.pptx
Version 1 Analytics
 
PDF
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
PPTX
ChiSquare Procedure in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
AI + DevOps = Smart Automation with devseccops.ai.pdf
Devseccops.ai
 
Top Agile Project Management Tools for Teams in 2025
Orangescrum
 
AEM User Group: India Chapter Kickoff Meeting
jennaf3
 
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
Comprehensive Risk Assessment Module for Smarter Risk Management
EHA Soft Solutions
 
IDM Crack with Internet Download Manager 6.42 Build 43 with Patch Latest 2025
bashirkhan333g
 
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
Driver Easy Pro 6.1.1 Crack Licensce key 2025 FREE
utfefguu
 
In From the Cold: Open Source as Part of Mainstream Software Asset Management
Shane Coughlan
 
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 
Agentic Automation Journey Series Day 2 – Prompt Engineering for UiPath Agents
klpathrudu
 
Add Background Images to Charts in IBM SPSS Statistics Version 31.pdf
Version 1 Analytics
 
OpenChain @ OSS NA - In From the Cold: Open Source as Part of Mainstream Soft...
Shane Coughlan
 
Milwaukee Marketo User Group - Summer Road Trip: Mapping and Personalizing Yo...
bbedford2
 
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
Customise Your Correlation Table in IBM SPSS Statistics.pptx
Version 1 Analytics
 
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
ChiSquare Procedure in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
AI + DevOps = Smart Automation with devseccops.ai.pdf
Devseccops.ai
 

Recursive Query Throwdown

  • 1. Recursive Query Throwdown in MySQL 8 BILL KARWIN PERCONA LIVE OPEN SOURCE DATABASE CONFERENCE 2017
  • 2. Bill Karwin Software developer, consultant, trainer Using MySQL since 2000 Senior Database Architect at SchoolMessenger Author of SQL Antipatterns: Avoiding the Pitfalls of Database Programming Oracle ACE Director
  • 3. How to Query a Tree? Hierarchical data § Organization charts § Categories and sub-categories § Parts explosion § Threaded discussions https://commons.wikimedia.org/wiki/File:Staff_Organisation_Diagram,_1896.jpg
  • 5. Adjacency List Example Data comment_id parent_id author comment 1 NULL Fran What’s the cause of this bug? 2 1 Ollie I think it’s a null pointer. 3 2 Fran No, I checked for that. 4 1 Kukla We need to check valid input. 5 4 Ollie Yes, that’s a bug. 6 4 Fran Yes, please add a check 7 6 Kukla That fixed it.
  • 6. Can’t Easily Query Deep Trees SELECT * FROM Comments c1 LEFT JOIN Comments c2 ON (c2.parent_id = c1.comment_id) LEFT JOIN Comments c3 ON (c3.parent_id = c2.comment_id) LEFT JOIN Comments c4 ON (c4.parent_id = c3.comment_id) LEFT JOIN Comments c5 ON (c5.parent_id = c4.comment_id) LEFT JOIN Comments c6 ON (c6.parent_id = c5.comment_id) LEFT JOIN Comments c7 ON (c7.parent_id = c6.comment_id) LEFT JOIN Comments c8 ON (c8.parent_id = c7.comment_id) LEFT JOIN Comments c9 ON (c9.parent_id = c8.comment_id) LEFT JOIN Comments c10 ON (c10.parent_id = c9.comment_id) ...
  • 8. MySQL Workarounds MySQL lacked support for recursive queries, so workarounds were needed These are all denormalized designs, most don’t have referential integrity §Path enumeration §Nested sets §Closure table
  • 9. Path Enumeration Example Data comment_id path author comment 1 1/ Fran What’s the cause of this bug? 2 1/2/ Ollie I think it’s a null pointer. 3 1/2/3/ Fran No, I checked for that. 4 1/4/ Kukla We need to check valid input. 5 1/4/5/ Ollie Yes, that’s a bug. 6 1/4/6/ Fran Yes, please add a check 7 1/4/6/7/ Kukla That fixed it.
  • 10. Path Enumeration Example Queries Query ancestors of comment #7: SELECT * FROM Comments WHERE '1/4/6/7/' LIKE CONCAT(path, '%'); Query descendants of comment #4: SELECT * FROM Comments WHERE path LIKE '1/4/%';
  • 11. Path Enumeration Pros and Cons Pros: §Single non-recursive query to get a tree or a subtree Cons: §Complex updates to add or remove a node §Numbers are stored in a string—no referential integrity
  • 12. Nested Sets Each comment encodes its descendants using two numbers: § A comment’s left number is less than all numbers used by the comment’s descendants. § A comment’s right number is greater than all numbers used by the comment’s descendants. § A comment’s numbers are between all numbers used by the comment’s ancestors. References: § “Recursive Hierarchies: The Relational Taboo!” Michael J. Kamfonas, Relational Journal, Oct/Nov 1992 § “Trees and Hierarchies in SQL For Smarties,” Joe Celko, 2004 § “Managing Hierarchical Data in MySQL,” Mike Hillyer, 2005
  • 14. Nested Sets Example Data comment_id nsleft nsright author comment 1 1 14 Fran What’s the cause of this bug? 2 2 5 Ollie I think it’s a null pointer. 3 3 4 Fran No, I checked for that. 4 6 13 Kukla We need to check valid input. 5 7 8 Ollie Yes, that’s a bug. 6 9 12 Fran Yes, please add a check 7 10 11 Kukla That fixed it.
  • 15. Nested Sets Example Queries Query ancestors of comment #7: SELECT ancestor.* FROM Comments child JOIN Comments ancestor ON child.nsleft BETWEEN ancestor.nsleft AND ancestor.nsright WHERE child.comment_id = 7; Query subtree under comment #4: SELECT descendant.* FROM Comments parent JOIN Comments descendant ON descendant.nsleft BETWEEN parent.nsleft AND parent.nsright WHERE parent.comment_id = 4;
  • 16. Nested Sets Pros and Cons Pros: §Single non-recursive query to get a tree or a subtree Cons: §Complex updates to add or remove a node §Numbers are not foreign keys—no referential integrity
  • 17. Closure Table Many-to-many table Stores every path from each node to each of its descendants A node even connects to itself CREATE TABLE Closure ( ancestor INT NOT NULL, descendant INT NOT NULL, length INT NOT NULL, PRIMARY KEY (ancestor, descendant), FOREIGN KEY(ancestor) REFERENCES Comments(comment_id), FOREIGN KEY(descendant) REFERENCES Comments(comment_id) );
  • 19. Closure Table Example Data comment_id author comment 1 Fran What’s the cause of this bug? 2 Ollie I think it’s a null pointer. 3 Fran No, I checked for that. 4 Kukla We need to check valid input. 5 Ollie Yes, that’s a bug. 6 Fran Yes, please add a check 7 Kukla That fixed it. ancestor descendant length 1 1 0 1 2 1 1 3 2 1 4 1 1 5 2 1 6 2 1 7 3 2 2 0 2 3 1 3 3 0 4 4 0 4 5 1 4 6 1 4 7 2 5 5 0 6 6 0 6 7 1 7 7 0
  • 20. Closure Table Example Queries Query ancestors of comment #7: SELECT c.* FROM Comments c JOIN Closure t ON (c.comment_id = t.ancestor) WHERE t.descendant = 7; Query subtree under comment #4: SELECT c.* FROM Comments c JOIN Closure t ON (c.comment_id = t.descendant) WHERE t.ancestor = 4;
  • 21. Closure Table Pros and Cons Pros: §Single non-recursive query to get a tree or a subtree §Referential integrity! Cons: §Extra table is required §Hierarchy is stored redundantly, too easy to mess up §Lots of joins to do most kinds of queries
  • 23. WITHer Recursive Queries in MySQL? SQL vendors gradually implemented SQL-99 WITH syntax: § IBM DB2 UDB 8 (Dec. 2002) § Microsoft SQL Server 2005 (Oct. 2005) § Sybase SQL Anywhere 11 (Aug. 2008) § Firebird 2.1 (Sep. 2008) § PostgreSQL 8.4 (Jul. 2009) § Oracle 11g release 2 (Sep. 2009) § Teradata (date and version of support unknown, at least 2009) § HSQLDB 2.3 (Jul. 2013) § SQLite 3.8.3.1 (Feb. 2014) § H2 (date and version unknown) https://www.percona.com/blog/2014/02/11/wither-recursive-queries/
  • 24. ANSI SQL Recursive Common Table Expression WITH RECURSIVE cte_name (col_name, col_name, col_name) AS ( subquery base case UNION ALL subquery referencing cte_name ) SELECT ... FROM cte_name ... https://dev.mysql.com/doc/refman/8.0/en/with.html
  • 25. Generating a Series of Numbers WITH RECURSIVE MySeries (n) AS ( SELECT 1 AS n UNION ALL SELECT 1+n FROM MySeries WHERE n < 10 ) SELECT * FROM MySeries; +------+ | n | +------+ | 1 | | 2 | | 3 | | 4 | | 5 | | 6 | | 7 | | 8 | | 9 | | 10 | +------+
  • 26. Generating a Series of Dates WITH RECURSIVE MyDates (d) AS ( SELECT CURRENT_DATE() AS d UNION ALL SELECT d + INTERVAL 1 DAY FROM MyDates WHERE d < CURRENT_DATE() + INTERVAL 7 DAY ) SELECT * FROM MyDates; +------------+ | d | +------------+ | 2017-04-24 | | 2017-04-25 | | 2017-04-26 | | 2017-04-27 | | 2017-04-28 | | 2017-04-29 | | 2017-04-30 | | 2017-05-01 | +------------+
  • 27. Query ancestors of comment #7 WITH RECURSIVE CommentTree (comment_id, parent_id, author, comment, depth) AS ( SELECT comment_id, parent_id, author, comment, 0 AS depth FROM Comments WHERE comment_id = 7 UNION ALL SELECT c.comment_id, c.parent_id, c.author, c.comment, ct.depth+1 FROM CommentTree ct JOIN Comments c ON (ct.parent_id = c.comment_id) ) SELECT * FROM CommentTree;
  • 29. Query subtree under comment #4 WITH RECURSIVE CommentTree (comment_id, parent_id, author, comment, depth) AS ( SELECT comment_id, parent_id, author, comment, 0 AS depth FROM Comments WHERE comment_id = 4 UNION ALL SELECT c.comment_id, c.parent_id, c.author, c.comment, ct.depth+1 FROM CommentTree ct JOIN Comments c ON (ct.comment_id = c.parent_id) ) SELECT * FROM CommentTree;
  • 30. Recursive CTE Pros and Cons Pros: § ANSI SQL-99 Standard § Compatible with other SQL implementations § Works with Adjacency List (single source of authority) § Referential integrity! Cons: § Not compatible with earlier MySQL versions § Use of materialized temporary tables may cause performance problems
  • 31. MySQL CTE Implementation: 💯 Thanks to @MarkusWinand for his preview analysis based on 8.0.1-dmr http://modern-sql.com/feature/with
  • 33. ITIS: Sample Hierarchical Data Integrated Taxonomic Information System (https://www.itis.gov/) §Biological database of species of animals, plants, fungi §One big tree of 544,954 nodes §Data comes in adjacency list & path enumeration format §I converted to closure table for query tests
  • 34. ITIS Data Model mysql> select * from longnames where completename = 'Eschscholzia californica'; +--------+---------------------------+ | tsn | completename | +--------+---------------------------+ | 18956 | Eschscholzia californica | +--------+---------------------------+ mysql> select * from hierarchy where TSN = '18956'G TSN: 18956 Parent_TSN: 18954 level: 11 ChildrenCount: 8 hierarchy_string: 202422-954898-846494-954900-846496-846504-18063-846547-18409-18880-18954-18956
  • 35. Indexes mysql> ALTER TABLE hierarchy ADD KEY (tsn, parent_tsn); Query OK, 0 rows affected (1.30 sec)
  • 36. Breadcrumbs Query WITH RECURSIVE taxonomy AS ( SELECT base.tsn, base.parent_tsn, 0 as depth FROM hierarchy base WHERE tsn = '18956' UNION ALL SELECT next.tsn, next.parent_tsn, t.depth+1 FROM hierarchy next JOIN taxonomy t WHERE t.parent_tsn = next.tsn ) SELECT * FROM taxonomy JOIN longnames USING (tsn) ORDER BY depth DESC;
  • 37. Breadcrumbs Query Result +--------+------------+-------+--------------------------+ | tsn | parent_tsn | depth | completename | +--------+------------+-------+--------------------------+ | 202422 | 0 | 11 | Plantae | | 954898 | 202422 | 10 | Viridiplantae | | 846494 | 954898 | 9 | Streptophyta | | 954900 | 846494 | 8 | Embryophyta | | 846496 | 954900 | 7 | Tracheophyta | | 846504 | 846496 | 6 | Spermatophytina | | 18063 | 846504 | 5 | Magnoliopsida | | 846547 | 18063 | 4 | Ranunculanae | | 18409 | 846547 | 3 | Ranunculales | | 18880 | 18409 | 2 | Papaveraceae | | 18954 | 18880 | 1 | Eschscholzia | | 18956 | 18954 | 0 | Eschscholzia californica | +--------+------------+-------+--------------------------+ 12 rows in set (0.00 sec)
  • 38. Breadcrumbs Query EXPLAIN Plan §New note in Extra: "Recursive" §Using index (covering index) for both base case and recursive case §I can eliminate the filesort if I allow natural order (base case first) §No "Using Temporary"? Not so fast… +----+-------------+------------+--------+---------------+---------+---------+--------------+------+----------+-----------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+------------+--------+---------------+---------+---------+--------------+------+----------+-----------------------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 4 | 100.00 | Using where; Using filesort | | 1 | PRIMARY | longnames | eq_ref | PRIMARY,tsn | PRIMARY | 4 | taxonomy.tsn | 1 | 100.00 | NULL | | 2 | DERIVED | base | ref | TSN | TSN | 4 | const | 1 | 100.00 | Using index | | 3 | UNION | t | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | Recursive; Using where | | 3 | UNION | next | ref | TSN | TSN | 4 | t.parent_tsn | 1 | 100.00 | Using index | +----+-------------+------------+--------+---------------+---------+---------+--------------+------+----------+-----------------------------+
  • 39. Breadcrumbs Query Performance mysql> SELECT * FROM SYS.STATEMENTS_WITH_TEMP_TABLESG query: WITH RECURSIVE `taxonomy` AS ( ... `tsn` ) ORDER BY `depth` DESC db: itis exec_count: 1 total_latency: 10.05 ms memory_tmp_tables: 1 disk_tmp_tables: 0 avg_tmp_tables_per_query: 1 tmp_tables_to_disk_pct: 0 first_seen: 2017-04-24 22:07:56 last_seen: 2017-04-24 22:07:56 digest: 8438633360bedce178823bb868589fd0
  • 40. Breadcrumbs Query Stages mysql> SELECT * FROM SYS.USER_SUMMARY_BY_STAGES; +------+--------------------------------+-------+---------------+-------------+ | user | event_name | total | total_latency | avg_latency | +------+--------------------------------+-------+---------------+-------------+ | root | stage/sql/System lock | 40 | 6.62 ms | 165.60 us | | root | stage/sql/Opening tables | 191 | 3.16 ms | 16.52 us | | root | stage/sql/checking permissions | 45 | 1.50 ms | 33.44 us | | root | stage/sql/Creating sort index | 1 | 239.63 us | 239.63 us | | root | stage/sql/closing tables | 191 | 191.03 us | 1.00 us | | root | stage/sql/starting | 2 | 188.44 us | 94.22 us | | root | stage/sql/Sending data | 6 | 138.96 us | 23.16 us | | root | stage/sql/statistics | 4 | 122.42 us | 30.60 us | | root | stage/sql/query end | 191 | 56.67 us | 296.00 ns | | root | stage/sql/preparing | 4 | 33.57 us | 8.39 us | | root | stage/sql/freeing items | 2 | 27.93 us | 13.96 us | | root | stage/sql/optimizing | 5 | 20.03 us | 4.01 us | | root | stage/sql/executing | 7 | 15.39 us | 2.20 us | | root | stage/sql/removing tmp table | 4 | 9.35 us | 2.34 us | | root | stage/sql/init | 3 | 8.76 us | 2.92 us | | root | stage/sql/Sorting result | 2 | 4.16 us | 2.08 us | | root | stage/sql/end | 3 | 1.93 us | 644.00 ns | | root | stage/sql/cleaning up | 2 | 1.43 us | 715.00 ns | +------+--------------------------------+-------+---------------+-------------+
  • 41. Tree Expansion Query Result See Demo
  • 42. Tree Expansion Query WITH RECURSIVE ancestors (tsn, parent_tsn) AS ( SELECT h.tsn, h.parent_tsn FROM hierarchy AS h WHERE h.tsn = %s UNION ALL SELECT h.tsn, h.parent_tsn FROM hierarchy AS h JOIN ancestors AS base ON h.tsn = base.parent_tsn ), breadcrumbs (tsn, parent_tsn, depth, breadcrumbs) AS ( SELECT h.tsn, h.parent_tsn, 0 AS depth, CAST(LPAD(h.tsn, 8, '0') AS CHAR(255)) AS breadcrumbs FROM hierarchy AS h WHERE h.parent_tsn = 0 UNION ALL SELECT h.tsn, h.parent_tsn, base.depth+1 AS depth, CONCAT(base.breadcrumbs, ',', LPAD(h.tsn, 8, '0')) FROM hierarchy AS h JOIN ancestors AS a ON h.tsn = a.tsn JOIN breadcrumbs AS base ON h.parent_tsn = base.tsn ) SELECT l.tsn, l.completename, b.depth, b.breadcrumbs FROM breadcrumbs AS b JOIN longnames AS l ON b.tsn = l.tsn UNION SELECT l.tsn, l.completename, b.depth+1, CONCAT(b.breadcrumbs, ',', LPAD(h.tsn, 8, '0')) FROM breadcrumbs AS b JOIN hierarchy AS h ON b.tsn = h.parent_tsn JOIN longnames AS l ON l.tsn = h.tsn ORDER BY breadcrumbs
  • 43. Tree Expansion Query EXPLAIN --------------+------------+--------+-------------+---------+-------------------+--------+----------+-------------------------------- select_type | table | type | key | key_len | ref | rows | filtered | Extra --------------+------------+--------+-------------+---------+-------------------+--------+----------+-------------------------------- PRIMARY | <derived2> | ALL | NULL | NULL | NULL | 250230 | 100.00 | Using where PRIMARY | l | eq_ref | PRIMARY | 4 | b.tsn | 1 | 100.00 | NULL DERIVED | h | index | TSN | 9 | NULL | 500466 | 10.00 | Using where; Using index UNION | base | ALL | NULL | NULL | NULL | 50046 | 100.00 | Recursive; Using where UNION | <derived4> | ALL | NULL | NULL | NULL | 4 | 100.00 | Using where; Using join buffer UNION | h | ref | TSN | 9 | a.tsn,base.tsn | 1 | 100.00 | Using index DERIVED | h | ref | TSN | 4 | const | 1 | 100.00 | Using index UNION | base | ALL | NULL | NULL | NULL | 2 | 100.00 | Recursive; Using where UNION | h | ref | TSN | 4 | base.parent_tsn | 1 | 100.00 | Using index UNION | h | index | TSN | 9 | NULL | 500466 | 100.00 | Using where; Using index UNION | l | eq_ref | PRIMARY | 4 | itis.h.TSN | 1 | 100.00 | NULL UNION | <derived2> | ref | <auto_key0> | 5 | itis.h.Parent_TSN | 10 | 100.00 | NULL | UNION RESULT | <union1,8> | ALL | NULL | NULL | NULL | NULL | NULL | Using temporary; Using filesort --------------+------------+--------+-------------+---------+-------------------+--------+----------+-------------------------------- Maybe I need more indexes? Unfortunately I ran out of time to analyze.
  • 44. Tree Expansion Query Performance mysql> SELECT * FROM SYS.STATEMENTS_WITH_TEMP_TABLESG query: WITH RECURSIVE `ancestors` ( ` ... `l` . `completename` , `b` . db: itis exec_count: 1 total_latency: 1.24 s memory_tmp_tables: 3 disk_tmp_tables: 0 avg_tmp_tables_per_query: 3 tmp_tables_to_disk_pct: 0 first_seen: 2017-04-27 01:33:14 last_seen: 2017-04-27 01:33:14 digest: 86c1417d2ff3679863db754eff425e94
  • 45. Tree Expansion Query Stages mysql> SELECT * FROM SYS.USER_SUMMARY_BY_STAGES; +------+--------------------------------+-------+---------------+-------------+ | user | event_name | total | total_latency | avg_latency | +------+--------------------------------+-------+---------------+-------------+ | root | stage/sql/Sending data | 12 | 979.42 ms | 81.62 ms | | root | stage/sql/System lock | 40 | 6.34 ms | 158.52 us | | root | stage/sql/Opening tables | 191 | 3.34 ms | 17.51 us | | root | stage/sql/checking permissions | 53 | 1.35 ms | 25.45 us | | root | stage/sql/starting | 2 | 356.31 us | 178.16 us | | root | stage/sql/statistics | 12 | 271.01 us | 22.58 us | | root | stage/sql/closing tables | 191 | 179.15 us | 937.00 ns | | root | stage/sql/preparing | 12 | 98.18 us | 8.18 us | | root | stage/sql/query end | 191 | 57.60 us | 301.00 ns | | root | stage/sql/freeing items | 2 | 47.93 us | 23.96 us | | root | stage/sql/Creating sort index | 1 | 37.38 us | 37.38 us | | root | stage/sql/optimizing | 13 | 30.60 us | 2.35 us | | root | stage/sql/executing | 13 | 30.27 us | 2.33 us | | root | stage/sql/removing tmp table | 14 | 24.44 us | 1.74 us | | root | stage/sql/init | 3 | 14.78 us | 4.93 us | | root | stage/sql/cleaning up | 2 | 11.66 us | 5.83 us | | root | stage/sql/Sorting result | 2 | 3.67 us | 1.84 us | | root | stage/sql/end | 3 | 3.04 us | 1.01 us | +------+--------------------------------+-------+---------------+-------------+
  • 47. Conclusions §Overall, MySQL 8 support for recursive CTE queries is worth the wait. §Exotic cases exist that are beyond any optimizer. §I'm excited to upgrade to MySQL 8.0.x ASAP! §Now that virtually all major SQL brands support recursive CTE's, we need developer tools and popular apps to use them!
  • 48. License and Copyright Copyright 2017 Bill Karwin http://www.slideshare.net/billkarwin Released under a Creative Commons 3.0 License: http://creativecommons.org/licenses/by-nc-nd/3.0/ You are free to share—to copy, distribute, and transmit this work, under the following conditions: Attribution. You must attribute this work to Bill Karwin. Noncommercial. You may not use this work for commercial purposes. No Derivative Works. You may not alter, transform, or build upon this work.