SlideShare a Scribd company logo
+
Cost-based Query Optimization
Maryann Xue (Intel)
Julian Hyde (Hortonworks)
Hadoop Summit, San Jose
June 2016
•@maryannxue
•Apache Phoenix PMC member
•Intel
•@julianhyde
•Apache Calcite VP
•Hortonworks
What is Apache Phoenix?
• A relational database layer for Apache HBase
– Query engine
• Transforms SQL queries into native HBase API calls
• Pushes as much work as possible onto the cluster for parallel
execution
– Metadata repository
• Typed access to data stored in HBase tables
– Transaction support
– Table Statistics
– A JDBC driver
Advanced Features
• Secondary indexes
• Strong SQL standard compliance
• Windowed aggregates
• Connectivity (e.g. remote JDBC driver, ODBC driver)
Created architectural pain… We decided to do it right!
Example 1: Optimizing Secondary Indexes
How we match secondary
indexes in Phoenix 4.8:
What about both?
SELECT * FROM Emp ORDER BY name
SELECT * FROM Emp WHERE empId > 100
CREATE TABLE Emps(empId INT PRIMARY KEY, name VARCHAR(100));



CREATE INDEX I_Emps_Name ON Emps(name);
SELECT * FROM Emp

WHERE empId > 100 ORDER BY name
Q1
Q2
Q3
I_Emps_Name
Emps
We need to make a cost-based decision! Statistics can help.
?
Phoenix + Calcite
• Both are Apache projects
• Involves changes to both projects
• Work is being done on a branch of Phoenix, with changes to Calcite
as needed
• Goals:
– Remove code! (Use Calcite’s SQL parser, validator)
– Improve planning (Faster planning, faster queries)
– Improve SQL compliance
– Some “free” SQL features (e.g. WITH, scalar subquery, FILTER)
– Close to full compatibility with current Phoenix SQL and APIs
• Status: beta, expected GA: late 2016
Current Phoenix Architecture
Parser
Algebra
Phoenix Schema
Stage 1: ParseNode tree
Stage 2: Normalization,
secondary index rewrite
Stage 3: Expression tree
HBase Data
Runtime
Query Plan
Calcite Architecture
Parser
Algebra
Schema SPI Operators,

Rules,

Statistics,
Cost model
Data
Engine
Data
Engine
Data
Engine
Phoenix + Calcite Architecture
Parser
Algebra
Phoenix Schema Logical + Phoenix Operators,

Builtin + Phoenix Rules,

Phoenix Statistics,
Phoenix Cost model
Data
JDBC (optional)
HBase Data
Phoenix Runtime
Data
Other (optional)
Query Plan
Cost-based Query Optimizer

with Apache Calcite
• Base all query optimization decisions on cost
– Filter push down; range scan vs. skip scan
– Hash aggregate vs. stream aggregate vs. partial stream aggregate
– Sort optimized out; sort/limit push through; fwd/rev/unordered
scan
– Hash join vs. merge join; join ordering
– Use of data table vs. index table
– All above (any many others) COMBINED
• Query optimizations are modeled as pluggable rules
Calcite Algebra
SELECT products.name, COUNT(*)

FROM sales

JOIN products USING (productId)

WHERE sales.discount IS NOT NULL

GROUP BY products.name

ORDER BY COUNT(*) DESC
scan
[products]
scan
[sales]
join
filter
aggregate
sort
translate SQL to
relational
algebra
Example 2: FilterIntoJoinRule
SELECT products.name, COUNT(*)

FROM sales

JOIN products USING (productId)

WHERE sales.discount IS NOT NULL

GROUP BY products.name

ORDER BY COUNT(*) DESC
scan
[products]
scan
[sales]
join
filter
aggregate
sort
scan
[products]
scan
[sales]
filter’
join’
aggregate
sort
FilterIntoJoinRule
translate SQL to
relational
algebra
Example 3: Phoenix Joins
• Hash join vs. Sort merge join
– Hash join good for: either input is small
– Sort merge join good for: both inputs are big
– Hash join downside: potential OOM
– Sort merge join downside: extra sorting required sometimes
• Better to exploit the sortedness of join input
• Better to exploit the sortedness of join output
Example 3: Calcite Algebra
SELECT empid, e.name, d.name, location

FROM emps AS e

JOIN depts AS d USING (deptno)

ORDER BY d.deptno
scan
[emps]
scan
[depts]
join
sort
project
translate SQL to
relational
algebra
Example 3: Plan Candidates
scan
[emps]
scan
[depts]
hash-join
sort
project
scan
[emps]
scan
[depts]
sort
merge-join
projectCandidate 1:
hash-join
*also what standalone
Phoenix compiler
would generate.
Candidate 2:
merge-join
1. Very little difference in all other operators: project, scan, hash-join or merge-join
2. Candidate 1 would sort “emps join depts”, while candidate 2 would only sort “emps”
Win
SortRemoveRule
sorted on [deptno]
SortRemoveRule
sorted on [e.deptno]
Example 3: Improved Plan
scan ‘depts’
send ‘depts’ over to RS
& build hash-cache
scan ‘emps’ hash-join ‘depts’
sort joined table on ‘e.deptno’
scan ‘emps’
merge-join ‘emps’ and ‘depts’
sort by ‘deptno’
scan ‘depts’
Old vs. New
1. Exploited the sortedness of join input
2. Exploited the sortedness of join output
(and now, a brief look at Calcite)
Apache Calcite
• Apache top-level project since October, 2015
• Query planning framework
– Relational algebra, rewrite

rules
– Cost model & statistics
– Federation via adapters
– Extensible
• Packaging
– Library
– Optional SQL parser, JDBC server
– Community-authored rules, adapters
Embedded Adapters Streaming
Apache Drill
Apache Hive
Apache Kylin
Apache Phoenix*
Cascading
Lingual
Apache Cassandra*
Apache Spark
CSV
In-memory
JDBC
JSON
MongoDB
Splunk
Web tables
Apache Flink*
Apache Samza
Apache Storm
Apache Calcite Avatica
• Database connectivity
stack
• Self-contained sub-
project of Calcite
• Fast, open, stable
• Powers Phoenix Query
Server
Calcite – APIs and SPIs
Cost, statistics
RelOptCost
RelOptCostFactory
RelMetadataProvider
• RelMdColumnUniquensss
• RelMdDistinctRowCount
• RelMdSelectivity
SQL parser
SqlNode

SqlParser

SqlValidator
Transformation rules
RelOptRule
• MergeFilterRule
• PushAggregateThroughUnionRule
• 100+ more
Global transformations
• Unification (materialized view)
• Column trimming
• De-correlation
Relational algebra
RelNode (operator)
• TableScan
• Filter
• Project
• Union
• Aggregate
• …
RelDataType (type)
RexNode (expression)
RelTrait (physical property)
• RelConvention (calling-convention)
• RelCollation (sortedness)
• TBD (bucketedness/distribution)
JDBC driver (Avatica)
Metadata
Schema
Table
Function
• TableFunction
• TableMacro
Lattice
Calcite Planning Process
SQL
parse
tree
Planner
RelNode
Graph
Sql-to-Rel Converter
SqlNode
! RelNode
+ RexNode
Node for each node in Input
Plan
Each node is a Set of
alternate Sub Plans
Set further divided into
Subsets: based on traits like
sortedness
1. Plan Graph
Rule: specifies an Operator
sub-graph to match and logic
to generate equivalent ‘better’
sub-graph
New and original sub-graph
both remain in contention
2. Rules
RelNodes have Cost &
Cumulative Cost
3. Cost Model
Used to plug in Schema,
cost formulas
Filter selectivity
Join selectivity
NDV calculations
4. Metadata Providers
Rule Match Queue
Best RelNode Graph
Translate to
runtime
Logical Plan
Based on “Volcano” & “Cascades” papers [G. Graefe]
Add Rule matches to Queue
Apply Rule match transformations
to plan graph
Iterate for fixed iterations or until
cost doesn’t change
Match importance based on cost of
RelNode and height
Views and materialized views
• A view is a named
relational expression,
stored in the catalog,
that is expanded
while planning a
query.
• A materialized view is an equivalence,
stored in the catalog, between a table
and a relational expression.



The planner substitutes the table into
queries where it will help, even if the
queries do not reference the
materialized view.
Query using a view
Scan [Emps]
Join [$0, $5]
Project [$0, $1, $2, $3]
Filter [age >= 50]
Aggregate [deptno, min(salary)]
Scan [Managers]
Aggregate [manager]
Scan [Emps]
SELECT deptno, min(salary)

FROM Managers

WHERE age >= 50

GROUP BY deptno
CREATE VIEW Managers AS

SELECT *

FROM Emps 

WHERE EXISTS (

SELECT *

FROM Emps AS underling

WHERE underling.manager = emp.id)
view scan to
be expanded
After view expansion
Scan [Emps] Aggregate [manager]
Join [$0, $5]
Project [$0, $1, $2, $3]
Filter [age >= 50]
Aggregate [deptno, min(salary)]
Scan [Emps]
SELECT deptno, min(salary)

FROM Managers

WHERE age >= 50

GROUP BY deptno
CREATE VIEW Managers AS

SELECT *

FROM Emps 

WHERE EXISTS (

SELECT *

FROM Emps AS underling

WHERE underling.manager = emp.id)
can be pushed
down
Materialized view
Scan [Emps]
Aggregate [deptno, gender,

COUNT(*), SUM(sal)]
Scan [EmpSummary]
=
Scan [Emps]
Filter [deptno = 10 AND gender = ‘M’]
Aggregate [COUNT(*)]
CREATE MATERIALIZED VIEW EmpSummary AS

SELECT deptno,

gender,

COUNT(*) AS c,

SUM(sal) AS s

FROM Emps

GROUP BY deptno, gender
SELECT COUNT(*)

FROM Emps

WHERE deptno = 10

AND gender = ‘M’
Materialized view, step 2: Rewrite query to
match
Scan [Emps]
Aggregate [deptno, gender,

COUNT(*), SUM(sal)]
Scan [EmpSummary]
=
Scan [Emps]
Filter [deptno = 10 AND gender = ‘M’]
Aggregate [deptno, gender,

COUNT(*) AS c, SUM(sal) AS s]
Project [c]
CREATE MATERIALIZED VIEW EmpSummary AS

SELECT deptno,

gender,

COUNT(*) AS c,

SUM(sal) AS s

FROM Emps

GROUP BY deptno, gender
SELECT COUNT(*)

FROM Emps

WHERE deptno = 10

AND gender = ‘M’
Materialized view, step 3: Substitute table
Scan [Emps]
Aggregate [deptno, gender,

COUNT(*), SUM(sal)]
Scan [EmpSummary]
=
Filter [deptno = 10 AND gender = ‘M’]
Project [c]
Scan [EmpSummary]
CREATE MATERIALIZED VIEW EmpSummary AS

SELECT deptno,

gender,

COUNT(*) AS c,

SUM(sal) AS s

FROM Emps

GROUP BY deptno, gender
SELECT COUNT(*)

FROM Emps

WHERE deptno = 10

AND gender = ‘M’
(and now, back to Phoenix)
Example 1, Revisited: Secondary Index
Optimizer internally creates a mapping (query, table) equivalent to:
Scan [Emps]
Filter [deptno BETWEEN 100 and 150]
Project [deptno, name]
Sort [deptno]
CREATE MATERIALIZED VIEW I_Emp_Deptno AS

SELECT deptno, empno, name

FROM Emps

ORDER BY deptno
Scan [Emps]
Project [deptno, empno, name]
Sort [deptno, empno, name]
Filter [deptno BETWEEN 100 and 150]
Project [deptno, name]
Scan
[I_Emp_Deptno]
1,000
1,000
200
1600 1,000
1,000
200
very simple
cost based
on row-count
Beyond Phoenix 4.8

with Apache Calcite
• Get the missing SQL support
– WITH, UNNEST, Scalar subquery, etc.
• Materialized views
– To allow other forms of indices (maybe defined as external), e.g., a
filter view, a join view, or an aggregate view.
• Interop with other Calcite adapters
– Already used by Drill, Hive, Kylin, Samza, etc.
– Supports any JDBC source
– Initial version of Drill-Phoenix integration already working
Drillix: Interoperability with Apache Drill
SELECT deptno, sum(salary) FROM emps GROUP BY deptno
Stage 1:
Local Partial aggregation
Stage 3:
Final aggregation
Stage 2:
Shuffle partial results
Drill Aggregate [deptno, sum(salary)]
Drill Shuffle [deptno]
Phoenix Aggregate [deptno, sum(salary)]
Phoenix TableScan [emps]
Phoenix Tables on HBase
Thank you! Questions?
@maryannxue
@julianhyde
http://phoenix.apache.org
http://calcite.apache.org

More Related Content

What's hot (20)

PDF
Streaming SQL with Apache Calcite
Julian Hyde
 
PDF
Understanding Query Plans and Spark UIs
Databricks
 
PDF
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Julian Hyde
 
PDF
Physical Plans in Spark SQL
Databricks
 
PDF
Iceberg + Alluxio for Fast Data Analytics
Alluxio, Inc.
 
PDF
Fast federated SQL with Apache Calcite
Chris Baynes
 
PDF
Top 5 Mistakes When Writing Spark Applications
Spark Summit
 
PDF
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...
Christian Tzolov
 
PDF
Smarter Together - Bringing Relational Algebra, Powered by Apache Calcite, in...
Julian Hyde
 
PDF
Change Data Feed in Delta
Databricks
 
PDF
Data profiling with Apache Calcite
Julian Hyde
 
PPTX
How to understand and analyze Apache Hive query execution plan for performanc...
DataWorks Summit/Hadoop Summit
 
PPTX
Presto query optimizer: pursuit of performance
DataWorks Summit
 
PDF
Deep Dive into the New Features of Apache Spark 3.0
Databricks
 
PDF
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Databricks
 
PDF
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Databricks
 
PDF
Spark SQL
Joud Khattab
 
PDF
SQL for NoSQL and how Apache Calcite can help
Christian Tzolov
 
PDF
Common Strategies for Improving Performance on Your Delta Lakehouse
Databricks
 
PDF
Productizing Structured Streaming Jobs
Databricks
 
Streaming SQL with Apache Calcite
Julian Hyde
 
Understanding Query Plans and Spark UIs
Databricks
 
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Julian Hyde
 
Physical Plans in Spark SQL
Databricks
 
Iceberg + Alluxio for Fast Data Analytics
Alluxio, Inc.
 
Fast federated SQL with Apache Calcite
Chris Baynes
 
Top 5 Mistakes When Writing Spark Applications
Spark Summit
 
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...
Christian Tzolov
 
Smarter Together - Bringing Relational Algebra, Powered by Apache Calcite, in...
Julian Hyde
 
Change Data Feed in Delta
Databricks
 
Data profiling with Apache Calcite
Julian Hyde
 
How to understand and analyze Apache Hive query execution plan for performanc...
DataWorks Summit/Hadoop Summit
 
Presto query optimizer: pursuit of performance
DataWorks Summit
 
Deep Dive into the New Features of Apache Spark 3.0
Databricks
 
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Databricks
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Databricks
 
Spark SQL
Joud Khattab
 
SQL for NoSQL and how Apache Calcite can help
Christian Tzolov
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Databricks
 
Productizing Structured Streaming Jobs
Databricks
 

Similar to Cost-based Query Optimization in Apache Phoenix using Apache Calcite (20)

PDF
phoenix-on-calcite-nyc-meetup
Maryann Xue
 
PPTX
The Evolution of a Relational Database Layer over HBase
DataWorks Summit
 
PPTX
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon
 
PPTX
HBaseCon2015-final
Maryann Xue
 
PDF
Tactical data engineering
Julian Hyde
 
PDF
Apache Calcite: One Frontend to Rule Them All
Michael Mior
 
PPTX
Lazy beats Smart and Fast
Julian Hyde
 
PDF
Don’t optimize my queries, optimize my data!
Julian Hyde
 
PDF
Don't optimize my queries, organize my data!
Julian Hyde
 
PPTX
HBaseCon2016-final
Maryann Xue
 
PDF
Why you care about
 relational algebra (even though you didn’t know it)
Julian Hyde
 
PDF
Apache Big Data EU 2015 - Phoenix
Nick Dimiduk
 
PPTX
Apache Phoenix: Use Cases and New Features
HBaseCon
 
PPTX
eHarmony @ Hbase Conference 2016 by vijay vangapandu.
Vijaykumar Vangapandu
 
PDF
ONE FOR ALL! Using Apache Calcite to make SQL smart
Evans Ye
 
PDF
Tech Talk - JPA and Query Optimization - publish
Gleydson Lima
 
PPT
Phoenix h basemeetup
Dmitry Makarchuk
 
PDF
Beyond EXPLAIN: Query Optimization From Theory To Code
Yuto Hayamizu
 
PDF
Apache Drill talk ApacheCon 2018
Aman Sinha
 
PDF
Issues in Query Processing and Optimization
Editor IJMTER
 
phoenix-on-calcite-nyc-meetup
Maryann Xue
 
The Evolution of a Relational Database Layer over HBase
DataWorks Summit
 
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon
 
HBaseCon2015-final
Maryann Xue
 
Tactical data engineering
Julian Hyde
 
Apache Calcite: One Frontend to Rule Them All
Michael Mior
 
Lazy beats Smart and Fast
Julian Hyde
 
Don’t optimize my queries, optimize my data!
Julian Hyde
 
Don't optimize my queries, organize my data!
Julian Hyde
 
HBaseCon2016-final
Maryann Xue
 
Why you care about
 relational algebra (even though you didn’t know it)
Julian Hyde
 
Apache Big Data EU 2015 - Phoenix
Nick Dimiduk
 
Apache Phoenix: Use Cases and New Features
HBaseCon
 
eHarmony @ Hbase Conference 2016 by vijay vangapandu.
Vijaykumar Vangapandu
 
ONE FOR ALL! Using Apache Calcite to make SQL smart
Evans Ye
 
Tech Talk - JPA and Query Optimization - publish
Gleydson Lima
 
Phoenix h basemeetup
Dmitry Makarchuk
 
Beyond EXPLAIN: Query Optimization From Theory To Code
Yuto Hayamizu
 
Apache Drill talk ApacheCon 2018
Aman Sinha
 
Issues in Query Processing and Optimization
Editor IJMTER
 
Ad

More from Julian Hyde (20)

PPTX
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Julian Hyde
 
PDF
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Julian Hyde
 
PDF
Building a semantic/metrics layer using Calcite
Julian Hyde
 
PDF
Cubing and Metrics in SQL, oh my!
Julian Hyde
 
PDF
Morel, a data-parallel programming language
Julian Hyde
 
PDF
Is there a perfect data-parallel programming language? (Experiments with More...
Julian Hyde
 
PDF
Morel, a Functional Query Language
Julian Hyde
 
PDF
The evolution of Apache Calcite and its Community
Julian Hyde
 
PDF
What to expect when you're Incubating
Julian Hyde
 
PDF
Open Source SQL - beyond parsers: ZetaSQL and Apache Calcite
Julian Hyde
 
PDF
Efficient spatial queries on vanilla databases
Julian Hyde
 
PDF
Spatial query on vanilla databases
Julian Hyde
 
PDF
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
Julian Hyde
 
PDF
A smarter Pig: Building a SQL interface to Apache Pig using Apache Calcite
Julian Hyde
 
PDF
Data Profiling in Apache Calcite
Julian Hyde
 
PDF
Streaming SQL
Julian Hyde
 
PDF
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
Julian Hyde
 
PDF
Streaming SQL
Julian Hyde
 
PDF
Streaming SQL
Julian Hyde
 
PDF
Streaming SQL
Julian Hyde
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Julian Hyde
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Julian Hyde
 
Building a semantic/metrics layer using Calcite
Julian Hyde
 
Cubing and Metrics in SQL, oh my!
Julian Hyde
 
Morel, a data-parallel programming language
Julian Hyde
 
Is there a perfect data-parallel programming language? (Experiments with More...
Julian Hyde
 
Morel, a Functional Query Language
Julian Hyde
 
The evolution of Apache Calcite and its Community
Julian Hyde
 
What to expect when you're Incubating
Julian Hyde
 
Open Source SQL - beyond parsers: ZetaSQL and Apache Calcite
Julian Hyde
 
Efficient spatial queries on vanilla databases
Julian Hyde
 
Spatial query on vanilla databases
Julian Hyde
 
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
Julian Hyde
 
A smarter Pig: Building a SQL interface to Apache Pig using Apache Calcite
Julian Hyde
 
Data Profiling in Apache Calcite
Julian Hyde
 
Streaming SQL
Julian Hyde
 
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
Julian Hyde
 
Streaming SQL
Julian Hyde
 
Streaming SQL
Julian Hyde
 
Streaming SQL
Julian Hyde
 
Ad

Recently uploaded (20)

PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
PDF
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 

Cost-based Query Optimization in Apache Phoenix using Apache Calcite

  • 1. + Cost-based Query Optimization Maryann Xue (Intel) Julian Hyde (Hortonworks) Hadoop Summit, San Jose June 2016
  • 2. •@maryannxue •Apache Phoenix PMC member •Intel •@julianhyde •Apache Calcite VP •Hortonworks
  • 3. What is Apache Phoenix? • A relational database layer for Apache HBase – Query engine • Transforms SQL queries into native HBase API calls • Pushes as much work as possible onto the cluster for parallel execution – Metadata repository • Typed access to data stored in HBase tables – Transaction support – Table Statistics – A JDBC driver
  • 4. Advanced Features • Secondary indexes • Strong SQL standard compliance • Windowed aggregates • Connectivity (e.g. remote JDBC driver, ODBC driver) Created architectural pain… We decided to do it right!
  • 5. Example 1: Optimizing Secondary Indexes How we match secondary indexes in Phoenix 4.8: What about both? SELECT * FROM Emp ORDER BY name SELECT * FROM Emp WHERE empId > 100 CREATE TABLE Emps(empId INT PRIMARY KEY, name VARCHAR(100));
 
 CREATE INDEX I_Emps_Name ON Emps(name); SELECT * FROM Emp
 WHERE empId > 100 ORDER BY name Q1 Q2 Q3 I_Emps_Name Emps We need to make a cost-based decision! Statistics can help. ?
  • 6. Phoenix + Calcite • Both are Apache projects • Involves changes to both projects • Work is being done on a branch of Phoenix, with changes to Calcite as needed • Goals: – Remove code! (Use Calcite’s SQL parser, validator) – Improve planning (Faster planning, faster queries) – Improve SQL compliance – Some “free” SQL features (e.g. WITH, scalar subquery, FILTER) – Close to full compatibility with current Phoenix SQL and APIs • Status: beta, expected GA: late 2016
  • 7. Current Phoenix Architecture Parser Algebra Phoenix Schema Stage 1: ParseNode tree Stage 2: Normalization, secondary index rewrite Stage 3: Expression tree HBase Data Runtime Query Plan
  • 8. Calcite Architecture Parser Algebra Schema SPI Operators,
 Rules,
 Statistics, Cost model Data Engine Data Engine Data Engine
  • 9. Phoenix + Calcite Architecture Parser Algebra Phoenix Schema Logical + Phoenix Operators,
 Builtin + Phoenix Rules,
 Phoenix Statistics, Phoenix Cost model Data JDBC (optional) HBase Data Phoenix Runtime Data Other (optional) Query Plan
  • 10. Cost-based Query Optimizer
 with Apache Calcite • Base all query optimization decisions on cost – Filter push down; range scan vs. skip scan – Hash aggregate vs. stream aggregate vs. partial stream aggregate – Sort optimized out; sort/limit push through; fwd/rev/unordered scan – Hash join vs. merge join; join ordering – Use of data table vs. index table – All above (any many others) COMBINED • Query optimizations are modeled as pluggable rules
  • 11. Calcite Algebra SELECT products.name, COUNT(*)
 FROM sales
 JOIN products USING (productId)
 WHERE sales.discount IS NOT NULL
 GROUP BY products.name
 ORDER BY COUNT(*) DESC scan [products] scan [sales] join filter aggregate sort translate SQL to relational algebra
  • 12. Example 2: FilterIntoJoinRule SELECT products.name, COUNT(*)
 FROM sales
 JOIN products USING (productId)
 WHERE sales.discount IS NOT NULL
 GROUP BY products.name
 ORDER BY COUNT(*) DESC scan [products] scan [sales] join filter aggregate sort scan [products] scan [sales] filter’ join’ aggregate sort FilterIntoJoinRule translate SQL to relational algebra
  • 13. Example 3: Phoenix Joins • Hash join vs. Sort merge join – Hash join good for: either input is small – Sort merge join good for: both inputs are big – Hash join downside: potential OOM – Sort merge join downside: extra sorting required sometimes • Better to exploit the sortedness of join input • Better to exploit the sortedness of join output
  • 14. Example 3: Calcite Algebra SELECT empid, e.name, d.name, location
 FROM emps AS e
 JOIN depts AS d USING (deptno)
 ORDER BY d.deptno scan [emps] scan [depts] join sort project translate SQL to relational algebra
  • 15. Example 3: Plan Candidates scan [emps] scan [depts] hash-join sort project scan [emps] scan [depts] sort merge-join projectCandidate 1: hash-join *also what standalone Phoenix compiler would generate. Candidate 2: merge-join 1. Very little difference in all other operators: project, scan, hash-join or merge-join 2. Candidate 1 would sort “emps join depts”, while candidate 2 would only sort “emps” Win SortRemoveRule sorted on [deptno] SortRemoveRule sorted on [e.deptno]
  • 16. Example 3: Improved Plan scan ‘depts’ send ‘depts’ over to RS & build hash-cache scan ‘emps’ hash-join ‘depts’ sort joined table on ‘e.deptno’ scan ‘emps’ merge-join ‘emps’ and ‘depts’ sort by ‘deptno’ scan ‘depts’ Old vs. New 1. Exploited the sortedness of join input 2. Exploited the sortedness of join output
  • 17. (and now, a brief look at Calcite)
  • 18. Apache Calcite • Apache top-level project since October, 2015 • Query planning framework – Relational algebra, rewrite
 rules – Cost model & statistics – Federation via adapters – Extensible • Packaging – Library – Optional SQL parser, JDBC server – Community-authored rules, adapters Embedded Adapters Streaming Apache Drill Apache Hive Apache Kylin Apache Phoenix* Cascading Lingual Apache Cassandra* Apache Spark CSV In-memory JDBC JSON MongoDB Splunk Web tables Apache Flink* Apache Samza Apache Storm
  • 19. Apache Calcite Avatica • Database connectivity stack • Self-contained sub- project of Calcite • Fast, open, stable • Powers Phoenix Query Server
  • 20. Calcite – APIs and SPIs Cost, statistics RelOptCost RelOptCostFactory RelMetadataProvider • RelMdColumnUniquensss • RelMdDistinctRowCount • RelMdSelectivity SQL parser SqlNode
 SqlParser
 SqlValidator Transformation rules RelOptRule • MergeFilterRule • PushAggregateThroughUnionRule • 100+ more Global transformations • Unification (materialized view) • Column trimming • De-correlation Relational algebra RelNode (operator) • TableScan • Filter • Project • Union • Aggregate • … RelDataType (type) RexNode (expression) RelTrait (physical property) • RelConvention (calling-convention) • RelCollation (sortedness) • TBD (bucketedness/distribution) JDBC driver (Avatica) Metadata Schema Table Function • TableFunction • TableMacro Lattice
  • 21. Calcite Planning Process SQL parse tree Planner RelNode Graph Sql-to-Rel Converter SqlNode ! RelNode + RexNode Node for each node in Input Plan Each node is a Set of alternate Sub Plans Set further divided into Subsets: based on traits like sortedness 1. Plan Graph Rule: specifies an Operator sub-graph to match and logic to generate equivalent ‘better’ sub-graph New and original sub-graph both remain in contention 2. Rules RelNodes have Cost & Cumulative Cost 3. Cost Model Used to plug in Schema, cost formulas Filter selectivity Join selectivity NDV calculations 4. Metadata Providers Rule Match Queue Best RelNode Graph Translate to runtime Logical Plan Based on “Volcano” & “Cascades” papers [G. Graefe] Add Rule matches to Queue Apply Rule match transformations to plan graph Iterate for fixed iterations or until cost doesn’t change Match importance based on cost of RelNode and height
  • 22. Views and materialized views • A view is a named relational expression, stored in the catalog, that is expanded while planning a query. • A materialized view is an equivalence, stored in the catalog, between a table and a relational expression.
 
 The planner substitutes the table into queries where it will help, even if the queries do not reference the materialized view.
  • 23. Query using a view Scan [Emps] Join [$0, $5] Project [$0, $1, $2, $3] Filter [age >= 50] Aggregate [deptno, min(salary)] Scan [Managers] Aggregate [manager] Scan [Emps] SELECT deptno, min(salary)
 FROM Managers
 WHERE age >= 50
 GROUP BY deptno CREATE VIEW Managers AS
 SELECT *
 FROM Emps 
 WHERE EXISTS (
 SELECT *
 FROM Emps AS underling
 WHERE underling.manager = emp.id) view scan to be expanded
  • 24. After view expansion Scan [Emps] Aggregate [manager] Join [$0, $5] Project [$0, $1, $2, $3] Filter [age >= 50] Aggregate [deptno, min(salary)] Scan [Emps] SELECT deptno, min(salary)
 FROM Managers
 WHERE age >= 50
 GROUP BY deptno CREATE VIEW Managers AS
 SELECT *
 FROM Emps 
 WHERE EXISTS (
 SELECT *
 FROM Emps AS underling
 WHERE underling.manager = emp.id) can be pushed down
  • 25. Materialized view Scan [Emps] Aggregate [deptno, gender,
 COUNT(*), SUM(sal)] Scan [EmpSummary] = Scan [Emps] Filter [deptno = 10 AND gender = ‘M’] Aggregate [COUNT(*)] CREATE MATERIALIZED VIEW EmpSummary AS
 SELECT deptno,
 gender,
 COUNT(*) AS c,
 SUM(sal) AS s
 FROM Emps
 GROUP BY deptno, gender SELECT COUNT(*)
 FROM Emps
 WHERE deptno = 10
 AND gender = ‘M’
  • 26. Materialized view, step 2: Rewrite query to match Scan [Emps] Aggregate [deptno, gender,
 COUNT(*), SUM(sal)] Scan [EmpSummary] = Scan [Emps] Filter [deptno = 10 AND gender = ‘M’] Aggregate [deptno, gender,
 COUNT(*) AS c, SUM(sal) AS s] Project [c] CREATE MATERIALIZED VIEW EmpSummary AS
 SELECT deptno,
 gender,
 COUNT(*) AS c,
 SUM(sal) AS s
 FROM Emps
 GROUP BY deptno, gender SELECT COUNT(*)
 FROM Emps
 WHERE deptno = 10
 AND gender = ‘M’
  • 27. Materialized view, step 3: Substitute table Scan [Emps] Aggregate [deptno, gender,
 COUNT(*), SUM(sal)] Scan [EmpSummary] = Filter [deptno = 10 AND gender = ‘M’] Project [c] Scan [EmpSummary] CREATE MATERIALIZED VIEW EmpSummary AS
 SELECT deptno,
 gender,
 COUNT(*) AS c,
 SUM(sal) AS s
 FROM Emps
 GROUP BY deptno, gender SELECT COUNT(*)
 FROM Emps
 WHERE deptno = 10
 AND gender = ‘M’
  • 28. (and now, back to Phoenix)
  • 29. Example 1, Revisited: Secondary Index Optimizer internally creates a mapping (query, table) equivalent to: Scan [Emps] Filter [deptno BETWEEN 100 and 150] Project [deptno, name] Sort [deptno] CREATE MATERIALIZED VIEW I_Emp_Deptno AS
 SELECT deptno, empno, name
 FROM Emps
 ORDER BY deptno Scan [Emps] Project [deptno, empno, name] Sort [deptno, empno, name] Filter [deptno BETWEEN 100 and 150] Project [deptno, name] Scan [I_Emp_Deptno] 1,000 1,000 200 1600 1,000 1,000 200 very simple cost based on row-count
  • 30. Beyond Phoenix 4.8
 with Apache Calcite • Get the missing SQL support – WITH, UNNEST, Scalar subquery, etc. • Materialized views – To allow other forms of indices (maybe defined as external), e.g., a filter view, a join view, or an aggregate view. • Interop with other Calcite adapters – Already used by Drill, Hive, Kylin, Samza, etc. – Supports any JDBC source – Initial version of Drill-Phoenix integration already working
  • 31. Drillix: Interoperability with Apache Drill SELECT deptno, sum(salary) FROM emps GROUP BY deptno Stage 1: Local Partial aggregation Stage 3: Final aggregation Stage 2: Shuffle partial results Drill Aggregate [deptno, sum(salary)] Drill Shuffle [deptno] Phoenix Aggregate [deptno, sum(salary)] Phoenix TableScan [emps] Phoenix Tables on HBase