http://kylin.io
Apache Kylin
Extreme OLAP Engine
Seshu Adunuthula
Director, Analytics Platform, eBay | sadunuthula@ebay.com
http://kylin.io
Agenda
 What’s Apache Kylin?
 Features
 Performance
 Roadmap
 Q & A
http://kylin.io
Extreme OLAP Engine for Big Data
Kylin is an open source Distributed Analytics Engine from eBay
that provides SQL interface and multi-dimensional analysis
(OLAP) on Hadoop supporting extremely large datasets
What’s Kylin
kylin / ˈkiːˈlɪn / 麒麟
--n. (in Chinese art) a mythical animal of composite form
• Open Sourced on Oct 1st, 2014
• Be Accepted as Apache Incubator Project on Nov 25th, 2014
http://kylin.io
Business Needs for Big Data Analysis
 Sub-second query latency on billions of rows
 ANSI SQL for both analysts and engineers
 Full OLAP capability to offer advanced functionality
 Seamless Integration with BI Tools
 Support of high cardinality and high dimensions
 High concurrency – thousands of end users
 Distributed and scale out architecture for large data volume
http://kylin.io
 Huge volume data
 Table scan
 Big table joins
 Data shuffling
 Analysis on different granularity
 Runtime aggregation expensive
 Map Reduce job
 Batch processing
Technical Challenges
http://kylin.io
OLAP Cube – Balance between Space and Time
time, item
time, item, location
time, item, location, supplier
time item location supplier
time, location
Time, supplier
item, location
item, supplier
location, supplier
time, item, supplier
time, location, supplier
item, location, supplier
0-D(apex) cuboid
1-D cuboids
2-D cuboids
3-D cuboids
4-D(base) cuboid
• Base vs. aggregate cells; ancestor vs. descendant cells; parent vs. child cells
1. (9/15, milk, Urbana, Dairy_land) - <time, item, location, supplier>
2. (9/15, milk, Urbana, *) - <time, item, location>
3. (*, milk, Urbana, *) - <item, location>
4. (*, milk, Chicago, *) - <item, location>
5. (*, milk, *, *) - <item>
• Cuboid = one combination of dimensions
• Cube = all combination of dimensions (all cuboids)
http://kylin.io
Kylin Architecture Overview
7
Cube Build Engine
(MapReduce…)
SQL
Low Latency -
Seconds
Mid Latency - Minutes
Routing
3rd Party App
(Web App, Mobile…)
Metadata
SQL-Based Tool
(BI Tools: Tableau…)
Query Engine
Hadoop
Hive
REST API JDBC/ODBC
 Online Analysis Data Flow
 Offline Data Flow
 Clients/Users interactive with
Kylin via SQL
 OLAP Cube is transparent to
users
Star Schema Data Key Value Data
Data
Cube
OLAP
Cube
(HBase)
SQL
REST Server
http://kylin.io
 Hive
 Input source
 Pre-join star schema during cube building
 MapReduce
 Pre-aggregation metrics during cube building
 HDFS
 Store intermediated files during cube building.
 HBase
 Store data cube.
 Serve query on data cube.
 Coprocessor is used for query processing.
How Does Kylin Utilize Hadoop Components?
http://kylin.io
Agenda
 What’s Apache Kylin?
 Features
 Performance
 Roadmap
 Q & A
http://kylin.io
 Extremely Fast OLAP Engine at Scale
Kylin is designed to reduce query latency on Hadoop for 10+ billions of rows of data
 ANSI SQL Interface on Hadoop
Kylin offers ANSI SQL on Hadoop and supports most ANSI SQL query functions
 Seamless Integration with BI Tools
Kylin currently offers integration capability with BI Tools like Tableau.
 Interactive Query Capability
Users can interact with Hadoop data via Kylin at sub-second latency, better than Hive
queries for the same dataset
 MOLAP Cube
User can define a data model and pre-build in Kylin with more than 10+ billions of raw
data records
Features Highlights
http://kylin.io
 Compression and Encoding Support
 Incremental Refresh of Cubes
 Approximate Query Capability for distinct Count (HyperLogLog)
 Leverage HBase Coprocessor for query latency
 Job Management and Monitoring
 Easy Web interface to manage, build, monitor and query cubes
 Security capability to set ACL at Cube/Project Level
 Support LDAP Integration
Features Highlights…
http://kylin.io
Cube Designer
http://kylin.io
Job Management
http://kylin.io
Query and Visualization
http://kylin.io
Tableau Integration
http://kylin.io
Agenda
 What’s Apache Kylin?
 Features
 Performance
 Roadmap
 Q & A
http://kylin.io
Kylin vs. Hive
# Query
Type
Return Dataset Query
On Kylin (s)
Query
On Hive (s)
Comments
1 High Level
Aggregation
4 0.129 157.437 1,217 times
2 Analysis Query 22,669 1.615 109.206 68 times
3 Drill Down to
Detail
325,029 12.058 113.123 9 times
4 Drill Down to
Detail
524,780 22.42 6383.21 278 times
5 Data Dump 972,002 49.054 N/A
0
50
100
150
200
SQL #1 SQL #2 SQL #3
Hive
Kylin
High Level
Aggregatio
n
Analysis
Query
Drill Down
to Detail
Low Level
Aggregatio
n
Transactio
n Level
Based on 12+B records case
http://kylin.io
Performance - Query Latency
90%tile queries <5s
Green Line: 90%tile queries
Gray Line: 95%tile queries
http://kylin.io
Agenda
 What’s Apache Kylin?
 Features
 Performance
 Roadmap
 Q & A
http://kylin.io
Kylin Evolution Roadmap
201520142013
Initial
Prototype
for MOLAP
• Basic end to end
POC
MOLAP
• Incremental
Refresh
• ANSI SQL
• ODBC Driver
• Web GUI
• ACL
• Open Source
HOLAP
• Streaming OLAP
• JDBC Driver
• New UI
• Excel Support
• … more
Next Gen
• Automation
• Capacity
Management
• In-Memory
Analysis (TBD)
• Spark (TBD)
• … more
TBD
Future…
Sep, 2013
Jan, 2014
Sep, 2014
Q1, 2015
http://kylin.io
 Kylin Site:
 http://kylin.io
 Twitter:
 @ApacheKylin
 Github:
 apache/incubator-kylin
 WeChat (微信)
 ApacheKylin
Open Source

More Related Content

PPTX
HBaseCon 2013: Being Smarter Than the Smart Meter
PPTX
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...
PDF
Kylin and Druid Presentation
PDF
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
PDF
IEEE International Conference on Data Engineering 2015
PDF
Stsg17 speaker yousunjeong
PPTX
Never late again! Job-Level deadline SLOs in YARN
PPTX
HBaseConAsia2018 Track2-4: HTAP DB-System: AsparaDB HBase, Phoenix, and Spark
HBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...
Kylin and Druid Presentation
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
IEEE International Conference on Data Engineering 2015
Stsg17 speaker yousunjeong
Never late again! Job-Level deadline SLOs in YARN
HBaseConAsia2018 Track2-4: HTAP DB-System: AsparaDB HBase, Phoenix, and Spark

What's hot (20)

PPTX
Yahoo - Moving beyond running 100% of Apache Pig jobs on Apache Tez
PDF
What's new in SQL on Hadoop and Beyond
PPTX
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
PPTX
eBay Experimentation Platform on Hadoop
PPTX
Cloudera Impala + PostgreSQL
PDF
Big Telco - Yousun Jeong
PPTX
Bridging the gap of Relational to Hadoop using Sqoop @ Expedia
PDF
Change Data Capture with Data Collector @OVH
PPTX
Next Gen Big Data Analytics with Apache Apex
PPT
HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environmen...
PPTX
Hadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
PPTX
Quark Virtualization Engine for Analytics
PPTX
Empower Data-Driven Organizations
PPTX
Hadoop and HBase @eBay
PPTX
Apache kylin 2.0: from classic olap to real-time data warehouse
PDF
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
PDF
Next CERN Accelerator Logging Service with Jakub Wozniak
PDF
Exponea - Kafka and Hadoop as components of architecture
PDF
Hadoop summit 2010, HONU
Yahoo - Moving beyond running 100% of Apache Pig jobs on Apache Tez
What's new in SQL on Hadoop and Beyond
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
eBay Experimentation Platform on Hadoop
Cloudera Impala + PostgreSQL
Big Telco - Yousun Jeong
Bridging the gap of Relational to Hadoop using Sqoop @ Expedia
Change Data Capture with Data Collector @OVH
Next Gen Big Data Analytics with Apache Apex
HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environmen...
Hadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
Quark Virtualization Engine for Analytics
Empower Data-Driven Organizations
Hadoop and HBase @eBay
Apache kylin 2.0: from classic olap to real-time data warehouse
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
Next CERN Accelerator Logging Service with Jakub Wozniak
Exponea - Kafka and Hadoop as components of architecture
Hadoop summit 2010, HONU
Ad

Viewers also liked (20)

PPTX
The Evolution of Apache Kylin
PPTX
Apache Kylin’s Performance Boost from Apache HBase
PDF
Sybase BAM Overview
PPTX
Apache Kylin: Hadoop OLAP Engine, 2014 Dec
PPTX
Kylin Engineering Principles
PDF
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
PDF
eBay Cloud CMS - QCon 2012 - http://yidb.org/
PPTX
Apache Kylin Introduction
PDF
Low Latency OLAP with Hadoop and HBase
PDF
JavaFX 8 - GUI by Illusion
PPTX
Apache Kylin @ Big Data Europe 2015
PPTX
Apache Kylin – Cubes on Hadoop
PPTX
Design cube in Apache Kylin
PDF
OLAP with Cassandra and Spark
PDF
HBase Read High Availability Using Timeline-Consistent Region Replicas
PPTX
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
PPTX
Cross-Site BigTable using HBase
PPTX
HBaseCon 2013: Apache HBase on Flash
PDF
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
PPTX
HBaseCon 2012 | Scaling GIS In Three Acts
The Evolution of Apache Kylin
Apache Kylin’s Performance Boost from Apache HBase
Sybase BAM Overview
Apache Kylin: Hadoop OLAP Engine, 2014 Dec
Kylin Engineering Principles
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
eBay Cloud CMS - QCon 2012 - http://yidb.org/
Apache Kylin Introduction
Low Latency OLAP with Hadoop and HBase
JavaFX 8 - GUI by Illusion
Apache Kylin @ Big Data Europe 2015
Apache Kylin – Cubes on Hadoop
Design cube in Apache Kylin
OLAP with Cassandra and Spark
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
Cross-Site BigTable using HBase
HBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2012 | Scaling GIS In Three Acts
Ad

Similar to HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for Hadoop (20)

PPTX
Apache kylin - Big Data Technology Conference 2014 Beijing
PPTX
Apache Kylin Extreme OLAP Engine for Big Data
PDF
Apache Kylin - Balance between space and time - Hadoop Summit 2015
PPTX
Apache kylin (china hadoop summit 2015 shanghai)
PPTX
Apache Kylin Streaming
PPTX
Kylin OLAP Engine Tour
PDF
Accelerating Big Data Analytics with Apache Kylin
PDF
Apache Kylin Use Cases in China and Japan
PDF
Apache Kylin - Balance Between Space and Time
PPTX
Kylin olap part 1- getting started
PPTX
Adding Spark support to Kylin at Bay Area Spark Meetup
PDF
Apache kylin boost your SQLs on extremely large dataset
PDF
Apache kylin boost your sqls on extremely large dataset
PDF
The Evolution of Apache Kylin by Luke Han
PPTX
Apache kylin 101 - Get Sub-Second Analytics on Massive Datasets
PPTX
Apache Kylin 101
PDF
Apache Kylin and Use Cases - 2018 Big Data Spain
PPTX
Apache Kylin - OLAP Cubes for SQL on Hadoop
PPTX
Apache Kylin on HBase: Extreme OLAP engine for big data
PPTX
Apache Kylin 1.5 Updates
Apache kylin - Big Data Technology Conference 2014 Beijing
Apache Kylin Extreme OLAP Engine for Big Data
Apache Kylin - Balance between space and time - Hadoop Summit 2015
Apache kylin (china hadoop summit 2015 shanghai)
Apache Kylin Streaming
Kylin OLAP Engine Tour
Accelerating Big Data Analytics with Apache Kylin
Apache Kylin Use Cases in China and Japan
Apache Kylin - Balance Between Space and Time
Kylin olap part 1- getting started
Adding Spark support to Kylin at Bay Area Spark Meetup
Apache kylin boost your SQLs on extremely large dataset
Apache kylin boost your sqls on extremely large dataset
The Evolution of Apache Kylin by Luke Han
Apache kylin 101 - Get Sub-Second Analytics on Massive Datasets
Apache Kylin 101
Apache Kylin and Use Cases - 2018 Big Data Spain
Apache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin on HBase: Extreme OLAP engine for big data
Apache Kylin 1.5 Updates

More from HBaseCon (20)

PDF
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
PDF
hbaseconasia2017: HBase on Beam
PDF
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
PDF
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
PDF
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
PDF
hbaseconasia2017: Apache HBase at Netease
PDF
hbaseconasia2017: HBase在Hulu的使用和实践
PDF
hbaseconasia2017: 基于HBase的企业级大数据平台
PDF
hbaseconasia2017: HBase at JD.com
PDF
hbaseconasia2017: Large scale data near-line loading method and architecture
PDF
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
PDF
hbaseconasia2017: HBase Practice At XiaoMi
PDF
hbaseconasia2017: hbase-2.0.0
PDF
HBaseCon2017 Democratizing HBase
PDF
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
PDF
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
PDF
HBaseCon2017 Transactions in HBase
PDF
HBaseCon2017 Highly-Available HBase
PDF
HBaseCon2017 Apache HBase at Didi
PDF
HBaseCon2017 gohbase: Pure Go HBase Client
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: hbase-2.0.0
HBaseCon2017 Democratizing HBase
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Transactions in HBase
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 gohbase: Pure Go HBase Client

Recently uploaded (20)

PDF
Multiverse AI Review 2025: Access All TOP AI Model-Versions!
PPTX
Matchmaking for JVMs: How to Pick the Perfect GC Partner
PDF
Guide to Food Delivery App Development.pdf
PPTX
Airline CRS | Airline CRS Systems | CRS System
DOC
UTEP毕业证学历认证,宾夕法尼亚克拉里恩大学毕业证未毕业
PPTX
Presentation by Samna Perveen And Subhan Afzal.pptx
PDF
AI-Powered Fuzz Testing: The Future of QA
PDF
SOFTWARE ENGINEERING Software Engineering (3rd Edition) by K.K. Aggarwal & Yo...
PDF
Internet Download Manager IDM Crack powerful download accelerator New Version...
PDF
Cloud Native Aachen Meetup - Aug 21, 2025
PDF
CCleaner 6.39.11548 Crack 2025 License Key
PPTX
Cybersecurity-and-Fraud-Protecting-Your-Digital-Life.pptx
PPTX
4Seller: The All-in-One Multi-Channel E-Commerce Management Platform for Glob...
PPTX
Chapter 1 - Transaction Processing and Mgt.pptx
PPTX
string python Python Strings: Literals, Slicing, Methods, Formatting, and Pra...
PDF
Introduction to Ragic - #1 No Code Tool For Digitalizing Your Business Proces...
PPTX
Lecture 5 Software Requirement Engineering
PPTX
ROI from Efficient Content & Campaign Management in the Digital Media Industry
PDF
Website Design & Development_ Professional Web Design Services.pdf
PPTX
MLforCyber_MLDataSetsandFeatures_Presentation.pptx
Multiverse AI Review 2025: Access All TOP AI Model-Versions!
Matchmaking for JVMs: How to Pick the Perfect GC Partner
Guide to Food Delivery App Development.pdf
Airline CRS | Airline CRS Systems | CRS System
UTEP毕业证学历认证,宾夕法尼亚克拉里恩大学毕业证未毕业
Presentation by Samna Perveen And Subhan Afzal.pptx
AI-Powered Fuzz Testing: The Future of QA
SOFTWARE ENGINEERING Software Engineering (3rd Edition) by K.K. Aggarwal & Yo...
Internet Download Manager IDM Crack powerful download accelerator New Version...
Cloud Native Aachen Meetup - Aug 21, 2025
CCleaner 6.39.11548 Crack 2025 License Key
Cybersecurity-and-Fraud-Protecting-Your-Digital-Life.pptx
4Seller: The All-in-One Multi-Channel E-Commerce Management Platform for Glob...
Chapter 1 - Transaction Processing and Mgt.pptx
string python Python Strings: Literals, Slicing, Methods, Formatting, and Pra...
Introduction to Ragic - #1 No Code Tool For Digitalizing Your Business Proces...
Lecture 5 Software Requirement Engineering
ROI from Efficient Content & Campaign Management in the Digital Media Industry
Website Design & Development_ Professional Web Design Services.pdf
MLforCyber_MLDataSetsandFeatures_Presentation.pptx

HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for Hadoop