SlideShare a Scribd company logo
BIGGER DATA ON GPUS:
SUCCESSES
APPROACHES, CHALLENGES,
Jake Wheat
Arnon Shimoni
INTRODUCING SQREAM DB
GPU-ACCELERATED DATA WAREHOUSE
100xfaster
Queries
10%of resources
Cost
20xmore data
Analyze
ECOSYSTEM
FAST TO GET LOTS OF DATA IN
• Use GPU for loading
• 900 GB/s Memory Bandwidth
• Compress all the data
• Collect metadata
FAST TO GET LOTS OF DATA OUT
• Access with easy-to-use SQL
• Support standards like ODBC and JDBC
• 900 GB/s Memory Bandwidth for SQL operations
• Access raw data directly, without cubes, indexes
• SQream DB reads less data from disk, with compression
GPUS FOR DATA
PROCESSING
ARE GPUS INTERESTING FOR RUNNING SQL?
• Can they run SQL
• Can they run SQL faster
– If qualified yes, in what situations?
• Are there other issues to consider?
CAN GPUS RUN SQL?
Example SQL Physical Operator Implementation
select a+b, c * 5 from t select
(a.k.a
project/extend/rename)
thrust::transform
select a, count(*), sum(b),
avg(b) from t group by a
stream aggregate thrust::reduce_by_key
select a, b from t where a > 0.5 filter thrust::remove_if
select distinct a from t stream distinct thrust::unique
select a, b, c, d from t order
by a,b
sort thrust::sort
select * from t union all
select * from u
union all -
select * from t
inner join u using (a)
sort merge join (smj) simple implementation:
thrust::upper_bounds,
lower_bounds, unnest, gather
MARKETING HURDLES
• PCI-bottleneck means it will never work
• Columnar databases can't do joins
• GPUs can't accelerate SQL operations
• No-one will put a GPU in a server
• GPUs are not actually faster than CPUs
• A startup cannot make a production ready SQL DBMS
OTHER ISSUES
• Can you make a convincing demo?
• Can you turn it into a real product?
• Can you put GPUs in a data centre?
• Are GPUs a safe bet in the medium/long
term?
INSPIRATION
EARLY RESEARCH
• MonetDB/X100 talk
youtu.be/yrLd-3lnZ58
• Relational Joins on Graphics
Processors
www.cse.ust.hk/catalac/papers/
gpujoin_sigmod08.pdf
• Relational Query Co-Processing on
Graphics Processors
dl.acm.org/citation.cfm?id=162058
8
• Several Daniel Abadi papers
www.cs.umd.edu/~abadi/
THE EARLY SQREAM DB PROTOTYPES
• Original brief: OpenCL + Erlang + Haskell streaming IoT = World Domination!
• Generate thrust at query time
• SQL server plugin
• A real (but simple) DBMS with storage
OUR FIRST DBMS
• Run on data on disk
• Create and drop table
• Insert, insert select (and truncate)
• A wide range of queries:
e.g. select lists, joins, where, aggregates, order by, distinct
• Lots of external algorithms
WHY NOT POSTGRES?
Some downsides to Postgres
• No columnar - engine and storage
• No threads, Not distributed
• A big complex system
Some non-benefits:
• Parsing, syntax, and similar - Haskell makes this easy
• The storage and execution engine – very row based
Some things we miss:
• Wide range of features, data types, operations
• Extensibility
• Cost based optimiser
• Protocol/client compatibility
STEPS TOWARDS TODAY'S PRODUCT
Haskell Compiler
Parse SQL
Desugar to
Relational
Algebra
Optimize
Desugar to
Statement Plan
Network
Server
Runtime
Metadata
Database
Columnar
Storage
Tree Interpreter Building Blocks I/O Task Runner
SQREAM DB ARCHITECTURE
Statement Compiler
SQL Parser
Desugar & Optimize
Relational Algebra
Desugar & Optimize
Low-level stages
Execution Engine
Statement Tree Interpreter
Task Runners
I/O CPU GPU
Storage Layer
Metadata Database
+ Low-level transactions
server or in-process
Bulk Data Layer
Extent Extent Extent …
Storage Reorganizer
Tasks
Queue & Thread
Manager
Profiling Support
Memory Managers
Building
blocks
Building
blocks
Building
blocks
Connection &
Session
Manager
Concurrency
& Admission
Control
Desugar & Optimize
Small
Memory
Managers
Chunk
Memory
Managers
Spool
Memory
Managers
Linux FS
Cache
Prodder
SOME ARCHITECTURE DETAILS
• Haskell has the intelligence
• C++/CUDA does the heavy lifting
• Message passing, worker pools
• Bulk data memory centric
• Storage is append-only with background reorganization
STORAGE AND TRANSACTIONS
• Metadata database with relatively conventional transactions
• Append only storage layer with background reorganization
Transactions
• Serializable, with any kind of statement
• Run multiple queries concurrently with anything
• Run multiple inserts to the same table at the same time
• Cannot run multiple statements in a single transaction
• Other operations such as delete, truncate, and DDL use course grained exclusive
locking
USING GPUS EFFECTIVELY
• Good kernels
• Optimise around GPU memory
• Use large chunks, rechunk where necessary
• Avoid PCI transfers where possible
• Profiling
• Partitioning
VECTORED BINARY SEARCH
0
3
4
2
4
5
0
0
3
3
1
1
1
1
1
2
2
Table A Table B
HASH JOINS
• Can hashing run fast on the GPU?
• Answer from NVIDIA experts:
– in principle probably yes
– in practice, difficult to compete with sort-based algorithms
COMPRESSION
• GPU compression for typical columnar data
– e.g. Dictionary, RLE, Delta, Pfor + Combos
– Helps speed up IO and PCI transfer times
– in house code
• CPU compression for general data
– Helps speed up IO, but not PCI transfer times
– We use things like Snappy and LZ4
SOME FINAL THOUGHTS
• SQL analytics and GPUs are a natural fit
• GPUs can be very effective for big data/external
algorithms
• Lots of exciting work being done in non-SQL
analytics (not just on GPUs)
• Haskell is a big positive
• Building a commercial SQL DBMS is very difficult
• Building a SQL DBMS is a really satisfying thing to do
SQL GPU
HIGH THROUGHPUT, CONVERGED
• SQream DB is designed for high-throughput devices
• IBM Power Systems is the only NVLink CPU-to-GPU enabled architecture,
unlocking the potential of high-throughput accelerated computing
• The IBM AC922, with POWER9 and NVLINK can transfer data at up to 300GB/s,
almost 9.5x faster than PCIe 3.0 found in x86-based architectures, reducing
classic I/O bottlenecks
2x
NVIDIA
Tesla V100
2x
NVIDIA
Tesla V100
IBM
Power 9
IBM
Power 9
HIGH THROUGHPUT ARCHITECTURE
IT’S NOT JUST CORES
RAM
Power9
CPU
Tesla V100
GPU
VRAM
Tesla V100
GPU
VRAM
170GB/s per CPU
NVLink – 300GB/s BiDi
900GB/s
RAM
Power9
CPU
Tesla V100
GPU
VRAM
Tesla V100
GPU
VRAM
IBM SMP bus
UP TO 3.7X FASTER QUERIES
52.83
10.35
84.5
78.57
14.06
2.8
30.29 29.01
0
10
20
30
40
50
60
70
80
90
TPC-H Query 8 TPC-H Query 6 TPC-H Query 19 TPC-H Query 17
Querytime(seconds)
Lowerisbetter
Query
SQream DB performance
IBM Power9 vs Intel Xeon (Skylake)
Dell PowerEdge R740 IBM Power9 AC922
IBM Power9 AC922:
2x POWER9 16C @ 3.8GHz | 256 GB DDR4 2666 MHz | SSD storage | 4x NVIDIA Tesla V100 (SXM2 NVLINK - 16GB)
Dell PowerEdge R740:
2x Intel Xeon Silver 4112 CPU @ 2.60GHz | 256GB DDR4 2666MHz | SSD storage | 4x NVIDIA Tesla V100 (PCIe - 16GB)
• In our testing, SQream DB on Power9
is between 150% to 370% faster than
comparable x86 architectures,
especially on large data sets. For
example, in the TPC-H (SF 10,000)
dataset, Query 8 ran in a quarter of
the time on the IBM Power 9,
compared to the x86 competitor.
UNDERSTAND 40 MILLION CUSTOMERS
TELECOM
HP DL380g9
with NVIDIA Tesla GPU
96 GB RAM + 6 TB storage
$200K
80 NODES
5 full racks
7600 CPU cores
$10,000,000
20M
10M
300M
120M
Ingest time
Reporting time
Ownership Cost
Green
plum
3G
4G
CDRs
Others
ETL
1-2 hours
GP
Daily
aggr.
…
Profiles
GP
Daily report
3 hours
(max) #1 #2 #3 #4 #31•••
•••
Daily reports
Monthly
#1
Monthly
#2
Monthly
#NMonthly reports
(7 days)
5hr 3hr 0.5hr
Billing
Pre-aggregations
ARCHITECTURE BEFORE SQREAM DB
SIMPLIFIED WITH SQREAM DB
3G
4G
CDRs
Others
#1 #2 #3 #4 #31•••
•••
Daily reports
Monthly
#1
Monthly
#2
Monthly
#NMonthly reports
1 day
10m 4m 2m

More Related Content

PPTX
ZFS appliance
Fran Navarro
 
PPTX
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...
Monica Beckwith
 
PDF
Average Active Sessions RMOUG2007
John Beresniewicz
 
PDF
Performance Tuning Using oratop
Sandesh Rao
 
PDF
Transparent Data Encryption in PostgreSQL and Integration with Key Management...
Masahiko Sawada
 
PDF
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Databricks
 
PDF
Comparison of ACFS and DBFS
DanielHillinger
 
PPT
UKOUG, Oracle Transaction Locks
Kyle Hailey
 
ZFS appliance
Fran Navarro
 
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...
Monica Beckwith
 
Average Active Sessions RMOUG2007
John Beresniewicz
 
Performance Tuning Using oratop
Sandesh Rao
 
Transparent Data Encryption in PostgreSQL and Integration with Key Management...
Masahiko Sawada
 
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Databricks
 
Comparison of ACFS and DBFS
DanielHillinger
 
UKOUG, Oracle Transaction Locks
Kyle Hailey
 

What's hot (20)

PDF
QuerySurge - the automated Data Testing solution
RTTS
 
PPT
Understanding MySQL Performance through Benchmarking
Laine Campbell
 
PPTX
Oracle Key Vault Overview
Troy Kitch
 
PDF
Dev Ops Training
Spark Summit
 
PPTX
Infrastructure-as-Code (IaC) using Terraform
Adin Ermie
 
PDF
Elasticsearch for Logs & Metrics - a deep dive
Sematext Group, Inc.
 
PPTX
Data Guard Architecture & Setup
Satishbabu Gunukula
 
PDF
MySQL Shell for DBAs
Frederic Descamps
 
PPTX
Oracle ASM Training
Vigilant Technologies
 
PDF
MySQL Enterprise Backup (MEB)
Mydbops
 
PDF
Google Cloud DNS
Zdenko Hrček
 
PPTX
Intro to Apache Spark
Robert Sanders
 
PPTX
Adapting and adopting spm v04
Carlos Sierra
 
PDF
Rapid Upgrades with Pg_Upgrade
EDB
 
PDF
Podman rootless containers
Giuseppe Scrivano
 
PDF
Oracle data guard for beginners
Pini Dibask
 
PDF
Understanding oracle rac internals part 1 - slides
Mohamed Farouk
 
PDF
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
OpenStack Korea Community
 
PPTX
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Altinity Ltd
 
PPT
Oracle GoldenGate
oracleonthebrain
 
QuerySurge - the automated Data Testing solution
RTTS
 
Understanding MySQL Performance through Benchmarking
Laine Campbell
 
Oracle Key Vault Overview
Troy Kitch
 
Dev Ops Training
Spark Summit
 
Infrastructure-as-Code (IaC) using Terraform
Adin Ermie
 
Elasticsearch for Logs & Metrics - a deep dive
Sematext Group, Inc.
 
Data Guard Architecture & Setup
Satishbabu Gunukula
 
MySQL Shell for DBAs
Frederic Descamps
 
Oracle ASM Training
Vigilant Technologies
 
MySQL Enterprise Backup (MEB)
Mydbops
 
Google Cloud DNS
Zdenko Hrček
 
Intro to Apache Spark
Robert Sanders
 
Adapting and adopting spm v04
Carlos Sierra
 
Rapid Upgrades with Pg_Upgrade
EDB
 
Podman rootless containers
Giuseppe Scrivano
 
Oracle data guard for beginners
Pini Dibask
 
Understanding oracle rac internals part 1 - slides
Mohamed Farouk
 
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
OpenStack Korea Community
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Altinity Ltd
 
Oracle GoldenGate
oracleonthebrain
 
Ad

Similar to SQream DB - Bigger Data On GPUs: Approaches, Challenges, Successes (20)

PDF
Building a High Performance Analytics Platform
Santanu Dey
 
PPTX
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Community
 
PDF
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
ScyllaDB
 
PPTX
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community
 
PPTX
Galaxy Big Data with MariaDB
MariaDB Corporation
 
PPTX
SQream-GPU가속 초거대 정형데이타 분석용 SQL DB-제품소개-박문기@메가존클라우드
문기 박
 
PPTX
IaaS for DBAs in Azure
Kellyn Pot'Vin-Gorman
 
PDF
Healthcare Claim Reimbursement using Apache Spark
Databricks
 
PDF
Sqream DB on OpenPOWER performance
Ganesan Narayanasamy
 
PDF
The state of SQL-on-Hadoop in the Cloud
Nicolas Poggi
 
PDF
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
Databricks
 
PPTX
Introduction to HPC & Supercomputing in AI
Tyrone Systems
 
PPTX
QCT Ceph Solution - Design Consideration and Reference Architecture
Patrick McGarry
 
PPTX
QCT Ceph Solution - Design Consideration and Reference Architecture
Ceph Community
 
PDF
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Data Con LA
 
PPTX
Drupal performance
Piyuesh Kumar
 
PPTX
SQREAM DB on IBM Power9
Ganesan Narayanasamy
 
PPTX
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
ScyllaDB
 
PDF
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Community
 
PDF
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Databricks
 
Building a High Performance Analytics Platform
Santanu Dey
 
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Community
 
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
ScyllaDB
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community
 
Galaxy Big Data with MariaDB
MariaDB Corporation
 
SQream-GPU가속 초거대 정형데이타 분석용 SQL DB-제품소개-박문기@메가존클라우드
문기 박
 
IaaS for DBAs in Azure
Kellyn Pot'Vin-Gorman
 
Healthcare Claim Reimbursement using Apache Spark
Databricks
 
Sqream DB on OpenPOWER performance
Ganesan Narayanasamy
 
The state of SQL-on-Hadoop in the Cloud
Nicolas Poggi
 
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
Databricks
 
Introduction to HPC & Supercomputing in AI
Tyrone Systems
 
QCT Ceph Solution - Design Consideration and Reference Architecture
Patrick McGarry
 
QCT Ceph Solution - Design Consideration and Reference Architecture
Ceph Community
 
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Data Con LA
 
Drupal performance
Piyuesh Kumar
 
SQREAM DB on IBM Power9
Ganesan Narayanasamy
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
ScyllaDB
 
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Community
 
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Databricks
 
Ad

Recently uploaded (20)

DOCX
Modul Ajar Deep Learning Bahasa Inggris Kelas 11 Terbaru 2025
wahyurestu63
 
PPTX
Introduction to pediatric nursing in 5th Sem..pptx
AneetaSharma15
 
PPTX
Cleaning Validation Ppt Pharmaceutical validation
Ms. Ashatai Patil
 
PPTX
CARE OF UNCONSCIOUS PATIENTS .pptx
AneetaSharma15
 
PPTX
INTESTINALPARASITES OR WORM INFESTATIONS.pptx
PRADEEP ABOTHU
 
PDF
The Minister of Tourism, Culture and Creative Arts, Abla Dzifa Gomashie has e...
nservice241
 
PPTX
A Smarter Way to Think About Choosing a College
Cyndy McDonald
 
PDF
BÀI TẬP TEST BỔ TRỢ THEO TỪNG CHỦ ĐỀ CỦA TỪNG UNIT KÈM BÀI TẬP NGHE - TIẾNG A...
Nguyen Thanh Tu Collection
 
PPTX
Command Palatte in Odoo 18.1 Spreadsheet - Odoo Slides
Celine George
 
PDF
Module 2: Public Health History [Tutorial Slides]
JonathanHallett4
 
PPTX
Basics and rules of probability with real-life uses
ravatkaran694
 
PPTX
Artificial Intelligence in Gastroentrology: Advancements and Future Presprec...
AyanHossain
 
DOCX
Unit 5: Speech-language and swallowing disorders
JELLA VISHNU DURGA PRASAD
 
PPTX
Five Point Someone – Chetan Bhagat | Book Summary & Analysis by Bhupesh Kushwaha
Bhupesh Kushwaha
 
PPTX
Python-Application-in-Drug-Design by R D Jawarkar.pptx
Rahul Jawarkar
 
PPTX
HEALTH CARE DELIVERY SYSTEM - UNIT 2 - GNM 3RD YEAR.pptx
Priyanshu Anand
 
DOCX
SAROCES Action-Plan FOR ARAL PROGRAM IN DEPED
Levenmartlacuna1
 
PPTX
Measures_of_location_-_Averages_and__percentiles_by_DR SURYA K.pptx
Surya Ganesh
 
PPTX
family health care settings home visit - unit 6 - chn 1 - gnm 1st year.pptx
Priyanshu Anand
 
PPTX
Sonnet 130_ My Mistress’ Eyes Are Nothing Like the Sun By William Shakespear...
DhatriParmar
 
Modul Ajar Deep Learning Bahasa Inggris Kelas 11 Terbaru 2025
wahyurestu63
 
Introduction to pediatric nursing in 5th Sem..pptx
AneetaSharma15
 
Cleaning Validation Ppt Pharmaceutical validation
Ms. Ashatai Patil
 
CARE OF UNCONSCIOUS PATIENTS .pptx
AneetaSharma15
 
INTESTINALPARASITES OR WORM INFESTATIONS.pptx
PRADEEP ABOTHU
 
The Minister of Tourism, Culture and Creative Arts, Abla Dzifa Gomashie has e...
nservice241
 
A Smarter Way to Think About Choosing a College
Cyndy McDonald
 
BÀI TẬP TEST BỔ TRỢ THEO TỪNG CHỦ ĐỀ CỦA TỪNG UNIT KÈM BÀI TẬP NGHE - TIẾNG A...
Nguyen Thanh Tu Collection
 
Command Palatte in Odoo 18.1 Spreadsheet - Odoo Slides
Celine George
 
Module 2: Public Health History [Tutorial Slides]
JonathanHallett4
 
Basics and rules of probability with real-life uses
ravatkaran694
 
Artificial Intelligence in Gastroentrology: Advancements and Future Presprec...
AyanHossain
 
Unit 5: Speech-language and swallowing disorders
JELLA VISHNU DURGA PRASAD
 
Five Point Someone – Chetan Bhagat | Book Summary & Analysis by Bhupesh Kushwaha
Bhupesh Kushwaha
 
Python-Application-in-Drug-Design by R D Jawarkar.pptx
Rahul Jawarkar
 
HEALTH CARE DELIVERY SYSTEM - UNIT 2 - GNM 3RD YEAR.pptx
Priyanshu Anand
 
SAROCES Action-Plan FOR ARAL PROGRAM IN DEPED
Levenmartlacuna1
 
Measures_of_location_-_Averages_and__percentiles_by_DR SURYA K.pptx
Surya Ganesh
 
family health care settings home visit - unit 6 - chn 1 - gnm 1st year.pptx
Priyanshu Anand
 
Sonnet 130_ My Mistress’ Eyes Are Nothing Like the Sun By William Shakespear...
DhatriParmar
 

SQream DB - Bigger Data On GPUs: Approaches, Challenges, Successes

  • 1. BIGGER DATA ON GPUS: SUCCESSES APPROACHES, CHALLENGES, Jake Wheat Arnon Shimoni
  • 2. INTRODUCING SQREAM DB GPU-ACCELERATED DATA WAREHOUSE 100xfaster Queries 10%of resources Cost 20xmore data Analyze
  • 4. FAST TO GET LOTS OF DATA IN • Use GPU for loading • 900 GB/s Memory Bandwidth • Compress all the data • Collect metadata
  • 5. FAST TO GET LOTS OF DATA OUT • Access with easy-to-use SQL • Support standards like ODBC and JDBC • 900 GB/s Memory Bandwidth for SQL operations • Access raw data directly, without cubes, indexes • SQream DB reads less data from disk, with compression
  • 7. ARE GPUS INTERESTING FOR RUNNING SQL? • Can they run SQL • Can they run SQL faster – If qualified yes, in what situations? • Are there other issues to consider?
  • 8. CAN GPUS RUN SQL? Example SQL Physical Operator Implementation select a+b, c * 5 from t select (a.k.a project/extend/rename) thrust::transform select a, count(*), sum(b), avg(b) from t group by a stream aggregate thrust::reduce_by_key select a, b from t where a > 0.5 filter thrust::remove_if select distinct a from t stream distinct thrust::unique select a, b, c, d from t order by a,b sort thrust::sort select * from t union all select * from u union all - select * from t inner join u using (a) sort merge join (smj) simple implementation: thrust::upper_bounds, lower_bounds, unnest, gather
  • 9. MARKETING HURDLES • PCI-bottleneck means it will never work • Columnar databases can't do joins • GPUs can't accelerate SQL operations • No-one will put a GPU in a server • GPUs are not actually faster than CPUs • A startup cannot make a production ready SQL DBMS
  • 10. OTHER ISSUES • Can you make a convincing demo? • Can you turn it into a real product? • Can you put GPUs in a data centre? • Are GPUs a safe bet in the medium/long term?
  • 12. EARLY RESEARCH • MonetDB/X100 talk youtu.be/yrLd-3lnZ58 • Relational Joins on Graphics Processors www.cse.ust.hk/catalac/papers/ gpujoin_sigmod08.pdf • Relational Query Co-Processing on Graphics Processors dl.acm.org/citation.cfm?id=162058 8 • Several Daniel Abadi papers www.cs.umd.edu/~abadi/
  • 13. THE EARLY SQREAM DB PROTOTYPES • Original brief: OpenCL + Erlang + Haskell streaming IoT = World Domination! • Generate thrust at query time • SQL server plugin • A real (but simple) DBMS with storage
  • 14. OUR FIRST DBMS • Run on data on disk • Create and drop table • Insert, insert select (and truncate) • A wide range of queries: e.g. select lists, joins, where, aggregates, order by, distinct • Lots of external algorithms
  • 15. WHY NOT POSTGRES? Some downsides to Postgres • No columnar - engine and storage • No threads, Not distributed • A big complex system Some non-benefits: • Parsing, syntax, and similar - Haskell makes this easy • The storage and execution engine – very row based Some things we miss: • Wide range of features, data types, operations • Extensibility • Cost based optimiser • Protocol/client compatibility
  • 16. STEPS TOWARDS TODAY'S PRODUCT Haskell Compiler Parse SQL Desugar to Relational Algebra Optimize Desugar to Statement Plan Network Server Runtime Metadata Database Columnar Storage Tree Interpreter Building Blocks I/O Task Runner
  • 17. SQREAM DB ARCHITECTURE Statement Compiler SQL Parser Desugar & Optimize Relational Algebra Desugar & Optimize Low-level stages Execution Engine Statement Tree Interpreter Task Runners I/O CPU GPU Storage Layer Metadata Database + Low-level transactions server or in-process Bulk Data Layer Extent Extent Extent … Storage Reorganizer Tasks Queue & Thread Manager Profiling Support Memory Managers Building blocks Building blocks Building blocks Connection & Session Manager Concurrency & Admission Control Desugar & Optimize Small Memory Managers Chunk Memory Managers Spool Memory Managers Linux FS Cache Prodder
  • 18. SOME ARCHITECTURE DETAILS • Haskell has the intelligence • C++/CUDA does the heavy lifting • Message passing, worker pools • Bulk data memory centric • Storage is append-only with background reorganization
  • 19. STORAGE AND TRANSACTIONS • Metadata database with relatively conventional transactions • Append only storage layer with background reorganization Transactions • Serializable, with any kind of statement • Run multiple queries concurrently with anything • Run multiple inserts to the same table at the same time • Cannot run multiple statements in a single transaction • Other operations such as delete, truncate, and DDL use course grained exclusive locking
  • 20. USING GPUS EFFECTIVELY • Good kernels • Optimise around GPU memory • Use large chunks, rechunk where necessary • Avoid PCI transfers where possible • Profiling • Partitioning
  • 22. HASH JOINS • Can hashing run fast on the GPU? • Answer from NVIDIA experts: – in principle probably yes – in practice, difficult to compete with sort-based algorithms
  • 23. COMPRESSION • GPU compression for typical columnar data – e.g. Dictionary, RLE, Delta, Pfor + Combos – Helps speed up IO and PCI transfer times – in house code • CPU compression for general data – Helps speed up IO, but not PCI transfer times – We use things like Snappy and LZ4
  • 24. SOME FINAL THOUGHTS • SQL analytics and GPUs are a natural fit • GPUs can be very effective for big data/external algorithms • Lots of exciting work being done in non-SQL analytics (not just on GPUs) • Haskell is a big positive • Building a commercial SQL DBMS is very difficult • Building a SQL DBMS is a really satisfying thing to do SQL GPU
  • 25. HIGH THROUGHPUT, CONVERGED • SQream DB is designed for high-throughput devices • IBM Power Systems is the only NVLink CPU-to-GPU enabled architecture, unlocking the potential of high-throughput accelerated computing • The IBM AC922, with POWER9 and NVLINK can transfer data at up to 300GB/s, almost 9.5x faster than PCIe 3.0 found in x86-based architectures, reducing classic I/O bottlenecks 2x NVIDIA Tesla V100 2x NVIDIA Tesla V100 IBM Power 9 IBM Power 9
  • 26. HIGH THROUGHPUT ARCHITECTURE IT’S NOT JUST CORES RAM Power9 CPU Tesla V100 GPU VRAM Tesla V100 GPU VRAM 170GB/s per CPU NVLink – 300GB/s BiDi 900GB/s RAM Power9 CPU Tesla V100 GPU VRAM Tesla V100 GPU VRAM IBM SMP bus
  • 27. UP TO 3.7X FASTER QUERIES 52.83 10.35 84.5 78.57 14.06 2.8 30.29 29.01 0 10 20 30 40 50 60 70 80 90 TPC-H Query 8 TPC-H Query 6 TPC-H Query 19 TPC-H Query 17 Querytime(seconds) Lowerisbetter Query SQream DB performance IBM Power9 vs Intel Xeon (Skylake) Dell PowerEdge R740 IBM Power9 AC922 IBM Power9 AC922: 2x POWER9 16C @ 3.8GHz | 256 GB DDR4 2666 MHz | SSD storage | 4x NVIDIA Tesla V100 (SXM2 NVLINK - 16GB) Dell PowerEdge R740: 2x Intel Xeon Silver 4112 CPU @ 2.60GHz | 256GB DDR4 2666MHz | SSD storage | 4x NVIDIA Tesla V100 (PCIe - 16GB) • In our testing, SQream DB on Power9 is between 150% to 370% faster than comparable x86 architectures, especially on large data sets. For example, in the TPC-H (SF 10,000) dataset, Query 8 ran in a quarter of the time on the IBM Power 9, compared to the x86 competitor.
  • 28. UNDERSTAND 40 MILLION CUSTOMERS TELECOM HP DL380g9 with NVIDIA Tesla GPU 96 GB RAM + 6 TB storage $200K 80 NODES 5 full racks 7600 CPU cores $10,000,000 20M 10M 300M 120M Ingest time Reporting time Ownership Cost
  • 29. Green plum 3G 4G CDRs Others ETL 1-2 hours GP Daily aggr. … Profiles GP Daily report 3 hours (max) #1 #2 #3 #4 #31••• ••• Daily reports Monthly #1 Monthly #2 Monthly #NMonthly reports (7 days) 5hr 3hr 0.5hr Billing Pre-aggregations ARCHITECTURE BEFORE SQREAM DB
  • 30. SIMPLIFIED WITH SQREAM DB 3G 4G CDRs Others #1 #2 #3 #4 #31••• ••• Daily reports Monthly #1 Monthly #2 Monthly #NMonthly reports 1 day 10m 4m 2m