SlideShare a Scribd company logo
Using Control Theory to Keep
Compactions Under Control
Glauber Costa - VP of Field Engineering, ScyllaDB
WEBINAR
Glauber Costa
2
Glauber Costa is the VP of Field Engineering at ScyllaDB.
He shares his time between the engineering department
working on upcoming Scylla features and helping
customers succeed.
Before ScyllaDB, Glauber worked with Virtualization in the
Linux Kernel for 10 years with contributions ranging from
the Xen Hypervisor to all sorts of guest functionality and
containers
3
+ Next-generation NoSQL database
+ Drop-in replacement for Cassandra
+ 10X the performance & low tail latency
+ Open source and enterprise editions
+ Founded by the creators of KVM hypervisor
+ HQs: Palo Alto, CA; Herzelia, Israel
+ Scylla Summit 2018 November 6-7, SF Bay
About ScyllaDB
Join real-time big-data database developers and users from start-ups
and leading enterprises from around the globe for two days of sharing
ideas, hearing innovative use cases, and getting practical tips and tricks
from your peers and NoSQL gurus.
What are compactions ?
5
Scylla’s write path:
5
Writes
commit log
compaction
6
Compaction Strategy
+ Which sstables to compact, and when?
+ This is called the compaction strategy
+ The goal of the strategy is low amplification:
+ Avoid read requests needing many sstables: read amplification
+ Avoid overwritten/deleted/expired data staying on disk.
+ Avoid excessive temporary disk space needs: space amplification
+ Avoid compacting the same data again and again : write amplification
7
The main compaction strategies
+ Size Tiered Compaction Strategy
+ compact SSTables with roughly the same size together
+ Leveled Compaction Strategy
+ compact SSTables keeping them in different levels that are exponentially bigger
+ Time Window Compaction Strategy
+ Each user-defined time window has a single SSTable
+ Major, or manual compaction
+ compacts everything in a single* SSTable
8
The main compaction strategies
+ Size Tiered Compaction Strategy
+ compact SSTables with roughly the same size together
+ Leveled Compaction Strategy
+ compact SSTables keeping them in different levels that are exponentially bigger
+ Time Window Compaction Strategy
+ Each user-defined time window has a single SSTable
+ Major, or manual compaction
+ compacts everything in a single* SSTable
* see next slide
9
Compactions in Scylla
+ Because all data is sharded, so are SSTables
+ and as a result, so are compactions
+ in a system with 64 vCPUS - expect 64 SSTables after a major compaction
+ same logic for LeveledCompactionStrategy for amount of tables in each level.
Impact of compactions
10
+ Compaction too slow: reads will touch from many SSTables and be slower.
+ Compactions too fast : foreground workload will be disrupted.
Impact of compactions
11
+ Compaction too slow: reads will touch from many SSTables and be slower.
+ Compactions too fast : foreground workload will be disrupted.
+ Common solutions is to use limits. Ex: Apache Cassandra
+ “Don’t allow compactions to run at more than 300 MB/s”
+ But how to find that number?
+ But what if the workload changes?
+ But what if there is idle time now?
Impact of compactions
12
+ Compaction too slow: reads will touch from many SSTables and be slower.
+ Compactions too fast : foreground workload will be disrupted.
+ Common solutions is to use limits. Ex: Apache Cassandra
+ “Don’t allow compactions to run at more than 300 MB/s”
+ But how to find that number?
+ But what if the workload changes?
+ But what if there is idle time now?
+ Another solution is to determine ratios. Ex: ScyllaDB until 2.2
+ “Don’t allow compactions to use more than 20% of storage bandwidth/CPU”
+ Much better, adapts automatically to resource capacity, use idle time efficiently
+ But no temporal knowledge.
Compactions over time
13
Compactions run. Limited impact,
but still impact
Compactions over time
14
All shards are compacting here Almost no shards are
compacting here
What is Control Theory ?
15
+ Open-loop control system
+ there is some input, a function is applied, there is an output.
+ ex: toaster
+ Closed-loop control systems
+ We want the world to be in a particular state.
+ The current state of the world is fed-back to the control system
+ The control system acts to bring the system back to the goal
Feedback Control Systems
16
1. Measure the state of the world
2. Transfer function
3. Actuator
Measuring - current state of all SSTables
17
Partial New SSTable Size
Static SSTable Size
SSTable Uncompacted Size
SSTable Uncompacted Size
Partial compacted
SSTable Size
Actuators - Schedulers
18
Query
Commitlog
Compaction
Queue
Queue
Queue
Userspace
I/O
Scheduler
Storage
Max useful disk concurrency
I/O queued in FS/deviceNo queues
Transfer Function - Backlog
19
+ Each compaction strategy does a different amount of work
+ For each compaction strategy we determine when there is no more work to be done.
+ Examples:
+ SizeTiered: there is only one SSTable in the system.
+ TimeWindow: there is only one SSTable per Time Window.
+ The backlog is: how many bytes I expect to write to reach the state of zero backlog ?
+ Controller output: f(B)
+ proportional function
Transfer Function - Backlog
20
+ Each compaction strategy does a different amount of work
+ For each compaction strategy we determine when there is no more work to be done.
+ Examples:
+ SizeTiered: there is only one SSTable in the system.
+ TimeWindow: there is only one SSTable per Time Window.
+ The backlog is: how many bytes I expect to write to reach the state of zero backlog ?
+ Controller output: f(B)
+ proportional function
+ This is a self-regulating system:
+ more compaction shares = less new writes = less compaction backlog
+ less compaction shares = more new writes = more compaction backlog
SizeTiered Backlog example
21
SizeTiered Backlog
22
+ each byte that is written now is rewritten T times, where T is the number of tiers
+ In SizeTiered, tiers are proportinal to SSTable Sizes.
SizeTiered Backlog
23
+ each byte that is written now is rewritten T times, where T is the number of tiers
+ In SizeTiered, tiers are proportinal to SSTable Sizes.
+ Number of tiers is roughly proportional to the log of this SSTable contribution to the total size
+ Ex: 4 SSTables with 1GB, 4 SSTables with 4GB. Total size = 20GB
+ log4(20 / 1) ~ 2
+ log4(20 / 4) ~ 1
SizeTiered Backlog
24
+ each byte that is written now is rewritten T times, where T is the number of tiers
+ In SizeTiered, tiers are proportinal to SSTable Sizes.
+ Number of tiers is roughly proportional to the log of this SSTable contribution to the total size
+ Ex: 4 SSTables with 1GB, 4 SSTables with 4GB. Total size = 20GB
+ log4(20 / 1) ~ 2
+ log4(20 / 4) ~ 1
+ Backlog for one SSTable is its size, times the backlog per byte:
+ B = SSTableSize * log4(TableSize / SSTableSize)
SizeTiered Backlog
25
+ each byte that is written now is rewritten T times, where T is the number of tiers
+ In SizeTiered, tiers are proportinal to SSTable Sizes.
+ Number of tiers is roughly proportional to the log of this SSTable contribution to the total size
+ Ex: 4 SSTables with 1GB, 4 SSTables with 4GB. Total size = 20GB
+ log4(20 / 1) ~ 2
+ log4(20 / 4) ~ 1
+ Backlog for one SSTable is its size, times the backlog per byte:
+ B = SSTableSize * log4(TableSize / SSTableSize)
+ Backlog for the Entire Table is the Sum of all backlogs for that SSTable.
Results: before vs after
26
Results: throughput vs CPU
27
% CPU time used by Compactions
Throughput
Results, changing workload
28
28
Workload changes:
- automatic adjustment
- new equilibrium
Results - impact on latency
29
2929
2ms : 99.9 % latencies at 100 % load
< 2ms : 99 % latencies,
1ms : 95 % latencies.
30
Q&A
Stay in touch
Join us at Scylla Summit 2018
Pullman San Francisco Bay Hotel | November 6-7
scylladb.com/scylla-summit-2018
glauber@scylladb.com
@ScyllaDB
@glcst
United States
1900 Embarcadero Road
Palo Alto, CA 94303
Israel
11 Galgalei Haplada
Herzelia, Israel
www.scylladb.com
@scylladb
Thank You!

More Related Content

PPTX
Lightweight Transactions in Scylla versus Apache Cassandra
ScyllaDB
 
PDF
Webinar: Does it Still Make Sense to do Big Data with Small Nodes?
Julia Angell
 
PDF
TechTalk: Reduce Your Storage Footprint with a Revolutionary New Compaction S...
ScyllaDB
 
PDF
NoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
ScyllaDB
 
PDF
The Do’s and Don’ts of Benchmarking Databases
ScyllaDB
 
PDF
Running a DynamoDB-compatible Database on Managed Kubernetes Services
ScyllaDB
 
PDF
Comparing Apache Cassandra 4.0, 3.0, and ScyllaDB
ScyllaDB
 
PDF
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
ScyllaDB
 
Lightweight Transactions in Scylla versus Apache Cassandra
ScyllaDB
 
Webinar: Does it Still Make Sense to do Big Data with Small Nodes?
Julia Angell
 
TechTalk: Reduce Your Storage Footprint with a Revolutionary New Compaction S...
ScyllaDB
 
NoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
ScyllaDB
 
The Do’s and Don’ts of Benchmarking Databases
ScyllaDB
 
Running a DynamoDB-compatible Database on Managed Kubernetes Services
ScyllaDB
 
Comparing Apache Cassandra 4.0, 3.0, and ScyllaDB
ScyllaDB
 
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
ScyllaDB
 

What's hot (20)

PDF
How to Build a Scylla Database Cluster that Fits Your Needs
ScyllaDB
 
PPTX
Seastar Summit 2019 Keynote
ScyllaDB
 
PDF
Under the Hood of a Shard-per-Core Database Architecture
ScyllaDB
 
PDF
Numberly on Joining Billions of Rows in Seconds: Replacing MongoDB and Hive w...
ScyllaDB
 
PDF
The True Cost of NoSQL DBaaS Options
ScyllaDB
 
PDF
Addressing the High Cost of Apache Cassandra
ScyllaDB
 
PDF
Measuring Database Performance on Bare Metal AWS Instances
ScyllaDB
 
PDF
Scylla Virtual Workshop 2020
ScyllaDB
 
PDF
Critical Attributes for a High-Performance, Low-Latency Database
ScyllaDB
 
PDF
Introducing Scylla Cloud
ScyllaDB
 
PDF
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
ScyllaDB
 
PDF
Scylla Summit 2016: Compose on Containing the Database
ScyllaDB
 
PPTX
Cassandra vs. ScyllaDB: Evolutionary Differences
ScyllaDB
 
PDF
Renegotiating the boundary between database latency and consistency
ScyllaDB
 
PDF
How to achieve no compromise performance and availability
ScyllaDB
 
PDF
Introducing Scylla Manager: Cluster Management and Task Automation
ScyllaDB
 
PDF
Introducing Scylla Open Source 4.0
ScyllaDB
 
PDF
Back to the future with C++ and Seastar
Tzach Livyatan
 
PPTX
Scylla Summit 2018: Keynote - 4 Years of Scylla
ScyllaDB
 
PDF
Demystifying the Distributed Database Landscape
ScyllaDB
 
How to Build a Scylla Database Cluster that Fits Your Needs
ScyllaDB
 
Seastar Summit 2019 Keynote
ScyllaDB
 
Under the Hood of a Shard-per-Core Database Architecture
ScyllaDB
 
Numberly on Joining Billions of Rows in Seconds: Replacing MongoDB and Hive w...
ScyllaDB
 
The True Cost of NoSQL DBaaS Options
ScyllaDB
 
Addressing the High Cost of Apache Cassandra
ScyllaDB
 
Measuring Database Performance on Bare Metal AWS Instances
ScyllaDB
 
Scylla Virtual Workshop 2020
ScyllaDB
 
Critical Attributes for a High-Performance, Low-Latency Database
ScyllaDB
 
Introducing Scylla Cloud
ScyllaDB
 
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
ScyllaDB
 
Scylla Summit 2016: Compose on Containing the Database
ScyllaDB
 
Cassandra vs. ScyllaDB: Evolutionary Differences
ScyllaDB
 
Renegotiating the boundary between database latency and consistency
ScyllaDB
 
How to achieve no compromise performance and availability
ScyllaDB
 
Introducing Scylla Manager: Cluster Management and Task Automation
ScyllaDB
 
Introducing Scylla Open Source 4.0
ScyllaDB
 
Back to the future with C++ and Seastar
Tzach Livyatan
 
Scylla Summit 2018: Keynote - 4 Years of Scylla
ScyllaDB
 
Demystifying the Distributed Database Landscape
ScyllaDB
 
Ad

Similar to Webinar: Using Control Theory to Keep Compactions Under Control (20)

PDF
Scylla Compaction Strategies
Nadav Har'El
 
PPTX
Balancing Compaction Principles and Practices
ScyllaDB
 
PPTX
Manage your compactions before they manage you!
Carlos Juzarte Rolo
 
PDF
Using ScyllaDB for Real-Time Write-Heavy Workloads
ScyllaDB
 
PDF
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
ScyllaDB
 
PPTX
How Incremental Compaction Reduces Your Storage Footprint
ScyllaDB
 
PDF
Object Compaction in Cloud for High Yield
ScyllaDB
 
PDF
Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...
ScyllaDB
 
PDF
C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian
DataStax Academy
 
PDF
[Cassandra summit Tokyo, 2015] Cassandra 2015 最新情報 by ジョナサン・エリス(Jonathan Ellis)
datastaxjp
 
PPTX
Using Time Window Compaction Strategy For Time Series Workloads
Jeff Jirsa
 
PPTX
High-Load Storage of Users’ Actions with ScyllaDB and HDDs
ScyllaDB
 
PPTX
Designing data intensive applications
Hemchander Sannidhanam
 
PDF
Compaction, Compaction Everywhere
DataStax Academy
 
PDF
Using ScyllaDB for Real-Time Read-Heavy Workloads.pdf
ScyllaDB
 
PDF
Cassandra 2.1 boot camp, Compaction
Joshua McKenzie
 
PPT
Key Challenges in Cloud Computing and How Yahoo! is Approaching Them
Yahoo Developer Network
 
PDF
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
ScyllaDB
 
PPTX
Scylla Summit 2022: Scylla 5.0 New Features, Part 1
ScyllaDB
 
PPTX
HBaseCon 2013: Compaction Improvements in Apache HBase
Cloudera, Inc.
 
Scylla Compaction Strategies
Nadav Har'El
 
Balancing Compaction Principles and Practices
ScyllaDB
 
Manage your compactions before they manage you!
Carlos Juzarte Rolo
 
Using ScyllaDB for Real-Time Write-Heavy Workloads
ScyllaDB
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
ScyllaDB
 
How Incremental Compaction Reduces Your Storage Footprint
ScyllaDB
 
Object Compaction in Cloud for High Yield
ScyllaDB
 
Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...
ScyllaDB
 
C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian
DataStax Academy
 
[Cassandra summit Tokyo, 2015] Cassandra 2015 最新情報 by ジョナサン・エリス(Jonathan Ellis)
datastaxjp
 
Using Time Window Compaction Strategy For Time Series Workloads
Jeff Jirsa
 
High-Load Storage of Users’ Actions with ScyllaDB and HDDs
ScyllaDB
 
Designing data intensive applications
Hemchander Sannidhanam
 
Compaction, Compaction Everywhere
DataStax Academy
 
Using ScyllaDB for Real-Time Read-Heavy Workloads.pdf
ScyllaDB
 
Cassandra 2.1 boot camp, Compaction
Joshua McKenzie
 
Key Challenges in Cloud Computing and How Yahoo! is Approaching Them
Yahoo Developer Network
 
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
ScyllaDB
 
Scylla Summit 2022: Scylla 5.0 New Features, Part 1
ScyllaDB
 
HBaseCon 2013: Compaction Improvements in Apache HBase
Cloudera, Inc.
 
Ad

More from ScyllaDB (20)

PDF
Understanding The True Cost of DynamoDB Webinar
ScyllaDB
 
PDF
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
ScyllaDB
 
PDF
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
ScyllaDB
 
PDF
New Ways to Reduce Database Costs with ScyllaDB
ScyllaDB
 
PDF
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
PDF
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
ScyllaDB
 
PDF
Leading a High-Stakes Database Migration
ScyllaDB
 
PDF
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
ScyllaDB
 
PDF
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
ScyllaDB
 
PDF
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
ScyllaDB
 
PDF
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
ScyllaDB
 
PDF
ScyllaDB: 10 Years and Beyond by Dor Laor
ScyllaDB
 
PDF
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
ScyllaDB
 
PDF
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
ScyllaDB
 
PDF
Vector Search with ScyllaDB by Szymon Wasik
ScyllaDB
 
PDF
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
ScyllaDB
 
PDF
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
ScyllaDB
 
PDF
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
ScyllaDB
 
PDF
Lessons Learned from Building a Serverless Notifications System by Srushith R...
ScyllaDB
 
PDF
A Dist Sys Programmer's Journey into AI by Piotr Sarna
ScyllaDB
 
Understanding The True Cost of DynamoDB Webinar
ScyllaDB
 
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
ScyllaDB
 
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
ScyllaDB
 
New Ways to Reduce Database Costs with ScyllaDB
ScyllaDB
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
ScyllaDB
 
Leading a High-Stakes Database Migration
ScyllaDB
 
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
ScyllaDB
 
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
ScyllaDB
 
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
ScyllaDB
 
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
ScyllaDB
 
ScyllaDB: 10 Years and Beyond by Dor Laor
ScyllaDB
 
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
ScyllaDB
 
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
ScyllaDB
 
Vector Search with ScyllaDB by Szymon Wasik
ScyllaDB
 
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
ScyllaDB
 
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
ScyllaDB
 
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
ScyllaDB
 
Lessons Learned from Building a Serverless Notifications System by Srushith R...
ScyllaDB
 
A Dist Sys Programmer's Journey into AI by Piotr Sarna
ScyllaDB
 

Recently uploaded (20)

PDF
Doc9.....................................
SofiaCollazos
 
PPTX
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
PDF
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
PDF
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PDF
Event Presentation Google Cloud Next Extended 2025
minhtrietgect
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PDF
REPORT: Heating appliances market in Poland 2024
SPIUG
 
PDF
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
PPTX
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PDF
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
Doc9.....................................
SofiaCollazos
 
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
Event Presentation Google Cloud Next Extended 2025
minhtrietgect
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
REPORT: Heating appliances market in Poland 2024
SPIUG
 
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 

Webinar: Using Control Theory to Keep Compactions Under Control

  • 1. Using Control Theory to Keep Compactions Under Control Glauber Costa - VP of Field Engineering, ScyllaDB WEBINAR
  • 2. Glauber Costa 2 Glauber Costa is the VP of Field Engineering at ScyllaDB. He shares his time between the engineering department working on upcoming Scylla features and helping customers succeed. Before ScyllaDB, Glauber worked with Virtualization in the Linux Kernel for 10 years with contributions ranging from the Xen Hypervisor to all sorts of guest functionality and containers
  • 3. 3 + Next-generation NoSQL database + Drop-in replacement for Cassandra + 10X the performance & low tail latency + Open source and enterprise editions + Founded by the creators of KVM hypervisor + HQs: Palo Alto, CA; Herzelia, Israel + Scylla Summit 2018 November 6-7, SF Bay About ScyllaDB
  • 4. Join real-time big-data database developers and users from start-ups and leading enterprises from around the globe for two days of sharing ideas, hearing innovative use cases, and getting practical tips and tricks from your peers and NoSQL gurus.
  • 5. What are compactions ? 5 Scylla’s write path: 5 Writes commit log compaction
  • 6. 6 Compaction Strategy + Which sstables to compact, and when? + This is called the compaction strategy + The goal of the strategy is low amplification: + Avoid read requests needing many sstables: read amplification + Avoid overwritten/deleted/expired data staying on disk. + Avoid excessive temporary disk space needs: space amplification + Avoid compacting the same data again and again : write amplification
  • 7. 7 The main compaction strategies + Size Tiered Compaction Strategy + compact SSTables with roughly the same size together + Leveled Compaction Strategy + compact SSTables keeping them in different levels that are exponentially bigger + Time Window Compaction Strategy + Each user-defined time window has a single SSTable + Major, or manual compaction + compacts everything in a single* SSTable
  • 8. 8 The main compaction strategies + Size Tiered Compaction Strategy + compact SSTables with roughly the same size together + Leveled Compaction Strategy + compact SSTables keeping them in different levels that are exponentially bigger + Time Window Compaction Strategy + Each user-defined time window has a single SSTable + Major, or manual compaction + compacts everything in a single* SSTable * see next slide
  • 9. 9 Compactions in Scylla + Because all data is sharded, so are SSTables + and as a result, so are compactions + in a system with 64 vCPUS - expect 64 SSTables after a major compaction + same logic for LeveledCompactionStrategy for amount of tables in each level.
  • 10. Impact of compactions 10 + Compaction too slow: reads will touch from many SSTables and be slower. + Compactions too fast : foreground workload will be disrupted.
  • 11. Impact of compactions 11 + Compaction too slow: reads will touch from many SSTables and be slower. + Compactions too fast : foreground workload will be disrupted. + Common solutions is to use limits. Ex: Apache Cassandra + “Don’t allow compactions to run at more than 300 MB/s” + But how to find that number? + But what if the workload changes? + But what if there is idle time now?
  • 12. Impact of compactions 12 + Compaction too slow: reads will touch from many SSTables and be slower. + Compactions too fast : foreground workload will be disrupted. + Common solutions is to use limits. Ex: Apache Cassandra + “Don’t allow compactions to run at more than 300 MB/s” + But how to find that number? + But what if the workload changes? + But what if there is idle time now? + Another solution is to determine ratios. Ex: ScyllaDB until 2.2 + “Don’t allow compactions to use more than 20% of storage bandwidth/CPU” + Much better, adapts automatically to resource capacity, use idle time efficiently + But no temporal knowledge.
  • 13. Compactions over time 13 Compactions run. Limited impact, but still impact
  • 14. Compactions over time 14 All shards are compacting here Almost no shards are compacting here
  • 15. What is Control Theory ? 15 + Open-loop control system + there is some input, a function is applied, there is an output. + ex: toaster + Closed-loop control systems + We want the world to be in a particular state. + The current state of the world is fed-back to the control system + The control system acts to bring the system back to the goal
  • 16. Feedback Control Systems 16 1. Measure the state of the world 2. Transfer function 3. Actuator
  • 17. Measuring - current state of all SSTables 17 Partial New SSTable Size Static SSTable Size SSTable Uncompacted Size SSTable Uncompacted Size Partial compacted SSTable Size
  • 19. Transfer Function - Backlog 19 + Each compaction strategy does a different amount of work + For each compaction strategy we determine when there is no more work to be done. + Examples: + SizeTiered: there is only one SSTable in the system. + TimeWindow: there is only one SSTable per Time Window. + The backlog is: how many bytes I expect to write to reach the state of zero backlog ? + Controller output: f(B) + proportional function
  • 20. Transfer Function - Backlog 20 + Each compaction strategy does a different amount of work + For each compaction strategy we determine when there is no more work to be done. + Examples: + SizeTiered: there is only one SSTable in the system. + TimeWindow: there is only one SSTable per Time Window. + The backlog is: how many bytes I expect to write to reach the state of zero backlog ? + Controller output: f(B) + proportional function + This is a self-regulating system: + more compaction shares = less new writes = less compaction backlog + less compaction shares = more new writes = more compaction backlog
  • 22. SizeTiered Backlog 22 + each byte that is written now is rewritten T times, where T is the number of tiers + In SizeTiered, tiers are proportinal to SSTable Sizes.
  • 23. SizeTiered Backlog 23 + each byte that is written now is rewritten T times, where T is the number of tiers + In SizeTiered, tiers are proportinal to SSTable Sizes. + Number of tiers is roughly proportional to the log of this SSTable contribution to the total size + Ex: 4 SSTables with 1GB, 4 SSTables with 4GB. Total size = 20GB + log4(20 / 1) ~ 2 + log4(20 / 4) ~ 1
  • 24. SizeTiered Backlog 24 + each byte that is written now is rewritten T times, where T is the number of tiers + In SizeTiered, tiers are proportinal to SSTable Sizes. + Number of tiers is roughly proportional to the log of this SSTable contribution to the total size + Ex: 4 SSTables with 1GB, 4 SSTables with 4GB. Total size = 20GB + log4(20 / 1) ~ 2 + log4(20 / 4) ~ 1 + Backlog for one SSTable is its size, times the backlog per byte: + B = SSTableSize * log4(TableSize / SSTableSize)
  • 25. SizeTiered Backlog 25 + each byte that is written now is rewritten T times, where T is the number of tiers + In SizeTiered, tiers are proportinal to SSTable Sizes. + Number of tiers is roughly proportional to the log of this SSTable contribution to the total size + Ex: 4 SSTables with 1GB, 4 SSTables with 4GB. Total size = 20GB + log4(20 / 1) ~ 2 + log4(20 / 4) ~ 1 + Backlog for one SSTable is its size, times the backlog per byte: + B = SSTableSize * log4(TableSize / SSTableSize) + Backlog for the Entire Table is the Sum of all backlogs for that SSTable.
  • 26. Results: before vs after 26
  • 27. Results: throughput vs CPU 27 % CPU time used by Compactions Throughput
  • 28. Results, changing workload 28 28 Workload changes: - automatic adjustment - new equilibrium
  • 29. Results - impact on latency 29 2929 2ms : 99.9 % latencies at 100 % load < 2ms : 99 % latencies, 1ms : 95 % latencies.
  • 30. 30 Q&A Stay in touch Join us at Scylla Summit 2018 Pullman San Francisco Bay Hotel | November 6-7 scylladb.com/scylla-summit-2018 [email protected] @ScyllaDB @glcst
  • 31. United States 1900 Embarcadero Road Palo Alto, CA 94303 Israel 11 Galgalei Haplada Herzelia, Israel www.scylladb.com @scylladb Thank You!