SlideShare a Scribd company logo
5
Most read
11
Most read
14
Most read
A ScyllaDB Community
How Agoda Scaled 50x
Throughput with ScyllaDB
Worakarn Isaratham
Lead Software Engineer
Worakarn Isaratham (he/him)
■ Lead Software Engineer, Agoda
■ Based in Bangkok, Thailand
■ Experience in distributed computing,
software testing
■ Interested in dependable software systems
■ ScyllaDB in Agoda Feature Store
■ Capacity Problem
■ Potential Solutions
Presentation Agenda
Agoda Feature Store
Online Feature Serving
Client SDK
Cache
ScyllaDB
App Servers
3.5M EPS 1.7M EPS
200k EPS
P99 Latency: 5 ms
P99 Latency: 8 ms
Average 5 features / entities
Growth
Since the start of 2023
■ Servers traffic: 50x
Peak servers traffic, on the busiest DC
Growth
Since the start of 2023
■ Servers traffic: 50x
■ ScyllaDB traffic: 10x
10K EPS
Peak ScyllaDB traffic, on the busiest DC
A Capacity Problem
■ A new use case wanted to onboard
■ Problematic usage pattern:
■ Bursty traffic from cold cache, hitting ScyllaDB at 120K EPS.
■ Many duplicated requests in very quick succession
■ Keep retrying any failed requests
12x of the load then
2x of the load now!
A Capacity Problem
■ One DC was able to survive this load
without errors.
■ The other DC got lots of problems
■ Very high error rate
■ Took 40 minutes to finish all
the retries
■ Metrics were pointing to slow
read on ScyllaDB nodes
Slow Disks
Bad DC Good DC Advantage
Disks SATA SSD
RAID 0
NVMe SSD
RAID 0
Read iops 6868 79566 11.6x
Read
bandwidth
1.5G 10.1G 6.7x
Write iops 6615 41104 6.2x
Write
bandwidth
1.9G 6.3G 3.3x
Just Buy New Disks?
● New disks were ordered
● Improved user-side caching, reduced
this load to 7K.
● How long could we survive?
Capacity
Cache-Avoiding Load Test
■ Use artificial, one-time-used load to avoid ScyllaDB caching.
25K 5K
Normal load
ScyllaDB cache
one-time-used entities
BYPASS CACHE
Flush, Restart ScyllaDB
Baseline EPS for SATA
Idea 1: Different Data Modeling
Current: one tall table
Alternative: one table per feature set
Idea 1: Different Data Modeling
Idea 2: Change Compaction Strategy
■ Our workload is “Read-mostly, many updates”. Size-tiered strategy is recommended.
Prioritized read latency
Slow disk read
Large SSTable files
Size-tiered
Compaction
Leveled
Compaction
Idea 2: Change Compaction Strategy
1.5x
Idea 3: Increase Summary File Size
■ ScyllaDB uses summary files to help navigate to index files
summary file size ≈ data file size × summary ratio
High ratio
Larger
summary
More
efficient
index
Less disk I/O
Idea 3: Increase Summary File Size
4x
NVMe
60x
Rollout
Jul 2023
New summary ratio applied
Oct 2023
Migrated to NVMe disks
Focus shifted to other components.
Still trying out some new ideas on ScyllaDB.
Leveled Compaction:
Only applied to new table,
need data migration
Recent Experiments
● Partitioned By Feature Set, clustered by Entity
○ Disastrous! 400x worse
● All features as a blob in a single row
○ +35% throughput
Lessons
● Fast disks are essential!
● Benchmark your load
● Tailor your data model to fit the needs
Stay in Touch
Worakarn Isaratham
worakarn.isaratham@agoda.com
github.com/arkorwan
www.linkedin.com/in/worakarn

More Related Content

PDF
Replacing Your Cache with ScyllaDB by Felipe Cardeneti Mendes and Tomasz Grabiec
ScyllaDB
 
PPTX
Replacing Your Cache with ScyllaDB
ScyllaDB
 
PPTX
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
ScyllaDB
 
PDF
Using ScyllaDB for Real-Time Read-Heavy Workloads.pdf
ScyllaDB
 
PDF
How Development Teams Cut Costs with ScyllaDB.pdf
ScyllaDB
 
PDF
Using ScyllaDB for Real-Time Write-Heavy Workloads
ScyllaDB
 
PDF
ScyllaDB Virtual Workshop: Getting Started with ScyllaDB 2024
ScyllaDB
 
PDF
Dissecting Real-World Database Performance Dilemmas
ScyllaDB
 
Replacing Your Cache with ScyllaDB by Felipe Cardeneti Mendes and Tomasz Grabiec
ScyllaDB
 
Replacing Your Cache with ScyllaDB
ScyllaDB
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
ScyllaDB
 
Using ScyllaDB for Real-Time Read-Heavy Workloads.pdf
ScyllaDB
 
How Development Teams Cut Costs with ScyllaDB.pdf
ScyllaDB
 
Using ScyllaDB for Real-Time Write-Heavy Workloads
ScyllaDB
 
ScyllaDB Virtual Workshop: Getting Started with ScyllaDB 2024
ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
ScyllaDB
 

Similar to How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham (20)

PDF
ShareChat’s Path to High-Performance NoSQL with ScyllaDB
ScyllaDB
 
PDF
Why Databases Cache, but Caches Go to Disk
ScyllaDB
 
PDF
Dissecting Real-World Database Performance Dilemmas
ScyllaDB
 
PDF
DynamoDB Cost Optimization Masterclass: ScyllaDB as a DynamoDB Alternative
ScyllaDB
 
PDF
Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...
ScyllaDB
 
PDF
Using ScyllaDB for Extreme Scale Workloads
MarisaDelao3
 
PDF
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
ScyllaDB
 
PPTX
Radically Outperforming DynamoDB @ Digital Turbine with SADA and Google Cloud
ScyllaDB
 
PDF
Elasticity, Speed & Simplicity: Get the Most Out of New ScyllaDB Capabilities
ScyllaDB
 
PPTX
Scylla Virtual Workshop 2022
ScyllaDB
 
PDF
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
ScyllaDB
 
PDF
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
ScyllaDB
 
PDF
Scylla Summit 2022: How ScyllaDB Powers This Next Tech Cycle
ScyllaDB
 
PPTX
A Deep Dive into ScyllaDB's Architecture
ScyllaDB
 
PDF
ScyllaDB Virtual Workshop
ScyllaDB
 
PPTX
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
ScyllaDB
 
PDF
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
ScyllaDB
 
PDF
Understanding The True Cost of DynamoDB Webinar
ScyllaDB
 
PDF
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...
ScyllaDB
 
PPTX
MongoDB vs ScyllaDB: Tractian’s Experience with Real-Time ML
ScyllaDB
 
ShareChat’s Path to High-Performance NoSQL with ScyllaDB
ScyllaDB
 
Why Databases Cache, but Caches Go to Disk
ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
ScyllaDB
 
DynamoDB Cost Optimization Masterclass: ScyllaDB as a DynamoDB Alternative
ScyllaDB
 
Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...
ScyllaDB
 
Using ScyllaDB for Extreme Scale Workloads
MarisaDelao3
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
ScyllaDB
 
Radically Outperforming DynamoDB @ Digital Turbine with SADA and Google Cloud
ScyllaDB
 
Elasticity, Speed & Simplicity: Get the Most Out of New ScyllaDB Capabilities
ScyllaDB
 
Scylla Virtual Workshop 2022
ScyllaDB
 
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
ScyllaDB
 
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
ScyllaDB
 
Scylla Summit 2022: How ScyllaDB Powers This Next Tech Cycle
ScyllaDB
 
A Deep Dive into ScyllaDB's Architecture
ScyllaDB
 
ScyllaDB Virtual Workshop
ScyllaDB
 
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
ScyllaDB
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
ScyllaDB
 
Understanding The True Cost of DynamoDB Webinar
ScyllaDB
 
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...
ScyllaDB
 
MongoDB vs ScyllaDB: Tractian’s Experience with Real-Time ML
ScyllaDB
 
Ad

More from ScyllaDB (20)

PDF
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
ScyllaDB
 
PDF
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
ScyllaDB
 
PDF
New Ways to Reduce Database Costs with ScyllaDB
ScyllaDB
 
PDF
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
PDF
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
ScyllaDB
 
PDF
Leading a High-Stakes Database Migration
ScyllaDB
 
PDF
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
ScyllaDB
 
PDF
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
ScyllaDB
 
PDF
ScyllaDB: 10 Years and Beyond by Dor Laor
ScyllaDB
 
PDF
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
ScyllaDB
 
PDF
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
ScyllaDB
 
PDF
Vector Search with ScyllaDB by Szymon Wasik
ScyllaDB
 
PDF
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
ScyllaDB
 
PDF
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
ScyllaDB
 
PDF
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
ScyllaDB
 
PDF
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
ScyllaDB
 
PDF
Lessons Learned from Building a Serverless Notifications System by Srushith R...
ScyllaDB
 
PDF
A Dist Sys Programmer's Journey into AI by Piotr Sarna
ScyllaDB
 
PDF
High Availability: Lessons Learned by Paul Preuveneers
ScyllaDB
 
PDF
How Natura Uses ScyllaDB and ScyllaDB Connector to Create a Real-time Data Pi...
ScyllaDB
 
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
ScyllaDB
 
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
ScyllaDB
 
New Ways to Reduce Database Costs with ScyllaDB
ScyllaDB
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
ScyllaDB
 
Leading a High-Stakes Database Migration
ScyllaDB
 
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
ScyllaDB
 
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
ScyllaDB
 
ScyllaDB: 10 Years and Beyond by Dor Laor
ScyllaDB
 
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
ScyllaDB
 
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
ScyllaDB
 
Vector Search with ScyllaDB by Szymon Wasik
ScyllaDB
 
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
ScyllaDB
 
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
ScyllaDB
 
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
ScyllaDB
 
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
ScyllaDB
 
Lessons Learned from Building a Serverless Notifications System by Srushith R...
ScyllaDB
 
A Dist Sys Programmer's Journey into AI by Piotr Sarna
ScyllaDB
 
High Availability: Lessons Learned by Paul Preuveneers
ScyllaDB
 
How Natura Uses ScyllaDB and ScyllaDB Connector to Create a Real-time Data Pi...
ScyllaDB
 
Ad

Recently uploaded (20)

PDF
Software Development Methodologies in 2025
KodekX
 
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PPTX
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PDF
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PDF
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
PDF
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PDF
How-Cloud-Computing-Impacts-Businesses-in-2025-and-Beyond.pdf
Artjoker Software Development Company
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PDF
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PDF
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PDF
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
PDF
REPORT: Heating appliances market in Poland 2024
SPIUG
 
PDF
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
Software Development Methodologies in 2025
KodekX
 
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
How-Cloud-Computing-Impacts-Businesses-in-2025-and-Beyond.pdf
Artjoker Software Development Company
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
REPORT: Heating appliances market in Poland 2024
SPIUG
 
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 

How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham

  • 1. A ScyllaDB Community How Agoda Scaled 50x Throughput with ScyllaDB Worakarn Isaratham Lead Software Engineer
  • 2. Worakarn Isaratham (he/him) ■ Lead Software Engineer, Agoda ■ Based in Bangkok, Thailand ■ Experience in distributed computing, software testing ■ Interested in dependable software systems
  • 3. ■ ScyllaDB in Agoda Feature Store ■ Capacity Problem ■ Potential Solutions Presentation Agenda
  • 5. Online Feature Serving Client SDK Cache ScyllaDB App Servers 3.5M EPS 1.7M EPS 200k EPS P99 Latency: 5 ms P99 Latency: 8 ms Average 5 features / entities
  • 6. Growth Since the start of 2023 ■ Servers traffic: 50x Peak servers traffic, on the busiest DC
  • 7. Growth Since the start of 2023 ■ Servers traffic: 50x ■ ScyllaDB traffic: 10x 10K EPS Peak ScyllaDB traffic, on the busiest DC
  • 8. A Capacity Problem ■ A new use case wanted to onboard ■ Problematic usage pattern: ■ Bursty traffic from cold cache, hitting ScyllaDB at 120K EPS. ■ Many duplicated requests in very quick succession ■ Keep retrying any failed requests 12x of the load then 2x of the load now!
  • 9. A Capacity Problem ■ One DC was able to survive this load without errors. ■ The other DC got lots of problems ■ Very high error rate ■ Took 40 minutes to finish all the retries ■ Metrics were pointing to slow read on ScyllaDB nodes
  • 10. Slow Disks Bad DC Good DC Advantage Disks SATA SSD RAID 0 NVMe SSD RAID 0 Read iops 6868 79566 11.6x Read bandwidth 1.5G 10.1G 6.7x Write iops 6615 41104 6.2x Write bandwidth 1.9G 6.3G 3.3x
  • 11. Just Buy New Disks? ● New disks were ordered ● Improved user-side caching, reduced this load to 7K. ● How long could we survive? Capacity
  • 12. Cache-Avoiding Load Test ■ Use artificial, one-time-used load to avoid ScyllaDB caching. 25K 5K Normal load ScyllaDB cache one-time-used entities BYPASS CACHE Flush, Restart ScyllaDB Baseline EPS for SATA
  • 13. Idea 1: Different Data Modeling Current: one tall table Alternative: one table per feature set
  • 14. Idea 1: Different Data Modeling
  • 15. Idea 2: Change Compaction Strategy ■ Our workload is “Read-mostly, many updates”. Size-tiered strategy is recommended. Prioritized read latency Slow disk read Large SSTable files Size-tiered Compaction Leveled Compaction
  • 16. Idea 2: Change Compaction Strategy 1.5x
  • 17. Idea 3: Increase Summary File Size ■ ScyllaDB uses summary files to help navigate to index files summary file size ≈ data file size × summary ratio High ratio Larger summary More efficient index Less disk I/O
  • 18. Idea 3: Increase Summary File Size 4x
  • 20. Rollout Jul 2023 New summary ratio applied Oct 2023 Migrated to NVMe disks Focus shifted to other components. Still trying out some new ideas on ScyllaDB. Leveled Compaction: Only applied to new table, need data migration
  • 21. Recent Experiments ● Partitioned By Feature Set, clustered by Entity ○ Disastrous! 400x worse ● All features as a blob in a single row ○ +35% throughput
  • 22. Lessons ● Fast disks are essential! ● Benchmark your load ● Tailor your data model to fit the needs
  • 23. Stay in Touch Worakarn Isaratham [email protected] github.com/arkorwan www.linkedin.com/in/worakarn