Dense Node C* / S* / Spark on Dockers
Walmart GEC-Database Engineering & Data
Services
Ani Mondal
NoSQL / Big Data Footprint
1500 Nodes
(150 Clusters)
Prod & Non-
Prod
Elastic
Kafka
Proposed
Container
Cluster
1
2
3
4
5
6
7
8BM
Cluster
1
2
3
4
Transform new C*
hardware into containers
thereby utilizing unused
compute
Build new clusters in
containers & migrate old
C* clusters into those
Double the capacity on
existing & newly ordered
H/W for C* clusters
thereby saving cost
Reclaiming and reusing existing HW
App
A
Containers vs. VMs
Hypervisor
Host OS
Server
Guest
OS
Bins/
Libs
App
A’
Gues
t
OS
Bins/
Libs
App
B
Gues
t
OS
Bins/
Libs
AppA’
Docker
Host OS
Server
Bins/Libs
AppA
Bins/Libs
AppB
AppB’
AppC
AppC’
VM
Container
Containers are isolated, but share OS and, optionally, binaries/libraries
Guest
OS
Guest
OS
Bins/
Libs
App
A
Original App
(No OS to take
up space, resources,
or require restart)
Copy of
App
No OS. Can
Share bins/libs
App
A’
Configuration
Host OS
Server (HP SL4540)
Bins/Libs Bins/Libs
Dockers
Docker 1
Persistent DATA & LOGS
Docker 2
Persistent DATA & LOGS
OPSC
Agent
OPSC
Agent
Custom Orchestration
Host Network – Dedicated NICs
Docker Registry
Benchmark #s in-house application
Comparison (+/-3
% Margin of Error)
Bare Metal
(Prod)
Dockers
(Prod HW)
OpenStack
(VM)
AZURE VM
Reads
Total 276943 813428 999164 975179
Per Sec 4615.72 6778.57 651.77 5079.06
Avg. Latency (in
ms)
0.68 0.92 9.77 1.40
Writes
Total 499921 1000000 1000000 1000000
Per Sec 8332.02 8333.33 652.32 5208.3
Avg. Latency (in
ms)
0.72 0.71 9.21 1.04
Patent
• Kavindra Yerolkar
• Jayakrishnan Parappalliyalil
• Preetish Tripathi
• Aniruddha Mondal
Cassandra on Docker @ Walmart Labs

More Related Content

PDF
Running Cassandra in AWS
PDF
Large Scale Data Analytics with Spark and Cassandra on the DSE Platform
PDF
Scylla Summit 2022: What’s New in ScyllaDB Operator for Kubernetes
PDF
Spark day 2017 - Spark on Kubernetes
PDF
Critical Attributes for a High-Performance, Low-Latency Database
PPTX
Cassandra Performance and Scalability on AWS
PDF
Multi-Region Cassandra Clusters
PDF
ScyllaDB: What could you do with Cassandra compatibility at 1.8 million reque...
Running Cassandra in AWS
Large Scale Data Analytics with Spark and Cassandra on the DSE Platform
Scylla Summit 2022: What’s New in ScyllaDB Operator for Kubernetes
Spark day 2017 - Spark on Kubernetes
Critical Attributes for a High-Performance, Low-Latency Database
Cassandra Performance and Scalability on AWS
Multi-Region Cassandra Clusters
ScyllaDB: What could you do with Cassandra compatibility at 1.8 million reque...

What's hot (20)

PPTX
Running Cassandra on Amazon EC2
PDF
Scylla Summit 2016: Scylla at Samsung SDS
PDF
Cassandra Summit 2014: Apache Cassandra Best Practices at Ebay
PDF
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
ODP
Intro to cassandra
PPTX
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
PDF
Cisco: Cassandra adoption on Cisco UCS & OpenStack
PPTX
mParticle's Journey to Scylla from Cassandra
PDF
Big Data Day LA 2015 - Sparking up your Cassandra Cluster- Analytics made Awe...
PDF
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
PDF
Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...
PDF
Real Time Analytics with Dse
PPTX
Scylla on Kubernetes: Introducing the Scylla Operator
PDF
Scylla Summit 2016: Compose on Containing the Database
PDF
Scylla Virtual Workshop 2020
PDF
The True Cost of NoSQL DBaaS Options
PDF
Kafka for begginer
PPTX
Introducing DataStax Enterprise 4.7
PDF
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed
PDF
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
Running Cassandra on Amazon EC2
Scylla Summit 2016: Scylla at Samsung SDS
Cassandra Summit 2014: Apache Cassandra Best Practices at Ebay
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Intro to cassandra
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
Cisco: Cassandra adoption on Cisco UCS & OpenStack
mParticle's Journey to Scylla from Cassandra
Big Data Day LA 2015 - Sparking up your Cassandra Cluster- Analytics made Awe...
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...
Real Time Analytics with Dse
Scylla on Kubernetes: Introducing the Scylla Operator
Scylla Summit 2016: Compose on Containing the Database
Scylla Virtual Workshop 2020
The True Cost of NoSQL DBaaS Options
Kafka for begginer
Introducing DataStax Enterprise 4.7
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
Ad

Viewers also liked (20)

PPTX
Introduction to DataStax Enterprise Graph Database
PDF
Cassandra 3.0 Data Modeling
PPTX
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
PPTX
Cassandra Adoption on Cisco UCS & Open stack
PDF
Data Modeling for Apache Cassandra
PDF
Standing Up Your First Cluster
PDF
Production Ready Cassandra
PDF
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
PPTX
Cassandra @ Sony: The good, the bad, and the ugly part 1
PDF
Advanced Cassandra
PPTX
Bad Habits Die Hard
PDF
Apache Cassandra and Drivers
PDF
Cassandra Core Concepts
PDF
Introduction to Data Modeling with Apache Cassandra
PDF
Coursera Cassandra Driver
PDF
Advanced Data Modeling with Apache Cassandra
PPTX
Cassandra @ Sony: The good, the bad, and the ugly part 2
PPTX
Enabling Search in your Cassandra Application with DataStax Enterprise
PDF
Getting Started with Graph Databases
PDF
Analytics with Spark and Cassandra
Introduction to DataStax Enterprise Graph Database
Cassandra 3.0 Data Modeling
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Cassandra Adoption on Cisco UCS & Open stack
Data Modeling for Apache Cassandra
Standing Up Your First Cluster
Production Ready Cassandra
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Sony: The good, the bad, and the ugly part 1
Advanced Cassandra
Bad Habits Die Hard
Apache Cassandra and Drivers
Cassandra Core Concepts
Introduction to Data Modeling with Apache Cassandra
Coursera Cassandra Driver
Advanced Data Modeling with Apache Cassandra
Cassandra @ Sony: The good, the bad, and the ugly part 2
Enabling Search in your Cassandra Application with DataStax Enterprise
Getting Started with Graph Databases
Analytics with Spark and Cassandra
Ad

Similar to Cassandra on Docker @ Walmart Labs (20)

PPTX
State of the Container Ecosystem
PDF
Week 8 lecture material
PDF
week8_watermark.pdfhowcanitbe minimum 40 i
PPTX
Kubernetes is all you need
PDF
Dockers and kubernetes
PDF
Demystifying Docker
PPTX
Demystifying Docker101
PDF
Introduction to Docker
PPTX
Docker training
PPTX
Cont0519
PDF
Chris Ward - Understanding databases for distributed docker applications - No...
PPTX
Container on azure
PDF
Docker's Jérôme Petazzoni: Best Practices in Dev to Production Parity for Con...
PPTX
Docker introduction
PPTX
Docker introduction (1)
PPTX
Docker introduction (1)
PDF
Introduction to Docker, December 2014 "Tour de France" Edition
PDF
Workshop : 45 minutes pour comprendre Docker avec Jérôme Petazzoni
PDF
Docker Containers: Developer’s experience and building robust developer envir...
State of the Container Ecosystem
Week 8 lecture material
week8_watermark.pdfhowcanitbe minimum 40 i
Kubernetes is all you need
Dockers and kubernetes
Demystifying Docker
Demystifying Docker101
Introduction to Docker
Docker training
Cont0519
Chris Ward - Understanding databases for distributed docker applications - No...
Container on azure
Docker's Jérôme Petazzoni: Best Practices in Dev to Production Parity for Con...
Docker introduction
Docker introduction (1)
Docker introduction (1)
Introduction to Docker, December 2014 "Tour de France" Edition
Workshop : 45 minutes pour comprendre Docker avec Jérôme Petazzoni
Docker Containers: Developer’s experience and building robust developer envir...

More from DataStax Academy (9)

PDF
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
PDF
Cassandra Data Maintenance with Spark
PDF
Make 2016 your year of SMACK talk
PDF
Client Drivers and Cassandra, the Right Way
PPTX
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
PDF
Traveler's Guide to Cassandra
PDF
Cassandra: One (is the loneliest number)
PPTX
Spark Cassandra Connector: Past, Present and Furure
PDF
New features in 3.0
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Cassandra Data Maintenance with Spark
Make 2016 your year of SMACK talk
Client Drivers and Cassandra, the Right Way
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Traveler's Guide to Cassandra
Cassandra: One (is the loneliest number)
Spark Cassandra Connector: Past, Present and Furure
New features in 3.0

Recently uploaded (20)

PDF
The Digital Engine Room: Unlocking APAC’s Economic and Digital Potential thro...
PDF
Applying Agentic AI in Enterprise Automation
PDF
Slides World Game (s) Great Redesign Eco Economic Epochs.pdf
PPTX
Information-Technology-in-Human-Society.pptx
PDF
CCUS-as-the-Missing-Link-to-Net-Zero_AksCurious.pdf
PDF
Ebook - The Future of AI A Comprehensive Guide.pdf
PDF
Intravenous drug administration application for pediatric patients via augmen...
PDF
Domain-specific knowledge and context in large language models: challenges, c...
PDF
State of AI in Business 2025 - MIT NANDA
PPTX
Blending method and technology for hydrogen.pptx
PDF
NewMind AI Journal Monthly Chronicles - August 2025
PDF
EGCB_Solar_Project_Presentation_and Finalcial Analysis.pdf
PPTX
How to use fields_get method in Odoo 18
PDF
TrustArc Webinar - Data Minimization in Practice_ Reducing Risk, Enhancing Co...
PDF
ELLIE29.pdfWETWETAWTAWETAETAETERTRTERTER
PPTX
CRM(Customer Relationship Managmnet) Presentation
PPTX
maintenance powerrpoint for adaprive and preventive
PPTX
From XAI to XEE through Influence and Provenance.Controlling model fairness o...
PPTX
AQUEEL MUSHTAQUE FAKIH COMPUTER CENTER .
PDF
Be ready for tomorrow’s needs with a longer-lasting, higher-performing PC
The Digital Engine Room: Unlocking APAC’s Economic and Digital Potential thro...
Applying Agentic AI in Enterprise Automation
Slides World Game (s) Great Redesign Eco Economic Epochs.pdf
Information-Technology-in-Human-Society.pptx
CCUS-as-the-Missing-Link-to-Net-Zero_AksCurious.pdf
Ebook - The Future of AI A Comprehensive Guide.pdf
Intravenous drug administration application for pediatric patients via augmen...
Domain-specific knowledge and context in large language models: challenges, c...
State of AI in Business 2025 - MIT NANDA
Blending method and technology for hydrogen.pptx
NewMind AI Journal Monthly Chronicles - August 2025
EGCB_Solar_Project_Presentation_and Finalcial Analysis.pdf
How to use fields_get method in Odoo 18
TrustArc Webinar - Data Minimization in Practice_ Reducing Risk, Enhancing Co...
ELLIE29.pdfWETWETAWTAWETAETAETERTRTERTER
CRM(Customer Relationship Managmnet) Presentation
maintenance powerrpoint for adaprive and preventive
From XAI to XEE through Influence and Provenance.Controlling model fairness o...
AQUEEL MUSHTAQUE FAKIH COMPUTER CENTER .
Be ready for tomorrow’s needs with a longer-lasting, higher-performing PC

Cassandra on Docker @ Walmart Labs

  • 1. Dense Node C* / S* / Spark on Dockers Walmart GEC-Database Engineering & Data Services Ani Mondal
  • 2. NoSQL / Big Data Footprint 1500 Nodes (150 Clusters) Prod & Non- Prod Elastic Kafka
  • 3. Proposed Container Cluster 1 2 3 4 5 6 7 8BM Cluster 1 2 3 4 Transform new C* hardware into containers thereby utilizing unused compute Build new clusters in containers & migrate old C* clusters into those Double the capacity on existing & newly ordered H/W for C* clusters thereby saving cost Reclaiming and reusing existing HW
  • 4. App A Containers vs. VMs Hypervisor Host OS Server Guest OS Bins/ Libs App A’ Gues t OS Bins/ Libs App B Gues t OS Bins/ Libs AppA’ Docker Host OS Server Bins/Libs AppA Bins/Libs AppB AppB’ AppC AppC’ VM Container Containers are isolated, but share OS and, optionally, binaries/libraries Guest OS Guest OS Bins/ Libs App A Original App (No OS to take up space, resources, or require restart) Copy of App No OS. Can Share bins/libs App A’
  • 5. Configuration Host OS Server (HP SL4540) Bins/Libs Bins/Libs Dockers Docker 1 Persistent DATA & LOGS Docker 2 Persistent DATA & LOGS OPSC Agent OPSC Agent Custom Orchestration Host Network – Dedicated NICs Docker Registry
  • 6. Benchmark #s in-house application Comparison (+/-3 % Margin of Error) Bare Metal (Prod) Dockers (Prod HW) OpenStack (VM) AZURE VM Reads Total 276943 813428 999164 975179 Per Sec 4615.72 6778.57 651.77 5079.06 Avg. Latency (in ms) 0.68 0.92 9.77 1.40 Writes Total 499921 1000000 1000000 1000000 Per Sec 8332.02 8333.33 652.32 5208.3 Avg. Latency (in ms) 0.72 0.71 9.21 1.04
  • 7. Patent • Kavindra Yerolkar • Jayakrishnan Parappalliyalil • Preetish Tripathi • Aniruddha Mondal