SlideShare a Scribd company logo
Accelerating Ceph Performance with
High Speed Networks and Protocols
Qingchun Song
Sr. Director of Market Development-APJ & China
Company
Headquarters
Mellanox Overview
Ticker: MLNX
Yokneam, Israel
Sunnyvale, California
Worldwide Offices
~2,900Employees worldwide
Leadership in Storage Platforms
Delivering the Highest Data Center Return on Investment
SMB Direct
Storage & Connectivity Evolution
1990 2000 2010 2015
Internal
Storage
Distributed Storage
Shared Network
File & Block Data
External
Storage
Storage Consolidation
Dedicated FC SAN
Block/Structured Data
Flash/
Convergence
Media Migration
Dedicated Eth/IB/FC
All Data Types
Virtualization/
Cloud
Data Specific Storage
Lossless IB/Eth
Unstructured Data
2020
NVMe/
Big Data
Many Platforms
Dedicated Eth/IB
Right Data for Platform
Where to Draw the Line?
Traditional
Storage
Legacy DC – FC SAN
Scale-out Storage
Modern DC – Ethernet Storage Fabric
CePH Work Flow
Storage or Data Bottleneck:
Bandwidth
OSD read:
• Client(App <-> RBD <-> RADOS) <-> NIC <-> Leaf <-> Spine <-> Leaf <-> NIC <->OSD <-> NVMe
OSD write:
• Client(App <-> RBD <-> RADOS) <-> NIC <-> Leaf <-> Spine <-> Leaf <-> NIC <->OSD <-> NVMe <-> OSD <-> NIC <-> Leaf <-> Spine <-> Leaf <-> NIC <->OSD <-> NVMe
Ceph Bandwidth Performance
Improvement
• Aggregate performance
of 4 Ceph servers
• 25GbE has 92% more
bandwidth than 10GbE
• 25GbE has 86% more
IOPS than 10GbE
• Internet search results
seem to recommend one
10GbE NIC for each ~15
HDDs in an OSD
• Mirantis, Red Hat,
Supermicro, etc.
Storage or Data Bottleneck: Latency
Servers
• Higher processing
capability
• High-density
virtualization
Storage
• Move to All-flash
• Faster protocols –
NVMe-oF
cost
Risk Complexity
Data Center modernization requires Future Proof, faster, lossless
Ethernet Storage Fabrics
High bandwidth,
low latency,
zero packet loss
Predictable Performance, Deterministic & Secure Fabrics
Fabrics
Simplified
security and
management
Faster, more
predictable
performance
Block, file, and
object storage
RDMA Is The Key For Storage Latency
adapter based transport
RDMA Enables Efficient Data Movement
• Without RDMA
• 5.7 GB/s throughput
• 20-26% CPU utilization
• 4 cores 100% consumed by moving
data
• With Hardware RDMA
• 11.1 GB/s throughput at half the latency
• 13-14% CPU utilization
• More CPU power for applications, better ROI
x x x x
100GbE With CPU Onload 100 GbE With Network Offload
CPU Onload Penalties
• Half the Throughput
• Twice the Latency
• Higher CPU Consumption
2X Better Bandwidth
Half the Latency
33% Lower CPU
See the demo: https://www.youtube.com/watch?v=u8ZYhUjSUoI
 Best Results: 3x Higher IOPS
 RDMA’s biggest benefit for Ceph block storage
• High IOPS workloads
• Small block sizes (<32KB)
 Enable > 10GB/s from single node
 Enable < 10usec latency under load
Ceph RDMA Performance Improvement
• Conservative Results: 44%~60%
more IOPS
• RDMA offers significant benefits to Ceph performance
for small block size (4KB) IOPS.
• 2 OSDs with 4 clients, RDMA allowed 44% more IOPS.
• 4 OSDs and 4 clients, RDMA allowed 60% more IOPS.
RDMA: Mitigates Meltdown Mess,
Stops Spectre Security Slowdown
0%
-47%
Performance
Impact: 0%
Performance
Impact: -47%
Before – Before applying software patches of Meltdown & Spectre After – After applying software patches of Meltdown & Spectre
CePH RDMA Status
• CePH RDMA working group
• Mellanox
• Xsky
• Samsung
• SanDisk
• RedHat
• The latest stable CePH RDMA version
• https://github.com/Mellanox/ceph/tree/luminous-12.1.0-rdma
• Bring Up Ceph RDMA - Developer's Guide
• https://community.mellanox.com/docs/DOC-2721
• RDMA/RoCE Configuration Guide
• https://community.mellanox.com/docs/DOC-2283
Storage or Data Bottleneck:
Storage Fabric
www.zeropacketloss.com
www.Mellanox.com/tolly
5.2
8.4
9.6 9.7
0.3
0.9 1.0 1.1
64B 512B 1.5B 9KB
MaxBurstSize(MB)
Packet size
Microburst Absorption Capability
Spectrum Tomahawk
Congestion Management
Good Bad
Fairness
GoodBad
Avoidable Packet Loss
GoodBad
Latency Test Results
Ethernet
Storage Fabric
Summary
• Ceph Benefits from Faster Network
• 10GbE is not enough!
• RDMA further optimizes Ceph performance
• Reduce the impact from Meltdown/Spectre fixes
• ESF(Ethernet Storage Fabric) is trend
Thank You

More Related Content

What's hot (20)

PDF
Erasure Code at Scale - Thomas William Byrne
Ceph Community
 
PDF
Ceph on arm64 upload
Ceph Community
 
PDF
Experiences building a distributed shared log on RADOS - Noah Watkins
Ceph Community
 
PDF
Build a High Available NFS Cluster Based on CephFS - Shangzhong Zhu
Ceph Community
 
PDF
Automatic Operation Bot for Ceph - You Ji
Ceph Community
 
PDF
RBD: What will the future bring? - Jason Dillaman
Ceph Community
 
PPTX
Ceph Performance Profiling and Reporting
Ceph Community
 
PPTX
OpenStack and Ceph case study at the University of Alabama
Kamesh Pemmaraju
 
PDF
Ceph Day Beijing - Ceph RDMA Update
Danielle Womboldt
 
PPTX
QCT Ceph Solution - Design Consideration and Reference Architecture
Patrick McGarry
 
PDF
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Ceph Community
 
PDF
Ceph Goes on Online at Qihoo 360 - Xuehan Xu
Ceph Community
 
PDF
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu Yong
Ceph Community
 
PPTX
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Patrick McGarry
 
PPTX
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Community
 
PPTX
Ceph on rdma
Somnath Roy
 
PPTX
Designing for High Performance Ceph at Scale
James Saint-Rossy
 
PDF
Ceph QoS: How to support QoS in distributed storage system - Taewoong Kim
Ceph Community
 
PDF
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Odinot Stanislas
 
PDF
Performance tuning in BlueStore & RocksDB - Li Xiaoyan
Ceph Community
 
Erasure Code at Scale - Thomas William Byrne
Ceph Community
 
Ceph on arm64 upload
Ceph Community
 
Experiences building a distributed shared log on RADOS - Noah Watkins
Ceph Community
 
Build a High Available NFS Cluster Based on CephFS - Shangzhong Zhu
Ceph Community
 
Automatic Operation Bot for Ceph - You Ji
Ceph Community
 
RBD: What will the future bring? - Jason Dillaman
Ceph Community
 
Ceph Performance Profiling and Reporting
Ceph Community
 
OpenStack and Ceph case study at the University of Alabama
Kamesh Pemmaraju
 
Ceph Day Beijing - Ceph RDMA Update
Danielle Womboldt
 
QCT Ceph Solution - Design Consideration and Reference Architecture
Patrick McGarry
 
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Ceph Community
 
Ceph Goes on Online at Qihoo 360 - Xuehan Xu
Ceph Community
 
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu Yong
Ceph Community
 
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Patrick McGarry
 
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Community
 
Ceph on rdma
Somnath Roy
 
Designing for High Performance Ceph at Scale
James Saint-Rossy
 
Ceph QoS: How to support QoS in distributed storage system - Taewoong Kim
Ceph Community
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Odinot Stanislas
 
Performance tuning in BlueStore & RocksDB - Li Xiaoyan
Ceph Community
 

Similar to Accelerating Ceph Performance with High Speed Networks and Protocols - Qingchun Song (20)

PDF
Deploying flash storage for Ceph without compromising performance
Ceph Community
 
PPTX
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Community
 
PPTX
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community
 
PPTX
Ceph Day Berlin: Deploying Flash Storage for Ceph without Compromising Perfor...
Ceph Community
 
PPTX
Ceph Day SF 2015 - Deploying flash storage for Ceph without compromising perf...
Ceph Community
 
PPTX
Ceph Day Amsterdam 2015 - Deploying flash storage for Ceph without compromisi...
Ceph Community
 
PPTX
Ceph Day Chicago - Deploying flash storage for Ceph without compromising perf...
Ceph Community
 
PPTX
Ceph Day New York 2014: Ceph over High Performance Networks
Ceph Community
 
PPTX
Ceph Day London 2014 - Ceph Over High-Performance Networks
Ceph Community
 
PPTX
Ceph Day Berlin: Ceph on All Flash Storage - Breaking Performance Barriers
Ceph Community
 
PPTX
SQLintersection keynote a tale of two teams
Sumeet Bansal
 
PDF
StorPool Storage presenting at Storage Field Day 25pdf
StorPool Storage
 
PPT
Infrastructure Strategies 2007
Dr. Jimmy Schwarzkopf
 
PDF
Mellanox Storage Solutions
Mellanox Technologies
 
PPT
High Performance Communication for Oracle using InfiniBand
webhostingguy
 
PDF
VMworld 2013: How SRP Delivers More Than Power to Their Customers
VMworld
 
PPTX
Storage and performance- Batch processing, Whiptail
Internet World
 
PPTX
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
Netronome
 
PPT
Oracle Exec Summary 7000 Unified Storage
David R. Klauser
 
PDF
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
sabnees
 
Deploying flash storage for Ceph without compromising performance
Ceph Community
 
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Community
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community
 
Ceph Day Berlin: Deploying Flash Storage for Ceph without Compromising Perfor...
Ceph Community
 
Ceph Day SF 2015 - Deploying flash storage for Ceph without compromising perf...
Ceph Community
 
Ceph Day Amsterdam 2015 - Deploying flash storage for Ceph without compromisi...
Ceph Community
 
Ceph Day Chicago - Deploying flash storage for Ceph without compromising perf...
Ceph Community
 
Ceph Day New York 2014: Ceph over High Performance Networks
Ceph Community
 
Ceph Day London 2014 - Ceph Over High-Performance Networks
Ceph Community
 
Ceph Day Berlin: Ceph on All Flash Storage - Breaking Performance Barriers
Ceph Community
 
SQLintersection keynote a tale of two teams
Sumeet Bansal
 
StorPool Storage presenting at Storage Field Day 25pdf
StorPool Storage
 
Infrastructure Strategies 2007
Dr. Jimmy Schwarzkopf
 
Mellanox Storage Solutions
Mellanox Technologies
 
High Performance Communication for Oracle using InfiniBand
webhostingguy
 
VMworld 2013: How SRP Delivers More Than Power to Their Customers
VMworld
 
Storage and performance- Batch processing, Whiptail
Internet World
 
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
Netronome
 
Oracle Exec Summary 7000 Unified Storage
David R. Klauser
 
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
sabnees
 
Ad

Recently uploaded (20)

PPTX
Agentic AI in Healthcare Driving the Next Wave of Digital Transformation
danielle hunter
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PDF
introduction to computer hardware and sofeware
chauhanshraddha2007
 
PPTX
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
PDF
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PDF
Per Axbom: The spectacular lies of maps
Nexer Digital
 
PDF
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PDF
TrustArc Webinar - Navigating Data Privacy in LATAM: Laws, Trends, and Compli...
TrustArc
 
PPTX
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
PDF
Market Insight : ETH Dominance Returns
CIFDAQ
 
PPTX
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PPTX
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
PDF
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
PPTX
AVL ( audio, visuals or led ), technology.
Rajeshwri Panchal
 
Agentic AI in Healthcare Driving the Next Wave of Digital Transformation
danielle hunter
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
introduction to computer hardware and sofeware
chauhanshraddha2007
 
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
Per Axbom: The spectacular lies of maps
Nexer Digital
 
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
TrustArc Webinar - Navigating Data Privacy in LATAM: Laws, Trends, and Compli...
TrustArc
 
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
Market Insight : ETH Dominance Returns
CIFDAQ
 
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
AVL ( audio, visuals or led ), technology.
Rajeshwri Panchal
 
Ad

Accelerating Ceph Performance with High Speed Networks and Protocols - Qingchun Song

  • 1. Accelerating Ceph Performance with High Speed Networks and Protocols Qingchun Song Sr. Director of Market Development-APJ & China
  • 2. Company Headquarters Mellanox Overview Ticker: MLNX Yokneam, Israel Sunnyvale, California Worldwide Offices ~2,900Employees worldwide
  • 3. Leadership in Storage Platforms Delivering the Highest Data Center Return on Investment SMB Direct
  • 4. Storage & Connectivity Evolution 1990 2000 2010 2015 Internal Storage Distributed Storage Shared Network File & Block Data External Storage Storage Consolidation Dedicated FC SAN Block/Structured Data Flash/ Convergence Media Migration Dedicated Eth/IB/FC All Data Types Virtualization/ Cloud Data Specific Storage Lossless IB/Eth Unstructured Data 2020 NVMe/ Big Data Many Platforms Dedicated Eth/IB Right Data for Platform
  • 5. Where to Draw the Line? Traditional Storage Legacy DC – FC SAN Scale-out Storage Modern DC – Ethernet Storage Fabric
  • 7. Storage or Data Bottleneck: Bandwidth OSD read: • Client(App <-> RBD <-> RADOS) <-> NIC <-> Leaf <-> Spine <-> Leaf <-> NIC <->OSD <-> NVMe OSD write: • Client(App <-> RBD <-> RADOS) <-> NIC <-> Leaf <-> Spine <-> Leaf <-> NIC <->OSD <-> NVMe <-> OSD <-> NIC <-> Leaf <-> Spine <-> Leaf <-> NIC <->OSD <-> NVMe
  • 8. Ceph Bandwidth Performance Improvement • Aggregate performance of 4 Ceph servers • 25GbE has 92% more bandwidth than 10GbE • 25GbE has 86% more IOPS than 10GbE • Internet search results seem to recommend one 10GbE NIC for each ~15 HDDs in an OSD • Mirantis, Red Hat, Supermicro, etc.
  • 9. Storage or Data Bottleneck: Latency Servers • Higher processing capability • High-density virtualization Storage • Move to All-flash • Faster protocols – NVMe-oF cost Risk Complexity Data Center modernization requires Future Proof, faster, lossless Ethernet Storage Fabrics High bandwidth, low latency, zero packet loss Predictable Performance, Deterministic & Secure Fabrics Fabrics Simplified security and management Faster, more predictable performance Block, file, and object storage
  • 10. RDMA Is The Key For Storage Latency adapter based transport
  • 11. RDMA Enables Efficient Data Movement • Without RDMA • 5.7 GB/s throughput • 20-26% CPU utilization • 4 cores 100% consumed by moving data • With Hardware RDMA • 11.1 GB/s throughput at half the latency • 13-14% CPU utilization • More CPU power for applications, better ROI x x x x 100GbE With CPU Onload 100 GbE With Network Offload CPU Onload Penalties • Half the Throughput • Twice the Latency • Higher CPU Consumption 2X Better Bandwidth Half the Latency 33% Lower CPU See the demo: https://www.youtube.com/watch?v=u8ZYhUjSUoI
  • 12.  Best Results: 3x Higher IOPS  RDMA’s biggest benefit for Ceph block storage • High IOPS workloads • Small block sizes (<32KB)  Enable > 10GB/s from single node  Enable < 10usec latency under load Ceph RDMA Performance Improvement • Conservative Results: 44%~60% more IOPS • RDMA offers significant benefits to Ceph performance for small block size (4KB) IOPS. • 2 OSDs with 4 clients, RDMA allowed 44% more IOPS. • 4 OSDs and 4 clients, RDMA allowed 60% more IOPS.
  • 13. RDMA: Mitigates Meltdown Mess, Stops Spectre Security Slowdown 0% -47% Performance Impact: 0% Performance Impact: -47% Before – Before applying software patches of Meltdown & Spectre After – After applying software patches of Meltdown & Spectre
  • 14. CePH RDMA Status • CePH RDMA working group • Mellanox • Xsky • Samsung • SanDisk • RedHat • The latest stable CePH RDMA version • https://github.com/Mellanox/ceph/tree/luminous-12.1.0-rdma • Bring Up Ceph RDMA - Developer's Guide • https://community.mellanox.com/docs/DOC-2721 • RDMA/RoCE Configuration Guide • https://community.mellanox.com/docs/DOC-2283
  • 15. Storage or Data Bottleneck: Storage Fabric www.zeropacketloss.com www.Mellanox.com/tolly 5.2 8.4 9.6 9.7 0.3 0.9 1.0 1.1 64B 512B 1.5B 9KB MaxBurstSize(MB) Packet size Microburst Absorption Capability Spectrum Tomahawk Congestion Management Good Bad Fairness GoodBad Avoidable Packet Loss GoodBad Latency Test Results Ethernet Storage Fabric
  • 16. Summary • Ceph Benefits from Faster Network • 10GbE is not enough! • RDMA further optimizes Ceph performance • Reduce the impact from Meltdown/Spectre fixes • ESF(Ethernet Storage Fabric) is trend