SlideShare a Scribd company logo
Reference	
  Architectures:	
  
Architec/ng	
  Ceph	
  Storage	
  Solu/ons	
  
Brent	
  Compton	
  
Director	
  Storage	
  Solu/ons,	
  Red	
  Hat	
  
	
  
Kyle	
  Bader	
  
Senior	
  Solu/on	
  Architect,	
  Red	
  Hat	
  
RefArch	
  Building	
  Blocks	
  
Servers	
  and	
  Media	
  (HDD,	
  SSD,	
  PCIe)	
  
Network	
  
Bare	
  metal	
   OpenStack	
  virt	
   Container	
  virt	
   Other	
  virt	
  
Defined	
  
Workloads	
  
OS/Virt	
  
PlaMorm	
  
Network	
  
Ceph	
  
RefArch	
  Flavors	
  
•  How-­‐to	
  integra/on	
  guides	
  
(Ceph+OS/Virt,	
  or	
  Ceph+OS/Virt+Workloads)	
  
hQp://www.dell.com/learn/us/en/04/shared-­‐content~data-­‐sheets~en/documents~dell-­‐red-­‐hat-­‐cloud-­‐
solu/ons.pdf	
  	
  
	
  
•  Performance	
  and	
  sizing	
  guides	
  
(Network+Server+Ceph+[OS/Virt])	
  
hQp://www.redhat.com/en/resources/red-­‐hat-­‐ceph-­‐storage-­‐clusters-­‐supermicro-­‐storage-­‐servers	
  
	
  hQps://www.redhat.com/en/resources/cisco-­‐ucs-­‐c3160-­‐rack-­‐server-­‐red-­‐hat-­‐ceph-­‐storage	
  	
  
hQps://www.scalableinforma/cs.com/assets/documents/Unison-­‐Ceph-­‐Performance.pdf	
  	
  
1.  Qualify	
  need	
  for	
  scale-­‐out	
  storage	
  
2.  Design	
  for	
  target	
  workload	
  IO	
  profile(s)	
  
3.  Choose	
  storage	
  access	
  method(s)	
  
4.  Iden/fy	
  capacity	
  
5.  Determine	
  fault-­‐domain	
  risk	
  tolerance	
  
6.  Select	
  data	
  protec/on	
  method	
  
•  Target	
  server	
  and	
  network	
  hardware	
  architecture	
  
(performance	
  and	
  sizing)	
  
Design	
  Considera/ons	
  
1.	
  Qualify	
  Need	
  for	
  Scale-­‐out	
  
•  Elas/c	
  provisioning	
  across	
  storage	
  server	
  cluster	
  
•  Data	
  HA	
  across	
  ‘islands’	
  of	
  scale-­‐up	
  storage	
  servers	
  
•  Standardized	
  servers	
  and	
  networking	
  
•  Performance	
  and	
  capacity	
  scaled	
  independently	
  
•  Incremental	
  v.	
  forklie	
  upgrades	
  
Reference Architecture: Architecting Ceph Storage Solutions
PAST:	
  SCALE	
  UP	
  
FUTURE:	
  SCALE	
  OUT	
  
Designed	
  for	
  Agility	
  
2.	
  Design	
  for	
  Workload	
  IO	
  
•  Performance	
  v.	
  ‘cheap-­‐and-­‐deep’?	
  
•  Performance:	
  throughput	
  v.	
  IOPS	
  intensive?	
  
•  Sequen/al	
  v.	
  random?	
  
•  Small	
  block	
  v.	
  large	
  block?	
  
•  Read	
  v.	
  write	
  mix?	
  
•  Latency:	
  absolute	
  v.	
  consistent	
  targets?	
  
•  Sync	
  v.	
  async?	
  
Generalized	
  Workload	
  IO	
  
Categories	
  
IOPS	
  	
  
Op/mized	
  
Throughput	
  
Op/mized	
  
Cost-­‐Capacity	
  
Op/mized	
  
•  Highest	
  performance	
  (MB/sec	
  or	
  IOPS)	
  
•  CapEx:	
  Lowest	
  $/performance-­‐unit	
  
•  OpEx:	
  Highest	
  performance/BTU	
  
•  OpEx:	
  Highest	
  performance/waQ	
  
•  Meets	
  minimum	
  server-­‐fault	
  domain	
  
recommenda)on	
  (1	
  server	
  <=	
  10%	
  cluster)	
  
Performance-­‐Op/mized	
  
•  CapEx:	
  Lowest	
  $/TB	
  
•  OpEx:	
  Lowest	
  BTU/TB	
  
•  OpEx:	
  Lowest	
  waQ/TB	
  
•  OpEx:	
  Highest	
  TB/Rack-­‐unit	
  
•  Meets	
  minimum	
  server-­‐fault	
  domain	
  
recommenda)on	
  (1	
  server	
  <=	
  15%	
  cluster)	
  
Cost/Capacity-­‐Op/mized	
  
3.	
  Storage	
  Access	
  Methods	
  
distributed	
  file*	
   object	
   block**	
  
software-defined storage cluster
*CephFS	
  not	
  yet	
  supported	
  by	
  RHCS	
  
**	
  RBD	
  supported	
  with	
  replicated	
  data	
  protec/on	
  
4.	
  Iden/fy	
  Capacity	
  
OpenStack	
  
Starter	
  
100TB	
  
S	
  
500TB	
  
M	
  
1PTB	
  
L	
  
2PB+	
  
5.	
  Fault-­‐Domain	
  Risk	
  Tolerance	
  
•  What	
  %	
  of	
  cluster	
  capacity	
  does	
  you	
  want	
  on	
  a	
  single	
  node?	
  
When	
  a	
  server	
  fails:	
  
•  More	
  workload	
  performance	
  impairment	
  during	
  backfill/recovery	
  with	
  fewer	
  nodes	
  in	
  
the	
  cluster	
  (each	
  node	
  has	
  greater	
  %	
  of	
  its	
  compute/IO	
  u/liza/on	
  devoted	
  to	
  recovery).	
  	
  
•  Larger	
  %	
  of	
  cluster’s	
  reserve	
  storage	
  capacity	
  u/lized	
  during	
  backfill/recovery	
  with	
  fewer	
  
nodes	
  in	
  the	
  cluster	
  (must	
  reserve	
  larger	
  %	
  of	
  capacity	
  for	
  recovery	
  with	
  fewer	
  nodes).	
  
•  Guidelines:	
  
–  Minimum	
  supported	
  (RHCS):	
  3	
  OSD	
  nodes	
  per	
  Ceph	
  cluster.	
  
–  Minimum	
  recommended	
  (performance	
  cluster):	
  10	
  OSD	
  nodes	
  cluster	
  
(1	
  node	
  represents	
  <10%	
  of	
  total	
  cluster	
  capacity)	
  
–  Minimum	
  recommended	
  (cost/capacity	
  cluster):	
  7	
  OSD	
  nodes	
  per	
  cluster	
  
(1	
  node	
  represents	
  <15%	
  of	
  total	
  cluster	
  capacity)	
  
6.	
  Data	
  Protec/on	
  Schemes	
  
•  Replica/on	
  
•  Erasure	
  Coding	
  (analogous	
  to	
  network	
  RAID)	
  
	
  
One	
  of	
  the	
  biggest	
  choices	
  affec)ng	
  TCO	
  in	
  the	
  en)re	
  
solu)on!	
  
•  Replica/on	
  
– 3x	
  rep	
  over	
  JBOD	
  =	
  33%	
  usable:raw	
  capacity	
  ra/o	
  
	
  
•  Erasure	
  Coding	
  (analogous	
  to	
  network	
  RAID)	
  
– 8+3	
  over	
  JBOD	
  =	
  73%	
  usable:raw	
  
Data	
  Protec/on	
  Schemes	
  
•  Replica/on	
  
–  Ceph	
  block	
  storage	
  default:	
  3x	
  rep	
  over	
  JBOD	
  disks.	
  
–  Gluster	
  file	
  storage	
  default:	
  2x	
  rep	
  over	
  RAID6	
  bricks.	
  
	
  
•  Erasure	
  Coding	
  (analogous	
  to	
  network	
  RAID)	
  
–  Data	
  encoded	
  into	
  k	
  chunks	
  with	
  m	
  parity	
  chunks	
  and	
  
spread	
  onto	
  different	
  disks	
  (frequently	
  on	
  different	
  
servers).	
  	
  Can	
  tolerate	
  m	
  disk	
  failures	
  without	
  data	
  
loss.	
  	
  8+3	
  popular.	
  
Data	
  Protec/on	
  Schemes	
  
Target	
  Cluster	
  Hardware	
  
OSP	
  Starter	
  
100TB	
  
S	
  
500TB	
  
M	
  
1PTB	
  
L	
  
2PB	
  
IOPS	
  	
  
Op/mized	
  
Throughput	
  
Op/mized	
  
Cost-­‐Capacity	
  
Op/mized	
  
•  Following	
  are	
  extracts	
  from	
  the	
  recently	
  
published	
  Ceph	
  on	
  Supermicro	
  RefArch	
  
•  Based	
  on	
  lab	
  benchmarking	
  results	
  from	
  many	
  
different	
  configura/ons	
  
RefArch	
  Examples	
  
Sequen/al	
  Throughput	
  (R)	
  
(per	
  server)	
  
!"
#!!"
$!!!"
$#!!"
%!!!"
%#!!"
&!'(" '!!" (&" &"
!"#$%&'
()*%&+',-.%'/0"1'
23',%45%6789':%8;'<=>?5@=A5+'BC:',%>D%>'
EF':%AG'9-)>8;?$'
$%)$"
$!*)$!*"
$+)$"
$!*)$!*"
$+)!"
$!*)$!*"
,()%"
$!*)$!*"
,()!"
$!*)$!*"
,()%"
&!*"-./01234"
!)%"
&!*"-./01234"
(!)$%"
&!*"-./01234"
5%)!"
&!*"-./01234"
Sequen/al	
  Throughput	
  (R)	
  
(per	
  OSD/HDD)	
  
!"
#!"
$!"
%!"
&!"
'!"
(!"
)!"
*!"
&!+(" +!!" (&" &"
!"#$%&'
()*%&+',-.%'/0"1'
23',%45%6789':%8;'<=>?5@=A5+'BC:'(,D'
EF':%AG'9-)>8;?$'
#$,#"
#!-,#!-"
#*,#"
#!-,#!-"
#*,!"
#!-,#!-"
%(,$"
#!-,#!-"
%(,!"
#!-,#!-"
%(,$"
&!-"./012345"
(!,#$"
61/37893"
(!,#$"
&!-"./012345"
)$,!"
&!-"./012345"
Sequen/al	
  Throughput	
  (W)	
  
(per	
  OSD/HDD,	
  3xRep)	
  
!"
#"
$!"
$#"
%!"
%#"
&!"
'!()" (!!" )'" '"
!"#$%&'
()*%&+',-.%'/0"1'
234',%56%789:';<-+%'=><?6@>A6+'BCD'(,E'
FG'D%A'/H6:8A:I')I'F'+?'@%+'A>I$-&9:'J<-+%'+><?6@>A6+1K':-)<9L?$'
$%*$"
$!+*$!+"
$,*$"
$!+*$!+"
$,*!"
$!+*$!+"
&)*%"
$!+*$!+"
&)*!"
$!+*$!+"
&)*%"
'!+"-./01234"
)!*$%"
50.26782"
)!*$%"
'!+"-./01234"
9%*!"
'!+"-./01234"
!"
#"
$!"
$#"
%!"
%#"
&!"
&#"
'!"
'#"
'!()" (!!" )'" '"
!"#$%&'
()*%&+',-.%'/0"1'
234',%56%789:';<-+%'=><?6@>A6+'BCD'(,E'
CF'GH3'/I6:8A:J')J'/2H/K#/KHI11'+?'@%+'A>J$-&9:'L<-+%'+><?6@>A6+1M':-)<9N?$'
$%*$"
$!+*$!+"
$,*$"
$!+*$!+"
$,*!"
$!+*$!+"
&)*%"
$!+*$!+"
&)*!"
$!+*$!+"
&)*%"
'!+"-./01234"
)!*$%"
50.26782"
$%*!"
$!+*$!+"
)!*$%"
'!+"-./01234"
9%*!"
'!+"-./01234"
Sequen/al	
  Throughput	
  (W)	
  
(per	
  OSD/HDD,	
  EC)	
  
!"#$%"&'()$*"+,&")$-"%$./)0"
12%3452#46"7#89:;.83<=">$./"
?%:*$"#$%"4<:6"3@"A3%B"
+C3A$)6"#%:*$"D"E$)60"
!"#$%&'&())*&
$(#+%&'&,"#))*&
-$#+%&'&"#./01&
-(#+%&'&$#))*&
,"#+%&'&,#./01&
Throughput-­‐Op/mized	
  (R)	
  
Throughput-­‐Op/mized	
  (W)	
  
!"#$%"&'()$*"+,&")$-"%$./)0"
12%3452#46"7#89:;.83<=">%:6$"
?%:*$"#$%"4<:6"3@"A3%B"
+C3A$)6"#%:*$"D"E$)60"
!"#$%&'&())*&
$(#+%&'&,"#))*&
-$#+%&'&"#./01&
-(#+%&'&$#))*&
,"#+%&'&,#./01&
Capacity-­‐Op/mized	
  
!"#$%"&'"
()#)*+,-".#/0+1)/23"
4%+*$"#$%"&'"
5627$8,"#%+*$"9":$8,;"
!"#$%&'&())*&
+$#$%&'&())*&
,"#$%&'&())*&
!"#$%&'()*)+,-.
!"#$%&'()*
%&'+&#+
,-./ %*0*,--./ 10*23/ 4*0*53/
6!3%*!"&7879#:
/0'12*3,+)-24'4)-)
.;+<=>;"=&*!"&7879#:
/0'12*3,+)-24'4)-)
5'6742
89:;).'<8=9!>
5=?'@)+A:!6,-B
890'5C$'DEE
80'?FFG'""EHI(J2
/9'6742
89:;).'<8=9!>
/9=K5'@)+A:!6,-B
890'5C$'DEE
80'?FFG'""EHI(J2
K/'6742
89:;).'<8=9!>
K/=89K'@)+A:!6,-B
890'5C$'DEE
80'?FFG'""EHI(J2
89L'6742
89:;).'<8=9!>
89L=9LF'@)+A:!6,-B
890'5C$'DEE
80'?FFG'""EHI(J2
?'"'(7&@*!"&7879#:
KM9'2+'*17-2+-24'4)-)
8F'6742
89:;).'<8=9!>
8F=9F'@)+A:!6,-B
890'KC$'DEE
F0'""E
N'6742
/K:;).
9?'@)+A:!6,-B
/K0'KC$'DEE
F0'""E
N'6742
N9:;).
9?'@)+A:!6,-B
N90'KC$'DEE
F0'""E
I3)6624'O71'"PQQ21'9F8L
Ceph	
  Op/mized	
  Configs	
  
•  Server	
  chassis	
  size	
  
•  CPU	
  
•  Memory	
  
•  Disk	
  
•  SSD	
  Write	
  Journals	
  (Ceph	
  only)	
  
•  Network	
  
Add’l	
  Subsystem	
  Guidelines	
  
•  See	
  Ceph	
  on	
  Supermicro	
  PorMolio	
  RefArch	
  
hQp://www.redhat.com/en/resources/red-­‐hat-­‐ceph-­‐storage-­‐clusters-­‐supermicro-­‐storage-­‐servers	
  	
  
•  See	
  Ceph	
  on	
  Cisco	
  UCS	
  C3160	
  Whitepaper	
  
hQp://www.cisco.com/c/en/us/products/collateral/servers-­‐unified-­‐compu/ng/ucs-­‐c-­‐series-­‐rack-­‐servers/whitepaper-­‐C11-­‐735004.html	
  	
  
•  See	
  Ceph	
  on	
  Scalable	
  Informa/cs	
  Whitepaper	
  
hQps://www.scalableinforma/cs.com/assets/documents/Unison-­‐Ceph-­‐Performance.pdf	
  	
  
RefArchs	
  &	
  Whitepapers	
  
THANK	
  YOU	
  

More Related Content

What's hot (20)

PDF
DevOps with ActiveMQ, Camel, Fabric8, and HawtIO
Christian Posta
 
PDF
Open vSwitchソースコードの全体像
Sho Shimizu
 
PDF
Speeding up your team with GitOps
Brice Fernandes
 
PDF
Concurrency in action - chapter 7
JinWoo Lee
 
ODP
eBPF maps 101
SUSE Labs Taipei
 
PDF
Docker on Docker
Docker, Inc.
 
PDF
DPDK in Containers Hands-on Lab
Michelle Holley
 
PDF
20111015 勉強会 (PCIe / SR-IOV)
Kentaro Ebisawa
 
PDF
Replacing iptables with eBPF in Kubernetes with Cilium
Michal Rostecki
 
PDF
最近のOpenStackを振り返ってみよう
Takashi Kajinami
 
PDF
Kubernetes (k8s).pdf
Jaouad Assabbour
 
PDF
DPDK In Depth
Kernel TLV
 
PDF
Kubernetes Networking with Cilium - Deep Dive
Michal Rostecki
 
PDF
Ceph issue 해결 사례
Open Source Consulting
 
PPTX
Materialized Views and Secondary Indexes in Scylla: They Are finally here!
ScyllaDB
 
PDF
Git and Github
Wen-Tien Chang
 
ODP
Dpdk performance
Stephen Hemminger
 
PDF
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
Thomas Graf
 
PDF
Red Hat OpenStack 17 저자직강+스터디그룹_1주차
Nalee Jang
 
PPTX
Procesy długożyjące - Wzorzec Sagi
Michał Brzuchalski
 
DevOps with ActiveMQ, Camel, Fabric8, and HawtIO
Christian Posta
 
Open vSwitchソースコードの全体像
Sho Shimizu
 
Speeding up your team with GitOps
Brice Fernandes
 
Concurrency in action - chapter 7
JinWoo Lee
 
eBPF maps 101
SUSE Labs Taipei
 
Docker on Docker
Docker, Inc.
 
DPDK in Containers Hands-on Lab
Michelle Holley
 
20111015 勉強会 (PCIe / SR-IOV)
Kentaro Ebisawa
 
Replacing iptables with eBPF in Kubernetes with Cilium
Michal Rostecki
 
最近のOpenStackを振り返ってみよう
Takashi Kajinami
 
Kubernetes (k8s).pdf
Jaouad Assabbour
 
DPDK In Depth
Kernel TLV
 
Kubernetes Networking with Cilium - Deep Dive
Michal Rostecki
 
Ceph issue 해결 사례
Open Source Consulting
 
Materialized Views and Secondary Indexes in Scylla: They Are finally here!
ScyllaDB
 
Git and Github
Wen-Tien Chang
 
Dpdk performance
Stephen Hemminger
 
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
Thomas Graf
 
Red Hat OpenStack 17 저자직강+스터디그룹_1주차
Nalee Jang
 
Procesy długożyjące - Wzorzec Sagi
Michał Brzuchalski
 

Viewers also liked (20)

PDF
Ceph Day Melbourne - Scale and performance: Servicing the Fabric and the Work...
Ceph Community
 
PPTX
Ceph Day Chicago - Deploying flash storage for Ceph without compromising perf...
Ceph Community
 
PDF
Ceph Day Shanghai - Ceph in Chinau Unicom Labs
Ceph Community
 
PDF
Ceph Day Chicago: Using Ceph for Large Hadron Collider Data
Ceph Community
 
PPTX
Ceph Day Chicago - Brining Ceph Storage to the Enterprise
Ceph Community
 
PDF
Ceph Day Shanghai - Community Update
Ceph Community
 
PDF
Ceph Day Shanghai - On the Productization Practice of Ceph
Ceph Community
 
PPTX
Ceph on 64-bit ARM with X-Gene
Ceph Community
 
PDF
Ceph Day Seoul - Ceph: a decade in the making and still going strong
Ceph Community
 
PPTX
Ceph Day Taipei - Community Update
Ceph Community
 
PDF
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Community
 
PPTX
Ceph Day Chicago - Supermicro Ceph - Open SolutionsDefined by Workload
Ceph Community
 
PPTX
Ceph Day Chicago - Ceph at work at Bloomberg
Ceph Community
 
PPTX
Ceph Tech Talk -- Ceph Benchmarking Tool
Ceph Community
 
PDF
2016-JAN-28 -- High Performance Production Databases on Ceph
Ceph Community
 
PPTX
Ceph Day Taipei - Ceph on All-Flash Storage
Ceph Community
 
PDF
Ceph Day Shanghai - Ceph Performance Tools
Ceph Community
 
PDF
iSCSI Target Support for Ceph
Ceph Community
 
PPTX
Ceph Day Taipei - Ceph Tiering with High Performance Architecture
Ceph Community
 
PPTX
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community
 
Ceph Day Melbourne - Scale and performance: Servicing the Fabric and the Work...
Ceph Community
 
Ceph Day Chicago - Deploying flash storage for Ceph without compromising perf...
Ceph Community
 
Ceph Day Shanghai - Ceph in Chinau Unicom Labs
Ceph Community
 
Ceph Day Chicago: Using Ceph for Large Hadron Collider Data
Ceph Community
 
Ceph Day Chicago - Brining Ceph Storage to the Enterprise
Ceph Community
 
Ceph Day Shanghai - Community Update
Ceph Community
 
Ceph Day Shanghai - On the Productization Practice of Ceph
Ceph Community
 
Ceph on 64-bit ARM with X-Gene
Ceph Community
 
Ceph Day Seoul - Ceph: a decade in the making and still going strong
Ceph Community
 
Ceph Day Taipei - Community Update
Ceph Community
 
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Community
 
Ceph Day Chicago - Supermicro Ceph - Open SolutionsDefined by Workload
Ceph Community
 
Ceph Day Chicago - Ceph at work at Bloomberg
Ceph Community
 
Ceph Tech Talk -- Ceph Benchmarking Tool
Ceph Community
 
2016-JAN-28 -- High Performance Production Databases on Ceph
Ceph Community
 
Ceph Day Taipei - Ceph on All-Flash Storage
Ceph Community
 
Ceph Day Shanghai - Ceph Performance Tools
Ceph Community
 
iSCSI Target Support for Ceph
Ceph Community
 
Ceph Day Taipei - Ceph Tiering with High Performance Architecture
Ceph Community
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community
 
Ad

Similar to Reference Architecture: Architecting Ceph Storage Solutions (20)

PPTX
Architecting Ceph Solutions
Red_Hat_Storage
 
PPTX
New Ceph capabilities and Reference Architectures
Kamesh Pemmaraju
 
PPTX
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Red_Hat_Storage
 
PDF
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Community
 
PPTX
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Community
 
PDF
Linux Stammtisch Munich: Ceph - Overview, Experiences and Outlook
Danny Al-Gaaf
 
PPTX
Ceph barcelona-v-1.2
Ranga Swami Reddy Muthumula
 
PPTX
ceph-barcelona-v-1.2
Ranga Swami Reddy Muthumula
 
PPTX
Dfs in iaa_s
Chih-Chieh Huang
 
PDF
Quick-and-Easy Deployment of a Ceph Storage Cluster
Patrick Quairoli
 
PPTX
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Community
 
PPTX
Your 1st Ceph cluster
Mirantis
 
PDF
NAVER Ceph Storage on ssd for Container
Jangseon Ryu
 
PDF
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Ceph Community
 
PDF
Red hat open stack and storage presentation
Mayur Shetty
 
PDF
Ambedded - how to build a true no single point of failure ceph cluster
inwin stack
 
PDF
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Odinot Stanislas
 
PDF
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
Lars Marowsky-Brée
 
PDF
Ceph Day London 2014 - Deploying ceph in the wild
Ceph Community
 
Architecting Ceph Solutions
Red_Hat_Storage
 
New Ceph capabilities and Reference Architectures
Kamesh Pemmaraju
 
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Red_Hat_Storage
 
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Community
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Community
 
Linux Stammtisch Munich: Ceph - Overview, Experiences and Outlook
Danny Al-Gaaf
 
Ceph barcelona-v-1.2
Ranga Swami Reddy Muthumula
 
ceph-barcelona-v-1.2
Ranga Swami Reddy Muthumula
 
Dfs in iaa_s
Chih-Chieh Huang
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Patrick Quairoli
 
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Community
 
Your 1st Ceph cluster
Mirantis
 
NAVER Ceph Storage on ssd for Container
Jangseon Ryu
 
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Ceph Community
 
Red hat open stack and storage presentation
Mayur Shetty
 
Ambedded - how to build a true no single point of failure ceph cluster
inwin stack
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Odinot Stanislas
 
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
Lars Marowsky-Brée
 
Ceph Day London 2014 - Deploying ceph in the wild
Ceph Community
 
Ad

Recently uploaded (20)

PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 

Reference Architecture: Architecting Ceph Storage Solutions

  • 1. Reference  Architectures:   Architec/ng  Ceph  Storage  Solu/ons   Brent  Compton   Director  Storage  Solu/ons,  Red  Hat     Kyle  Bader   Senior  Solu/on  Architect,  Red  Hat  
  • 2. RefArch  Building  Blocks   Servers  and  Media  (HDD,  SSD,  PCIe)   Network   Bare  metal   OpenStack  virt   Container  virt   Other  virt   Defined   Workloads   OS/Virt   PlaMorm   Network   Ceph  
  • 3. RefArch  Flavors   •  How-­‐to  integra/on  guides   (Ceph+OS/Virt,  or  Ceph+OS/Virt+Workloads)   hQp://www.dell.com/learn/us/en/04/shared-­‐content~data-­‐sheets~en/documents~dell-­‐red-­‐hat-­‐cloud-­‐ solu/ons.pdf       •  Performance  and  sizing  guides   (Network+Server+Ceph+[OS/Virt])   hQp://www.redhat.com/en/resources/red-­‐hat-­‐ceph-­‐storage-­‐clusters-­‐supermicro-­‐storage-­‐servers    hQps://www.redhat.com/en/resources/cisco-­‐ucs-­‐c3160-­‐rack-­‐server-­‐red-­‐hat-­‐ceph-­‐storage     hQps://www.scalableinforma/cs.com/assets/documents/Unison-­‐Ceph-­‐Performance.pdf    
  • 4. 1.  Qualify  need  for  scale-­‐out  storage   2.  Design  for  target  workload  IO  profile(s)   3.  Choose  storage  access  method(s)   4.  Iden/fy  capacity   5.  Determine  fault-­‐domain  risk  tolerance   6.  Select  data  protec/on  method   •  Target  server  and  network  hardware  architecture   (performance  and  sizing)   Design  Considera/ons  
  • 5. 1.  Qualify  Need  for  Scale-­‐out   •  Elas/c  provisioning  across  storage  server  cluster   •  Data  HA  across  ‘islands’  of  scale-­‐up  storage  servers   •  Standardized  servers  and  networking   •  Performance  and  capacity  scaled  independently   •  Incremental  v.  forklie  upgrades  
  • 7. PAST:  SCALE  UP   FUTURE:  SCALE  OUT   Designed  for  Agility  
  • 8. 2.  Design  for  Workload  IO   •  Performance  v.  ‘cheap-­‐and-­‐deep’?   •  Performance:  throughput  v.  IOPS  intensive?   •  Sequen/al  v.  random?   •  Small  block  v.  large  block?   •  Read  v.  write  mix?   •  Latency:  absolute  v.  consistent  targets?   •  Sync  v.  async?  
  • 9. Generalized  Workload  IO   Categories   IOPS     Op/mized   Throughput   Op/mized   Cost-­‐Capacity   Op/mized  
  • 10. •  Highest  performance  (MB/sec  or  IOPS)   •  CapEx:  Lowest  $/performance-­‐unit   •  OpEx:  Highest  performance/BTU   •  OpEx:  Highest  performance/waQ   •  Meets  minimum  server-­‐fault  domain   recommenda)on  (1  server  <=  10%  cluster)   Performance-­‐Op/mized  
  • 11. •  CapEx:  Lowest  $/TB   •  OpEx:  Lowest  BTU/TB   •  OpEx:  Lowest  waQ/TB   •  OpEx:  Highest  TB/Rack-­‐unit   •  Meets  minimum  server-­‐fault  domain   recommenda)on  (1  server  <=  15%  cluster)   Cost/Capacity-­‐Op/mized  
  • 12. 3.  Storage  Access  Methods   distributed  file*   object   block**   software-defined storage cluster *CephFS  not  yet  supported  by  RHCS   **  RBD  supported  with  replicated  data  protec/on  
  • 13. 4.  Iden/fy  Capacity   OpenStack   Starter   100TB   S   500TB   M   1PTB   L   2PB+  
  • 14. 5.  Fault-­‐Domain  Risk  Tolerance   •  What  %  of  cluster  capacity  does  you  want  on  a  single  node?   When  a  server  fails:   •  More  workload  performance  impairment  during  backfill/recovery  with  fewer  nodes  in   the  cluster  (each  node  has  greater  %  of  its  compute/IO  u/liza/on  devoted  to  recovery).     •  Larger  %  of  cluster’s  reserve  storage  capacity  u/lized  during  backfill/recovery  with  fewer   nodes  in  the  cluster  (must  reserve  larger  %  of  capacity  for  recovery  with  fewer  nodes).   •  Guidelines:   –  Minimum  supported  (RHCS):  3  OSD  nodes  per  Ceph  cluster.   –  Minimum  recommended  (performance  cluster):  10  OSD  nodes  cluster   (1  node  represents  <10%  of  total  cluster  capacity)   –  Minimum  recommended  (cost/capacity  cluster):  7  OSD  nodes  per  cluster   (1  node  represents  <15%  of  total  cluster  capacity)  
  • 15. 6.  Data  Protec/on  Schemes   •  Replica/on   •  Erasure  Coding  (analogous  to  network  RAID)     One  of  the  biggest  choices  affec)ng  TCO  in  the  en)re   solu)on!  
  • 16. •  Replica/on   – 3x  rep  over  JBOD  =  33%  usable:raw  capacity  ra/o     •  Erasure  Coding  (analogous  to  network  RAID)   – 8+3  over  JBOD  =  73%  usable:raw   Data  Protec/on  Schemes  
  • 17. •  Replica/on   –  Ceph  block  storage  default:  3x  rep  over  JBOD  disks.   –  Gluster  file  storage  default:  2x  rep  over  RAID6  bricks.     •  Erasure  Coding  (analogous  to  network  RAID)   –  Data  encoded  into  k  chunks  with  m  parity  chunks  and   spread  onto  different  disks  (frequently  on  different   servers).    Can  tolerate  m  disk  failures  without  data   loss.    8+3  popular.   Data  Protec/on  Schemes  
  • 18. Target  Cluster  Hardware   OSP  Starter   100TB   S   500TB   M   1PTB   L   2PB   IOPS     Op/mized   Throughput   Op/mized   Cost-­‐Capacity   Op/mized  
  • 19. •  Following  are  extracts  from  the  recently   published  Ceph  on  Supermicro  RefArch   •  Based  on  lab  benchmarking  results  from  many   different  configura/ons   RefArch  Examples  
  • 20. Sequen/al  Throughput  (R)   (per  server)   !" #!!" $!!!" $#!!" %!!!" %#!!" &!'(" '!!" (&" &" !"#$%&' ()*%&+',-.%'/0"1' 23',%45%6789':%8;'<=>?5@=A5+'BC:',%>D%>' EF':%AG'9-)>8;?$' $%)$" $!*)$!*" $+)$" $!*)$!*" $+)!" $!*)$!*" ,()%" $!*)$!*" ,()!" $!*)$!*" ,()%" &!*"-./01234" !)%" &!*"-./01234" (!)$%" &!*"-./01234" 5%)!" &!*"-./01234"
  • 21. Sequen/al  Throughput  (R)   (per  OSD/HDD)   !" #!" $!" %!" &!" '!" (!" )!" *!" &!+(" +!!" (&" &" !"#$%&' ()*%&+',-.%'/0"1' 23',%45%6789':%8;'<=>?5@=A5+'BC:'(,D' EF':%AG'9-)>8;?$' #$,#" #!-,#!-" #*,#" #!-,#!-" #*,!" #!-,#!-" %(,$" #!-,#!-" %(,!" #!-,#!-" %(,$" &!-"./012345" (!,#$" 61/37893" (!,#$" &!-"./012345" )$,!" &!-"./012345"
  • 22. Sequen/al  Throughput  (W)   (per  OSD/HDD,  3xRep)   !" #" $!" $#" %!" %#" &!" '!()" (!!" )'" '" !"#$%&' ()*%&+',-.%'/0"1' 234',%56%789:';<-+%'=><?6@>A6+'BCD'(,E' FG'D%A'/H6:8A:I')I'F'+?'@%+'A>I$-&9:'J<-+%'+><?6@>A6+1K':-)<9L?$' $%*$" $!+*$!+" $,*$" $!+*$!+" $,*!" $!+*$!+" &)*%" $!+*$!+" &)*!" $!+*$!+" &)*%" '!+"-./01234" )!*$%" 50.26782" )!*$%" '!+"-./01234" 9%*!" '!+"-./01234"
  • 23. !" #" $!" $#" %!" %#" &!" &#" '!" '#" '!()" (!!" )'" '" !"#$%&' ()*%&+',-.%'/0"1' 234',%56%789:';<-+%'=><?6@>A6+'BCD'(,E' CF'GH3'/I6:8A:J')J'/2H/K#/KHI11'+?'@%+'A>J$-&9:'L<-+%'+><?6@>A6+1M':-)<9N?$' $%*$" $!+*$!+" $,*$" $!+*$!+" $,*!" $!+*$!+" &)*%" $!+*$!+" &)*!" $!+*$!+" &)*%" '!+"-./01234" )!*$%" 50.26782" $%*!" $!+*$!+" )!*$%" '!+"-./01234" 9%*!" '!+"-./01234" Sequen/al  Throughput  (W)   (per  OSD/HDD,  EC)  
  • 27. !"#$%&'()*)+,-. !"#$%&'()* %&'+&#+ ,-./ %*0*,--./ 10*23/ 4*0*53/ 6!3%*!"&7879#: /0'12*3,+)-24'4)-) .;+<=>;"=&*!"&7879#: /0'12*3,+)-24'4)-) 5'6742 89:;).'<8=9!> 5=?'@)+A:!6,-B 890'5C$'DEE 80'?FFG'""EHI(J2 /9'6742 89:;).'<8=9!> /9=K5'@)+A:!6,-B 890'5C$'DEE 80'?FFG'""EHI(J2 K/'6742 89:;).'<8=9!> K/=89K'@)+A:!6,-B 890'5C$'DEE 80'?FFG'""EHI(J2 89L'6742 89:;).'<8=9!> 89L=9LF'@)+A:!6,-B 890'5C$'DEE 80'?FFG'""EHI(J2 ?'"'(7&@*!"&7879#: KM9'2+'*17-2+-24'4)-) 8F'6742 89:;).'<8=9!> 8F=9F'@)+A:!6,-B 890'KC$'DEE F0'""E N'6742 /K:;). 9?'@)+A:!6,-B /K0'KC$'DEE F0'""E N'6742 N9:;). 9?'@)+A:!6,-B N90'KC$'DEE F0'""E I3)6624'O71'"PQQ21'9F8L Ceph  Op/mized  Configs  
  • 28. •  Server  chassis  size   •  CPU   •  Memory   •  Disk   •  SSD  Write  Journals  (Ceph  only)   •  Network   Add’l  Subsystem  Guidelines  
  • 29. •  See  Ceph  on  Supermicro  PorMolio  RefArch   hQp://www.redhat.com/en/resources/red-­‐hat-­‐ceph-­‐storage-­‐clusters-­‐supermicro-­‐storage-­‐servers     •  See  Ceph  on  Cisco  UCS  C3160  Whitepaper   hQp://www.cisco.com/c/en/us/products/collateral/servers-­‐unified-­‐compu/ng/ucs-­‐c-­‐series-­‐rack-­‐servers/whitepaper-­‐C11-­‐735004.html     •  See  Ceph  on  Scalable  Informa/cs  Whitepaper   hQps://www.scalableinforma/cs.com/assets/documents/Unison-­‐Ceph-­‐Performance.pdf     RefArchs  &  Whitepapers