SlideShare a Scribd company logo
CONFIDENTIAL - FOR GARTNER USE ONLY © Scality 20161
Node.js @Scality
Experiences and Lessons Learned
Giorgio Regni, CTO
Lauren Spiegel, Software Engineer
Disrupting storage – unlimited & everywhere
When to use object storage?
1. Need for capacities beyond 100 TB and
growing fast
2. Very large number of clients accessing
isolated data
3. Object must be > 100KB, otherwise use a
Database
Bucket 1
Object A
Object B
Object C
Bucket 2
Object A
Object B
Object …
Object Z
Our first Node.js project
Copyright Scality 2014Copyright Scality 2014
Our first node.js project - Building a Tivo in the Cloud
• 25 million users -> Designed for high degree of
parallelism
• TB/sec –> Need very efficient network transfer
• Scales out by adding nodes and drives
• Proved 30 GB/sec of ingest with 10 servers and 360
drives
SS1
Scality
FanOut
APPLICATION SERVER
A/V Fragment 1 fragment sent with X fanout
1 2 3 4 5 6 7 1 2 3
fragment is erasure coded (7,3)
data slices code slices
metadata chunk+
1
A B C D E F
HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD
HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD
HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD
Scality
FanOut
Comcast Live
Recorder
+
Chunking
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
SS1
1 to 10 servers
SS2 SS10
Test Case Latency (seconds)
Duration Recordings Batch Size Sockets
RPM per
Client
Threads per
Client
Average at 95% at 99%
2 hours 20,000 2500 1000 270 63 0.159 0.319 0.426
The team
SF & Paris
S3-Server
 AWS S3 compatible server
 Open source
https://github.com/scality/s3
 Can use local storage
S3-MetaData
 A distributed metadata database service
 Supports fast Bucket & object listing
 Stores ACL and Users/Groups
S3-Vault
 Security, Identity & Authentication Service
 Provides Accounts/Keys
 Supports AWS IAM Users & Groups
 Interoperable with user directory services (via SAML)
What we built: Three Key Components
7
What have we learned??
Logging is hard
• Challenges
• Logging is expensive as it taxes the Node.js process
• UDP datagrams have expensive DNS lookups
• Redundant transformations by bunyan and bunyan-logstash
• Solution: Werelogs
• Produces raw JSON logs with the least resistive path
• Forward logs to ELK using Filebeat for indexing
• Avoids expensive and redundant transformations
• Ability to track requests across the components with UIDs
• Dump log history on errors
Open source -> http://github.com/scality/werelogs
Scality S3 Server: Node js Meetup Presentation
Our first Node.Js project
Performance, performance & performance
The performance cycle
Code, Benchmark, … Repeat
• Socket & Nagle algorithm on by default -> very high
latencies
• The event loop can get backed up quickly -> hunt for
all cpu intensive tasks in the main loop
• Buffers are much more efficient when writing server
response
• Micro optimizations: Date.now() > new Date()
• Beware of libraries doing way too many things for you
• ES6 support, Babel5 was killing performance -> Babel6
Scality S3 Server: Node js Meetup Presentation
Nifty Node Tools
Nifty Node Tools
Getting going
• Airbnb JavaScript Style Guide + Eslint
• babel — babel5 to babel6 with just imports,
destructuring and default parameters
• Commander — cool cli tools in minutes
• Async
Nifty Node Tools
Getting serious
• Level — LevelDB wrapper for node
• Memcached — client library for node
• xml — <parse>yes</parse>
• Profiler — Go fast or go home
Nifty Node Tools
Might as well test
• Mocha
• Istanbul
• lolex
• aws-node-sdk
Nifty Node Tools
Docs and Open Source Code
• Docs are good, but
• Code is even better
• Read the readable stream code and take a nap.
• Then read the transform stream code and create new
universes.
What can you do with this thing?
Download the code!
http://s3.scality.com/
https://github.com/scality/s3
Lauren:
github: laurenspiegel
twitter: @notfollowingyet
Giorgio:
github: @giorgioregni
twitter: @giorgioregni

More Related Content

What's hot (20)

PPTX
Openstack and Reddwarf Overview
Craig Vyvial
 
PPTX
Cloud native policy enforcement with Open Policy Agent
LibbySchulze
 
PDF
Dok Talks #111 - Scheduled Scaling with Dask and Argo Workflows
DoKC
 
PDF
Kubernetes on DC/OS
Cloud Technology Experts
 
PDF
Ejecución del Elastic Stack en Kubernetes
Elasticsearch
 
PDF
Ejecución del Elastic Stack en Kubernetes
Elasticsearch
 
PDF
Combinación de logs, métricas y seguimiento para una visibilidad centralizada
Elasticsearch
 
PPTX
Neutron Updates - Liberty Edition
OpenStack Foundation
 
PDF
7 - Monitoring Kubernetes with Elastic
Kangaroot
 
PPTX
Kubera Launch Webinar: Kubernetes native management of Kubernetes native data
MayaData Inc
 
PDF
Cncf storage-final-filip
Juraj Hantak
 
PDF
Getting started with OpenStack
Knoldus Inc.
 
PDF
SFScon16 - Michele Baldessari: "OpenStack – An introduction"
South Tyrol Free Software Conference
 
PPTX
Cinder Updates - Liberty Edition
OpenStack Foundation
 
PPTX
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
InfluxData
 
PPTX
Openstack Swift Introduction
Park YounSung
 
PPTX
Top Considerations For Operating a Kubernetes Environment at Scale
SignalFx
 
PDF
Case Study: Utilizing Mirantis Fuel to install OpenStack Ansible
TeK Charnsilp Chinprasert
 
PDF
Searchlight Updates - Liberty Edition
OpenStack Foundation
 
PDF
Exposing and Controlling Kafka Event Streaming with Kong Konnect Enterprise |...
HostedbyConfluent
 
Openstack and Reddwarf Overview
Craig Vyvial
 
Cloud native policy enforcement with Open Policy Agent
LibbySchulze
 
Dok Talks #111 - Scheduled Scaling with Dask and Argo Workflows
DoKC
 
Kubernetes on DC/OS
Cloud Technology Experts
 
Ejecución del Elastic Stack en Kubernetes
Elasticsearch
 
Ejecución del Elastic Stack en Kubernetes
Elasticsearch
 
Combinación de logs, métricas y seguimiento para una visibilidad centralizada
Elasticsearch
 
Neutron Updates - Liberty Edition
OpenStack Foundation
 
7 - Monitoring Kubernetes with Elastic
Kangaroot
 
Kubera Launch Webinar: Kubernetes native management of Kubernetes native data
MayaData Inc
 
Cncf storage-final-filip
Juraj Hantak
 
Getting started with OpenStack
Knoldus Inc.
 
SFScon16 - Michele Baldessari: "OpenStack – An introduction"
South Tyrol Free Software Conference
 
Cinder Updates - Liberty Edition
OpenStack Foundation
 
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
InfluxData
 
Openstack Swift Introduction
Park YounSung
 
Top Considerations For Operating a Kubernetes Environment at Scale
SignalFx
 
Case Study: Utilizing Mirantis Fuel to install OpenStack Ansible
TeK Charnsilp Chinprasert
 
Searchlight Updates - Liberty Edition
OpenStack Foundation
 
Exposing and Controlling Kafka Event Streaming with Kong Konnect Enterprise |...
HostedbyConfluent
 

Similar to Scality S3 Server: Node js Meetup Presentation (20)

PDF
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly
SolarWinds Loggly
 
PDF
beSharp a serverless approach to big data on aws
Claudio Pontili
 
PPTX
Cloud Security Monitoring and Spark Analytics
amesar0
 
PDF
Making Apache Kafka Even Faster And More Scalable
PaulBrebner2
 
PPTX
Cloud computing UNIT 2.1 presentation in
RahulBhole12
 
PDF
Webinar - DreamObjects/Ceph Case Study
Ceph Community
 
PPTX
Why Kubernetes as a container orchestrator is a right choice for running spar...
DataWorks Summit
 
PPTX
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
DATAVERSITY
 
PPTX
The impact of cloud NSBCon NY by Yves Goeleven
Particular Software
 
PDF
Delivering big content at NBC News with RavenDB
John Bennett
 
PDF
Leveraging Databricks for Spark Pipelines
Rose Toomey
 
PDF
Leveraging Databricks for Spark pipelines
Rose Toomey
 
PPTX
AWS for the Java Developer
Rory Preddy
 
PPTX
EC2 and S3 Level 100
AWS Riyadh User Group
 
PDF
Capital One Delivers Risk Insights in Real Time with Stream Processing
confluent
 
PDF
Run Cloud Native MySQL NDB Cluster in Kubernetes
Bernd Ocklin
 
PDF
John adams talk cloudy
John Adams
 
PPTX
Innovations of .NET and Azure (Recaps of Build 2017 selected sessions)
Jeff Chu
 
PDF
The Data Mullet: From all SQL to No SQL back to Some SQL
Datadog
 
PPTX
Kubernetes Manchester - 6th December 2018
David Stockton
 
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly
SolarWinds Loggly
 
beSharp a serverless approach to big data on aws
Claudio Pontili
 
Cloud Security Monitoring and Spark Analytics
amesar0
 
Making Apache Kafka Even Faster And More Scalable
PaulBrebner2
 
Cloud computing UNIT 2.1 presentation in
RahulBhole12
 
Webinar - DreamObjects/Ceph Case Study
Ceph Community
 
Why Kubernetes as a container orchestrator is a right choice for running spar...
DataWorks Summit
 
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
DATAVERSITY
 
The impact of cloud NSBCon NY by Yves Goeleven
Particular Software
 
Delivering big content at NBC News with RavenDB
John Bennett
 
Leveraging Databricks for Spark Pipelines
Rose Toomey
 
Leveraging Databricks for Spark pipelines
Rose Toomey
 
AWS for the Java Developer
Rory Preddy
 
EC2 and S3 Level 100
AWS Riyadh User Group
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
confluent
 
Run Cloud Native MySQL NDB Cluster in Kubernetes
Bernd Ocklin
 
John adams talk cloudy
John Adams
 
Innovations of .NET and Azure (Recaps of Build 2017 selected sessions)
Jeff Chu
 
The Data Mullet: From all SQL to No SQL back to Some SQL
Datadog
 
Kubernetes Manchester - 6th December 2018
David Stockton
 
Ad

More from Scality (9)

PDF
QuadIron An open source library for number theoretic transform-based erasure ...
Scality
 
PDF
Introducing MetalK8s, An Opinionated Kubernetes Implementation
Scality
 
PPTX
Wally MacDermid presents Scality Connect for Microsoft Azure at Microsoft Ign...
Scality
 
PPTX
Storage that Powers Digital Business: Scality for Enterprise Backup
Scality
 
PPTX
2017 Hackathon Scality & 42 School
Scality
 
PPTX
Leader in Cloud and Object Storage for Service Providers
Scality
 
PPTX
Scality medical imaging storage
Scality
 
PDF
Zenko: Enabling Data Control in a Multi-cloud World
Scality
 
PDF
Hackathon scality holberton seagate 2016 v5
Scality
 
QuadIron An open source library for number theoretic transform-based erasure ...
Scality
 
Introducing MetalK8s, An Opinionated Kubernetes Implementation
Scality
 
Wally MacDermid presents Scality Connect for Microsoft Azure at Microsoft Ign...
Scality
 
Storage that Powers Digital Business: Scality for Enterprise Backup
Scality
 
2017 Hackathon Scality & 42 School
Scality
 
Leader in Cloud and Object Storage for Service Providers
Scality
 
Scality medical imaging storage
Scality
 
Zenko: Enabling Data Control in a Multi-cloud World
Scality
 
Hackathon scality holberton seagate 2016 v5
Scality
 
Ad

Recently uploaded (20)

PPTX
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PPTX
Digital Circuits, important subject in CS
contactparinay1
 
PDF
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PDF
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
PPTX
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Digital Circuits, important subject in CS
contactparinay1
 
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 

Scality S3 Server: Node js Meetup Presentation

  • 1. CONFIDENTIAL - FOR GARTNER USE ONLY © Scality 20161 Node.js @Scality Experiences and Lessons Learned Giorgio Regni, CTO Lauren Spiegel, Software Engineer
  • 2. Disrupting storage – unlimited & everywhere
  • 3. When to use object storage? 1. Need for capacities beyond 100 TB and growing fast 2. Very large number of clients accessing isolated data 3. Object must be > 100KB, otherwise use a Database Bucket 1 Object A Object B Object C Bucket 2 Object A Object B Object … Object Z
  • 5. Copyright Scality 2014Copyright Scality 2014 Our first node.js project - Building a Tivo in the Cloud • 25 million users -> Designed for high degree of parallelism • TB/sec –> Need very efficient network transfer • Scales out by adding nodes and drives • Proved 30 GB/sec of ingest with 10 servers and 360 drives SS1 Scality FanOut APPLICATION SERVER A/V Fragment 1 fragment sent with X fanout 1 2 3 4 5 6 7 1 2 3 fragment is erasure coded (7,3) data slices code slices metadata chunk+ 1 A B C D E F HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD Scality FanOut Comcast Live Recorder + Chunking 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 SS1 1 to 10 servers SS2 SS10 Test Case Latency (seconds) Duration Recordings Batch Size Sockets RPM per Client Threads per Client Average at 95% at 99% 2 hours 20,000 2500 1000 270 63 0.159 0.319 0.426
  • 7. S3-Server  AWS S3 compatible server  Open source https://github.com/scality/s3  Can use local storage S3-MetaData  A distributed metadata database service  Supports fast Bucket & object listing  Stores ACL and Users/Groups S3-Vault  Security, Identity & Authentication Service  Provides Accounts/Keys  Supports AWS IAM Users & Groups  Interoperable with user directory services (via SAML) What we built: Three Key Components 7
  • 8. What have we learned??
  • 9. Logging is hard • Challenges • Logging is expensive as it taxes the Node.js process • UDP datagrams have expensive DNS lookups • Redundant transformations by bunyan and bunyan-logstash • Solution: Werelogs • Produces raw JSON logs with the least resistive path • Forward logs to ELK using Filebeat for indexing • Avoids expensive and redundant transformations • Ability to track requests across the components with UIDs • Dump log history on errors Open source -> http://github.com/scality/werelogs
  • 11. Our first Node.Js project
  • 13. The performance cycle Code, Benchmark, … Repeat • Socket & Nagle algorithm on by default -> very high latencies • The event loop can get backed up quickly -> hunt for all cpu intensive tasks in the main loop • Buffers are much more efficient when writing server response • Micro optimizations: Date.now() > new Date() • Beware of libraries doing way too many things for you • ES6 support, Babel5 was killing performance -> Babel6
  • 16. Nifty Node Tools Getting going • Airbnb JavaScript Style Guide + Eslint • babel — babel5 to babel6 with just imports, destructuring and default parameters • Commander — cool cli tools in minutes • Async
  • 17. Nifty Node Tools Getting serious • Level — LevelDB wrapper for node • Memcached — client library for node • xml — <parse>yes</parse> • Profiler — Go fast or go home
  • 18. Nifty Node Tools Might as well test • Mocha • Istanbul • lolex • aws-node-sdk
  • 19. Nifty Node Tools Docs and Open Source Code • Docs are good, but • Code is even better • Read the readable stream code and take a nap. • Then read the transform stream code and create new universes.
  • 20. What can you do with this thing?