SlideShare a Scribd company logo
Hadoop Networking at
       DataSift
"How I learned to stop worrying and love
            Arista Switches"
About Me
Gareth Llewellyn
Senior Operations Engineer at DataSift

@NetworkString
about.me/GarethLlewellyn
blog.NetworksAreMadeOfString.co.uk
Overview
●   What is DataSift
●   The DataSift platform in numbers
●   Our initial network design
●   Issues with the initial design
●   Considered designs
●   Implementation
●   Questions
What is DataSift
● Real time and historical curation and filtering
  of many sources e.g. Facebook, Twitter,
  YouTube etc
● Augmentation of data e.g. demographic, link
  resolution
● Real time streams via Web Sockets, HTTP
  POST / PUT, SFTP etc
● Historical queries against data from as long
  ago as 2010
Platform in Numbers; Servers
● ~7k 2.13 - 2.8 Ghz Cores

● ~8Tb RAM

● ~2Pb Storage

● ~380 amps peak draw

● Heterogeneous mix of chassis; Intel
  SR2600URLXR, Dell R710s & DL380 Gen 7
  / Gen8's
Platform in Numbers; Traffic
● Writes
  ○ ~300Mb/s inbound streams

● Replication
  ○ Peaks of 24Gb/s

● Map Reduce
  ○ Peaks of 70Gb/s

● Exports
Initial Network Design
Buffers & Discards
Moving through the Cisco portfolio
● 2960
  ○ 2.7 mpps
  ○ 32 Gbs
● 3560
  ○ 13.1 mpps
  ○ 32 Gbs
● 3750
  ○ 38.7 mpps
  ○ 32 Gbs
● 4948
  ○ 72 mpps
  ○ 96 Gbs
Redesigning the Network
● Uplink over subscription
   ○ Servers per cab
   ○ Gbit uplinks per server
   ○ Extensibility / Redundancy of uplinks
● Redundancy of TOR / Core / Distribution
   ○ Power
   ○ Chassis
   ○ Management Controllers
● Performance
   ○ Buffers
   ○ Head of line blocking
● Extensibility / Scalability
   ○ Number of Hosts / Cabs supported
   ○ Backplane
Redundancy: Mesh
Pros:
● Inter cab transit is truly cab to cab
● OSPF reduces admin overhead
● Cisco IOS

Cons:
● OSPF licence adds cost and
  increases complexity
● Uplink over subscription
● More ports for routing than
  hosts
Uplink Oversubscription: Chassis
Pros:
● 720,000,000 pps
● 80Gbs of inter blade transit
● Cisco IOS
● Dual Supervisors / PSU
Cons:
● Still suffers Head of Line blocking
● Only 2 PSUs
● Overpopulated line cards increase
   failure impact
● Chassis failure (unlikely) = disaster
● And....
Cables!
Where next?
If I have seen further it is by standing on the
shoulders of Giants
                                    - Issac Newton

Benoit Sigoure’s presentation at a Hadoop user
group in 2011
Leaf and Spine
Arista 7050's & 7048's




●   2x 52x port 10Gbit 7050 core switches
●   12x 48x 1Gb / 4x 10Gb port TOR switches
●   /27 public subnet per rack
●   ECMP routes to all racks
●   Dual PSU with disparate PDU / Dist Board / UPS /
    Generator
The Core
Arista 7050
●   Each 7050 is a separate layer 2 network
●   SVI's on VLAN for Internet routing
●   Static routes
●   1.2 Tbs throughput / 960 mpps forwarding
Top of Rack
Arista 7048
●   VLAN number = Cab Number
●   SVI consumes 1 IP from /27
●   Static ECMP routes to all other cabs
●   Minimum of 2x 10Gbit uplinks
●   176Gbs throughput
●   132 mpps forwarding
Questions




Yes, we're hiring ;)

More Related Content

What's hot (9)

PDF
Argus Production Monitoring at Salesforce
HBaseCon
 
PDF
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
ScyllaDB
 
PPTX
Scylla Summit 2022: Scylla 5.0 New Features, Part 1
ScyllaDB
 
PPTX
Zeotap: Moving to ScyllaDB - A Graph of Billions Scale
ScyllaDB
 
PPTX
Kafka website activity architecture
Omid Vahdaty
 
PDF
HBaseConAsia2018 Track3-3: HBase at China Life Insurance
Michael Stack
 
PPTX
Migration to Redshift from SQL Server
joeharris76
 
PDF
Presto: SQL-on-Anything. Netherlands Hadoop User Group Meetup
Wojciech Biela
 
PDF
HBaseConAsia2018 Track1-3: HBase at Xiaomi
Michael Stack
 
Argus Production Monitoring at Salesforce
HBaseCon
 
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
ScyllaDB
 
Scylla Summit 2022: Scylla 5.0 New Features, Part 1
ScyllaDB
 
Zeotap: Moving to ScyllaDB - A Graph of Billions Scale
ScyllaDB
 
Kafka website activity architecture
Omid Vahdaty
 
HBaseConAsia2018 Track3-3: HBase at China Life Insurance
Michael Stack
 
Migration to Redshift from SQL Server
joeharris76
 
Presto: SQL-on-Anything. Netherlands Hadoop User Group Meetup
Wojciech Biela
 
HBaseConAsia2018 Track1-3: HBase at Xiaomi
Michael Stack
 

Viewers also liked (6)

PPTX
Data Does Xmas - Winners Londata Awards 2012
Stewart Townsend
 
PDF
Building Hadoop Data Applications with Kite
huguk
 
PPTX
October hug
huguk
 
PDF
Fast real-time approximations using Spark streaming
huguk
 
PDF
Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for per...
huguk
 
KEY
Big Data Trends
David Feinleib
 
Data Does Xmas - Winners Londata Awards 2012
Stewart Townsend
 
Building Hadoop Data Applications with Kite
huguk
 
October hug
huguk
 
Fast real-time approximations using Spark streaming
huguk
 
Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for per...
huguk
 
Big Data Trends
David Feinleib
 
Ad

Similar to Hadoop Networking at Datasift (20)

PDF
DPDK Summit - 08 Sept 2014 - NTT - High Performance vSwitch
Jim St. Leger
 
PDF
100 M pps on PC.
Redge Technologies
 
PDF
Bare Metal Club ATX: Networking Discussion
Carl Perry
 
PPTX
DPDK summit 2015: It's kind of fun to do the impossible with DPDK
Lagopus SDN/OpenFlow switch
 
PDF
DPDK Summit 2015 - NTT - Yoshihiro Nakajima
Jim St. Leger
 
PDF
Introduction to Internet of Things
Nagasai Arun Panchakarla
 
ODP
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
brouer
 
PDF
100G Networking Berlin.pdf
JunZhao68
 
PDF
Lagopus presentation on 14th Annual ON*VECTOR International Photonics Workshop
Lagopus SDN/OpenFlow switch
 
PDF
PLNOG 13: Alexis Dacquay: Handling high-bandwidth-consumption applications in...
PROIDEA
 
PPTX
LAN, WAN, SAN upgrades: hyperconverged vs traditional vs cloud
Jisc
 
PDF
An FPGA for high end Open Networking
rinnocente
 
PPTX
Keeping the Internet Fast and Resilient for You and Your Customers
Cloudflare
 
PDF
Adding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux Device
Samsung Open Source Group
 
PDF
Run Your Own 6LoWPAN Based IoT Network
Samsung Open Source Group
 
PPTX
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...
PROIDEA
 
PDF
Introducing the Future of Data Center Interconnect Networks
ADVA
 
PDF
Netflix Open Source Meetup Season 4 Episode 2
aspyker
 
PDF
@IBM Power roadmap 8
Diego Alberto Tamayo
 
PDF
Ethos_cdnliveIsrael
Rafie Grinvald
 
DPDK Summit - 08 Sept 2014 - NTT - High Performance vSwitch
Jim St. Leger
 
100 M pps on PC.
Redge Technologies
 
Bare Metal Club ATX: Networking Discussion
Carl Perry
 
DPDK summit 2015: It's kind of fun to do the impossible with DPDK
Lagopus SDN/OpenFlow switch
 
DPDK Summit 2015 - NTT - Yoshihiro Nakajima
Jim St. Leger
 
Introduction to Internet of Things
Nagasai Arun Panchakarla
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
brouer
 
100G Networking Berlin.pdf
JunZhao68
 
Lagopus presentation on 14th Annual ON*VECTOR International Photonics Workshop
Lagopus SDN/OpenFlow switch
 
PLNOG 13: Alexis Dacquay: Handling high-bandwidth-consumption applications in...
PROIDEA
 
LAN, WAN, SAN upgrades: hyperconverged vs traditional vs cloud
Jisc
 
An FPGA for high end Open Networking
rinnocente
 
Keeping the Internet Fast and Resilient for You and Your Customers
Cloudflare
 
Adding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux Device
Samsung Open Source Group
 
Run Your Own 6LoWPAN Based IoT Network
Samsung Open Source Group
 
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...
PROIDEA
 
Introducing the Future of Data Center Interconnect Networks
ADVA
 
Netflix Open Source Meetup Season 4 Episode 2
aspyker
 
@IBM Power roadmap 8
Diego Alberto Tamayo
 
Ethos_cdnliveIsrael
Rafie Grinvald
 
Ad

More from huguk (20)

PDF
Data Wrangling on Hadoop - Olivier De Garrigues, Trifacta
huguk
 
PDF
ether.camp - Hackathon & ether.camp intro
huguk
 
PPTX
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
huguk
 
PPTX
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
huguk
 
PDF
Extracting maximum value from data while protecting consumer privacy. Jason ...
huguk
 
PDF
Intelligence Augmented vs Artificial Intelligence. Alex Flamant, IBM Watson
huguk
 
PDF
Streaming Dataflow with Apache Flink
huguk
 
PPTX
Lambda architecture on Spark, Kafka for real-time large scale ML
huguk
 
PDF
Today’s reality Hadoop with Spark- How to select the best Data Science approa...
huguk
 
PDF
Jonathon Southam: Venture Capital, Funding & Pitching
huguk
 
PDF
Signal Media: Real-Time Media & News Monitoring
huguk
 
PDF
Dean Bryen: Scaling The Platform For Your Startup
huguk
 
PDF
Peter Karney: Intro to the Digital catapult
huguk
 
PDF
Cytora: Real-Time Political Risk Analysis
huguk
 
PDF
Cubitic: Predictive Analytics
huguk
 
PDF
Bird.i: Earth Observation Data Made Social
huguk
 
PDF
Aiseedo: Real Time Machine Intelligence
huguk
 
PDF
Secrets of Spark's success - Deenar Toraskar, Think Reactive
huguk
 
PDF
TV Marketing and big data: cat and dog or thick as thieves? Krzysztof Osiewal...
huguk
 
PPTX
Hadoop - Looking to the Future By Arun Murthy
huguk
 
Data Wrangling on Hadoop - Olivier De Garrigues, Trifacta
huguk
 
ether.camp - Hackathon & ether.camp intro
huguk
 
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
huguk
 
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
huguk
 
Extracting maximum value from data while protecting consumer privacy. Jason ...
huguk
 
Intelligence Augmented vs Artificial Intelligence. Alex Flamant, IBM Watson
huguk
 
Streaming Dataflow with Apache Flink
huguk
 
Lambda architecture on Spark, Kafka for real-time large scale ML
huguk
 
Today’s reality Hadoop with Spark- How to select the best Data Science approa...
huguk
 
Jonathon Southam: Venture Capital, Funding & Pitching
huguk
 
Signal Media: Real-Time Media & News Monitoring
huguk
 
Dean Bryen: Scaling The Platform For Your Startup
huguk
 
Peter Karney: Intro to the Digital catapult
huguk
 
Cytora: Real-Time Political Risk Analysis
huguk
 
Cubitic: Predictive Analytics
huguk
 
Bird.i: Earth Observation Data Made Social
huguk
 
Aiseedo: Real Time Machine Intelligence
huguk
 
Secrets of Spark's success - Deenar Toraskar, Think Reactive
huguk
 
TV Marketing and big data: cat and dog or thick as thieves? Krzysztof Osiewal...
huguk
 
Hadoop - Looking to the Future By Arun Murthy
huguk
 

Recently uploaded (20)

PDF
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
PDF
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
PPT
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
PDF
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
PDF
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 

Hadoop Networking at Datasift

  • 1. Hadoop Networking at DataSift "How I learned to stop worrying and love Arista Switches"
  • 2. About Me Gareth Llewellyn Senior Operations Engineer at DataSift @NetworkString about.me/GarethLlewellyn blog.NetworksAreMadeOfString.co.uk
  • 3. Overview ● What is DataSift ● The DataSift platform in numbers ● Our initial network design ● Issues with the initial design ● Considered designs ● Implementation ● Questions
  • 4. What is DataSift ● Real time and historical curation and filtering of many sources e.g. Facebook, Twitter, YouTube etc ● Augmentation of data e.g. demographic, link resolution ● Real time streams via Web Sockets, HTTP POST / PUT, SFTP etc ● Historical queries against data from as long ago as 2010
  • 5. Platform in Numbers; Servers ● ~7k 2.13 - 2.8 Ghz Cores ● ~8Tb RAM ● ~2Pb Storage ● ~380 amps peak draw ● Heterogeneous mix of chassis; Intel SR2600URLXR, Dell R710s & DL380 Gen 7 / Gen8's
  • 6. Platform in Numbers; Traffic ● Writes ○ ~300Mb/s inbound streams ● Replication ○ Peaks of 24Gb/s ● Map Reduce ○ Peaks of 70Gb/s ● Exports
  • 9. Moving through the Cisco portfolio ● 2960 ○ 2.7 mpps ○ 32 Gbs ● 3560 ○ 13.1 mpps ○ 32 Gbs ● 3750 ○ 38.7 mpps ○ 32 Gbs ● 4948 ○ 72 mpps ○ 96 Gbs
  • 10. Redesigning the Network ● Uplink over subscription ○ Servers per cab ○ Gbit uplinks per server ○ Extensibility / Redundancy of uplinks ● Redundancy of TOR / Core / Distribution ○ Power ○ Chassis ○ Management Controllers ● Performance ○ Buffers ○ Head of line blocking ● Extensibility / Scalability ○ Number of Hosts / Cabs supported ○ Backplane
  • 11. Redundancy: Mesh Pros: ● Inter cab transit is truly cab to cab ● OSPF reduces admin overhead ● Cisco IOS Cons: ● OSPF licence adds cost and increases complexity ● Uplink over subscription ● More ports for routing than hosts
  • 12. Uplink Oversubscription: Chassis Pros: ● 720,000,000 pps ● 80Gbs of inter blade transit ● Cisco IOS ● Dual Supervisors / PSU Cons: ● Still suffers Head of Line blocking ● Only 2 PSUs ● Overpopulated line cards increase failure impact ● Chassis failure (unlikely) = disaster ● And....
  • 14. Where next? If I have seen further it is by standing on the shoulders of Giants - Issac Newton Benoit Sigoure’s presentation at a Hadoop user group in 2011
  • 15. Leaf and Spine Arista 7050's & 7048's ● 2x 52x port 10Gbit 7050 core switches ● 12x 48x 1Gb / 4x 10Gb port TOR switches ● /27 public subnet per rack ● ECMP routes to all racks ● Dual PSU with disparate PDU / Dist Board / UPS / Generator
  • 16. The Core Arista 7050 ● Each 7050 is a separate layer 2 network ● SVI's on VLAN for Internet routing ● Static routes ● 1.2 Tbs throughput / 960 mpps forwarding
  • 17. Top of Rack Arista 7048 ● VLAN number = Cab Number ● SVI consumes 1 IP from /27 ● Static ECMP routes to all other cabs ● Minimum of 2x 10Gbit uplinks ● 176Gbs throughput ● 132 mpps forwarding