SlideShare a Scribd company logo
Switches, Penguins and
One Bad Cable
Alex Balk
Back on AUGUST 13th 2015
ADI NAVEH
ALEX BALK
ALEX KARASIK
CHEN SHABI
DAFNA FRANK
DORI SHMUEL
GAI RADZI
GERARDO LARACUENTE
GUY MAZUZ
RYAN MCQUILLAN
SHAHAF SAGES
YAFIT MELES
We Are
Core
OUR PURPOSE
OUTBRAIN helps people discover
content that they find interesting.
250 Billion
Content Recommendations
Every Month
½ Billion
People Worldwide
OUTBRAIN
BY THE NUMBERS
THE BEGINNING
Building a Layer 3 network with Cumulus Linux
Building a Layer 3 network with Cumulus Linux
Building a Layer 3 network with Cumulus Linux
AVAILABILITYMANAGEMENT
Two Main Networking Challenges
AVAILABILITYMANAGEMENT
4
racks
80
nodes
320
nodes
1
switch
4
racks
80
nodes
320
nodes
1
switch
AVAILABILITYMANAGEMENT
Node
Stack A Stack B
Backbone
NodeNode
SCALE IS ABOUT
DOING MORE WITH LESS
Building a Layer 3 network with Cumulus Linux
Building a Layer 3 network with Cumulus Linux
6 Million
Metrics generated every minute
150 Releases
To production every day
OUTBRAIN
BY THE NUMBERS
SCALE IS ABOUT
TURNING THE LIGHTS ON
Network (gasp!)
was 100% Manual!
• Every change =
risk
• Switching stack
proprietary
• Debugging = fight
or just a hit-n-miss
• Lead time to set-up
new stack
measured in
weeks!
• No way to scale to
the next 10X
June 2017
OUTBRAIN OFFICES
New Data Center = Clos Fabric — running BGP end-to-end
Node
Leaf A Leaf B
Spine
NodeNode
SpineSpine
No bonding.
No backbone.
Everything is just a router!
All possible paths to all possible destinations constructed —
hop-by-hop
Node
Leaf A
Spine
NodeNode
SpineSpine
Leaf B
ECMP = “Send it down any available path, they’re all the same”.
SIMPLE
PREDICTABLE
SCALABLE
A Network That is Now
DEVICE
MANAGEMENT
CABLE
MANAGEMENT
SETUP TIME
MONITORING
TESTING
SCALE IS ABOUT
BUILDING THE RIGHT CULTURE
SCALE IS ABOUT
CHOOSING THE RIGHT TOOLS
To bootstrap the new datacenter5 DAYS
99% Of code worked as expected
1 Bad cable...out of 3,000
END SOLUTION MODULES
DRAFT 051116
Thank
You
Hardcore Tech Stuff
Slides shamelessly “borrowed”
from Adi Naveh’s internal tech talk
Infranet Team
Gai
Adi Yafit
Chen
Traditional Network Topology
Aggregation
Core
Access
Traditional Network Topology
Access
Aggregation
Core
Services
Clients
North-South Traffic
Load Balancers Load Balancers
ISP
Traditional Network Topology in Data Center
Access
Aggregation
Core
Services
North-South Traffic
East-West Traffic

More Related Content

More from Cumulus Networks (20)

PPTX
Building Scalable Data Center Networks
Cumulus Networks
 
PPTX
Network Architecture for Containers
Cumulus Networks
 
PPTX
Webinar: Network Automation [Tips & Tricks]
Cumulus Networks
 
PPTX
July NYC Open Networking Meeup
Cumulus Networks
 
PPTX
Demystifying Networking Webinar Series- Routing on the Host
Cumulus Networks
 
PDF
Ifupdown2: Network Interface Manager
Cumulus Networks
 
PPTX
Operationalizing VRF in the Data Center
Cumulus Networks
 
PPTX
Microservices Network Architecture 101
Cumulus Networks
 
PPTX
Linux networking is Awesome!
Cumulus Networks
 
PPTX
Webinar-Linux Networking is Awesome
Cumulus Networks
 
PDF
Webinar- Tea for the Tillerman
Cumulus Networks
 
PDF
Dreamhost deploying dreamcompute at scale
Cumulus Networks
 
PDF
Operationalizing BGP in the SDDC
Cumulus Networks
 
PDF
Manage your switches like servers
Cumulus Networks
 
PDF
Cumulus Linux 2.5.5 What's New
Cumulus Networks
 
PDF
Cumulus Linux 2.5.4
Cumulus Networks
 
PPTX
Cumulus Linux 2.5.3
Cumulus Networks
 
PDF
Open Networking for Your OpenStack
Cumulus Networks
 
PDF
Big data, better networks
Cumulus Networks
 
PDF
Mlag invisibile layer 2 redundancy
Cumulus Networks
 
Building Scalable Data Center Networks
Cumulus Networks
 
Network Architecture for Containers
Cumulus Networks
 
Webinar: Network Automation [Tips & Tricks]
Cumulus Networks
 
July NYC Open Networking Meeup
Cumulus Networks
 
Demystifying Networking Webinar Series- Routing on the Host
Cumulus Networks
 
Ifupdown2: Network Interface Manager
Cumulus Networks
 
Operationalizing VRF in the Data Center
Cumulus Networks
 
Microservices Network Architecture 101
Cumulus Networks
 
Linux networking is Awesome!
Cumulus Networks
 
Webinar-Linux Networking is Awesome
Cumulus Networks
 
Webinar- Tea for the Tillerman
Cumulus Networks
 
Dreamhost deploying dreamcompute at scale
Cumulus Networks
 
Operationalizing BGP in the SDDC
Cumulus Networks
 
Manage your switches like servers
Cumulus Networks
 
Cumulus Linux 2.5.5 What's New
Cumulus Networks
 
Cumulus Linux 2.5.4
Cumulus Networks
 
Cumulus Linux 2.5.3
Cumulus Networks
 
Open Networking for Your OpenStack
Cumulus Networks
 
Big data, better networks
Cumulus Networks
 
Mlag invisibile layer 2 redundancy
Cumulus Networks
 

Recently uploaded (20)

PDF
Français Patch Tuesday - Juillet
Ivanti
 
PPTX
Building a Production-Ready Barts Health Secure Data Environment Tooling, Acc...
Barts Health
 
PPTX
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
PDF
Meetup Kickoff & Welcome - Rohit Yadav, CSIUG Chairman
ShapeBlue
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
Human-centred design in online workplace learning and relationship to engagem...
Tracy Tang
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
PPTX
Top Managed Service Providers in Los Angeles
Captain IT
 
PPTX
Building and Operating a Private Cloud with CloudStack and LINBIT CloudStack ...
ShapeBlue
 
PPTX
Darren Mills The Migration Modernization Balancing Act: Navigating Risks and...
AWS Chicago
 
PPTX
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
Rethinking Security Operations - SOC Evolution Journey.pdf
Haris Chughtai
 
PDF
Wojciech Ciemski for Top Cyber News MAGAZINE. June 2025
Dr. Ludmila Morozova-Buss
 
PDF
July Patch Tuesday
Ivanti
 
PPT
Interview paper part 3, It is based on Interview Prep
SoumyadeepGhosh39
 
PPTX
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Français Patch Tuesday - Juillet
Ivanti
 
Building a Production-Ready Barts Health Secure Data Environment Tooling, Acc...
Barts Health
 
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
Meetup Kickoff & Welcome - Rohit Yadav, CSIUG Chairman
ShapeBlue
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Human-centred design in online workplace learning and relationship to engagem...
Tracy Tang
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
Top Managed Service Providers in Los Angeles
Captain IT
 
Building and Operating a Private Cloud with CloudStack and LINBIT CloudStack ...
ShapeBlue
 
Darren Mills The Migration Modernization Balancing Act: Navigating Risks and...
AWS Chicago
 
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Rethinking Security Operations - SOC Evolution Journey.pdf
Haris Chughtai
 
Wojciech Ciemski for Top Cyber News MAGAZINE. June 2025
Dr. Ludmila Morozova-Buss
 
July Patch Tuesday
Ivanti
 
Interview paper part 3, It is based on Interview Prep
SoumyadeepGhosh39
 
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Ad

Building a Layer 3 network with Cumulus Linux

Editor's Notes

  • #19: Publishers from all around the world
  • #31: In fact, when we designed the new datacenter, We wrote Chef cookbooks to automate provisioning and config We wrote unit and integration tests using Chef’s toolchain. And setup a CI pipeline for the code. We even simulated the entire datacenter, switches, servers and all, Using Vagrant. It worked so well, that bootstrapping the new datacenter took us just 5 days. Think about it. The first time we ever saw a real Dell switch running Linux, was when we arrived onsite for the buildout. And yet, 99% of our code worked as expected. In 5 days, we were able to setup a LAN, VPN, server provisioning, DNS, LDAP, and dealt with some quirky BIOS configs. On the servers, mind you, not the switches. We even hooked Cumulus’ built-in cabling validation, To our Prometheus based monitoring system. So that right after we turned monitoring on, we got an alert. On one bad cable. Out of 3000.