SlideShare a Scribd company logo
www.dataloop.io | @dataloopio | info@dataloop.io
Monitoring for Online Services
Obligatory about me slide..
•  Born a SysAdmin
•  Always worked in software companies
•  Head of Ops a bunch of times
•  Helped launch quite a few SaaS products
•  Founded Dataloop.IO end of 2013
•  Worked solely on monitoring ever since
Context of talk
Companies running online services at scale who are are on public /
private cloud, moving to micro-services and ‘doing’ devops
The World As We Know It
Application
MySQL
Database
OpsView
(Nagios)
Logstash ElasticSearch Kibana
AppDynamicsPingdom
GraphiteCollectD
PagerDuty
Amazon AWS
Alfresco JVM
SOLR
Transformations
Browser
Google
Analytics
Custom Scripts
Reporting
System
SQL DB’s
Mixpanel GoSquared
Geckoboard
Netflix’s “Monitoring System”
“We aggregate the call volume from all of our offices and when that changes
significantly we investigate. It then might be a case of letting Microsoft know
that Xbox is down”
Expedia’s “Monitoring System”
“We have a very consistent shape to our bookings graph. If that changes by
more than a few % throughout the day we know something is up even if
nothing else is alerting”
Our Sample
The Results
http://blog.dataloop.io/2014/01/30/what-we-learnt-talking-to-60-companies-about-monitoring/
How Your Tools Change with No. of Servers
Your Typical Monitoring Stack
Is my site up or down?
(External)
What happened?
(Logs)
How is my application performing?
(APM)
What’s my app actually doing?
(Custom Metrics)
Is everything working as expected?
(Service)
Dashing
(Custom Dashboards)
Your Automation Today
•  Tied to releases
•  Often a bit of a mess
•  Ops centric
•  High friction
•  Bus factor
DevOps Cardiff - Monitoring Automation for DevOps
Attempts to improve things
•  Educate, educate, educate! (and then encourage devs to learn cm)
•  Automation and ‘just drop the script here’ in the repo
•  AppDynamics and New Relic
•  Sensu, Check_MK and custom portals / wiki pages etc
Why?
Making Software Is Comparable to Manufacturing
Books!
How Software Is Developed
The Solution
Technology only gets you so far. To be really successful
you need to devolve monitoring down to the teams that
own their part of the service.
The key to good monitoring
Why You Need An Operational Historian
http://en.wikipedia.org/wiki/Operational_historian
Developers Must Own Your Monitoring
•  Devs outnumber Ops 7 – 1 in micro-services
•  The best person to write the check script
•  Responsible for the service not just code
•  Production data makes everyone cleverer
•  Bridges the gap between Dev + Ops
•  Data driven decisions!
Our focus at Dataloop
Simple setup by Ops team (agent install and tags)
Self service metric collection via ‘create in the browser’ script
editor
Decoupled script deployment from cm via drag and drop
Real-time dashboards that anyone can create
Simple alerting rules that teams can manage themselves
The infinite loop
Operational Intelligence != Analytics
www.dataloop.io
@dataloopio
info@dataloop.io
Red Stops The floor
Public Status
Dataloop Crown Jewells
Deployment Radiator
Visually Defeating GHOST
Q&A

More Related Content

What's hot (19)

PPTX
Serverless Architecture - Azure Logic apps
Puneet Ghanshani
 
PPTX
Azure Functions: Beginners to Advanced – Part 1
BizTalk360
 
PPTX
Azure Mobile Apps with Xamarin
danhermes
 
PPTX
Logic Apps and Azure Functions
Daniel Toomey
 
PDF
Ben Kehoe - Serverless Architecture for the Internet of Things
ServerlessConf
 
PPT
BUILDING SERVERLESS SOLUTIONS WITH AZURE FUNCTIONS
CodeOps Technologies LLP
 
PPTX
TechDays 2017 - Going Serverless (2/2): Hands-on with Azure Event Grid
Rick van den Bosch
 
PDF
"Hacking" JIRA and Confluence Cloud Part 2 - Build Your Own - Luke Kilpatrick
Atlassian
 
PPTX
Serverless CQRS in Azure!
BizTalk360
 
PPTX
Microservices & Streaming Data
Leon Mergen
 
PDF
Atlassian Connect – Add Ons For Every Platform - Tanguy Crusson
Atlassian
 
PDF
How to Grow a Serverless Team
SheenBrisals
 
PDF
How LEGO.com Accelerates With Serverless
SheenBrisals
 
PPTX
MongoDB World 2018: Using Puppet, Ansible and Ops Manager to Create Your Own ...
MongoDB
 
PPT
Dave Nielsen - the economically unstoppable cloud
Olga Lavrentieva
 
PDF
"Hacking" JIRA and Confluence Cloud Part 1 - Connect Your Apps - Travis Smith
Atlassian
 
PPTX
Azure Logic Apps
Marco Parenzan
 
PPTX
Dude, Where's my Server?
Rick van den Bosch
 
PPTX
MongoDB World 2018: Replatforming: Switching to MongoDB for Flexibility, Scal...
MongoDB
 
Serverless Architecture - Azure Logic apps
Puneet Ghanshani
 
Azure Functions: Beginners to Advanced – Part 1
BizTalk360
 
Azure Mobile Apps with Xamarin
danhermes
 
Logic Apps and Azure Functions
Daniel Toomey
 
Ben Kehoe - Serverless Architecture for the Internet of Things
ServerlessConf
 
BUILDING SERVERLESS SOLUTIONS WITH AZURE FUNCTIONS
CodeOps Technologies LLP
 
TechDays 2017 - Going Serverless (2/2): Hands-on with Azure Event Grid
Rick van den Bosch
 
"Hacking" JIRA and Confluence Cloud Part 2 - Build Your Own - Luke Kilpatrick
Atlassian
 
Serverless CQRS in Azure!
BizTalk360
 
Microservices & Streaming Data
Leon Mergen
 
Atlassian Connect – Add Ons For Every Platform - Tanguy Crusson
Atlassian
 
How to Grow a Serverless Team
SheenBrisals
 
How LEGO.com Accelerates With Serverless
SheenBrisals
 
MongoDB World 2018: Using Puppet, Ansible and Ops Manager to Create Your Own ...
MongoDB
 
Dave Nielsen - the economically unstoppable cloud
Olga Lavrentieva
 
"Hacking" JIRA and Confluence Cloud Part 1 - Connect Your Apps - Travis Smith
Atlassian
 
Azure Logic Apps
Marco Parenzan
 
Dude, Where's my Server?
Rick van den Bosch
 
MongoDB World 2018: Replatforming: Switching to MongoDB for Flexibility, Scal...
MongoDB
 

Similar to DevOps Cardiff - Monitoring Automation for DevOps (20)

PDF
Transforming operations into devOps iteratively
Outlyer
 
PDF
Using Customer Development to get Traction in a Crowded Space
Outlyer
 
PPTX
Monitoring in the DevOps Era
Mike Kavis
 
PDF
Nagios Conference 2007 | Enterprise Application Monitoring with Nagios by Jam...
NETWAYS
 
PPTX
Unified Operations Vision
Steve Mushero
 
PDF
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
Demi Ben-Ari
 
PPTX
What does "monitoring" mean? (FOSDEM 2017)
Brian Brazil
 
PPTX
Evolution of Monitoring and Prometheus (Dublin 2018)
Brian Brazil
 
PDF
How to address operational aspects effectively with Agile practices - Matthew...
Skelton Thatcher Consulting Ltd
 
PPTX
Lunch and Learn and Sneakers
Bill Zajac
 
PDF
WTF is a Microservice - Rafael Schloming, Datawire
Ambassador Labs
 
PDF
The Open-Source Monitoring Landscape
VictorOps
 
PDF
The Open-Source Monitoring Landscape
Mike Merideth
 
PDF
Thinking DevOps in the Era of the Cloud - Demi Ben-Ari
Demi Ben-Ari
 
PPTX
Monitoring Redefined - Austrian Testing Board
Klaus Enzenhofer
 
PDF
Dev and Ops Collaboration and Awareness at Etsy and Flickr
John Allspaw
 
PDF
Completing the Microservices Puzzle: Kubernetes, Prometheus and FreshTracks.io
CA Technologies
 
PPTX
Monitoring Containerized Micro-Services In Azure
Alex Bulankou
 
PPTX
Nagios Conference 2011 - Jeff Sly - Case Study Nagios @ Nu Skin
Nagios
 
PPTX
Prometheus - Open Source Forum Japan
Brian Brazil
 
Transforming operations into devOps iteratively
Outlyer
 
Using Customer Development to get Traction in a Crowded Space
Outlyer
 
Monitoring in the DevOps Era
Mike Kavis
 
Nagios Conference 2007 | Enterprise Application Monitoring with Nagios by Jam...
NETWAYS
 
Unified Operations Vision
Steve Mushero
 
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
Demi Ben-Ari
 
What does "monitoring" mean? (FOSDEM 2017)
Brian Brazil
 
Evolution of Monitoring and Prometheus (Dublin 2018)
Brian Brazil
 
How to address operational aspects effectively with Agile practices - Matthew...
Skelton Thatcher Consulting Ltd
 
Lunch and Learn and Sneakers
Bill Zajac
 
WTF is a Microservice - Rafael Schloming, Datawire
Ambassador Labs
 
The Open-Source Monitoring Landscape
VictorOps
 
The Open-Source Monitoring Landscape
Mike Merideth
 
Thinking DevOps in the Era of the Cloud - Demi Ben-Ari
Demi Ben-Ari
 
Monitoring Redefined - Austrian Testing Board
Klaus Enzenhofer
 
Dev and Ops Collaboration and Awareness at Etsy and Flickr
John Allspaw
 
Completing the Microservices Puzzle: Kubernetes, Prometheus and FreshTracks.io
CA Technologies
 
Monitoring Containerized Micro-Services In Azure
Alex Bulankou
 
Nagios Conference 2011 - Jeff Sly - Case Study Nagios @ Nu Skin
Nagios
 
Prometheus - Open Source Forum Japan
Brian Brazil
 
Ad

More from Outlyer (20)

PPTX
Murat Karslioglu, VP Solutions @ OpenEBS - Containerized storage for containe...
Outlyer
 
PPTX
How & When to Feature Flag
Outlyer
 
PPTX
Why You Need to Stop Using "The" Staging Server
Outlyer
 
PPTX
How GitHub combined with CI empowers rapid product delivery at Credit Karma
Outlyer
 
PPTX
Packaging Services with Nix
Outlyer
 
PDF
Minimum Viable Docker: our journey towards orchestration
Outlyer
 
PDF
Ops is dead. long live ops.
Outlyer
 
PDF
The service mesh: resilient communication for microservice applications
Outlyer
 
PPTX
Microservices: Why We Did It (and should you?)
Outlyer
 
PPTX
Renan Dias: Using Alexa to deploy applications to Kubernetes
Outlyer
 
PDF
Alex Dias: how to build a docker monitoring solution
Outlyer
 
PPTX
How to build a container monitoring solution - David Gildeh, CEO and Co-Found...
Outlyer
 
PDF
Heresy in the church of - Corey Quinn, Principal at The Quinn Advisory Group
Outlyer
 
PDF
Anatomy of a real-life incident -Alex Solomon, CTO and Co-Founder of PagerDuty
Outlyer
 
PDF
A Holistic View of Operational Capabilities—Roy Rapoport, Insight Engineering...
Outlyer
 
PPTX
The Network Knows—Avi Freedman, CEO & Co-Founder of Kentik
Outlyer
 
PPTX
Building a production-ready, fully-scalable Docker Swarm using Terraform & Pa...
Outlyer
 
PDF
Zero Downtime Postgres Upgrades
Outlyer
 
PDF
DOXLON November 2016: Facebook Engineering on cgroupv2
Outlyer
 
PDF
DOXLON November 2016 - ELK Stack and Beats
Outlyer
 
Murat Karslioglu, VP Solutions @ OpenEBS - Containerized storage for containe...
Outlyer
 
How & When to Feature Flag
Outlyer
 
Why You Need to Stop Using "The" Staging Server
Outlyer
 
How GitHub combined with CI empowers rapid product delivery at Credit Karma
Outlyer
 
Packaging Services with Nix
Outlyer
 
Minimum Viable Docker: our journey towards orchestration
Outlyer
 
Ops is dead. long live ops.
Outlyer
 
The service mesh: resilient communication for microservice applications
Outlyer
 
Microservices: Why We Did It (and should you?)
Outlyer
 
Renan Dias: Using Alexa to deploy applications to Kubernetes
Outlyer
 
Alex Dias: how to build a docker monitoring solution
Outlyer
 
How to build a container monitoring solution - David Gildeh, CEO and Co-Found...
Outlyer
 
Heresy in the church of - Corey Quinn, Principal at The Quinn Advisory Group
Outlyer
 
Anatomy of a real-life incident -Alex Solomon, CTO and Co-Founder of PagerDuty
Outlyer
 
A Holistic View of Operational Capabilities—Roy Rapoport, Insight Engineering...
Outlyer
 
The Network Knows—Avi Freedman, CEO & Co-Founder of Kentik
Outlyer
 
Building a production-ready, fully-scalable Docker Swarm using Terraform & Pa...
Outlyer
 
Zero Downtime Postgres Upgrades
Outlyer
 
DOXLON November 2016: Facebook Engineering on cgroupv2
Outlyer
 
DOXLON November 2016 - ELK Stack and Beats
Outlyer
 
Ad

Recently uploaded (20)

PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
July Patch Tuesday
Ivanti
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
July Patch Tuesday
Ivanti
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 

DevOps Cardiff - Monitoring Automation for DevOps