SlideShare a Scribd company logo
{ 
title: ‘ELK Wrestling’, 
author: ‘Steve Elliott’, 
company: ‘LateRooms.com’, 
type: ‘DevOpsLeeds, 
@timestamp: ‘2014-10-13T18:30Z’ 
}
Featuring Live Demo! 
Please tweet! 
Include: “leedsdevops”
Home growing a metrics culture 
Needed visibility of live issues 
Had trialled off the shelf before (Splunk) 
Hadn’t gained traction 
Wanted the data still
Options... 
Tried Splunk 
...Bit pricey, pay for HW and volume of data 
indexed 
Looked at cloud based options, were also 
expensive
It started with Badger...
Logging and Monitoring Project 
Locate and implement the tools we needed 
Started with Cube for metrics (wouldn’t 
recommend) 
Moved onto Logging
Current tooling... 
...Lacking
“But it works”
What can we log? 
Pretty much anything with a timestamp 
Error log 
Web logs 
Proxy logs 
Releases? 
Tweets?
Logstash 
ELK
High level architectural design 
Web servers 
Queue 
Dashboards 
Elasticsearch 
Rest of 
Badger
Real time search and analytics database
Who’s using it? 
Certain other hotel website... 
...Clever people
Working with Elasticsearch 
● RESTful API 
● JSON 
● Many libraries to deal with it (new on 
ElasticLinq for C#)
Sense Chrome Extension
Clustering 
Excellent distributed features 
Easy to use 
Node Self discovery 
Different Node Types 
(Data, Master, Search, Client) 
“Live” 
SSD 
“Archive” 
HDD
More in depth architecture 
IIS 
Logs 
Errors 
WMI 
Collector 
(e.g. Live Server) 
Queue 
Forwarder 
Cube (/TSDB) 
Search 
Analytics 
Rabbit MQ 
Filter & 
Forward
Logstash 
Inputs 
Filters 
Outputs 
e.g.HTTP logs, 
UDP, error logs, 
tweets. 
e.g. UDP, 
elasticsearch, 
graphite, IRC 
(e.g. Filter, grok, lookup 
IP, magic…)
Why the Queue? 
● Resiliancy 
● Single source of data for everyone 
● Logstash used to recommend RabbitMQ, 
now they recommend Redis 
● We still use RabbitMQ, works for us
Kibana 
● Easy to build dashboards 
● Gateway drug to ElasticSearch queries 
● Examples!
ELK Wrestling (Leeds DevOps)
ELK Wrestling (Leeds DevOps)
ELK Wrestling (Leeds DevOps)
But...
Demo
Mistake: Dashboard Fatigue 
Too many dashboards to watch! 
Need to do more on alerting
Mistake: Using elasticsearch as a TSDB 
Lots of graphs just cared about 
top level values, should 
use a TSDB (such as graphite) 
instead 
Elasticsearch use case for more in-depth data 
analysis
Mistake: Trying to keep too much data 
● Nodes going out of memory or disk space is 
bad 
● Long GC can cause nodes to drop 
● Can lead to split brain 
● More shards = more memory 
● usage, watch your scaling
Scaling 
Hit two bottlenecks 
- Ingestion (solved with SSDs) 
- Search (solved by scaling horizontally) 
1.4.0 brings stability improvements, should 
handle oom better
Other Mistakes 
Should have automated 
sooner 
(Good chef/puppet support) 
Should have used 
“normal” logstash more 
More 
node 
More 
awesome??
What went right? 
● Free and easy access to Data 
● Doesn’t need to be on elasticsearch, but the 
tooling makes it easy 
● Give people access and they’ll seek out the 
data to drive decisions - start the feedback 
loop 
● Dev/Test instance
ELK in the wild 
Data Driven QA 
Data Driven...Managering
But wait, theres more! 
Curator, Kibana 4 (Woo - aggregations), 
alerting, linking 
logs together… 
Too much to 
cover here!
Thanks for Listening! 
More: elasticsearch.org, logstash.net 
Blog: www.tegud.net 
Twitter: @tegud 
Github: www.github.com/tegud 
Come say hi!

More Related Content

PPTX
Elk meetup boston - logz.io
tomerlevy9
 
PDF
Interactive learning analytics dashboards with ELK (Elasticsearch Logstash Ki...
Andrii Vozniuk
 
PPTX
Log analysis using Logstash,ElasticSearch and Kibana
Avinash Ramineni
 
ODP
Log aggregation and analysis
Dhaval Mehta
 
PPTX
Toronto High Scalability meetup - Scaling ELK
Andrew Trossman
 
PPTX
Log management with ELK
Geert Pante
 
PPTX
Scaling an ELK stack at bol.com
Renzo Tomà
 
PPTX
Centralised logging with ELK stack
Simon Hanmer
 
Elk meetup boston - logz.io
tomerlevy9
 
Interactive learning analytics dashboards with ELK (Elasticsearch Logstash Ki...
Andrii Vozniuk
 
Log analysis using Logstash,ElasticSearch and Kibana
Avinash Ramineni
 
Log aggregation and analysis
Dhaval Mehta
 
Toronto High Scalability meetup - Scaling ELK
Andrew Trossman
 
Log management with ELK
Geert Pante
 
Scaling an ELK stack at bol.com
Renzo Tomà
 
Centralised logging with ELK stack
Simon Hanmer
 

What's hot (19)

PPTX
How bol.com makes sense of its logs, using the Elastic technology stack.
Renzo Tomà
 
PPTX
ELK at LinkedIn - Kafka, scaling, lessons learned
Tin Le
 
PPTX
Centralized Logging System Using ELK Stack
Rohit Sharma
 
PDF
Introducing ELK
AllBits BVBA (freelancer)
 
PPTX
The ELK Stack - Get to Know Logs
GlobalLogic Ukraine
 
PPTX
Elk
Caleb Wang
 
PDF
Log analysis with the elk stack
Vikrant Chauhan
 
PDF
Elk devops
Ideato
 
PDF
Presto meetup 2015-03-19 @Facebook
Treasure Data, Inc.
 
PDF
Kibana + timelion: time series with the elastic stack
Sylvain Wallez
 
PPTX
Elastic Stack Introduction
Vikram Shinde
 
PDF
Norikra Recent Updates
SATOSHI TAGOMORI
 
PDF
Security monitoring log management-describe logstash,kibana,elastic slidshare
ReZa AdineH
 
PDF
Fluentd and Docker - running fluentd within a docker container
Treasure Data, Inc.
 
PDF
Monitoring with Graylog - a modern approach to monitoring?
inovex GmbH
 
PDF
Centralized logging system using mongoDB
Vivek Parihar
 
PDF
tdtechtalk20160330johan
Johan Gustavsson
 
PDF
Spark Workflow Management
Romi Kuntsman
 
PDF
Technologies, Data Analytics Service and Enterprise Business
SATOSHI TAGOMORI
 
How bol.com makes sense of its logs, using the Elastic technology stack.
Renzo Tomà
 
ELK at LinkedIn - Kafka, scaling, lessons learned
Tin Le
 
Centralized Logging System Using ELK Stack
Rohit Sharma
 
Introducing ELK
AllBits BVBA (freelancer)
 
The ELK Stack - Get to Know Logs
GlobalLogic Ukraine
 
Log analysis with the elk stack
Vikrant Chauhan
 
Elk devops
Ideato
 
Presto meetup 2015-03-19 @Facebook
Treasure Data, Inc.
 
Kibana + timelion: time series with the elastic stack
Sylvain Wallez
 
Elastic Stack Introduction
Vikram Shinde
 
Norikra Recent Updates
SATOSHI TAGOMORI
 
Security monitoring log management-describe logstash,kibana,elastic slidshare
ReZa AdineH
 
Fluentd and Docker - running fluentd within a docker container
Treasure Data, Inc.
 
Monitoring with Graylog - a modern approach to monitoring?
inovex GmbH
 
Centralized logging system using mongoDB
Vivek Parihar
 
tdtechtalk20160330johan
Johan Gustavsson
 
Spark Workflow Management
Romi Kuntsman
 
Technologies, Data Analytics Service and Enterprise Business
SATOSHI TAGOMORI
 
Ad

Similar to ELK Wrestling (Leeds DevOps) (20)

PPTX
ELK Ruminating on Logs (Zendcon 2016)
Mathew Beane
 
PDF
2015 03-16-elk at-bsides
Jeremy Cohoe
 
PPTX
Elk ruminating on logs
Mathew Beane
 
PPTX
ELK Elasticsearch Logstash and Kibana Stack for Log Management
El Mahdi Benzekri
 
PPTX
Elastic stack Presentation
Amr Alaa Yassen
 
PDF
Logs aggregation and analysis
Divante
 
PDF
Scaling ELK Stack - DevOpsDays Singapore
Angad Singh
 
PDF
Elastic.co's ELK Stack - Platform Agnostic Immutable Infrastructure & Analys...
AWS Chicago
 
PPTX
Intro elasticsearch taswarbhatti
Taswar Bhatti
 
PDF
Experiences in ELK with D3.js for Large Log Analysis and Visualization
Surasak Sanguanpong
 
PDF
"How about no grep and zabbix?". ELK based alerts and metrics.
Vladimir Pavkin
 
PDF
Elk stack @inbot
Jilles van Gurp
 
PDF
Growing with elastic search
Devi A S L
 
PPTX
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
ForgeRock
 
PDF
ELK stack introduction
abenyeung1
 
PDF
Applied Elk Stack Data Insights And Business Metrics With Collective Capabili...
klegerrejuso
 
PPTX
The Elastic ELK Stack
enterprisesearchmeetup
 
PDF
Javantura v3 - ELK – Big Data for DevOps – Maarten Mulders
HUJAK - Hrvatska udruga Java korisnika / Croatian Java User Association
 
PDF
Mulesoft ELK
Integration Assistance
 
PDF
Présentation ELK/SIEM et démo Wazuh
Aurélie Henriot
 
ELK Ruminating on Logs (Zendcon 2016)
Mathew Beane
 
2015 03-16-elk at-bsides
Jeremy Cohoe
 
Elk ruminating on logs
Mathew Beane
 
ELK Elasticsearch Logstash and Kibana Stack for Log Management
El Mahdi Benzekri
 
Elastic stack Presentation
Amr Alaa Yassen
 
Logs aggregation and analysis
Divante
 
Scaling ELK Stack - DevOpsDays Singapore
Angad Singh
 
Elastic.co's ELK Stack - Platform Agnostic Immutable Infrastructure & Analys...
AWS Chicago
 
Intro elasticsearch taswarbhatti
Taswar Bhatti
 
Experiences in ELK with D3.js for Large Log Analysis and Visualization
Surasak Sanguanpong
 
"How about no grep and zabbix?". ELK based alerts and metrics.
Vladimir Pavkin
 
Elk stack @inbot
Jilles van Gurp
 
Growing with elastic search
Devi A S L
 
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
ForgeRock
 
ELK stack introduction
abenyeung1
 
Applied Elk Stack Data Insights And Business Metrics With Collective Capabili...
klegerrejuso
 
The Elastic ELK Stack
enterprisesearchmeetup
 
Javantura v3 - ELK – Big Data for DevOps – Maarten Mulders
HUJAK - Hrvatska udruga Java korisnika / Croatian Java User Association
 
Présentation ELK/SIEM et démo Wazuh
Aurélie Henriot
 
Ad

Recently uploaded (20)

PDF
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
PDF
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PPTX
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PDF
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PDF
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
PPTX
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
PDF
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
PDF
REPORT: Heating appliances market in Poland 2024
SPIUG
 
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
REPORT: Heating appliances market in Poland 2024
SPIUG
 

ELK Wrestling (Leeds DevOps)

  • 1. { title: ‘ELK Wrestling’, author: ‘Steve Elliott’, company: ‘LateRooms.com’, type: ‘DevOpsLeeds, @timestamp: ‘2014-10-13T18:30Z’ }
  • 2. Featuring Live Demo! Please tweet! Include: “leedsdevops”
  • 3. Home growing a metrics culture Needed visibility of live issues Had trialled off the shelf before (Splunk) Hadn’t gained traction Wanted the data still
  • 4. Options... Tried Splunk ...Bit pricey, pay for HW and volume of data indexed Looked at cloud based options, were also expensive
  • 5. It started with Badger...
  • 6. Logging and Monitoring Project Locate and implement the tools we needed Started with Cube for metrics (wouldn’t recommend) Moved onto Logging
  • 9. What can we log? Pretty much anything with a timestamp Error log Web logs Proxy logs Releases? Tweets?
  • 11. High level architectural design Web servers Queue Dashboards Elasticsearch Rest of Badger
  • 12. Real time search and analytics database
  • 13. Who’s using it? Certain other hotel website... ...Clever people
  • 14. Working with Elasticsearch ● RESTful API ● JSON ● Many libraries to deal with it (new on ElasticLinq for C#)
  • 16. Clustering Excellent distributed features Easy to use Node Self discovery Different Node Types (Data, Master, Search, Client) “Live” SSD “Archive” HDD
  • 17. More in depth architecture IIS Logs Errors WMI Collector (e.g. Live Server) Queue Forwarder Cube (/TSDB) Search Analytics Rabbit MQ Filter & Forward
  • 18. Logstash Inputs Filters Outputs e.g.HTTP logs, UDP, error logs, tweets. e.g. UDP, elasticsearch, graphite, IRC (e.g. Filter, grok, lookup IP, magic…)
  • 19. Why the Queue? ● Resiliancy ● Single source of data for everyone ● Logstash used to recommend RabbitMQ, now they recommend Redis ● We still use RabbitMQ, works for us
  • 20. Kibana ● Easy to build dashboards ● Gateway drug to ElasticSearch queries ● Examples!
  • 25. Demo
  • 26. Mistake: Dashboard Fatigue Too many dashboards to watch! Need to do more on alerting
  • 27. Mistake: Using elasticsearch as a TSDB Lots of graphs just cared about top level values, should use a TSDB (such as graphite) instead Elasticsearch use case for more in-depth data analysis
  • 28. Mistake: Trying to keep too much data ● Nodes going out of memory or disk space is bad ● Long GC can cause nodes to drop ● Can lead to split brain ● More shards = more memory ● usage, watch your scaling
  • 29. Scaling Hit two bottlenecks - Ingestion (solved with SSDs) - Search (solved by scaling horizontally) 1.4.0 brings stability improvements, should handle oom better
  • 30. Other Mistakes Should have automated sooner (Good chef/puppet support) Should have used “normal” logstash more More node More awesome??
  • 31. What went right? ● Free and easy access to Data ● Doesn’t need to be on elasticsearch, but the tooling makes it easy ● Give people access and they’ll seek out the data to drive decisions - start the feedback loop ● Dev/Test instance
  • 32. ELK in the wild Data Driven QA Data Driven...Managering
  • 33. But wait, theres more! Curator, Kibana 4 (Woo - aggregations), alerting, linking logs together… Too much to cover here!
  • 34. Thanks for Listening! More: elasticsearch.org, logstash.net Blog: www.tegud.net Twitter: @tegud Github: www.github.com/tegud Come say hi!