SlideShare a Scribd company logo
Monitoring MySQL with OpenTSDB
Percona live 2013 Geoffrey Anderson, Box Inc.
@geodbz
Who
Geoffrey Anderson
• Database Operations Engineer @ Box, Inc.
• a.k.a. DBA
• Tooling for MySQL and HBase
• #DBHangOps
The
Situation
Monitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDB
Then
You
Get
More
Servers
Monitoring MySQL with OpenTSDB
Enter OpenTSDB
OpenTSDB is...
• Distributed
• Scalable
• Time Series Database
• Runs on HBase
• Created By
Benoit Sigoure
HBase
TSD for
Querying
mydb.example.com
HAProxy
fe1.example.com
TSD for
Storing
Push
Metrics
Query via API
• FAST
• EASY to Scale
• EASY to Populate
• EASY to collect data
• EASY to Query
Why OpenTSDB?
Collecting
Data
#!/usr/bin/env bash
timestamp=$(date +%s)
mysql -ss -e "SHOW GLOBAL STATUS" | while read var val
do
echo "mysql.$var $timestamp $val host=$HOSTNAME"
done
ganderson@mydb.example.com:~$ _./mysql_collector.sh
mysql.Aborted_connects 1366399993 0 host=mydb.example.com
mysql.Binlog_cache_disk_use 1366399993 0 host=mydb.example.com
mysql.Binlog_cache_use 1366399993 0 host=mydb.example.com
mysql.Binlog_stmt_cache_disk_use 1366399993 0 host=mydb.example.com
mysql.Binlog_stmt_cache_use 1366399993 0 host=mydb.example.com
mysql.Bytes_received 1366399993 19453687 host=mydb.example.com
mysql.Bytes_sent 1366399993 1238166682 host=mydb.example.com
mysql.Com_admin_commands 1366399993 1 host=mydb.example.com
mysql.Com_assign_to_keycache 1366399993 0 host=mydb.example.com
...
Example: mysql_collector.sh
#!/usr/bin/env bash
timestamp=$(date +%s)
mysql -ss -e "SHOW GLOBAL STATUS" | while read var val
do
echo "mysql.$var $timestamp $val host=$HOSTNAME"
done
ganderson@mydb.example.com:~$ _./mysql_collector.sh
mysql.Aborted_connects 1366399993 0 host=mydb.example.com
mysql.Binlog_cache_disk_use 1366399993 0 host=mydb.example.com
mysql.Binlog_cache_use 1366399993 0 host=mydb.example.com
mysql.Binlog_stmt_cache_disk_use 1366399993 0 host=mydb.example.com
mysql.Binlog_stmt_cache_use 1366399993 0 host=mydb.example.com
mysql.Bytes_received 1366399993 19453687 host=mydb.example.com
mysql.Bytes_sent 1366399993 1238166682 host=mydb.example.com
mysql.Com_admin_commands 1366399993 1 host=mydb.example.com
mysql.Com_assign_to_keycache 1366399993 0 host=mydb.example.com
...
Example: mysql_collector.sh
Metric name Timestamp Value “Tags” (key=val)
* * * * * mysql_collector.sh | nc opentsdb.example.com 4242
Example: adding a cron for OpenTSDB
Monitoring MySQL with OpenTSDB
ganderson@mydb.example.com:tcollector$ tree
.
|-- collectors
| |-- 0
| | |-- ifstat.py
| | |-- iostat.py
| | |-- procnettcp.py
| | |-- procstats.py
| |-- 15
| | `-- dfstat.py
| |-- 30
| | |-- mysql_collector.sh
| |-- 300
| | `-- ptTcpModel.sh
| `-- etc
| |-- config.py
|-- config
|-- startstop
`-- tcollector.py
Run forever
Run every 15 seconds
Run every 5 minutes
Run every 30 seconds
Querying
Data
Monitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDB
http://opentsdb.example.com
/#start=2013/04/10-07:32:29
&end=2013/04/10-07:57:57
&m=sum:proc.stat.cpu.percentage_idle{host=db22}
&o=axis x1y1
&m=sum:db.threads_running{host=db22}
&o=axis x1y2
&ylabel=CPU idle
&y2label=Threads Running
&yrange=[0:]
&wxh=1475x600
&png
http://opentsdb.example.com
/q?start=2013/04/10-07:32:29
&end=2013/04/10-07:57:57
&m=sum:proc.stat.cpu.percentage_idle{host=db22}
&o=axis x1y1
&m=sum:db.threads_running{host=db22}
&o=axis x1y2
&ylabel=CPU idle
&y2label=Threads Running
&yrange=[0:]
&ascii
Leveraging OpenTSDB For MySQL
user_statistics monitoring
table_statistics monitoring
Table Info from I_S
SELECT *, DATA_LENGTH+INDEX_LENGTH AS TOTAL_LENGTH
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA NOT IN
('PERFORMANCE_SCHEMA','INFORMATION_SCHEMA')
Query Throughput
And other “common” metrics
• Various MySQL status counters
• QPS (questions)
• Threads connected
• Temporary tables on disk
• Etc.
• Various server statistics
• %CPU Idle
• Free disk space
• I/O utilization
• Network traffic
• Etc.
Future collectors
• pt-query-digest/mysqlslow query statistics
• Data from “show engine innodb status”
• (that is missing from counters)
• PERFORMANCE_SCHEMA (MySQL 5.6+)
• Query statistics
• Processlist information
• Background thread information
How does this change things?
Monitoring MySQL with OpenTSDB
In all seriousness, though...
• Easily see aggregate graphs
• Easily build graphs on-the-fly
• Full granularity forever
• API request for raw data
• Cluster-wide nagios checks with check_tsd
Challenges Switching
• Aggregates are the default
• Mouse-zooming (patched!)
• Auto-suggest for metrics
• “The graphs aren’t pretty”
• Migrating from proof of concept
• Plan for 3+ machines
• Data pruning may be required
Some
Quick
Numbers OpenTSDB @ Box
 21,294 metrics
 72 tag keys
 5,145,745 tag values
 90% Interactive graphs
return <300ms
Next Steps
Enjoy #PerconaLive 2013
We’re hiring!
https://www.box.com/about-us/careers/
geoff@box.com
Image credits
 http://upload.wikimedia.org/wikipedia/commons/7/7b/Batelco_Network_Operations_Centre_(NOC).JPG
 http://www.flickr.com/photos/hoyvinmayvin/5873697252/
 http://www.percona.com/doc/percona-monitoring-plugins
 http://www.2cto.com/uploadfile/2012/0731/20120731112415744.jpg
 http://media.tumblr.com/tumblr_lvfspoenWU1qi19a2.png
 http://img.izismile.com/img/img4/20110527/640/you_can_be_a_superhero_640_01.jpg
 http://openclipart.org/image/250px/svg_to_png/26427/Anonymous_notebook.png
 http://images.alphacoders.com/768/2560-1600-76893.jpg
 http://www.flickr.com/photos/in365/4861180503/
 http://openclipart.org/image/250px/svg_to_png/130915/Prohibido_3D.png
 http://www.flickr.com/photos/61114149@N02/5566484951/
 http://opentsdb.net/img/tsd-sample.png
 http://images2.wikia.nocookie.net/__cb20080911160202/bttf/images/5/57/WhatdidItellyou-HQ.jpg
 http://www.flickr.com/photos/lisakayaks/3028350539/
 http://www.flickr.com/photos/25566302@N00/1472400115
 http://www.flickr.com/photos/grandmaitre/5846058698/
 http://www.flickr.com/photos/7518432@N06/2673347604/

More Related Content

What's hot (20)

PDF
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon
 
PPTX
Keynote: Apache HBase at Yahoo! Scale
HBaseCon
 
PDF
Gnocchi v3 brownbag
Gordon Chung
 
PDF
Gnocchi Profiling 2.1.x
Gordon Chung
 
PDF
Gnocchi v4 (preview)
Gordon Chung
 
PDF
Advanced Apache Cassandra Operations with JMX
zznate
 
PDF
ELK: Moose-ively scaling your log system
Avleen Vig
 
PDF
Monitoring with Prometheus
Shiao-An Yuan
 
PDF
Gnocchi v3
Gordon Chung
 
PDF
Gnocchi Profiling v2
Gordon Chung
 
PDF
Gnocchi v4 - past and present
Gordon Chung
 
PPTX
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
DataStax
 
PDF
Anatomy of an action
Gordon Chung
 
PDF
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
NoSQLmatters
 
PPTX
Back to Basics Webinar 6: Production Deployment
MongoDB
 
PDF
openTSDB - Metrics for a distributed world
Oliver Hankeln
 
PPTX
Aerospike & GCE (LSPE Talk)
Sayyaparaju Sunil
 
PDF
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxData
 
PDF
ScyllaDB: NoSQL at Ludicrous Speed
J On The Beach
 
PDF
Let's Compare: A Benchmark review of InfluxDB and Elasticsearch
InfluxData
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon
 
Keynote: Apache HBase at Yahoo! Scale
HBaseCon
 
Gnocchi v3 brownbag
Gordon Chung
 
Gnocchi Profiling 2.1.x
Gordon Chung
 
Gnocchi v4 (preview)
Gordon Chung
 
Advanced Apache Cassandra Operations with JMX
zznate
 
ELK: Moose-ively scaling your log system
Avleen Vig
 
Monitoring with Prometheus
Shiao-An Yuan
 
Gnocchi v3
Gordon Chung
 
Gnocchi Profiling v2
Gordon Chung
 
Gnocchi v4 - past and present
Gordon Chung
 
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
DataStax
 
Anatomy of an action
Gordon Chung
 
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
NoSQLmatters
 
Back to Basics Webinar 6: Production Deployment
MongoDB
 
openTSDB - Metrics for a distributed world
Oliver Hankeln
 
Aerospike & GCE (LSPE Talk)
Sayyaparaju Sunil
 
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxData
 
ScyllaDB: NoSQL at Ludicrous Speed
J On The Beach
 
Let's Compare: A Benchmark review of InfluxDB and Elasticsearch
InfluxData
 

Similar to Monitoring MySQL with OpenTSDB (20)

PPTX
HBaseCon 2015: OpenTSDB and AsyncHBase Update
HBaseCon
 
PPTX
Need for Time series Database
Pramit Choudhary
 
PDF
Survey real time databases
Manuel Santos
 
PPTX
Percona Live UK 2014 Part III
Alkin Tezuysal
 
PPTX
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
Pablo Garbossa
 
PPTX
Eko10 Workshop Opensource Database Auditing
Juan Berner
 
PDF
OSMC 2013 | openTSDB - metrics for a distributed world
NETWAYS
 
PDF
Open TSDB Lightning Talk
CloudOps2005
 
PDF
FOSDEM 2015: gdb tips and tricks for MySQL DBAs
Valerii Kravchuk
 
PDF
MariaDB - a MySQL Replacement #SELF2014
Colin Charles
 
PDF
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
InfluxData
 
PDF
20190615 hkos-mysql-troubleshootingandperformancev2
Ivan Ma
 
PPTX
MySQL performance monitoring using Statsd and Graphite
DB-Art
 
PDF
Applying profilers to my sql (fosdem 2017)
Valeriy Kravchuk
 
PDF
Pi Day 2022 - from IoT to MySQL HeatWave Database Service
Frederic Descamps
 
PPTX
Apache IOTDB: a Time Series Database for Industrial IoT
jixuan1989
 
PDF
Chronix Poster for the Poster Session FAST 2017
Florian Lautenschlager
 
PDF
Scaling Pinterest's Monitoring
Brian Overstreet
 
PDF
[B14] A MySQL Replacement by Colin Charles
Insight Technology, Inc.
 
PDF
Ndb cluster 80_tpc_h
mikaelronstrom
 
HBaseCon 2015: OpenTSDB and AsyncHBase Update
HBaseCon
 
Need for Time series Database
Pramit Choudhary
 
Survey real time databases
Manuel Santos
 
Percona Live UK 2014 Part III
Alkin Tezuysal
 
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
Pablo Garbossa
 
Eko10 Workshop Opensource Database Auditing
Juan Berner
 
OSMC 2013 | openTSDB - metrics for a distributed world
NETWAYS
 
Open TSDB Lightning Talk
CloudOps2005
 
FOSDEM 2015: gdb tips and tricks for MySQL DBAs
Valerii Kravchuk
 
MariaDB - a MySQL Replacement #SELF2014
Colin Charles
 
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
InfluxData
 
20190615 hkos-mysql-troubleshootingandperformancev2
Ivan Ma
 
MySQL performance monitoring using Statsd and Graphite
DB-Art
 
Applying profilers to my sql (fosdem 2017)
Valeriy Kravchuk
 
Pi Day 2022 - from IoT to MySQL HeatWave Database Service
Frederic Descamps
 
Apache IOTDB: a Time Series Database for Industrial IoT
jixuan1989
 
Chronix Poster for the Poster Session FAST 2017
Florian Lautenschlager
 
Scaling Pinterest's Monitoring
Brian Overstreet
 
[B14] A MySQL Replacement by Colin Charles
Insight Technology, Inc.
 
Ndb cluster 80_tpc_h
mikaelronstrom
 
Ad

Recently uploaded (20)

PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
July Patch Tuesday
Ivanti
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
July Patch Tuesday
Ivanti
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
Ad

Monitoring MySQL with OpenTSDB

Editor's Notes

  • #2: Will be talking about OpenTSDBHow OpenTSDB changed monitoring at boxHow we leverage it’s abilities for day-to-day management of MySQL DBs
  • #5: Youprobablyhave the perconacactigraphs and monitoring plugins
  • #6: Youaddsomeothernagioschecks for funedgecases
  • #7: And you use different tools from the percona toolkit like:StalkPoor man’s profiler (PMP)Query Digest
  • #8: Suddenly finding problems and correlating issues is difficultMaybe you don’t have a NOC yetMaybe you do, and they need better graphs
  • #11: IT’S BIGGER ON THE INSIDE – just kiddingFast!Easy to build graphs on the flyHella easy to scale – just add nodes (HBase or TSDs)Very easy to put data into it – NEXT SLIDES TALK ABOUT THIS YO
  • #18: Running threads follows the CPU spikes PERFECTLYBox has a “long query” killer that gets more aggressive as more threads stack upShould get a look at queries on the server
  • #19: Zoom in to get the exact time interval
  • #20: Know the exact time of a high stack upGo to check Box Anemometer to see what query is there
  • #21: This is the URL for thatCan easily paste this to anyone to see the same interactive graph
  • #22: If you prefer text, that’s also an option via APIYou can build cool tools using the APIWeek over Week graphsSimplifies anomaly detectionURL is pretty simpleEffectively just use “q?” and add “&amp;ascii”
  • #24: Get audit log:LoginsTypes of statements issuedEtc.
  • #25: Get performance information about:Row and index change activityRow read activity
  • #26: Generate daily reports of:Are auto increments columns nearing a boundary on a table?Number of records in a tableSize of a datafile for a table
  • #27: Using pt-tcp-modelAllows us to identify when server stops doing work5min interval
  • #31: Aggregate graphs are the defaultDrill down only when problems in aggregate
  • #32: Aggregatesare thedefault–shift in thinking from lookingatspecificimportantservers.Zooming in on a timeslice was painfullymanual– I wroteup a patch to addmouse-zooming and upstreamed. Thiscementedopentsdb as a powerful monitoring tool for Box, overnightAuto-suggest for metricsisspotty– we wrote a quick cron job that dumps full metric list into JSON “Graphs aren’t pretty” – a few changes to the base GNUPlot options solved this. There’s also a “Smooth” option in the interface nowMigrating from POC – we had a single-node setup for the longest time until that fell over...a lotPlan for 3+ machines – it’s enough to run all the needed bits for a light-weight distributed HBase and TSD setupData pruning – ~4 bytes per metric before HDFS replication add up quicklymysql_tcollector - 370 metrics -- ~1.5k per server. X 30s interval = ~4.2MB/dayeither have a plan to prune old data or build out extra capacity and predict storage needs per server/metric added