SlideShare a Scribd company logo
Cassandra Tools and Distributed Administration
Dr. Jeffrey Berger
Lead Database Engineer
Knewton
1 Introduction
2 Why command-line tools?
3 cassandra-stat
4 cassandra-tracing
5 Ansible ad-hoc commands
2© DataStax, All Rights Reserved.
Knewton
© DataStax, All Rights Reserved. 3
Leader in adaptive learning
● Partners with publishers and institutions in Europe, US,
and Asia
● Provides unique recommendations to students based on
previous behavior
● Advanced content ingestion, curation, and calibration
● Runs in AWS with many different storage backends
● Check us out: www.knewton.com/about/careers/
Cassandra at Knewton
© DataStax, All Rights Reserved. 4
Cassandra is the main datastore at Knewton
EU ProductionDevelopment US ProductionUser AcceptanceQA
Clusters: 5
Nodes: 15
Clusters: 6
Nodes: 69
Clusters: 6
Nodes: 18
Clusters: 6
Nodes: 24
Clusters: 2
Nodes: 6
Clusters: 25 Nodes: 132
Cassandra Challenges
© DataStax, All Rights Reserved. 5
• Monitoring
– Historical measures are important
• Triage
– Immediate answers in a distributed system
• Provisioning
– Keep configurations consistent
• Scaling
– Elastically scale Cassandra 'out' or 'in'
Cassandra Challenges
© DataStax, All Rights Reserved. 6
• Monitoring
– Historical measures are important
• Triage
– Immediate answers in a distributed system
• Provisioning
– Keep configurations consistent
• Scaling
– Elastically scale Cassandra 'out' or 'in'
Solutions as Software
© DataStax, All Rights Reserved. 7
If you magnify your surface area,
magnify your tools
● Easy to use
● Fast and responsive
● Distributed
1 Introduction
2 Why command-line tools?
3 cassandra-stat
4 cassandra-tracing
5 Ansible ad-hoc commands
8© DataStax, All Rights Reserved.
Why command line tools?
© DataStax, All Rights Reserved. 9
Always consider the operator!
Systems people like the command line!
● Few moving parts
● Local
● Immediate
Why not graphs?
© DataStax, All Rights Reserved. 10
Graphs are great, I love graphs
● Not immediate
● Can be overloaded
● Remote
● Fixed metrics
● Averages rather than values
Why not nodetool?
© DataStax, All Rights Reserved. 11
Nodetool is great..
Why not nodetool?
© DataStax, All Rights Reserved. 12
Until it is time to cook dinner...
Jolokia ( jolokia.org )
© DataStax, All Rights Reserved. 13
Exposes JMX endpoints by HTTP
• Open source (Apache2)
• Lets you script with full access to JMX endpoints
• Agent runs with cassandra
• Lightweight, fast, easy to install
Installing Jolokia is painless
© DataStax, All Rights Reserved. 14
2) Add this line to cassandra-env.sh
# added to activate the jolokia agent
JVM_OPTS="$JVM_OPTS -javaagent:/opt/cassandra/jolokia-jvm-agent.jar"
(Or whatever the path is to your Jolokia JVM jar!)
1) Download the Jolokia JVM agent from their site / maven
What to do with Jolokia?
© DataStax, All Rights Reserved. 15
Build some monitoring tools!
• Use jconsole to find metrics you are interested in
• Make some programs with your favorite language
• Get the metrics from Jolokia to feed it
Check out the tools we have already made!
cassandra-toolbox
© DataStax, All Rights Reserved. 16
Python package of cassandra tools developed at Knewton
• Pip installable
– pip install cassandra-toolbox
• Open source (Apache2)
• Interacts with C* via Jolokia
• github.com/Knewton/cassandra-toolbox
• 2 scripts right now, more soon
1 Introduction
2 Why command-line tools?
3 cassandra-stat
4 cassandra-tracing
5 Ansible ad-hoc commands
17© DataStax, All Rights Reserved.
cassandra-stat
© DataStax, All Rights Reserved. 18
A real-time feed of Cassandra operations
Like iostat for Cassandra
• Interacts with Jolokia agent
• Diffs metrics on a configurable time scale
• Overall / Keyspace / CF granularity
• Easy to use, easy to read
cassandra-stat
© DataStax, All Rights Reserved. 19
$cassandra-stat
Reads Writes Reads (99%) ms Writes (99%) ms Compactions Time ns
1 111 91.462 17.4 0 20:15:36 total
2 113 91.4 17.98 0 20:15:37 total
0 117 91.4 17.17 0 20:15:38 total
0 72 91.4 17.34 0 20:15:39 total
0 69 91.4 17.3 0 20:15:40 total
*Not all fields shown
Some metrics are summed
across CFs and the difference
from the last iteration reported
Some report the maximum
value from all CFs
Some metrics are summed
across CFs
cassandra-stat
20
metrics = [
{
"metric_name": "ReadLatency",
"metric_key": "Count",
"display_name": "Reads",
"sum": True,
"diff": True,
"nonzero": True
},
...
● Metrics are not hardcoded
● Easy to add/remove
● Flexible
○ sum
○ diff
○ nonzero
● Configuration is moving to
a YAML file
cassandra-stat
© DataStax, All Rights Reserved. 21
Benefits:
• Traffic monitoring
– Real time load can be read off easily
• Performance debugging
– All vital metrics are on a single line at each time
• High granularity
– Metrics every second
• Diverse metrics
– Metrics can be configured and read out immediately
1 Introduction
2 Why command-line tools?
3 cassandra-stat
4 cassandra-tracing
5 Ansible ad-hoc commands
22© DataStax, All Rights Reserved.
cassandra-tracing
© DataStax, All Rights Reserved. 23
Sampling a percent of all queries is a great tool*
$nodetool settraceprobability 0.001
But if you ever queried the CFs in system_traces you
might be bewildered..
* Don't set this percent too high!
cassandra-tracing
© DataStax, All Rights Reserved. 24
cqlsh:system_traces> SELECT request,parameters FROM sessions LIMIT 4;
request | parameters
--------------------+---------------------------------------
Execute CQL3 query |
{'consistency_level': 'LOCAL_ONE', 'page_size': '5000', 'query': 'SELECT * FROM test2 WHERE
key=''XXXXXXXXXXXXXXXXX''', 'serial_consistency_level': 'SERIAL'}
Execute CQL3 query |
{'consistency_level': 'ONE', 'query': 'select cluster_name from system.local',
'serial_consistency_level': 'SERIAL'}
Execute CQL3 query |
{'consistency_level': 'ONE', 'query': 'select cluster_name from system.local',
'serial_consistency_level': 'SERIAL'}
Execute CQL3 query |
{'consistency_level': 'ONE', 'query': 'SELECT * FROM system.schema_columnfamilies',
'serial_consistency_level': 'SERIAL'}
cassandra-tracing
© DataStax, All Rights Reserved. 25
cqlsh:system_traces> SELECT request,parameters FROM sessions LIMIT 4;
request | parameters
--------------------+---------------------------------------
Execute CQL3 query |
{'consistency_level': 'LOCAL_ONE', 'page_size': '5000', 'query': 'SELECT * FROM test2 WHERE
key=''XXXXXXXXXXXXXXXXX''', 'serial_consistency_level': 'SERIAL'}
Execute CQL3 query |
{'consistency_level': 'ONE', 'query': 'select cluster_name from system.local',
'serial_consistency_level': 'SERIAL'}
Execute CQL3 query |
{'consistency_level': 'ONE', 'query': 'select cluster_name from system.local',
'serial_consistency_level': 'SERIAL'}
Execute CQL3 query |
{'consistency_level': 'ONE', 'query': 'SELECT * FROM system.schema_columnfamilies',
'serial_consistency_level': 'SERIAL'}
cassandra-tracing
© DataStax, All Rights Reserved. 26
$ cassandra-tracing `hostname -I `
100% Complete: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX|100
Total skipped due to null duration: 0
Total skipped due to error: 0
175 sessions satisfying criteria.
Showing 100 longest running results.
Session Id Duration(us) Query
UUID 19696 SELECT * FROM system.schema_columnfamilies
UUID 20569 Executing single-partition query on ColumnFamilyA
UUID 20905 SELECT * FROM system.schema_columnfamilies
UUID 21056 Executing single-partition query on ColumnFamilyB
UUID 21397 Executing single-partition query on ColumnFamilyB
UUID 21992 Executing single-partition query on ColumnFamilyC
...
Longest duration queries shown lastSession id allows introspection into
individual operations in system_traces
*Not all fields shown
cassandra-tracing
© DataStax, All Rights Reserved. 27
cqlsh:system_traces> select activity,source_elapsed from events WHERE session_id=UUID;
activity | source_elapsed
---------------------------------------------------------------+---------------
Parsing SELECT * FROM system.schema_columnfamilies | 21
Preparing statement | 31
Computing ranges to query | 73
Submitting range requests on 1 ranges with a concurrency of 1 | 88
Submitted 1 concurrent range requests covering 1 ranges | 96
Executing seq scan across 3 sstables for [min(-1), min(-1)] | 382
Read 7 live and 0 tombstone cells | 2057
Read 2 live and 0 tombstone cells | 2495
Read 1 live and 0 tombstone cells | 3066
Read 17 live and 32 tombstone cells | 16892
Read 7 live and 0 tombstone cells | 18757
Scanned 5 rows and matched 5 | 19172
cassandra-tracing
© DataStax, All Rights Reserved. 28
Benefits:
• High level view of traffic passing through the node
– Does a single query type take a long time?
– Are you hitting a lot of tombstones with a query type?
– Index usage? Timeouts?
• Meaningful introspection
– Isolate the sessions that are interesting cases and
spend your time on the queries driving up your %99.9.
1 Introduction
2 Why command-line tools?
3 cassandra-stat
4 cassandra-tracing
5 Ansible ad-hoc commands
29© DataStax, All Rights Reserved.
Ansible (www.ansible.com)
An agentless, open source, ssh-based, configuration
management tool.
We use it for backups / provisioning / distributed commands.
Go check out: Cassandra backups and restorations using Ansible
Joshua Wickman
4:10 PM – 4:45 PM Room 210B
© DataStax, All Rights Reserved. 30
Ad Hoc commands
Ad hoc commands are one-off command line processes
ansible cassandra -i ips.txt -m shell -a "hostname"
© DataStax, All Rights Reserved. 31
Yaml file of groups of ips
Using the shell module
Command to execute on
the remote hostName of ip group to
execute on
IP List can be a script that returns the IPs, so it can tie
into any inventory management
Ad Hoc commands
Output looks like:
172.ip.ip.ip| success | rc=0 >>
cassandra-i-962LMNOP
172.ip.ip.ip | success | rc=0 >>
cassandra-i-dbfLMNOP
172.ip.ip.ip | success | rc=0 >>
cassandra-i-450LMNOP
© DataStax, All Rights Reserved. 32
Success or failure of command
Return code of command
Able to be piped through grep or other
processes on your local machine
Distributed Arbitrary Commands
function dcmd(){
if [[ $# < 2 ]]; then
echo "USAGE dcmd <GROUP> <SHELL COMMAND>
Ex: dcmd qa-cass 'tail /var/log/cassandra/system.log'";
else
ansible "${1}" -i ips.txt -m shell -a "${2}" --sudo;
fi
}
© DataStax, All Rights Reserved. 33
Make a wrapper function - make it easy on your team!
dcmd = distributed command
Distributed Commands
© DataStax, All Rights Reserved. 34
Benefits:
• Get immediate status on distributed systems
– Output reflects the current state
• Execute operations on all nodes
– If you need to bounce a whole cluster, this is great
• Easy to see differences between node output
– Cassandra is distributed so all nodes might not
agree on the state of the cluster. It can be hard to
find the dissenting node(s).
Distributed Nodetool Commands
$dcmd qa-cass 'nodetool tpstats | egrep "AntiEntropy|Name"'
172.ip.ip.ip | success | rc=0 >>
Pool Name Active Pending Completed Blocked All time blocked
AntiEntropyStage 0 0 0 0 0
172.ip.ip.ip | success | rc=0 >>
Pool Name Active Pending Completed Blocked All time blocked
AntiEntropyStage 0 0 0 0 0
172.ip.ip.ip | success | rc=0 >>
Pool Name Active Pending Completed Blocked All time blocked
AntiEntropySessions 0 0 1536 0 0
AntiEntropyStage 0 0 126720 0 0
© DataStax, All Rights Reserved. 35
Conclusions
● Cassandra exposes a lot of metrics if you know where
to find them - don't be afraid to dig them out!
● Programs can analyze bulk output a lot faster and better
than people - save your time for the things programs
aren't good at.
● Have distributed commands in your arsenal and don't
be afraid to use them.
© DataStax, All Rights Reserved. 36
Thank You

More Related Content

What's hot (20)

PDF
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
DataStax
 
PDF
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...
DataStax
 
PPTX
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
DataStax
 
PPTX
Processing 50,000 events per second with Cassandra and Spark
Ben Slater
 
PPTX
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
DataStax
 
PDF
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
DataStax
 
PPTX
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
DataStax
 
PPTX
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
DataStax
 
PDF
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
DataStax
 
PPTX
Real time data pipeline with spark streaming and cassandra with mesos
Rahul Kumar
 
PDF
Engineering fast indexes
Daniel Lemire
 
PPTX
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
DataStax
 
PDF
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
DataStax
 
PPTX
Large partition in Cassandra
Shogo Hoshii
 
PDF
Develop Scalable Applications with DataStax Drivers (Alex Popescu, Bulat Shak...
DataStax
 
PPT
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...
DataStax
 
PDF
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
DataStax Academy
 
PDF
Cassandra Exports as a Trivially Parallelizable Problem (Emilio Del Tessandor...
DataStax
 
PDF
Apache Cassandra at Macys
DataStax Academy
 
PDF
Cassandra CLuster Management by Japan Cassandra Community
Hiromitsu Komatsu
 
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
DataStax
 
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...
DataStax
 
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
DataStax
 
Processing 50,000 events per second with Cassandra and Spark
Ben Slater
 
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
DataStax
 
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
DataStax
 
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
DataStax
 
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
DataStax
 
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
DataStax
 
Real time data pipeline with spark streaming and cassandra with mesos
Rahul Kumar
 
Engineering fast indexes
Daniel Lemire
 
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
DataStax
 
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
DataStax
 
Large partition in Cassandra
Shogo Hoshii
 
Develop Scalable Applications with DataStax Drivers (Alex Popescu, Bulat Shak...
DataStax
 
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...
DataStax
 
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
DataStax Academy
 
Cassandra Exports as a Trivially Parallelizable Problem (Emilio Del Tessandor...
DataStax
 
Apache Cassandra at Macys
DataStax Academy
 
Cassandra CLuster Management by Japan Cassandra Community
Hiromitsu Komatsu
 

Similar to Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C* Summit 2016 (20)

PPTX
Devops kc
Philip Thompson
 
PDF
Cassandra Day Atlanta 2015: Software Development with Apache Cassandra: A Wal...
DataStax Academy
 
PDF
Diagnosing Problems in Production (Nov 2015)
Jon Haddad
 
PDF
Advanced Operations
DataStax Academy
 
PDF
Cassandra Day Chicago 2015: Diagnosing Problems in Production
DataStax Academy
 
PDF
Cassandra Day London 2015: Diagnosing Problems in Production
DataStax Academy
 
PDF
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
DataStax Academy
 
PDF
Joel Jacobson (Datastax) - Diagnosing Cassandra Problems in Production
Outlyer
 
PDF
Pandora FMS:Cassandra Plugin
Pandora FMS
 
PDF
Cassandra Summit 2014: Successful Software Development with Apache Cassandra
DataStax Academy
 
PDF
Successful Software Development with Apache Cassandra
zznate
 
PDF
Diagnosing Problems in Production - Cassandra
Jon Haddad
 
PPTX
Monitoring Cassandra with graphite using Yammer Coda-Hale Library
Nader Ganayem
 
PDF
Infrastructure Monitoring with Postgres
Steven Simpson
 
PDF
Standing Up Your First Cluster
DataStax Academy
 
PDF
Introduction to Cassandra
Gokhan Atil
 
PDF
Cassandra Workshop - Cassandra from scratch in one day
Carlos Alonso Pérez
 
PDF
Cassandra Summit 2014: Monitor Everything!
DataStax Academy
 
PPTX
Using Cassandra with your Web Application
supertom
 
Devops kc
Philip Thompson
 
Cassandra Day Atlanta 2015: Software Development with Apache Cassandra: A Wal...
DataStax Academy
 
Diagnosing Problems in Production (Nov 2015)
Jon Haddad
 
Advanced Operations
DataStax Academy
 
Cassandra Day Chicago 2015: Diagnosing Problems in Production
DataStax Academy
 
Cassandra Day London 2015: Diagnosing Problems in Production
DataStax Academy
 
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
DataStax Academy
 
Joel Jacobson (Datastax) - Diagnosing Cassandra Problems in Production
Outlyer
 
Pandora FMS:Cassandra Plugin
Pandora FMS
 
Cassandra Summit 2014: Successful Software Development with Apache Cassandra
DataStax Academy
 
Successful Software Development with Apache Cassandra
zznate
 
Diagnosing Problems in Production - Cassandra
Jon Haddad
 
Monitoring Cassandra with graphite using Yammer Coda-Hale Library
Nader Ganayem
 
Infrastructure Monitoring with Postgres
Steven Simpson
 
Standing Up Your First Cluster
DataStax Academy
 
Introduction to Cassandra
Gokhan Atil
 
Cassandra Workshop - Cassandra from scratch in one day
Carlos Alonso Pérez
 
Cassandra Summit 2014: Monitor Everything!
DataStax Academy
 
Using Cassandra with your Web Application
supertom
 
Ad

More from DataStax (20)

PPTX
Is Your Enterprise Ready to Shine This Holiday Season?
DataStax
 
PPTX
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
DataStax
 
PPTX
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
DataStax
 
PPTX
Best Practices for Getting to Production with DataStax Enterprise Graph
DataStax
 
PPTX
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
DataStax
 
PPTX
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
DataStax
 
PDF
Webinar | Better Together: Apache Cassandra and Apache Kafka
DataStax
 
PDF
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
DataStax
 
PDF
Introduction to Apache Cassandra™ + What’s New in 4.0
DataStax
 
PPTX
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
DataStax
 
PPTX
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
DataStax
 
PDF
Designing a Distributed Cloud Database for Dummies
DataStax
 
PDF
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
DataStax
 
PDF
How to Evaluate Cloud Databases for eCommerce
DataStax
 
PPTX
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
DataStax
 
PPTX
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
DataStax
 
PPTX
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
DataStax
 
PPTX
Datastax - The Architect's guide to customer experience (CX)
DataStax
 
PPTX
An Operational Data Layer is Critical for Transformative Banking Applications
DataStax
 
PPTX
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
DataStax
 
Is Your Enterprise Ready to Shine This Holiday Season?
DataStax
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
DataStax
 
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
DataStax
 
Best Practices for Getting to Production with DataStax Enterprise Graph
DataStax
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
DataStax
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
DataStax
 
Webinar | Better Together: Apache Cassandra and Apache Kafka
DataStax
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
DataStax
 
Introduction to Apache Cassandra™ + What’s New in 4.0
DataStax
 
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
DataStax
 
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
DataStax
 
Designing a Distributed Cloud Database for Dummies
DataStax
 
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
DataStax
 
How to Evaluate Cloud Databases for eCommerce
DataStax
 
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
DataStax
 
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
DataStax
 
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
DataStax
 
Datastax - The Architect's guide to customer experience (CX)
DataStax
 
An Operational Data Layer is Critical for Transformative Banking Applications
DataStax
 
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
DataStax
 
Ad

Recently uploaded (20)

PPTX
Agentic Automation Journey Series Day 2 – Prompt Engineering for UiPath Agents
klpathrudu
 
PDF
Generic or Specific? Making sensible software design decisions
Bert Jan Schrijver
 
PPTX
Coefficient of Variance in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
PDF
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
PPTX
Customise Your Correlation Table in IBM SPSS Statistics.pptx
Version 1 Analytics
 
PDF
AI + DevOps = Smart Automation with devseccops.ai.pdf
Devseccops.ai
 
PDF
vMix Pro 28.0.0.42 Download vMix Registration key Bundle
kulindacore
 
PDF
TheFutureIsDynamic-BoxLang witch Luis Majano.pdf
Ortus Solutions, Corp
 
PDF
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
PPTX
Help for Correlations in IBM SPSS Statistics.pptx
Version 1 Analytics
 
PPTX
Milwaukee Marketo User Group - Summer Road Trip: Mapping and Personalizing Yo...
bbedford2
 
PDF
NEW-Viral>Wondershare Filmora 14.5.18.12900 Crack Free
sherryg1122g
 
PPTX
OpenChain @ OSS NA - In From the Cold: Open Source as Part of Mainstream Soft...
Shane Coughlan
 
PPTX
Comprehensive Risk Assessment Module for Smarter Risk Management
EHA Soft Solutions
 
PDF
IDM Crack with Internet Download Manager 6.42 Build 43 with Patch Latest 2025
bashirkhan333g
 
PDF
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
PDF
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 
PDF
SciPy 2025 - Packaging a Scientific Python Project
Henry Schreiner
 
PDF
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
Agentic Automation Journey Series Day 2 – Prompt Engineering for UiPath Agents
klpathrudu
 
Generic or Specific? Making sensible software design decisions
Bert Jan Schrijver
 
Coefficient of Variance in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
Customise Your Correlation Table in IBM SPSS Statistics.pptx
Version 1 Analytics
 
AI + DevOps = Smart Automation with devseccops.ai.pdf
Devseccops.ai
 
vMix Pro 28.0.0.42 Download vMix Registration key Bundle
kulindacore
 
TheFutureIsDynamic-BoxLang witch Luis Majano.pdf
Ortus Solutions, Corp
 
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
Help for Correlations in IBM SPSS Statistics.pptx
Version 1 Analytics
 
Milwaukee Marketo User Group - Summer Road Trip: Mapping and Personalizing Yo...
bbedford2
 
NEW-Viral>Wondershare Filmora 14.5.18.12900 Crack Free
sherryg1122g
 
OpenChain @ OSS NA - In From the Cold: Open Source as Part of Mainstream Soft...
Shane Coughlan
 
Comprehensive Risk Assessment Module for Smarter Risk Management
EHA Soft Solutions
 
IDM Crack with Internet Download Manager 6.42 Build 43 with Patch Latest 2025
bashirkhan333g
 
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 
SciPy 2025 - Packaging a Scientific Python Project
Henry Schreiner
 
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 

Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C* Summit 2016

  • 1. Cassandra Tools and Distributed Administration Dr. Jeffrey Berger Lead Database Engineer Knewton
  • 2. 1 Introduction 2 Why command-line tools? 3 cassandra-stat 4 cassandra-tracing 5 Ansible ad-hoc commands 2© DataStax, All Rights Reserved.
  • 3. Knewton © DataStax, All Rights Reserved. 3 Leader in adaptive learning ● Partners with publishers and institutions in Europe, US, and Asia ● Provides unique recommendations to students based on previous behavior ● Advanced content ingestion, curation, and calibration ● Runs in AWS with many different storage backends ● Check us out: www.knewton.com/about/careers/
  • 4. Cassandra at Knewton © DataStax, All Rights Reserved. 4 Cassandra is the main datastore at Knewton EU ProductionDevelopment US ProductionUser AcceptanceQA Clusters: 5 Nodes: 15 Clusters: 6 Nodes: 69 Clusters: 6 Nodes: 18 Clusters: 6 Nodes: 24 Clusters: 2 Nodes: 6 Clusters: 25 Nodes: 132
  • 5. Cassandra Challenges © DataStax, All Rights Reserved. 5 • Monitoring – Historical measures are important • Triage – Immediate answers in a distributed system • Provisioning – Keep configurations consistent • Scaling – Elastically scale Cassandra 'out' or 'in'
  • 6. Cassandra Challenges © DataStax, All Rights Reserved. 6 • Monitoring – Historical measures are important • Triage – Immediate answers in a distributed system • Provisioning – Keep configurations consistent • Scaling – Elastically scale Cassandra 'out' or 'in'
  • 7. Solutions as Software © DataStax, All Rights Reserved. 7 If you magnify your surface area, magnify your tools ● Easy to use ● Fast and responsive ● Distributed
  • 8. 1 Introduction 2 Why command-line tools? 3 cassandra-stat 4 cassandra-tracing 5 Ansible ad-hoc commands 8© DataStax, All Rights Reserved.
  • 9. Why command line tools? © DataStax, All Rights Reserved. 9 Always consider the operator! Systems people like the command line! ● Few moving parts ● Local ● Immediate
  • 10. Why not graphs? © DataStax, All Rights Reserved. 10 Graphs are great, I love graphs ● Not immediate ● Can be overloaded ● Remote ● Fixed metrics ● Averages rather than values
  • 11. Why not nodetool? © DataStax, All Rights Reserved. 11 Nodetool is great..
  • 12. Why not nodetool? © DataStax, All Rights Reserved. 12 Until it is time to cook dinner...
  • 13. Jolokia ( jolokia.org ) © DataStax, All Rights Reserved. 13 Exposes JMX endpoints by HTTP • Open source (Apache2) • Lets you script with full access to JMX endpoints • Agent runs with cassandra • Lightweight, fast, easy to install
  • 14. Installing Jolokia is painless © DataStax, All Rights Reserved. 14 2) Add this line to cassandra-env.sh # added to activate the jolokia agent JVM_OPTS="$JVM_OPTS -javaagent:/opt/cassandra/jolokia-jvm-agent.jar" (Or whatever the path is to your Jolokia JVM jar!) 1) Download the Jolokia JVM agent from their site / maven
  • 15. What to do with Jolokia? © DataStax, All Rights Reserved. 15 Build some monitoring tools! • Use jconsole to find metrics you are interested in • Make some programs with your favorite language • Get the metrics from Jolokia to feed it Check out the tools we have already made!
  • 16. cassandra-toolbox © DataStax, All Rights Reserved. 16 Python package of cassandra tools developed at Knewton • Pip installable – pip install cassandra-toolbox • Open source (Apache2) • Interacts with C* via Jolokia • github.com/Knewton/cassandra-toolbox • 2 scripts right now, more soon
  • 17. 1 Introduction 2 Why command-line tools? 3 cassandra-stat 4 cassandra-tracing 5 Ansible ad-hoc commands 17© DataStax, All Rights Reserved.
  • 18. cassandra-stat © DataStax, All Rights Reserved. 18 A real-time feed of Cassandra operations Like iostat for Cassandra • Interacts with Jolokia agent • Diffs metrics on a configurable time scale • Overall / Keyspace / CF granularity • Easy to use, easy to read
  • 19. cassandra-stat © DataStax, All Rights Reserved. 19 $cassandra-stat Reads Writes Reads (99%) ms Writes (99%) ms Compactions Time ns 1 111 91.462 17.4 0 20:15:36 total 2 113 91.4 17.98 0 20:15:37 total 0 117 91.4 17.17 0 20:15:38 total 0 72 91.4 17.34 0 20:15:39 total 0 69 91.4 17.3 0 20:15:40 total *Not all fields shown Some metrics are summed across CFs and the difference from the last iteration reported Some report the maximum value from all CFs Some metrics are summed across CFs
  • 20. cassandra-stat 20 metrics = [ { "metric_name": "ReadLatency", "metric_key": "Count", "display_name": "Reads", "sum": True, "diff": True, "nonzero": True }, ... ● Metrics are not hardcoded ● Easy to add/remove ● Flexible ○ sum ○ diff ○ nonzero ● Configuration is moving to a YAML file
  • 21. cassandra-stat © DataStax, All Rights Reserved. 21 Benefits: • Traffic monitoring – Real time load can be read off easily • Performance debugging – All vital metrics are on a single line at each time • High granularity – Metrics every second • Diverse metrics – Metrics can be configured and read out immediately
  • 22. 1 Introduction 2 Why command-line tools? 3 cassandra-stat 4 cassandra-tracing 5 Ansible ad-hoc commands 22© DataStax, All Rights Reserved.
  • 23. cassandra-tracing © DataStax, All Rights Reserved. 23 Sampling a percent of all queries is a great tool* $nodetool settraceprobability 0.001 But if you ever queried the CFs in system_traces you might be bewildered.. * Don't set this percent too high!
  • 24. cassandra-tracing © DataStax, All Rights Reserved. 24 cqlsh:system_traces> SELECT request,parameters FROM sessions LIMIT 4; request | parameters --------------------+--------------------------------------- Execute CQL3 query | {'consistency_level': 'LOCAL_ONE', 'page_size': '5000', 'query': 'SELECT * FROM test2 WHERE key=''XXXXXXXXXXXXXXXXX''', 'serial_consistency_level': 'SERIAL'} Execute CQL3 query | {'consistency_level': 'ONE', 'query': 'select cluster_name from system.local', 'serial_consistency_level': 'SERIAL'} Execute CQL3 query | {'consistency_level': 'ONE', 'query': 'select cluster_name from system.local', 'serial_consistency_level': 'SERIAL'} Execute CQL3 query | {'consistency_level': 'ONE', 'query': 'SELECT * FROM system.schema_columnfamilies', 'serial_consistency_level': 'SERIAL'}
  • 25. cassandra-tracing © DataStax, All Rights Reserved. 25 cqlsh:system_traces> SELECT request,parameters FROM sessions LIMIT 4; request | parameters --------------------+--------------------------------------- Execute CQL3 query | {'consistency_level': 'LOCAL_ONE', 'page_size': '5000', 'query': 'SELECT * FROM test2 WHERE key=''XXXXXXXXXXXXXXXXX''', 'serial_consistency_level': 'SERIAL'} Execute CQL3 query | {'consistency_level': 'ONE', 'query': 'select cluster_name from system.local', 'serial_consistency_level': 'SERIAL'} Execute CQL3 query | {'consistency_level': 'ONE', 'query': 'select cluster_name from system.local', 'serial_consistency_level': 'SERIAL'} Execute CQL3 query | {'consistency_level': 'ONE', 'query': 'SELECT * FROM system.schema_columnfamilies', 'serial_consistency_level': 'SERIAL'}
  • 26. cassandra-tracing © DataStax, All Rights Reserved. 26 $ cassandra-tracing `hostname -I ` 100% Complete: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX|100 Total skipped due to null duration: 0 Total skipped due to error: 0 175 sessions satisfying criteria. Showing 100 longest running results. Session Id Duration(us) Query UUID 19696 SELECT * FROM system.schema_columnfamilies UUID 20569 Executing single-partition query on ColumnFamilyA UUID 20905 SELECT * FROM system.schema_columnfamilies UUID 21056 Executing single-partition query on ColumnFamilyB UUID 21397 Executing single-partition query on ColumnFamilyB UUID 21992 Executing single-partition query on ColumnFamilyC ... Longest duration queries shown lastSession id allows introspection into individual operations in system_traces *Not all fields shown
  • 27. cassandra-tracing © DataStax, All Rights Reserved. 27 cqlsh:system_traces> select activity,source_elapsed from events WHERE session_id=UUID; activity | source_elapsed ---------------------------------------------------------------+--------------- Parsing SELECT * FROM system.schema_columnfamilies | 21 Preparing statement | 31 Computing ranges to query | 73 Submitting range requests on 1 ranges with a concurrency of 1 | 88 Submitted 1 concurrent range requests covering 1 ranges | 96 Executing seq scan across 3 sstables for [min(-1), min(-1)] | 382 Read 7 live and 0 tombstone cells | 2057 Read 2 live and 0 tombstone cells | 2495 Read 1 live and 0 tombstone cells | 3066 Read 17 live and 32 tombstone cells | 16892 Read 7 live and 0 tombstone cells | 18757 Scanned 5 rows and matched 5 | 19172
  • 28. cassandra-tracing © DataStax, All Rights Reserved. 28 Benefits: • High level view of traffic passing through the node – Does a single query type take a long time? – Are you hitting a lot of tombstones with a query type? – Index usage? Timeouts? • Meaningful introspection – Isolate the sessions that are interesting cases and spend your time on the queries driving up your %99.9.
  • 29. 1 Introduction 2 Why command-line tools? 3 cassandra-stat 4 cassandra-tracing 5 Ansible ad-hoc commands 29© DataStax, All Rights Reserved.
  • 30. Ansible (www.ansible.com) An agentless, open source, ssh-based, configuration management tool. We use it for backups / provisioning / distributed commands. Go check out: Cassandra backups and restorations using Ansible Joshua Wickman 4:10 PM – 4:45 PM Room 210B © DataStax, All Rights Reserved. 30
  • 31. Ad Hoc commands Ad hoc commands are one-off command line processes ansible cassandra -i ips.txt -m shell -a "hostname" © DataStax, All Rights Reserved. 31 Yaml file of groups of ips Using the shell module Command to execute on the remote hostName of ip group to execute on IP List can be a script that returns the IPs, so it can tie into any inventory management
  • 32. Ad Hoc commands Output looks like: 172.ip.ip.ip| success | rc=0 >> cassandra-i-962LMNOP 172.ip.ip.ip | success | rc=0 >> cassandra-i-dbfLMNOP 172.ip.ip.ip | success | rc=0 >> cassandra-i-450LMNOP © DataStax, All Rights Reserved. 32 Success or failure of command Return code of command Able to be piped through grep or other processes on your local machine
  • 33. Distributed Arbitrary Commands function dcmd(){ if [[ $# < 2 ]]; then echo "USAGE dcmd <GROUP> <SHELL COMMAND> Ex: dcmd qa-cass 'tail /var/log/cassandra/system.log'"; else ansible "${1}" -i ips.txt -m shell -a "${2}" --sudo; fi } © DataStax, All Rights Reserved. 33 Make a wrapper function - make it easy on your team! dcmd = distributed command
  • 34. Distributed Commands © DataStax, All Rights Reserved. 34 Benefits: • Get immediate status on distributed systems – Output reflects the current state • Execute operations on all nodes – If you need to bounce a whole cluster, this is great • Easy to see differences between node output – Cassandra is distributed so all nodes might not agree on the state of the cluster. It can be hard to find the dissenting node(s).
  • 35. Distributed Nodetool Commands $dcmd qa-cass 'nodetool tpstats | egrep "AntiEntropy|Name"' 172.ip.ip.ip | success | rc=0 >> Pool Name Active Pending Completed Blocked All time blocked AntiEntropyStage 0 0 0 0 0 172.ip.ip.ip | success | rc=0 >> Pool Name Active Pending Completed Blocked All time blocked AntiEntropyStage 0 0 0 0 0 172.ip.ip.ip | success | rc=0 >> Pool Name Active Pending Completed Blocked All time blocked AntiEntropySessions 0 0 1536 0 0 AntiEntropyStage 0 0 126720 0 0 © DataStax, All Rights Reserved. 35
  • 36. Conclusions ● Cassandra exposes a lot of metrics if you know where to find them - don't be afraid to dig them out! ● Programs can analyze bulk output a lot faster and better than people - save your time for the things programs aren't good at. ● Have distributed commands in your arsenal and don't be afraid to use them. © DataStax, All Rights Reserved. 36