SlideShare a Scribd company logo
Bulk exporting data
from Cassandra
Carlo Cabanilla
@clofresh
Why export?
snapshot
sstable2json
Killing IO on live cluster
sstable2json sstable2csv, with filters
ionice -c 3
Need a place to put it
EBS to the rescue
gzipped
S3cmd
Need to dedupe
Hadoop
numpy pickles
Haderp Mortar Data
numpy pickles msgpack lz4
gzipped lzo'd
Haderp file naming!
2010-07-27~org-1018~m-48778.csv-1,316.gz
S3 copy
Bulk exporting data
from Cassandra
Carlo Cabanilla
@clofresh

More Related Content

What's hot (20)

PPTX
Hadoop
Jaydeep Patel
 
PDF
Pdf sample3
Apoorvi Kapoor
 
PPTX
Case Study - DR on Demand
CTRLS
 
PDF
JavaCro'15 - Big Data in a DIY home - Marko Švaljek
HUJAK - Hrvatska udruga Java korisnika / Croatian Java User Association
 
DOCX
R Data Visualization-Spatial data and Maps in R: Using R as a GIS
Dr. Volkan OBAN
 
PDF
"Metrics: Where and How", Vsevolod Polyakov
Yulia Shcherbachova
 
PPTX
Your data isn't that big @ Big Things Meetup 2016-05-16
Boaz Menuhin
 
PDF
Meetup Elasticsearch 13 novembre 2014
Jean-Pierre Paris
 
PPTX
Rxjs
Stav Alfi
 
PPTX
Golang Arg / CABA Meetup #5 - go-carbon
Ezequiel Maraschio
 
PPTX
Beyond Lists - Functional Kats Conf Dublin 2015
Phillip Trelford
 
PDF
IBM Cloud Community Summit 2018:「Kubernetes in Muiticloudで戦うCloud Native時代」 b...
capsmalt
 
PDF
Bhc ocs inventory
Nico Tristan
 
PPTX
Leveraging Intra-Node Parallelization in HPCC Systems
HPCC Systems
 
PDF
Распределенные системы хранения данных, особенности реализации DHT в проекте ...
yaevents
 
PDF
1細胞オミックスのための新GSEA手法
弘毅 露崎
 
PDF
Collecting metrics with Graphite and StatsD
itnig
 
PDF
Introduction to Hadoop - FinistJug
David Morin
 
PDF
Intoroduction of py7zr
Hiroshi Miura
 
Pdf sample3
Apoorvi Kapoor
 
Case Study - DR on Demand
CTRLS
 
JavaCro'15 - Big Data in a DIY home - Marko Švaljek
HUJAK - Hrvatska udruga Java korisnika / Croatian Java User Association
 
R Data Visualization-Spatial data and Maps in R: Using R as a GIS
Dr. Volkan OBAN
 
"Metrics: Where and How", Vsevolod Polyakov
Yulia Shcherbachova
 
Your data isn't that big @ Big Things Meetup 2016-05-16
Boaz Menuhin
 
Meetup Elasticsearch 13 novembre 2014
Jean-Pierre Paris
 
Rxjs
Stav Alfi
 
Golang Arg / CABA Meetup #5 - go-carbon
Ezequiel Maraschio
 
Beyond Lists - Functional Kats Conf Dublin 2015
Phillip Trelford
 
IBM Cloud Community Summit 2018:「Kubernetes in Muiticloudで戦うCloud Native時代」 b...
capsmalt
 
Bhc ocs inventory
Nico Tristan
 
Leveraging Intra-Node Parallelization in HPCC Systems
HPCC Systems
 
Распределенные системы хранения данных, особенности реализации DHT в проекте ...
yaevents
 
1細胞オミックスのための新GSEA手法
弘毅 露崎
 
Collecting metrics with Graphite and StatsD
itnig
 
Introduction to Hadoop - FinistJug
David Morin
 
Intoroduction of py7zr
Hiroshi Miura
 

More from Datadog (20)

PPTX
What it Means to be a Next-Generation Managed Service Provider
Datadog
 
PPTX
Lifting the Blinds: Monitoring Windows Server 2012
Datadog
 
PDF
Monitoring kubernetes across data center and cloud
Datadog
 
PDF
Datadog + VictorOps Webinar
Datadog
 
PDF
Dataday Texas 2016 - Datadog
Datadog
 
PDF
Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015
Datadog
 
PDF
PyData NYC 2015 - Automatically Detecting Outliers with Datadog
Datadog
 
PDF
Monitoring Docker at Scale - Docker San Francisco Meetup - August 11, 2015
Datadog
 
PPTX
Monitoring Docker containers - Docker NYC Feb 2015
Datadog
 
PDF
Running & Monitoring Docker at Scale
Datadog
 
PDF
Treating Infrastructure as Garbage
Datadog
 
PDF
Events and metrics the Lifeblood of Webops
Datadog
 
PDF
The Data Mullet: From all SQL to No SQL back to Some SQL
Datadog
 
PDF
Big (IT) data
Datadog
 
PDF
Deep dive into Nagios analytics
Datadog
 
PDF
Just enough web ops for web developers
Datadog
 
PDF
Customer Ops: DevOps <3 customer support
Datadog
 
PDF
I <3 graphs in 20 slides
Datadog
 
PDF
Effective monitoring with StatsD
Datadog
 
PDF
Alerting: more signal, less noise, less pain
Datadog
 
What it Means to be a Next-Generation Managed Service Provider
Datadog
 
Lifting the Blinds: Monitoring Windows Server 2012
Datadog
 
Monitoring kubernetes across data center and cloud
Datadog
 
Datadog + VictorOps Webinar
Datadog
 
Dataday Texas 2016 - Datadog
Datadog
 
Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015
Datadog
 
PyData NYC 2015 - Automatically Detecting Outliers with Datadog
Datadog
 
Monitoring Docker at Scale - Docker San Francisco Meetup - August 11, 2015
Datadog
 
Monitoring Docker containers - Docker NYC Feb 2015
Datadog
 
Running & Monitoring Docker at Scale
Datadog
 
Treating Infrastructure as Garbage
Datadog
 
Events and metrics the Lifeblood of Webops
Datadog
 
The Data Mullet: From all SQL to No SQL back to Some SQL
Datadog
 
Big (IT) data
Datadog
 
Deep dive into Nagios analytics
Datadog
 
Just enough web ops for web developers
Datadog
 
Customer Ops: DevOps <3 customer support
Datadog
 
I <3 graphs in 20 slides
Datadog
 
Effective monitoring with StatsD
Datadog
 
Alerting: more signal, less noise, less pain
Datadog
 
Ad

Recently uploaded (20)

PPTX
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PDF
Market Insight : ETH Dominance Returns
CIFDAQ
 
PPTX
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
PDF
NewMind AI Weekly Chronicles – July’25, Week III
NewMind AI
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PDF
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
PPTX
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
PPTX
Agile Chennai 18-19 July 2025 | Workshop - Enhancing Agile Collaboration with...
AgileNetwork
 
PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PDF
State-Dependent Conformal Perception Bounds for Neuro-Symbolic Verification
Ivan Ruchkin
 
PDF
RAT Builders - How to Catch Them All [DeepSec 2024]
malmoeb
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PPTX
Simple and concise overview about Quantum computing..pptx
mughal641
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
Market Insight : ETH Dominance Returns
CIFDAQ
 
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
NewMind AI Weekly Chronicles – July’25, Week III
NewMind AI
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
Agile Chennai 18-19 July 2025 | Workshop - Enhancing Agile Collaboration with...
AgileNetwork
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
State-Dependent Conformal Perception Bounds for Neuro-Symbolic Verification
Ivan Ruchkin
 
RAT Builders - How to Catch Them All [DeepSec 2024]
malmoeb
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
Simple and concise overview about Quantum computing..pptx
mughal641
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
Ad

Bulk Exporting from Cassandra - Carlo Cabanilla