2013-08-20
Dave Latham
 History
 Stats
 HowWe Store Data
 Challenges
 MistakesWe Made
 Tips / Patterns
 Future
 Moral of the Story
 2008 –Flurry Analytics for MobileApps
 Sharded MySQL, or
 HBase!
 Launched on 0.18.1 with a 3 node cluster
 Great community
 Now running 0.94.5 (+ patches)
 2 data centers with 2 clusters each
 Bidirectional replication
 1000 slave nodes per cluster
 32 GB RAM, 4 drives (1 or 2TB), 1 GigE, dual quad-
core * 2 HT = 16 procs
 DataNode,TaskTracker, RegionServer
(11GB), 5 Mappers, 2 Reducers
 ~30 tables, 250k regions, 430TB (after LZO)
 2 big tables are about 90% of that
▪ 1 wide table: 3 CF, 4 billion rows, up to 1MM cells per row
▪ 1 tall table: 1 CF, 1 trillion rows, most 1 cell per row
 12 physical nodes
 5 region servers with 20GB heaps on each
 1 table - 8 billion small rows - 500GB (LZO)
 All in block cache (after 20 minute warmup)
 100k-1MM QPS - 99.9% Reads
 2ms mean, 99% <10ms
 25 ms GC pause every 40 seconds
 slow after compaction
 DAO for Java apps
 Requires:
▪ writeRowIndex / readRowIndex
▪ readKeyValue / writeRowContents
 Provides:
▪ save / delete
▪ streamEntities / pagination
▪ MR input formats on entities (rather than Result)
 Uses HTable or asynchbase
 Change row key format
 DAO supports both formats
1. Create new table
2. Writes to both
3. Migrate existing
4. Validate
5. Reads to new table
6. Write to (only) new table
7. Drop old table
 Bottlenecks (not horizontally scalable)
 HMaster (e.g. HLog cleaning falls behind creation
[HBASE-9208])
 NameNode
▪ Disable table / shutdown => many HDFS files at once
▪ Scan table directory => slow region assignments
 ZooKeeper (HBase replication)
 JobTracker (heap)
 META region
 Too many regions (250k)
 Max size 256M -> 1 GB -> 5 GB
 Slow reassignments on failure
 Slow hbck recovery
 Lots of META queries / big client cache
▪ Soft refs can exacerbate
 Slow rolling restarts
 More failures (Common and otherwise)
 Zombie RS
 Latency long tail
 HTable Flush write buffer
 GC pauses
 RegionServer failure
 (SeeTheTail at Scale – Jeff Dean, Luiz André Barroso)
 Shared cluster for MapReduce and live
queries
 IO bound requests hog handler threads
 Even cached reads get slow
 RegionServer falls behind, stays behind
 If the cluster goes down, it takes awhile to come
back
 HDFS-5042 Completed files lost after power failure
 ZOOKEEPER-1277 servers stop serving when lower 32bits of
zxid roll over
 ZOOKEEPER-1731 Unsynchronized access to
ServerCnxnFactory.connectionBeans results in deadlock
 Small region size -> many regions
 Nagle’s
 Trying to solve a crisis you don’t understand
(hbck fixSplitParents)
 Setting up replication
 Custom backup / restore
 CopyTable OOM
 Verification
 Compact data matters (even with
compression)
 Block cache, network not compressed
 Avoid random reads on non cached tables (duh!)
 Write cell fragments, combine at read time to
avoid doing random reads
 compact later - coprocessor?
 can lead to large rows
▪ probabilistic counter
 HDFS HA
 Snapshots (see how it works with 100k
regions on 1000 servers)
 2000 node clusters
 test those bottlenecks
 larger regions, larger HDFS blocks, larger HLogs
 More (independent) clusters
 Load aware balancing?
 Separate RPC priorities for workloads
 0.96
 Scaled 1000x and more on the same DB
 If you’re on the edge you need to understand
your system
 Monitor
 Open Source
 Load test
 Know your load
 Disk or Cache (or SSDs?)
 And maybe some answers

More Related Content

PPTX
HBaseCon 2015: HBase Operations in a Flurry
PDF
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
PDF
HBaseCon 2015: HBase Operations at Xiaomi
PPTX
HBaseCon 2015: Blackbird Collections - In-situ Stream Processing in HBase
PDF
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
PPTX
Time-Series Apache HBase
PDF
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
PPTX
HBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase Operations in a Flurry
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: Blackbird Collections - In-situ Stream Processing in HBase
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
Time-Series Apache HBase
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
HBaseCon 2015: HBase 2.0 and Beyond Panel

What's hot (20)

PPTX
Apache HBase, Accelerated: In-Memory Flush and Compaction
PPTX
Off-heaping the Apache HBase Read Path
PDF
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
PPTX
HBaseCon 2013: ETL for Apache HBase
PPTX
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
PDF
HBaseCon2017 gohbase: Pure Go HBase Client
PDF
hbaseconasia2017: hbase-2.0.0
PPTX
Date-tiered Compaction Policy for Time-series Data
PPTX
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
PDF
hbaseconasia2017: Apache HBase at Netease
PDF
HBaseCon2017 Improving HBase availability in a multi tenant environment
PDF
HBase 0.20.0 Performance Evaluation
PDF
Apache HBase in the Enterprise Data Hub at Cerner
PPTX
Real-time HBase: Lessons from the Cloud
PPTX
HBaseCon 2015: OpenTSDB and AsyncHBase Update
PPTX
Rolling Out Apache HBase for Mobile Offerings at Visa
PPTX
HBaseCon 2013: Apache HBase on Flash
PDF
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
PPTX
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
PDF
Argus Production Monitoring at Salesforce
Apache HBase, Accelerated: In-Memory Flush and Compaction
Off-heaping the Apache HBase Read Path
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon 2013: ETL for Apache HBase
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon2017 gohbase: Pure Go HBase Client
hbaseconasia2017: hbase-2.0.0
Date-tiered Compaction Policy for Time-series Data
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
hbaseconasia2017: Apache HBase at Netease
HBaseCon2017 Improving HBase availability in a multi tenant environment
HBase 0.20.0 Performance Evaluation
Apache HBase in the Enterprise Data Hub at Cerner
Real-time HBase: Lessons from the Cloud
HBaseCon 2015: OpenTSDB and AsyncHBase Update
Rolling Out Apache HBase for Mobile Offerings at Visa
HBaseCon 2013: Apache HBase on Flash
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
Argus Production Monitoring at Salesforce
Ad

Similar to HBase at Flurry (20)

PPTX
Introduction to Apache HBase
PDF
Facebook - Jonthan Gray - Hadoop World 2010
PDF
Hbase 20141003
PPT
Chicago Data Summit: Apache HBase: An Introduction
PPT
HBASE Overview
PPTX
HBase in Practice
PPTX
HBase Low Latency, StrataNYC 2014
PPTX
HBase in Practice
PPTX
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
PDF
Nyc hadoop meetup introduction to h base
DOCX
Hbase Quick Review Guide for Interviews
PDF
Базы данных. HBase
PDF
Hbase: an introduction
PPTX
Storage Infrastructure Behind Facebook Messages
PPTX
PPTX
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
PPTX
Real-time searching of big data with Solr and Hadoop
PDF
Understanding and building big data Architectures - NoSQL
PPTX
HBase Operations and Best Practices
PPTX
Introduction to HBase
Introduction to Apache HBase
Facebook - Jonthan Gray - Hadoop World 2010
Hbase 20141003
Chicago Data Summit: Apache HBase: An Introduction
HBASE Overview
HBase in Practice
HBase Low Latency, StrataNYC 2014
HBase in Practice
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Nyc hadoop meetup introduction to h base
Hbase Quick Review Guide for Interviews
Базы данных. HBase
Hbase: an introduction
Storage Infrastructure Behind Facebook Messages
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Real-time searching of big data with Solr and Hadoop
Understanding and building big data Architectures - NoSQL
HBase Operations and Best Practices
Introduction to HBase
Ad

Recently uploaded (20)

PPTX
Module 1 Introduction to Web Programming .pptx
PDF
“A New Era of 3D Sensing: Transforming Industries and Creating Opportunities,...
PDF
Auditboard EB SOX Playbook 2023 edition.
PDF
INTERSPEECH 2025 「Recent Advances and Future Directions in Voice Conversion」
DOCX
search engine optimization ppt fir known well about this
PDF
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
PDF
Comparative analysis of machine learning models for fake news detection in so...
PDF
Consumable AI The What, Why & How for Small Teams.pdf
PDF
Statistics on Ai - sourced from AIPRM.pdf
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PDF
Advancing precision in air quality forecasting through machine learning integ...
PDF
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
DOCX
Basics of Cloud Computing - Cloud Ecosystem
PDF
Early detection and classification of bone marrow changes in lumbar vertebrae...
PDF
The-Future-of-Automotive-Quality-is-Here-AI-Driven-Engineering.pdf
PDF
AI.gov: A Trojan Horse in the Age of Artificial Intelligence
PDF
NewMind AI Weekly Chronicles – August ’25 Week IV
PDF
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
PPTX
Configure Apache Mutual Authentication
PDF
Data Virtualization in Action: Scaling APIs and Apps with FME
Module 1 Introduction to Web Programming .pptx
“A New Era of 3D Sensing: Transforming Industries and Creating Opportunities,...
Auditboard EB SOX Playbook 2023 edition.
INTERSPEECH 2025 「Recent Advances and Future Directions in Voice Conversion」
search engine optimization ppt fir known well about this
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
Comparative analysis of machine learning models for fake news detection in so...
Consumable AI The What, Why & How for Small Teams.pdf
Statistics on Ai - sourced from AIPRM.pdf
Taming the Chaos: How to Turn Unstructured Data into Decisions
Advancing precision in air quality forecasting through machine learning integ...
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
Basics of Cloud Computing - Cloud Ecosystem
Early detection and classification of bone marrow changes in lumbar vertebrae...
The-Future-of-Automotive-Quality-is-Here-AI-Driven-Engineering.pdf
AI.gov: A Trojan Horse in the Age of Artificial Intelligence
NewMind AI Weekly Chronicles – August ’25 Week IV
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
Configure Apache Mutual Authentication
Data Virtualization in Action: Scaling APIs and Apps with FME

HBase at Flurry

  • 2.  History  Stats  HowWe Store Data  Challenges  MistakesWe Made  Tips / Patterns  Future  Moral of the Story
  • 3.  2008 –Flurry Analytics for MobileApps  Sharded MySQL, or  HBase!  Launched on 0.18.1 with a 3 node cluster  Great community  Now running 0.94.5 (+ patches)  2 data centers with 2 clusters each  Bidirectional replication
  • 4.  1000 slave nodes per cluster  32 GB RAM, 4 drives (1 or 2TB), 1 GigE, dual quad- core * 2 HT = 16 procs  DataNode,TaskTracker, RegionServer (11GB), 5 Mappers, 2 Reducers  ~30 tables, 250k regions, 430TB (after LZO)  2 big tables are about 90% of that ▪ 1 wide table: 3 CF, 4 billion rows, up to 1MM cells per row ▪ 1 tall table: 1 CF, 1 trillion rows, most 1 cell per row
  • 5.  12 physical nodes  5 region servers with 20GB heaps on each  1 table - 8 billion small rows - 500GB (LZO)  All in block cache (after 20 minute warmup)  100k-1MM QPS - 99.9% Reads  2ms mean, 99% <10ms  25 ms GC pause every 40 seconds  slow after compaction
  • 6.  DAO for Java apps  Requires: ▪ writeRowIndex / readRowIndex ▪ readKeyValue / writeRowContents  Provides: ▪ save / delete ▪ streamEntities / pagination ▪ MR input formats on entities (rather than Result)  Uses HTable or asynchbase
  • 7.  Change row key format  DAO supports both formats 1. Create new table 2. Writes to both 3. Migrate existing 4. Validate 5. Reads to new table 6. Write to (only) new table 7. Drop old table
  • 8.  Bottlenecks (not horizontally scalable)  HMaster (e.g. HLog cleaning falls behind creation [HBASE-9208])  NameNode ▪ Disable table / shutdown => many HDFS files at once ▪ Scan table directory => slow region assignments  ZooKeeper (HBase replication)  JobTracker (heap)  META region
  • 9.  Too many regions (250k)  Max size 256M -> 1 GB -> 5 GB  Slow reassignments on failure  Slow hbck recovery  Lots of META queries / big client cache ▪ Soft refs can exacerbate  Slow rolling restarts  More failures (Common and otherwise)  Zombie RS
  • 10.  Latency long tail  HTable Flush write buffer  GC pauses  RegionServer failure  (SeeTheTail at Scale – Jeff Dean, Luiz André Barroso)
  • 11.  Shared cluster for MapReduce and live queries  IO bound requests hog handler threads  Even cached reads get slow  RegionServer falls behind, stays behind  If the cluster goes down, it takes awhile to come back
  • 12.  HDFS-5042 Completed files lost after power failure  ZOOKEEPER-1277 servers stop serving when lower 32bits of zxid roll over  ZOOKEEPER-1731 Unsynchronized access to ServerCnxnFactory.connectionBeans results in deadlock
  • 13.  Small region size -> many regions  Nagle’s  Trying to solve a crisis you don’t understand (hbck fixSplitParents)  Setting up replication  Custom backup / restore  CopyTable OOM  Verification
  • 14.  Compact data matters (even with compression)  Block cache, network not compressed  Avoid random reads on non cached tables (duh!)  Write cell fragments, combine at read time to avoid doing random reads  compact later - coprocessor?  can lead to large rows ▪ probabilistic counter
  • 15.  HDFS HA  Snapshots (see how it works with 100k regions on 1000 servers)  2000 node clusters  test those bottlenecks  larger regions, larger HDFS blocks, larger HLogs  More (independent) clusters  Load aware balancing?  Separate RPC priorities for workloads  0.96
  • 16.  Scaled 1000x and more on the same DB  If you’re on the edge you need to understand your system  Monitor  Open Source  Load test  Know your load  Disk or Cache (or SSDs?)
  • 17.  And maybe some answers