SlideShare a Scribd company logo
past and present...
gord[at]live.ca
@gord_chung
v4 features
□ simplified scheduling
□ less pandas, more numpy
□ Redis incoming driver
□ In-memory incoming Ceph
driver
□ Other general features:
■ http://gnocchi.xyz/releasenotes/4.0.html
■ http://gnocchi.xyz/releasenotes/unreleased.html
scheduling
incoming data sharded
into sacks to allow simple
division of work across
metricd workers
numpy
old
Pandas - a monolithic, all-in-one, data
analysis toolkit
new
Numpy - a lightweight, high-performance,
N-dimensional array (and a bit more)
library
in-memory
the memory is mightier.
leverage Redis driver or
LevelDB/RocksDB
internals for Ceph
benchmarks
back with another one of those block rockin’ beats
v2 & v3
node1
- OpenStack controller node
- Ceph Monitor Service
- Redis (coordination)
node2
- OpenStack Compute Node
- Ceph OSD node (10 OSDs + SSD
Journal)
- 18 metricd (24 in v2)
node3
- Gnocchi API (32 workers)
- Ceph OSD node (10 OSDs + SSD
Journal)
- 18 metricd (24 in v2)
node4
- OpenStack Compute Node
- Ceph OSD node (10 OSDs + SSD
Journal)
- PostgreSQL (
- 18 metricd (24 in v2)
environment
v4.x
node1
- OpenStack controller node
- Ceph Monitor Service
- Redis
- MySQL
node2
- OpenStack Compute Node
- Ceph OSD node (10 OSDs + SSD
Journal)
node3
- OpenStack Compute Node
- Ceph OSD node (10 OSDs + SSD
Journal)
- Gnocchi API (32 workers)
- 18 metricd
all nodes are physical servers:
- 24CPU (48 hyperthreaded)
- 256GB memory
- 10K disks
- 1GB network
- CentOS 7.1
less services and hardware when
running v4. all gnocchi services on
single node
all tests use Ceph as a storage
driver for aggregates.
data generated using benchmark
tool in client (modified to use
threads). 4 clients w/ 12 threads
running simultaneously.
write throughput
total
datapoints
written per
second.
(higher is
better)
number of
requests
made per
second.
(higher is
better)
write throughput
test case 1
1K resources, 20 metrics
each. flood Gnocchi with
60 individual points per
metric. 1.2M calls/run.
run it a few times.
time to
POST 1.2M
individual
measures
for 20K
metrics to
Gnocchi.
post time
v3.1 had anomaly that caused
degradation over time.
processing time
v4 tests use 18 metricd, v3 test
uses 54 metricd
time to
aggregate
all
measures
according to
policy.
(lower is
better)
v4 only comparison
processing time
processing time
number of
recorded,
unprocessed
measures
over a single
run
poor scheduling logic resulted
inefficient handling of many
tiny objects in v3.
processing time
number of
recorded,
unprocessed
measures
over a single
run backlog size dependent on
both API’s ability to write
data and metricd’s ability
to process it.
test case 2
1K resources, 20 metrics
each. flood Gnocchi with
60 batched points per
metric. 20K calls/run. run
it a few times.
processing time
v4 tests use 18 metricd for 3x8
aggregates/metric, v2 and v3
tests, use 72 and 54 metricd
respectively
time to
aggregate
all
measures
according to
policy.
(lower is
better)
aggregation time
time to
aggregate 60
measures of
a metric into
3x8
aggregates
(lower is
better)
average time reflects a
combination of scheduling
efficiency, computation
efficiency and IO performance.
test case 3
500 resources, 20 metrics
each. flood Gnocchi with
720 batched points per
metric. 10K calls/run. run
it a few times.
time to
aggregate
all
measures
according to
policy.
(lower is
better)
processing time
v4 tests use 18 metricd for 3x8
aggregates/metric. v2 and v3
tests, use 72 metricd
aggregation time
time to
aggregate 720
measures of a
metric into
3x8
aggregates
(lower is
better)
computation efficiency improved
for larger series. ~3x
improvement for 60 points and
~6x improvement for 720 points
some more numbers
peep this...
time to
aggregate
metric with
varying
unbatched
measure
sizes (lower
is better)
processing time
numbers represent optimal
performance. benchmark was
taken under zero load.
time to
retrieve a
single time
series using
curl and
client
(lower is
better)
query time
client overhead attributed to
but not limited to formatting
no significant performance
difference vs v3
time to
aggregate
all
measures
according to
default
‘medium’
policy.
(lower is
better)
default configurations
v3 tests use 54 metricd.
v4 tests use 18 metricd.
- v3 medium policy:
- minute/hourly/daily rollups
- 8 aggregates each
- v4 medium policy:
- minute/hourly rollups
- 6 aggregates each
thanks!
Any questions?
You can find me at
@gord_chung
gord[at]live.ca
?
Credits
Special thanks to all the people who
made and released these awesome
resources for free:
□ Presentation template by
SlidesCarnival

More Related Content

PDF
Gnocchi v3
Gordon Chung
 
PDF
Gnocchi v3 brownbag
Gordon Chung
 
PDF
Gnocchi v4 (preview)
Gordon Chung
 
PDF
Gnocchi Profiling 2.1.x
Gordon Chung
 
PDF
Gnocchi Profiling v2
Gordon Chung
 
PDF
Anatomy of an action
Gordon Chung
 
PDF
Storing metrics at scale with Gnocchi
Gordon Chung
 
PDF
RDO hangout on gnocchi
Eoghan Glynn
 
Gnocchi v3
Gordon Chung
 
Gnocchi v3 brownbag
Gordon Chung
 
Gnocchi v4 (preview)
Gordon Chung
 
Gnocchi Profiling 2.1.x
Gordon Chung
 
Gnocchi Profiling v2
Gordon Chung
 
Anatomy of an action
Gordon Chung
 
Storing metrics at scale with Gnocchi
Gordon Chung
 
RDO hangout on gnocchi
Eoghan Glynn
 

What's hot (20)

PPTX
HBaseCon 2013: OpenTSDB at Box
Cloudera, Inc.
 
PPTX
Monitoring MySQL with OpenTSDB
Geoffrey Anderson
 
PPTX
opentsdb in a real enviroment
Chen Robert
 
PDF
ELK: Moose-ively scaling your log system
Avleen Vig
 
PDF
An Introduction to Priam
Jason Brown
 
PDF
OpenTSDB 2.0
HBaseCon
 
PDF
OpenTSDB for monitoring @ Criteo
Nathaniel Braun
 
PDF
Taking Your Database Beyond the Border of a Single Kubernetes Cluster
Christopher Bradford
 
PPTX
Time Series Data in a Time Series World
MapR Technologies
 
PDF
Building a Fast, Resilient Time Series Store with Cassandra (Alex Petrov, Dat...
DataStax
 
PPTX
Stabilising the jenga tower
Gordon Chung
 
PPTX
Update on OpenTSDB and AsyncHBase
HBaseCon
 
PDF
OpenTSDB: HBaseCon2017
HBaseCon
 
PPTX
Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...
DataStax
 
PDF
Ceph Object Storage Performance Secrets and Ceph Data Lake Solution
Karan Singh
 
PDF
Managing your Black Friday Logs
J On The Beach
 
PDF
"Metrics: Where and How", Vsevolod Polyakov
Yulia Shcherbachova
 
PDF
Chronix Poster for the Poster Session FAST 2017
Florian Lautenschlager
 
PDF
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
NoSQLmatters
 
PDF
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
Kohei KaiGai
 
HBaseCon 2013: OpenTSDB at Box
Cloudera, Inc.
 
Monitoring MySQL with OpenTSDB
Geoffrey Anderson
 
opentsdb in a real enviroment
Chen Robert
 
ELK: Moose-ively scaling your log system
Avleen Vig
 
An Introduction to Priam
Jason Brown
 
OpenTSDB 2.0
HBaseCon
 
OpenTSDB for monitoring @ Criteo
Nathaniel Braun
 
Taking Your Database Beyond the Border of a Single Kubernetes Cluster
Christopher Bradford
 
Time Series Data in a Time Series World
MapR Technologies
 
Building a Fast, Resilient Time Series Store with Cassandra (Alex Petrov, Dat...
DataStax
 
Stabilising the jenga tower
Gordon Chung
 
Update on OpenTSDB and AsyncHBase
HBaseCon
 
OpenTSDB: HBaseCon2017
HBaseCon
 
Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...
DataStax
 
Ceph Object Storage Performance Secrets and Ceph Data Lake Solution
Karan Singh
 
Managing your Black Friday Logs
J On The Beach
 
"Metrics: Where and How", Vsevolod Polyakov
Yulia Shcherbachova
 
Chronix Poster for the Poster Session FAST 2017
Florian Lautenschlager
 
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
NoSQLmatters
 
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
Kohei KaiGai
 
Ad

Similar to Gnocchi v4 - past and present (20)

PDF
Smart City Big Data Visualization on 96Boards - Linaro Connect Las Vegas 2016
Ganesh Raju
 
PDF
LAS16-305: Smart City Big Data Visualization on 96Boards
Linaro
 
PDF
Using BigBench to compare Hive and Spark (Long version)
Nicolas Poggi
 
PDF
Pivotal Real Time Data Stream Analytics
kgshukla
 
PDF
The state of Hive and Spark in the Cloud (July 2017)
Nicolas Poggi
 
PDF
OSN_2022.pdf
Neil Buesing
 
PDF
Paris.rb – 07/19 – Sidekiq scaling, workers vs processes
Maxence Haltel
 
PDF
Querying a Complex Web-Based KB for Cultural Heritage Preservation
Ester Giallonardo
 
PPTX
Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Fra...
Govt.Engineering college, Idukki
 
PPTX
Ingestion and Dimensions Compute and Enrich using Apache Apex
Apache Apex
 
PPT
11. From Hadoop to Spark 1:2
Fabio Fumarola
 
PPT
MongoDB Sharding Webinar 2014
Dylan Tong
 
PPSX
Hadoop-Quick introduction
Sandeep Singh
 
PDF
The state of Spark in the cloud
Nicolas Poggi
 
PPT
Wmware NoSQL
Murat Çakal
 
PPTX
Paris Data Geek - Spark Streaming
Djamel Zouaoui
 
PPTX
NoSQL meetup July 2011
Shay Hassidim
 
PPTX
SnappyData Ad Analytics Use Case -- BDAM Meetup Sept 14th
SnappyData
 
PDF
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Databricks
 
PPTX
Uri Cohen & Dan Kilman, GigaSpaces - Orchestration Tool Roundup - OpenStack l...
Cloud Native Day Tel Aviv
 
Smart City Big Data Visualization on 96Boards - Linaro Connect Las Vegas 2016
Ganesh Raju
 
LAS16-305: Smart City Big Data Visualization on 96Boards
Linaro
 
Using BigBench to compare Hive and Spark (Long version)
Nicolas Poggi
 
Pivotal Real Time Data Stream Analytics
kgshukla
 
The state of Hive and Spark in the Cloud (July 2017)
Nicolas Poggi
 
OSN_2022.pdf
Neil Buesing
 
Paris.rb – 07/19 – Sidekiq scaling, workers vs processes
Maxence Haltel
 
Querying a Complex Web-Based KB for Cultural Heritage Preservation
Ester Giallonardo
 
Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Fra...
Govt.Engineering college, Idukki
 
Ingestion and Dimensions Compute and Enrich using Apache Apex
Apache Apex
 
11. From Hadoop to Spark 1:2
Fabio Fumarola
 
MongoDB Sharding Webinar 2014
Dylan Tong
 
Hadoop-Quick introduction
Sandeep Singh
 
The state of Spark in the cloud
Nicolas Poggi
 
Wmware NoSQL
Murat Çakal
 
Paris Data Geek - Spark Streaming
Djamel Zouaoui
 
NoSQL meetup July 2011
Shay Hassidim
 
SnappyData Ad Analytics Use Case -- BDAM Meetup Sept 14th
SnappyData
 
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Databricks
 
Uri Cohen & Dan Kilman, GigaSpaces - Orchestration Tool Roundup - OpenStack l...
Cloud Native Day Tel Aviv
 
Ad

Recently uploaded (20)

PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PDF
How-Cloud-Computing-Impacts-Businesses-in-2025-and-Beyond.pdf
Artjoker Software Development Company
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PPTX
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
PDF
REPORT: Heating appliances market in Poland 2024
SPIUG
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PPTX
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PDF
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
PDF
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
PDF
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
The Future of Artificial Intelligence (AI)
Mukul
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
How-Cloud-Computing-Impacts-Businesses-in-2025-and-Beyond.pdf
Artjoker Software Development Company
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
REPORT: Heating appliances market in Poland 2024
SPIUG
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 

Gnocchi v4 - past and present

  • 2. v4 features □ simplified scheduling □ less pandas, more numpy □ Redis incoming driver □ In-memory incoming Ceph driver □ Other general features: ■ http://gnocchi.xyz/releasenotes/4.0.html ■ http://gnocchi.xyz/releasenotes/unreleased.html
  • 3. scheduling incoming data sharded into sacks to allow simple division of work across metricd workers
  • 4. numpy old Pandas - a monolithic, all-in-one, data analysis toolkit new Numpy - a lightweight, high-performance, N-dimensional array (and a bit more) library
  • 5. in-memory the memory is mightier. leverage Redis driver or LevelDB/RocksDB internals for Ceph
  • 6. benchmarks back with another one of those block rockin’ beats
  • 7. v2 & v3 node1 - OpenStack controller node - Ceph Monitor Service - Redis (coordination) node2 - OpenStack Compute Node - Ceph OSD node (10 OSDs + SSD Journal) - 18 metricd (24 in v2) node3 - Gnocchi API (32 workers) - Ceph OSD node (10 OSDs + SSD Journal) - 18 metricd (24 in v2) node4 - OpenStack Compute Node - Ceph OSD node (10 OSDs + SSD Journal) - PostgreSQL ( - 18 metricd (24 in v2) environment v4.x node1 - OpenStack controller node - Ceph Monitor Service - Redis - MySQL node2 - OpenStack Compute Node - Ceph OSD node (10 OSDs + SSD Journal) node3 - OpenStack Compute Node - Ceph OSD node (10 OSDs + SSD Journal) - Gnocchi API (32 workers) - 18 metricd all nodes are physical servers: - 24CPU (48 hyperthreaded) - 256GB memory - 10K disks - 1GB network - CentOS 7.1 less services and hardware when running v4. all gnocchi services on single node all tests use Ceph as a storage driver for aggregates.
  • 8. data generated using benchmark tool in client (modified to use threads). 4 clients w/ 12 threads running simultaneously. write throughput total datapoints written per second. (higher is better)
  • 9. number of requests made per second. (higher is better) write throughput
  • 10. test case 1 1K resources, 20 metrics each. flood Gnocchi with 60 individual points per metric. 1.2M calls/run. run it a few times.
  • 11. time to POST 1.2M individual measures for 20K metrics to Gnocchi. post time v3.1 had anomaly that caused degradation over time.
  • 12. processing time v4 tests use 18 metricd, v3 test uses 54 metricd time to aggregate all measures according to policy. (lower is better)
  • 14. processing time number of recorded, unprocessed measures over a single run poor scheduling logic resulted inefficient handling of many tiny objects in v3.
  • 15. processing time number of recorded, unprocessed measures over a single run backlog size dependent on both API’s ability to write data and metricd’s ability to process it.
  • 16. test case 2 1K resources, 20 metrics each. flood Gnocchi with 60 batched points per metric. 20K calls/run. run it a few times.
  • 17. processing time v4 tests use 18 metricd for 3x8 aggregates/metric, v2 and v3 tests, use 72 and 54 metricd respectively time to aggregate all measures according to policy. (lower is better)
  • 18. aggregation time time to aggregate 60 measures of a metric into 3x8 aggregates (lower is better) average time reflects a combination of scheduling efficiency, computation efficiency and IO performance.
  • 19. test case 3 500 resources, 20 metrics each. flood Gnocchi with 720 batched points per metric. 10K calls/run. run it a few times.
  • 20. time to aggregate all measures according to policy. (lower is better) processing time v4 tests use 18 metricd for 3x8 aggregates/metric. v2 and v3 tests, use 72 metricd
  • 21. aggregation time time to aggregate 720 measures of a metric into 3x8 aggregates (lower is better) computation efficiency improved for larger series. ~3x improvement for 60 points and ~6x improvement for 720 points
  • 23. time to aggregate metric with varying unbatched measure sizes (lower is better) processing time numbers represent optimal performance. benchmark was taken under zero load.
  • 24. time to retrieve a single time series using curl and client (lower is better) query time client overhead attributed to but not limited to formatting no significant performance difference vs v3
  • 25. time to aggregate all measures according to default ‘medium’ policy. (lower is better) default configurations v3 tests use 54 metricd. v4 tests use 18 metricd. - v3 medium policy: - minute/hourly/daily rollups - 8 aggregates each - v4 medium policy: - minute/hourly rollups - 6 aggregates each
  • 26. thanks! Any questions? You can find me at @gord_chung gord[at]live.ca ?
  • 27. Credits Special thanks to all the people who made and released these awesome resources for free: □ Presentation template by SlidesCarnival