SlideShare a Scribd company logo
Sam Dillard, Sales Engineer
Optimizing InfluxDB
Performance
© 2017 InfluxData. All rights reserved.2 © 2017 InfluxData. All rights reserved.2
❖ Optimizing
➢ Hardware/Architecture
➢ Write Method
➢ Schema
❖ Q&A
Agenda
© 2017 InfluxData. All rights reserved.3
Resource Utilization
• No Specific OS Tuning Required
• IOPS IOPS IOPS
• 70% cpu/mem utilization - need head room for:
• Peak periods
• Compactions
• Backfilling data
© 2017 InfluxData. All rights reserved.4 © 2017 InfluxData. All rights reserved.4
❖ Optimizing
➢ Hardware/Architecture
➢ Write Method
➢ Schema
❖ Q&A
Agenda
© 2017 InfluxData. All rights reserved.5 © 2017 InfluxData. All rights reserved.5
© 2017 InfluxData. All rights reserved.6
Telegraf
• Lightweight; written in Go
• Plug-in driven
• Optimized for writing to InfluxDB
• Formatting
• Retries
• Modifiable batch sizes and jitter
• Tag sorting
• Preprocessing
• Converting tags to fields, fields to tags
• Regex transformations
• Renaming measurements, tags
• Aggregations (mean, min, max, count, variance, stddev, etc.)
© 2017 InfluxData. All rights reserved.7
Popular Plugins
Out-of-the-box Custom
Kubernetes (kubelet) HTTP/socket listener
Kube_inventory (apiserver) HTTP (formatted endpoints)
Kafka (consumer) Prometheus (/metrics)
RabbitMQ (consumer) Exec
AMQP (mq metadata) StatsD
Redis
Nginx
HAproxy
Jolokia2
© 2017 InfluxData. All rights reserved.8
Telegraf
CPU
Mem
Disk
Docker
Kubernetes
/metrics
Kafka
MySQL
Process
-transform
-decorate
-filter
Aggregate
-mean
-min,max
-count
-variance
-stddev
File
InfluxDB
Kafka
CloudWatch
CloudWatch
© 2017 InfluxData. All rights reserved.9
Parsing
● JSON
● CSV
● Graphite
● CollectD
● Dropwizard
● Form URL-encoded
● Grok
© 2017 InfluxData. All rights reserved.10
Telegraf
InfluxDB
Telegraf
Telegraf
Telegraf
Telegraf
Telegraf
Telegraf
Telegraf
Telegraf
Telegraf
Message Queue Telegraf
Kafka
Rabbit
Active
NSQ
AWS Kinesis
Google PubSub
MQTT
© 2017 InfluxData. All rights reserved.11
Balanced ingestion helps....
© 2017 InfluxData. All rights reserved.12
Good...
Not so good...
© 2017 InfluxData. All rights reserved.13
Good...
Not so good...
© 2017 InfluxData. All rights reserved.14 © 2017 InfluxData. All rights reserved.14
❖ Optimizing
➢ Hardware/Architecture
➢ Write Method
➢ Schema
❖ Q&A
Agenda
© 2018 InfluxData. All rights reserved.15
Schema Design Goals
• By reducing...
– Measurement/tag cardinality
– Information-encoding
– Key lengths
• You increase…
– Write performance
– Query performance
– Readability
© 2018 InfluxData. All rights reserved.16
“It’s a feature, not a bug...but
features require thinking”
- Richard Laskey, Wayfair
© 2018 InfluxData. All rights reserved.17
Line Protocol && Schema Insight
<measurement,tagset fieldset timestamp>
● A Measurement is a namespace for like metrics (SQL table)
● What to make a Measurement?
○ Logically-alike metrics; categorization
○ I.e., CPU has metrics has many metrics associated with it
○ I.e., Transactions
■ “usd”,”response_time”,”duration_ms”,”timeout”, whatever else…
● What to make a Tag?
○ Metadata; “things you need to `GROUP BY`”
● What to make a Field?
○ Actual metrics
■ Metrics you will visualize or operate on
○ Things that have high value variance...that you don’t need to group
© 2018 InfluxData. All rights reserved.18
Line Protocol Goals
1) Don’t encode data into Measurements or Tags; indicated by
valuesless key names (value, counter, gauge)
2) Write as many Fields per Line as you can; #1 allows for #2
3) Separate information into primitives; reduce regex grouping
4) Order Tags lexicographically
(Telegraf does all this for you, for the most part)
© 2018 InfluxData. All rights reserved.19
DON'T ENCODE DATA INTO THE MEASUREMENT NAME
Measurement names like:
Encode that information as tags:
Cpu.server-5.us-west.usage_user value=20.0 1444234982000000000
cpu.server-6.us-west.usage_user value=40.0 1444234982000000000
mem.server-6.us-west.free value=25.0 1444234982000000000
cpu,host=server-5,region=us-west usage_user=20.0 1444234982000000000
cpu,host=server-6,region=us-west usage_user=40.0 1444234982000000000
mem,host=server-6,region=us-west mem_free=25.0 1444234982000000
© 2018 InfluxData. All rights reserved.20
DON’T OVERLOAD TAGS (separate into primitives)
BAD
GOOD: Separate out into different tags:
xxx
cpu,server=localhost.us-west.usage_user value=2.0 1444234982000000000
cpu,server=localhost.us-east.usage_system value=3.0 1444234982000000000
cpu,host=localhost,region=us-west usage_user=2.0 1444234982000000000
cpu,host=localhost,region=us-east usage_system=3.0 1444234982000000000
© 2017 InfluxData. All rights reserved.21
Use Telegraf as a Graphite parser
Graphite like: cpu.usage.eu-west.idle.percentage 100
With a Telegraf configuration like:
Results in following transformation:
cpu_usage,region=eu-east idle_percentage=100
[[inputs.http_listener_v2]]
data_format = “graphite”
separator = "_"
templates = [
"measurement.measurement.region.field*"
]
© 2018 InfluxData. All rights reserved.22
© 2018 InfluxData. All rights reserved.23
© 2017 InfluxData. All rights reserved.24
stock_prices,symbol=BP price=25.0 1
stock_prices,symbol=CVX price=35.0 1
stock_prices,symbol=XOM price=45.0 1
© 2017 InfluxData. All rights reserved.25
stock_prices,symbol=XOM open=25.0,high=45.0,low=20.0,close=35.0 1
stock_prices,symbol=XOM open=20.0,high=40.0,low=20.0,close=40.0 2
© 2018 InfluxData. All rights reserved.26
Also smaller payloads:
From:
cpu,region=us-west-1,host=hostA,container=containerA usage_user=35.0 <timestamp>
cpu,region=us-west-1,host=hostA,container=containerA usage_system=15.0 <timestamp>
cpu,region=us-west-1,host=hostA,container=containerA usage_guest=0.0 <timestamp>
cpu,region=us-west-1,host=hostA,container=containerA usage_guest_nice=0.0 <timestamp>
cpu,region=us-west-1,host=hostA,container=containerA usage_idle=35.0 <timestamp>
cpu,region=us-west-1,host=hostA,container=containerA usage_iowait=0.2 <timestamp>
cpu,region=us-west-1,host=hostA,container=containerA usage_irq=0.0 <timestamp>
cpu,region=us-west-1,host=hostA,container=containerA usage_nice=1.0 <timestamp>
cpu,region=us-west-1,host=hostA,container=containerA usage_steal=2.0 <timestamp>
cpu,region=us-west-1,host=hostA,container=containerA usage_softirq=2.5 <timestamp>
To:
cpu,region=us-west-1,host=hostA,container=containerA
usage_user=35.0,usage_system=15.0,usage_guest=0.0,usage_guest_nice=0.0,usage_idle=35.0,usage_iowait=0.2,usage_irq=0.0
,usage_irq=0.0,usage_nice=1.0,usage_steal=2.0,usage_softirq=2.5 <timestamp>
sam@influxdata.com @SDillard12
THANKS!
© 2017 InfluxData. All rights reserved.28 © 2017 InfluxData. All rights reserved.28
❖ Optimizing
➢ Hardware/Architecture
➢ Write Method
➢ Schema
➢ Queries
➢ Configuration
❖ Q&A
Agenda
© 2017 InfluxData. All rights reserved.29 © 2017 InfluxData. All rights reserved.29
Queries and Shards
• Shard durations should be longer than your longest typical
query
• When thinking about balancing writes/reads:
High Query load Low Query Load
High Write Load Balanced duration Shorter duration
Bursty or Low Write Load Longer duration Balanced duration
© 2017 InfluxData. All rights reserved.30 © 2017 InfluxData. All rights reserved.30
Query Performance
• Streaming functions > batch functions
• Batch funcs
• percentile(), spread(), stddev(), median(), mode(), holt-winters
• Stream funcs
• mean(),bottom(),first(),last(),max(),top(),count(),etc.
• Distributed functions (clusters only) > local functions
• Distributed
• first(),last(),max(),min(),count(),mean(),sum()
• Local
• percentile(),derivative(),spread(),top(),bottom(),elapsed(),etc.
© 2017 InfluxData. All rights reserved.31 © 2017 InfluxData. All rights reserved.31
Query Performance
• Boundaries!
• Time-bounding and series-bounding with `WHERE` clause
• `SELECT *` generally not a best practice
• Agg functions instead of raw queries
• `SELECT mean(<field>)` > `SELECT <field>`
• Reduce `GROUP BY time` intervals
• Subqueries
• When appropriate, process data from an already processed subset of
data
• SELECT min("max") FROM (SELECT max("usage_user") FROM cpu WHERE time
> now() - 5d GROUP BY time(5m))
© 2017 InfluxData. All rights reserved.32 © 2017 InfluxData. All rights reserved.32
Exercise
``` (1)
SELECT usage_user,usage_system
FROM cpu
WHERE time > now() - 7d
GROUP BY time(1m), host
```
``` (2)
SELECT usage_user,usage_system
FROM cpu
WHERE time > now() - 7d
GROUP BY time(5m), host
```
● Shard duration = 24h
● Hosts = 1,000
● Retention = 2w
Question: Which is faster?
© 2017 InfluxData. All rights reserved.33 © 2017 InfluxData. All rights reserved.33
Answer
● (1) = 10,080 time buckets * 2 Fields * 1,000
hosts = 20,160,000 series-time groups
● (2) = 2,016 time buckets * 2 Fields * 1,000
hosts = 4,032,000 series-time groups
© 2017 InfluxData. All rights reserved.34 © 2017 InfluxData. All rights reserved.34
Exercise
``` (1)
SELECT usage_user,usage_system
FROM cpu
WHERE time > now() - 1d
GROUP BY time(1m), host,
container_id,app
```
``` (2)
SELECT *
FROM docker
WHERE time > now() - 1d
GROUP BY time(5m), host
```
● Shard duration = 7d
● Hosts = 10,000
● Retention = 2w
Question: Which is faster?
© 2017 InfluxData. All rights reserved.35 © 2017 InfluxData. All rights reserved.35
Answer
● It depends. Docker measurement has 100+ Fields.
● `SELECT * from query (2) is going to grab every single one of
those from every single host vs. the mere 2 from query (1)
© 2017 InfluxData. All rights reserved.36 © 2017 InfluxData. All rights reserved.36
❖ Optimizing
➢ Hardware/Architecture
➢ Write Method
➢ Schema
➢ Queries
➢ Configuration
❖ Q&A
Agenda
© 2017 InfluxData. All rights reserved.37
Tuning Parameters
• Max-concurrent-queries
• Max-select-point
• Max-select-series
• Rate limiting compactions
• Max-concurrent-compactions
• Compact-full-write-cold-duration
• Compact-throughput
• Compact-throughput-burst
© 2017 InfluxData. All rights reserved.38
Tuning Parameters
• Cache-max-memory-size
• Cache-snapshot-memory-size
• Cache-snapshot-write-cold-duration
• Max-series-per-database
• Max-values-per-tag
• Fine Grained Auth instead of multiple databases
sam@influxdata.com @SDillard12
THANKS!

More Related Content

What's hot (20)

PPTX
Sam Dillard [InfluxData] | Performance Optimization in InfluxDB | InfluxDays...
InfluxData
 
PDF
InfluxData Architecture for IoT | Noah Crowley | InfluxData
InfluxData
 
PDF
Catalogs - Turning a Set of Parquet Files into a Data Set
InfluxData
 
PDF
Introduction to InfluxDB
Jorn Jambers
 
PDF
IT Monitoring in the Era of Containers | Luca Deri Founder & Project Lead | ntop
InfluxData
 
PDF
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
InfluxData
 
PPTX
IoT Architectural Overview - 3 use case studies from InfluxData
InfluxData
 
PDF
Kapacitor Stream Processing
InfluxData
 
PDF
How to Gain Visibility into Containers, VM’s and Multi-Cloud Environments Usi...
InfluxData
 
PPTX
Ryan Betts [InfluxData] | InfluxDB Platform Performance | InfluxDays Virtual ...
InfluxData
 
PDF
Setting up InfluxData for IoT
InfluxData
 
PDF
Virtual training Intro to Kapacitor
InfluxData
 
PDF
How Sensor Data Can Help Manufacturers Gain Insight to Reduce Waste, Energy C...
InfluxData
 
PDF
How to Store and Visualize CAN Bus Telematic Data with InfluxDB Cloud and Gra...
InfluxData
 
PPTX
Michael DeSa [InfluxData] | Monitoring Methodologies | InfluxDays Virtual Exp...
InfluxData
 
PDF
Observability of InfluxDB IOx: Tracing, Metrics and System Tables
InfluxData
 
PPTX
Kapacitor - Real Time Data Processing Engine
Prashant Vats
 
PDF
WRITING QUERIES (INFLUXQL AND TICK)
InfluxData
 
PDF
Alan Pope, Sebastian Spaink [InfluxData] | Data Collection 101 | InfluxDays N...
InfluxData
 
PPTX
Lessons Learned Running InfluxDB Cloud and Other Cloud Services at Scale by T...
InfluxData
 
Sam Dillard [InfluxData] | Performance Optimization in InfluxDB | InfluxDays...
InfluxData
 
InfluxData Architecture for IoT | Noah Crowley | InfluxData
InfluxData
 
Catalogs - Turning a Set of Parquet Files into a Data Set
InfluxData
 
Introduction to InfluxDB
Jorn Jambers
 
IT Monitoring in the Era of Containers | Luca Deri Founder & Project Lead | ntop
InfluxData
 
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
InfluxData
 
IoT Architectural Overview - 3 use case studies from InfluxData
InfluxData
 
Kapacitor Stream Processing
InfluxData
 
How to Gain Visibility into Containers, VM’s and Multi-Cloud Environments Usi...
InfluxData
 
Ryan Betts [InfluxData] | InfluxDB Platform Performance | InfluxDays Virtual ...
InfluxData
 
Setting up InfluxData for IoT
InfluxData
 
Virtual training Intro to Kapacitor
InfluxData
 
How Sensor Data Can Help Manufacturers Gain Insight to Reduce Waste, Energy C...
InfluxData
 
How to Store and Visualize CAN Bus Telematic Data with InfluxDB Cloud and Gra...
InfluxData
 
Michael DeSa [InfluxData] | Monitoring Methodologies | InfluxDays Virtual Exp...
InfluxData
 
Observability of InfluxDB IOx: Tracing, Metrics and System Tables
InfluxData
 
Kapacitor - Real Time Data Processing Engine
Prashant Vats
 
WRITING QUERIES (INFLUXQL AND TICK)
InfluxData
 
Alan Pope, Sebastian Spaink [InfluxData] | Data Collection 101 | InfluxDays N...
InfluxData
 
Lessons Learned Running InfluxDB Cloud and Other Cloud Services at Scale by T...
InfluxData
 

Similar to Optimizing InfluxDB Performance in the Real World | Sam Dillard | InfluxData (20)

PDF
Optimizing Time Series Performance in the Real World
DevOps.com
 
PPTX
InfluxDB 1.0 - Optimizing InfluxDB by Sam Dillard
InfluxData
 
PDF
OPTIMIZING THE TICK STACK
InfluxData
 
PDF
Timeseries - data visualization in Grafana
OCoderFest
 
PPTX
OPTIMIZING THE TICK STACK
InfluxData
 
PDF
Virtual training Intro to InfluxDB & Telegraf
InfluxData
 
PDF
Taming the Tiger: Tips and Tricks for Using Telegraf
InfluxData
 
PDF
OPTIMIZING THE TICK STACK
InfluxData
 
PPTX
Intro to InfluxDB 2.0 and Your First Flux Query by Sonia Gupta
InfluxData
 
PPTX
Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Exper...
InfluxData
 
PDF
InfluxData Platform Future and Vision
InfluxData
 
PPTX
Improving Clinical Data Accuracy: How to Streamline a Data Pipeline Using Nod...
InfluxData
 
PDF
Advanced kapacitor
InfluxData
 
PDF
Monitoring InfluxEnterprise
InfluxData
 
PDF
Altitude NY 2018: Leveraging Log Streaming to Build the Best Dashboards, Ever
Fastly
 
PDF
Virtual training intro to InfluxDB - June 2021
InfluxData
 
PDF
Virtual training optimizing the tick stack
InfluxData
 
PDF
InfluxDB Live Product Training
InfluxData
 
PDF
Intro to InfluxDB
InfluxData
 
PDF
Gilmore, Palani [InfluxData] | Use Case: Monitoring / Observability | InfluxD...
InfluxData
 
Optimizing Time Series Performance in the Real World
DevOps.com
 
InfluxDB 1.0 - Optimizing InfluxDB by Sam Dillard
InfluxData
 
OPTIMIZING THE TICK STACK
InfluxData
 
Timeseries - data visualization in Grafana
OCoderFest
 
OPTIMIZING THE TICK STACK
InfluxData
 
Virtual training Intro to InfluxDB & Telegraf
InfluxData
 
Taming the Tiger: Tips and Tricks for Using Telegraf
InfluxData
 
OPTIMIZING THE TICK STACK
InfluxData
 
Intro to InfluxDB 2.0 and Your First Flux Query by Sonia Gupta
InfluxData
 
Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Exper...
InfluxData
 
InfluxData Platform Future and Vision
InfluxData
 
Improving Clinical Data Accuracy: How to Streamline a Data Pipeline Using Nod...
InfluxData
 
Advanced kapacitor
InfluxData
 
Monitoring InfluxEnterprise
InfluxData
 
Altitude NY 2018: Leveraging Log Streaming to Build the Best Dashboards, Ever
Fastly
 
Virtual training intro to InfluxDB - June 2021
InfluxData
 
Virtual training optimizing the tick stack
InfluxData
 
InfluxDB Live Product Training
InfluxData
 
Intro to InfluxDB
InfluxData
 
Gilmore, Palani [InfluxData] | Use Case: Monitoring / Observability | InfluxD...
InfluxData
 
Ad

More from InfluxData (20)

PPTX
Announcing InfluxDB Clustered
InfluxData
 
PDF
Best Practices for Leveraging the Apache Arrow Ecosystem
InfluxData
 
PDF
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
InfluxData
 
PDF
Power Your Predictive Analytics with InfluxDB
InfluxData
 
PDF
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
InfluxData
 
PDF
Build an Edge-to-Cloud Solution with the MING Stack
InfluxData
 
PDF
Meet the Founders: An Open Discussion About Rewriting Using Rust
InfluxData
 
PDF
Introducing InfluxDB Cloud Dedicated
InfluxData
 
PDF
Gain Better Observability with OpenTelemetry and InfluxDB
InfluxData
 
PPTX
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
InfluxData
 
PDF
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
InfluxData
 
PPTX
Introducing InfluxDB’s New Time Series Database Storage Engine
InfluxData
 
PDF
Start Automating InfluxDB Deployments at the Edge with balena
InfluxData
 
PDF
Understanding InfluxDB’s New Storage Engine
InfluxData
 
PDF
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
InfluxData
 
PPTX
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
InfluxData
 
PDF
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
InfluxData
 
PDF
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
InfluxData
 
PDF
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
InfluxData
 
PDF
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
InfluxData
 
Announcing InfluxDB Clustered
InfluxData
 
Best Practices for Leveraging the Apache Arrow Ecosystem
InfluxData
 
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
InfluxData
 
Power Your Predictive Analytics with InfluxDB
InfluxData
 
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
InfluxData
 
Build an Edge-to-Cloud Solution with the MING Stack
InfluxData
 
Meet the Founders: An Open Discussion About Rewriting Using Rust
InfluxData
 
Introducing InfluxDB Cloud Dedicated
InfluxData
 
Gain Better Observability with OpenTelemetry and InfluxDB
InfluxData
 
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
InfluxData
 
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
InfluxData
 
Introducing InfluxDB’s New Time Series Database Storage Engine
InfluxData
 
Start Automating InfluxDB Deployments at the Edge with balena
InfluxData
 
Understanding InfluxDB’s New Storage Engine
InfluxData
 
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
InfluxData
 
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
InfluxData
 
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
InfluxData
 
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
InfluxData
 
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
InfluxData
 
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
InfluxData
 
Ad

Recently uploaded (20)

PPTX
Agile Chennai 18-19 July 2025 | Workshop - Enhancing Agile Collaboration with...
AgileNetwork
 
PDF
NewMind AI Weekly Chronicles – July’25, Week III
NewMind AI
 
PDF
Build with AI and GDG Cloud Bydgoszcz- ADK .pdf
jaroslawgajewski1
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PDF
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PPTX
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PDF
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PDF
RAT Builders - How to Catch Them All [DeepSec 2024]
malmoeb
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PPTX
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PPTX
AI Code Generation Risks (Ramkumar Dilli, CIO, Myridius)
Priyanka Aash
 
PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
PPTX
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
Agile Chennai 18-19 July 2025 | Workshop - Enhancing Agile Collaboration with...
AgileNetwork
 
NewMind AI Weekly Chronicles – July’25, Week III
NewMind AI
 
Build with AI and GDG Cloud Bydgoszcz- ADK .pdf
jaroslawgajewski1
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
The Future of Artificial Intelligence (AI)
Mukul
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
RAT Builders - How to Catch Them All [DeepSec 2024]
malmoeb
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
AI Code Generation Risks (Ramkumar Dilli, CIO, Myridius)
Priyanka Aash
 
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 

Optimizing InfluxDB Performance in the Real World | Sam Dillard | InfluxData

  • 1. Sam Dillard, Sales Engineer Optimizing InfluxDB Performance
  • 2. © 2017 InfluxData. All rights reserved.2 © 2017 InfluxData. All rights reserved.2 ❖ Optimizing ➢ Hardware/Architecture ➢ Write Method ➢ Schema ❖ Q&A Agenda
  • 3. © 2017 InfluxData. All rights reserved.3 Resource Utilization • No Specific OS Tuning Required • IOPS IOPS IOPS • 70% cpu/mem utilization - need head room for: • Peak periods • Compactions • Backfilling data
  • 4. © 2017 InfluxData. All rights reserved.4 © 2017 InfluxData. All rights reserved.4 ❖ Optimizing ➢ Hardware/Architecture ➢ Write Method ➢ Schema ❖ Q&A Agenda
  • 5. © 2017 InfluxData. All rights reserved.5 © 2017 InfluxData. All rights reserved.5
  • 6. © 2017 InfluxData. All rights reserved.6 Telegraf • Lightweight; written in Go • Plug-in driven • Optimized for writing to InfluxDB • Formatting • Retries • Modifiable batch sizes and jitter • Tag sorting • Preprocessing • Converting tags to fields, fields to tags • Regex transformations • Renaming measurements, tags • Aggregations (mean, min, max, count, variance, stddev, etc.)
  • 7. © 2017 InfluxData. All rights reserved.7 Popular Plugins Out-of-the-box Custom Kubernetes (kubelet) HTTP/socket listener Kube_inventory (apiserver) HTTP (formatted endpoints) Kafka (consumer) Prometheus (/metrics) RabbitMQ (consumer) Exec AMQP (mq metadata) StatsD Redis Nginx HAproxy Jolokia2
  • 8. © 2017 InfluxData. All rights reserved.8 Telegraf CPU Mem Disk Docker Kubernetes /metrics Kafka MySQL Process -transform -decorate -filter Aggregate -mean -min,max -count -variance -stddev File InfluxDB Kafka CloudWatch CloudWatch
  • 9. © 2017 InfluxData. All rights reserved.9 Parsing ● JSON ● CSV ● Graphite ● CollectD ● Dropwizard ● Form URL-encoded ● Grok
  • 10. © 2017 InfluxData. All rights reserved.10 Telegraf InfluxDB Telegraf Telegraf Telegraf Telegraf Telegraf Telegraf Telegraf Telegraf Telegraf Message Queue Telegraf Kafka Rabbit Active NSQ AWS Kinesis Google PubSub MQTT
  • 11. © 2017 InfluxData. All rights reserved.11 Balanced ingestion helps....
  • 12. © 2017 InfluxData. All rights reserved.12 Good... Not so good...
  • 13. © 2017 InfluxData. All rights reserved.13 Good... Not so good...
  • 14. © 2017 InfluxData. All rights reserved.14 © 2017 InfluxData. All rights reserved.14 ❖ Optimizing ➢ Hardware/Architecture ➢ Write Method ➢ Schema ❖ Q&A Agenda
  • 15. © 2018 InfluxData. All rights reserved.15 Schema Design Goals • By reducing... – Measurement/tag cardinality – Information-encoding – Key lengths • You increase… – Write performance – Query performance – Readability
  • 16. © 2018 InfluxData. All rights reserved.16 “It’s a feature, not a bug...but features require thinking” - Richard Laskey, Wayfair
  • 17. © 2018 InfluxData. All rights reserved.17 Line Protocol && Schema Insight <measurement,tagset fieldset timestamp> ● A Measurement is a namespace for like metrics (SQL table) ● What to make a Measurement? ○ Logically-alike metrics; categorization ○ I.e., CPU has metrics has many metrics associated with it ○ I.e., Transactions ■ “usd”,”response_time”,”duration_ms”,”timeout”, whatever else… ● What to make a Tag? ○ Metadata; “things you need to `GROUP BY`” ● What to make a Field? ○ Actual metrics ■ Metrics you will visualize or operate on ○ Things that have high value variance...that you don’t need to group
  • 18. © 2018 InfluxData. All rights reserved.18 Line Protocol Goals 1) Don’t encode data into Measurements or Tags; indicated by valuesless key names (value, counter, gauge) 2) Write as many Fields per Line as you can; #1 allows for #2 3) Separate information into primitives; reduce regex grouping 4) Order Tags lexicographically (Telegraf does all this for you, for the most part)
  • 19. © 2018 InfluxData. All rights reserved.19 DON'T ENCODE DATA INTO THE MEASUREMENT NAME Measurement names like: Encode that information as tags: Cpu.server-5.us-west.usage_user value=20.0 1444234982000000000 cpu.server-6.us-west.usage_user value=40.0 1444234982000000000 mem.server-6.us-west.free value=25.0 1444234982000000000 cpu,host=server-5,region=us-west usage_user=20.0 1444234982000000000 cpu,host=server-6,region=us-west usage_user=40.0 1444234982000000000 mem,host=server-6,region=us-west mem_free=25.0 1444234982000000
  • 20. © 2018 InfluxData. All rights reserved.20 DON’T OVERLOAD TAGS (separate into primitives) BAD GOOD: Separate out into different tags: xxx cpu,server=localhost.us-west.usage_user value=2.0 1444234982000000000 cpu,server=localhost.us-east.usage_system value=3.0 1444234982000000000 cpu,host=localhost,region=us-west usage_user=2.0 1444234982000000000 cpu,host=localhost,region=us-east usage_system=3.0 1444234982000000000
  • 21. © 2017 InfluxData. All rights reserved.21 Use Telegraf as a Graphite parser Graphite like: cpu.usage.eu-west.idle.percentage 100 With a Telegraf configuration like: Results in following transformation: cpu_usage,region=eu-east idle_percentage=100 [[inputs.http_listener_v2]] data_format = “graphite” separator = "_" templates = [ "measurement.measurement.region.field*" ]
  • 22. © 2018 InfluxData. All rights reserved.22
  • 23. © 2018 InfluxData. All rights reserved.23
  • 24. © 2017 InfluxData. All rights reserved.24 stock_prices,symbol=BP price=25.0 1 stock_prices,symbol=CVX price=35.0 1 stock_prices,symbol=XOM price=45.0 1
  • 25. © 2017 InfluxData. All rights reserved.25 stock_prices,symbol=XOM open=25.0,high=45.0,low=20.0,close=35.0 1 stock_prices,symbol=XOM open=20.0,high=40.0,low=20.0,close=40.0 2
  • 26. © 2018 InfluxData. All rights reserved.26 Also smaller payloads: From: cpu,region=us-west-1,host=hostA,container=containerA usage_user=35.0 <timestamp> cpu,region=us-west-1,host=hostA,container=containerA usage_system=15.0 <timestamp> cpu,region=us-west-1,host=hostA,container=containerA usage_guest=0.0 <timestamp> cpu,region=us-west-1,host=hostA,container=containerA usage_guest_nice=0.0 <timestamp> cpu,region=us-west-1,host=hostA,container=containerA usage_idle=35.0 <timestamp> cpu,region=us-west-1,host=hostA,container=containerA usage_iowait=0.2 <timestamp> cpu,region=us-west-1,host=hostA,container=containerA usage_irq=0.0 <timestamp> cpu,region=us-west-1,host=hostA,container=containerA usage_nice=1.0 <timestamp> cpu,region=us-west-1,host=hostA,container=containerA usage_steal=2.0 <timestamp> cpu,region=us-west-1,host=hostA,container=containerA usage_softirq=2.5 <timestamp> To: cpu,region=us-west-1,host=hostA,container=containerA usage_user=35.0,usage_system=15.0,usage_guest=0.0,usage_guest_nice=0.0,usage_idle=35.0,usage_iowait=0.2,usage_irq=0.0 ,usage_irq=0.0,usage_nice=1.0,usage_steal=2.0,usage_softirq=2.5 <timestamp>
  • 28. © 2017 InfluxData. All rights reserved.28 © 2017 InfluxData. All rights reserved.28 ❖ Optimizing ➢ Hardware/Architecture ➢ Write Method ➢ Schema ➢ Queries ➢ Configuration ❖ Q&A Agenda
  • 29. © 2017 InfluxData. All rights reserved.29 © 2017 InfluxData. All rights reserved.29 Queries and Shards • Shard durations should be longer than your longest typical query • When thinking about balancing writes/reads: High Query load Low Query Load High Write Load Balanced duration Shorter duration Bursty or Low Write Load Longer duration Balanced duration
  • 30. © 2017 InfluxData. All rights reserved.30 © 2017 InfluxData. All rights reserved.30 Query Performance • Streaming functions > batch functions • Batch funcs • percentile(), spread(), stddev(), median(), mode(), holt-winters • Stream funcs • mean(),bottom(),first(),last(),max(),top(),count(),etc. • Distributed functions (clusters only) > local functions • Distributed • first(),last(),max(),min(),count(),mean(),sum() • Local • percentile(),derivative(),spread(),top(),bottom(),elapsed(),etc.
  • 31. © 2017 InfluxData. All rights reserved.31 © 2017 InfluxData. All rights reserved.31 Query Performance • Boundaries! • Time-bounding and series-bounding with `WHERE` clause • `SELECT *` generally not a best practice • Agg functions instead of raw queries • `SELECT mean(<field>)` > `SELECT <field>` • Reduce `GROUP BY time` intervals • Subqueries • When appropriate, process data from an already processed subset of data • SELECT min("max") FROM (SELECT max("usage_user") FROM cpu WHERE time > now() - 5d GROUP BY time(5m))
  • 32. © 2017 InfluxData. All rights reserved.32 © 2017 InfluxData. All rights reserved.32 Exercise ``` (1) SELECT usage_user,usage_system FROM cpu WHERE time > now() - 7d GROUP BY time(1m), host ``` ``` (2) SELECT usage_user,usage_system FROM cpu WHERE time > now() - 7d GROUP BY time(5m), host ``` ● Shard duration = 24h ● Hosts = 1,000 ● Retention = 2w Question: Which is faster?
  • 33. © 2017 InfluxData. All rights reserved.33 © 2017 InfluxData. All rights reserved.33 Answer ● (1) = 10,080 time buckets * 2 Fields * 1,000 hosts = 20,160,000 series-time groups ● (2) = 2,016 time buckets * 2 Fields * 1,000 hosts = 4,032,000 series-time groups
  • 34. © 2017 InfluxData. All rights reserved.34 © 2017 InfluxData. All rights reserved.34 Exercise ``` (1) SELECT usage_user,usage_system FROM cpu WHERE time > now() - 1d GROUP BY time(1m), host, container_id,app ``` ``` (2) SELECT * FROM docker WHERE time > now() - 1d GROUP BY time(5m), host ``` ● Shard duration = 7d ● Hosts = 10,000 ● Retention = 2w Question: Which is faster?
  • 35. © 2017 InfluxData. All rights reserved.35 © 2017 InfluxData. All rights reserved.35 Answer ● It depends. Docker measurement has 100+ Fields. ● `SELECT * from query (2) is going to grab every single one of those from every single host vs. the mere 2 from query (1)
  • 36. © 2017 InfluxData. All rights reserved.36 © 2017 InfluxData. All rights reserved.36 ❖ Optimizing ➢ Hardware/Architecture ➢ Write Method ➢ Schema ➢ Queries ➢ Configuration ❖ Q&A Agenda
  • 37. © 2017 InfluxData. All rights reserved.37 Tuning Parameters • Max-concurrent-queries • Max-select-point • Max-select-series • Rate limiting compactions • Max-concurrent-compactions • Compact-full-write-cold-duration • Compact-throughput • Compact-throughput-burst
  • 38. © 2017 InfluxData. All rights reserved.38 Tuning Parameters • Cache-max-memory-size • Cache-snapshot-memory-size • Cache-snapshot-write-cold-duration • Max-series-per-database • Max-values-per-tag • Fine Grained Auth instead of multiple databases