InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre & Post Sales | InfluxData

Dean Sheehan
Snr Director Sales Engineering
InfluxEnterprise
Architectural Patterns

What we will be
covering
✓ Enterprise Overview
✓ Other Features
✓ Ingestion & Query Rates
✓ Deployment Examples
✓ Replications Patterns
✓ General Advice

Signs You’re Ready for InfluxEnterprise
6. Your CPU average is >=70%
1. The sales team starts calling you on weekends
5. Increasing throughput causing write drops errors
4. Sprawling number of single node deployments
3. Vertical scaling not providing further benefit
2. Data recording and availability matters

InfluxEnterprise
• Open Source Core
• High Availability
• Horizonal Scalability
• Enterprise Security
• Support from InfluxData
• OnPremise/Cloud Deployment Options

What Problem Are You
Trying to Solve?

What are you dealing with?
• Metrics
• Events
• Log Data
• Sensors
• Apps
• Servers
• Long-Term Storage
• Vendor Replacement
• Time-Series Alerts
• Visualization
• Network Data
• Custom Solution
• Real-Time Analytics
• Virtualization Monitoring
• Managed Service (InfluxCloud)

InfluxEnterprise Cluster Architecture: Meta Nodes

InfluxEnterprise Clustering: Data Nodes

InfluxEnterprise Cluster Architecture

Security
• LDAP Support
– Enterprise customers can configure the database to use LDAP as a backing
authentication source for users, roles and permissions.
– Connection between DB and LDAP server secured once connected
• Fine-grained authorization
– Used to control access at a measurement or series level
(compared to limiting access at the database level)
– Enable authentication in your configuration file
– Create users through the query API
– Grant users explicit read and/or write privileges
– Set restrictions which define a combination of database, measurement, and
tags which cannot be accessed without an explicit grant

© 2018 InfluxData. All rights reserved.© 2017 InfluxData. All rights reserved.
Eventual Consistency
• Anti-Entropy Service
– Expands on capabilities to
detect and copy full shards
– Now allows for detection and
repair of inconsistent shards
• Hinted-Handoff Queue
– Queue inbound points
destined to land on other
nodes in the cluster which
may currently be down
– Stored by node and shard
(10GB - default)

Backup and Restore
• Useful for: Disaster recovery, Debugging, Restoring clusters to a
consistent state
• What it does: Creates a copy of the metastore and the shard data
• Backup is compressed and is not human readable
• Export is not compressed but is human readable
• OSS and InfluxEnterprise ARE NOW compatible – aka portable
• Full or partial backup options
• Move data into a new database (with new Retention Policies, etc)

Cluster
Data Node 1 Data Node 2 Data Node 3 Data Node 4
Database X (Replication=1)
Shard 1 Shard 2 Shard 3 Shard 4
a b c d
X ≈ 4x ingest rate
≤ 1x concurrent query rate
a b c d

Cluster
a a’ a’’ a’’’
replication
≈ 4x concurrent query rate
a
b
b b’ b’’ b’’’

Cluster
a a’ b b’
replication replication
≤ 2x concurrent query rate
a b

How does
InfluxEnterprise Fit?

Example 1: Mothership
Data Center 1
Kapacitor
Telegraf InfluxDB
Ent
Enterprise Cluster
Data Node 1
Data Node 2
Data Node 3
Data Node n
Firewall/
LoadBalancer
Telegraf
Telegraf
Chronograf
Chronograf Kapacitor
Data Center 2
Kapacitor
Telegraf InfluxDB
EntTelegraf
Telegraf
Chronograf

Example 2: Durable Data Ingest
Telegraf Cluster
Telegraf
or other
source
Kafka
Queue
LoadBalancer
InfluxDB
Cluster
Telegraf
or other
source
Telegraf
or other
source
Telegraf
or other
sources
Telegraf
Telegraf
Telegraf
Telegraf
Put each Telegraf instance in
the same Kafka Consumer
Group
How Fast is Fast?
(eg): Six datanodes at 2.5M values per
second

Example 3: Influx with ElasticSearch
InfluxDB
Cluster
• Discover trends before and during the Error from metrics
• Perform Root Cause Analysis from Logs
LoadBalancer
Telegraf
ElasticSearch
Include common
Session ID
or other UID
Kapacitor
You
Metrics
Logs
Query using the common
Session ID or UID received
form Alert

Data Replication
Generally there are two types of data that we care about replicating:
• New Data – Data which is coming form our raw sources
• Derived Data – The output of a SELECT INTO, or TICK script

Replication of New Data – Pattern 1
Cluster 1
Firewall/
Load Balancer
Data Node 1
Data Node 2
Data Node 3
Data Node n
Telegraf
Cluster 2
Firewall/
Load Balancer
Data Node 1
Data Node 2
Data Node 3
Data Node n
Telegraf

Replication of New Data – Pattern 2
Cluster 1
Firewall/
Load Balancer
Data Node 1
Data Node 2
Data Node 3
Data Node n
Telegraf
Cluster 2
Firewall/
Load Balancer
Data Node 1
Data Node 2
Data Node 3
Data Node n
Telegraf
Kafka Queue

Replication of Derived Data – Pattern 3
Cluster 1
Load Balancer
Cluster 2
Data Nodes
Data Nodes
Data Nodes
Data Nodes
Data Nodes
Data Nodes
Data Nodes
Kapacitor
Load Balancer
Data Nodes
Data Nodes
Data Nodes
Data Nodes
Data Nodes
Data Nodes
Data Nodes
Kapacitor
Uses output of
Kapacitor to other
cluster
Telegraf Telegraf

General Cluster Advice
• Batch your writes!
• The number of data nodes should be a multiple of your replication
factor
• Use a single node of InfluxDB to monitor your cluster
• Put a load balancer in front of each of your data nodes
• Higher replication factors result in higher query concurrency, but
higher write latency.
• Use Fine Grained Authorization instead of multiple databases

InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre & Post Sales | InfluxData

InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre & Post Sales | InfluxData

More Related Content

What's hot (20)

Similar to InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre & Post Sales | InfluxData (20)

More from InfluxData (20)

Recently uploaded (20)

InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre & Post Sales | InfluxData