SlideShare a Scribd company logo
FLUENTD: UNIFIED LOGGING
LAYER
John Hammink
October 14, 2015
Cask Big Data Application Meetup
About Me
• A recovering software & QA engineer turned digital
artist once interested in fractals;
• now into data visualization based on large datasets
rendered directly to GPU (RGL, various Python GL
libraries, etc.)
• github: jammink2; twitter: rijksband
Tweet NOW!
“At @caskdata learning how to collect
more event data using #Fluentd”
WHAT’S FLUENTD?
An extensible & reliable data collection
tool
simple core + plugins
buffering, HA (failover),
load balancing, etc.
like syslogd
What’s Fluentd?
> Data collector for unified logging layer
> Streaming data transfer based on JSON
> Written in Ruby
> Gem based various plugins
> http://www.fluentd.org/plugins
> Working in production
> http://www.fluentd.org/testimonials
data collection tool
✓ duplicated code for error handling...
✓ messy code for retrying mechnism...
Blueflood
MongoDB
Hadoop
Metrics
Amazon S3
Analysis
Archiving
MySQL
Apache
Frontend
Access logs
syslogd
App logs
System logs
Backend
Your system
bash scripts ruby scripts
rsync
log file
bash
python scripts
custom

loggger
cron
other custom

scripts...
(this is painful!!!)
Blueflood
MongoDB
Hadoop
Metrics
Amazon S3
Analysis
Archiving
MySQL
Apache
Frontend
Access logs
syslogd
App logs
System logs
Backend
Your system
filter / buffer / route
extensible
CORE PLUGINS
• Divide & Conquer
• Buffering & Retries
• Error Handling
• Message Routing
• Parallelism
• Read Data
• Parse Data
• Buffer Data
• Write Data
• Format Data
Common
Concerns
Use Case
Specific
architecture
INTERNAL ARCHITECTURE
“input-ish” “output-ish”
Input Parser Buffer Output FormatterFilter
Internal Architecture (Simplified)
Input Buffer Output
Plugin Plugin Plugin
2012-02-04 01:33:51
myapp.buylog {
“user”: ”me”,
“path”: “/buyItem”,
“price”: 150,
“referer”: “/landing”
}
time
tag
record
Architecture :: Input plugins
Input
HTTP+JSON (in_http)
File tail (in_tail)
Syslog (in_syslog)
...
Plugin
✓ Receive logs
✓ Or pull logs from data sources
✓ in non-blocking manner
Architecture :: Output plugins
Plugin
✓ Write or send event logs
Output
File (out_file)
Amazon S3 (out_s3)
MongoDB (out_mongo)
...
Architecture :: Buffer plugins
Plugin
✓ Improve performance
✓ Provide reliability
✓ Provide thread-safety
Buffer
Memory (buf_memory)
File (buf_file)
Architecture :: Buffer plugins
Plugin
✓ Improve performance
✓ Provide reliability
✓ Provide thread-safety
chunk
chunk
chunk output
Input
Fluentd and Docker - running fluentd within a docker container
reliable data transfer
DIVIDE & CONQUER & RETRY
error retry
error retry retry
retry
reliable process
THIS?
OR THIS?
M X N → M + N
Nagios
MongoDB
Hadoop
Alerting
Amazon S3
Analysis
Archiving
MySQL
Apache
Frontend
Access logs
syslogd
App logs
System logs
Backend
Databases
buffer/filter/route
use cases
SIMPLE FORWARDING
# logs from a file
<source>
type tail
path /var/log/httpd.log
format apache2
tag backend.apache
</source>
# logs from client libraries
<source>
type forward
port 24224
</source>
# store logs to MongoDB
<match backend.*>
type mongo
database fluent
collection test
</match>
LESS SIMPLE FORWARDING
LAMBDA ARCHITECTURE
# logs from a file
<source>
type tail
path /var/log/httpd.log
format apache2
tag web.access
</source>
# logs from client libraries
<source>
type forward
port 24224
</source>
# store logs to ES and HDFS
<match *.*>
type copy
<store>
type elasticsearch
logstash_format true
</store>
<store>
type webhdfs
host namenode
port 50070
path /path/on/hdfs/
</store>
</match>
FLUENTD ON KUBERNETES (NOV 2015)
FLUENTD LOGGING DRIVER (APR 2015)
Tweet Again!
“Happy v1 #k8s and congrats #Fluentd for
becoming a #docker logging driver”
DEMO: FLUENTD + DOCKER
THANK YOU!
AND TREASURE DATA IS HIRING!
WWW.TREASUREDATA.COMC/CAREERS

More Related Content

What's hot (20)

PPTX
Log Files
Heinrich Hartmann
 
PPTX
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
Yann Cluchey
 
PDF
Log analysis with the elk stack
Vikrant Chauhan
 
PDF
Fluentd - Flexible, Stable, Scalable
Shu Ting Tseng
 
PDF
Scaling to Infinity - Open Source meets Big Data
Treasure Data, Inc.
 
PDF
Microservice-based software architecture
ArangoDB Database
 
PDF
Rental Cars and Industrialized Learning to Rank with Sean Downes
Databricks
 
PDF
MongoDB on Azure
Norberto Leite
 
PPTX
Monitoring and scaling postgres at datadog
Seth Rosenblum
 
PPTX
MongoDB Days UK: Using MongoDB and Python for Data Analysis Pipelines
MongoDB
 
KEY
Cascalog
nathanmarz
 
PDF
Interactive learning analytics dashboards with ELK (Elasticsearch Logstash Ki...
Andrii Vozniuk
 
PDF
ELK Wrestling (Leeds DevOps)
Steve Elliott
 
PDF
Superset druid realtime
arupmalakar
 
PDF
Technologies, Data Analytics Service and Enterprise Business
SATOSHI TAGOMORI
 
PDF
Kibana + timelion: time series with the elastic stack
Sylvain Wallez
 
PDF
Building Data Pipelines in Python
C4Media
 
PDF
Deep Dive on ArangoDB
Max Neunhöffer
 
PDF
Using Elastic to Monitor Everything - Christoph Wurm, Elastic - DevOpsDays Te...
DevOpsDays Tel Aviv
 
PDF
Building highly reliable data pipeline @datadog par Quentin François
Paris Data Engineers !
 
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
Yann Cluchey
 
Log analysis with the elk stack
Vikrant Chauhan
 
Fluentd - Flexible, Stable, Scalable
Shu Ting Tseng
 
Scaling to Infinity - Open Source meets Big Data
Treasure Data, Inc.
 
Microservice-based software architecture
ArangoDB Database
 
Rental Cars and Industrialized Learning to Rank with Sean Downes
Databricks
 
MongoDB on Azure
Norberto Leite
 
Monitoring and scaling postgres at datadog
Seth Rosenblum
 
MongoDB Days UK: Using MongoDB and Python for Data Analysis Pipelines
MongoDB
 
Cascalog
nathanmarz
 
Interactive learning analytics dashboards with ELK (Elasticsearch Logstash Ki...
Andrii Vozniuk
 
ELK Wrestling (Leeds DevOps)
Steve Elliott
 
Superset druid realtime
arupmalakar
 
Technologies, Data Analytics Service and Enterprise Business
SATOSHI TAGOMORI
 
Kibana + timelion: time series with the elastic stack
Sylvain Wallez
 
Building Data Pipelines in Python
C4Media
 
Deep Dive on ArangoDB
Max Neunhöffer
 
Using Elastic to Monitor Everything - Christoph Wurm, Elastic - DevOpsDays Te...
DevOpsDays Tel Aviv
 
Building highly reliable data pipeline @datadog par Quentin François
Paris Data Engineers !
 

Viewers also liked (9)

PPTX
Augmenting Mongo DB with Treasure Data
Treasure Data, Inc.
 
PPTX
Building a system for machine and event-oriented data with Rocana
Treasure Data, Inc.
 
PDF
What is support_engineer_in_treasuredata
Treasure Data, Inc.
 
PDF
Packaging Ecosystems -Monki Gras 2017
Treasure Data, Inc.
 
PDF
Unifying Events and Logs into the Cloud
Treasure Data, Inc.
 
PDF
Insight Data Engineering: Open source data ingestion
Treasure Data, Inc.
 
PDF
Introduction to New features and Use cases of Hivemall
Treasure Data, Inc.
 
PDF
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
Treasure Data, Inc.
 
PDF
Keynote - Fluentd meetup v14
Treasure Data, Inc.
 
Augmenting Mongo DB with Treasure Data
Treasure Data, Inc.
 
Building a system for machine and event-oriented data with Rocana
Treasure Data, Inc.
 
What is support_engineer_in_treasuredata
Treasure Data, Inc.
 
Packaging Ecosystems -Monki Gras 2017
Treasure Data, Inc.
 
Unifying Events and Logs into the Cloud
Treasure Data, Inc.
 
Insight Data Engineering: Open source data ingestion
Treasure Data, Inc.
 
Introduction to New features and Use cases of Hivemall
Treasure Data, Inc.
 
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
Treasure Data, Inc.
 
Keynote - Fluentd meetup v14
Treasure Data, Inc.
 
Ad

Similar to Fluentd and Docker - running fluentd within a docker container (20)

PDF
Logging for Production Systems in The Container Era
Sadayuki Furuhashi
 
PDF
Collect distributed application logging using fluentd (EFK stack)
Marco Pas
 
PDF
Fluentd and Embulk Game Server 4
N Masahiro
 
PDF
Fluentd at HKOScon
N Masahiro
 
PDF
«Что такое serverless-архитектура и как с ней жить?» Николай Марков, Aligned ...
it-people
 
PPTX
Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...
Data Con LA
 
PDF
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
Márton Kodok
 
PDF
Fluentd Unified Logging Layer At Fossasia
N Masahiro
 
PDF
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Codemotion
 
PDF
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Demi Ben-Ari
 
PDF
Fluentd Overview, Now and Then
SATOSHI TAGOMORI
 
PDF
Treasure Data and OSS
N Masahiro
 
PDF
Automating Workflows for Analytics Pipelines
Sadayuki Furuhashi
 
PPT
Designing the Call of Cthulhu app with Google App Engine
Chris Bunch
 
PDF
Building Kick Ass Video Games for the Cloud
Chris Schalk
 
PDF
Thug: a new low-interaction honeyclient
Angelo Dell'Aera
 
PDF
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Demi Ben-Ari
 
PPTX
Integrating Splunk into your Spring Applications
Damien Dallimore
 
PPTX
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
ForgeRock
 
PDF
Power your apps with Gmail, Google Drive, Calendar, Sheets, Slides & more
wesley chun
 
Logging for Production Systems in The Container Era
Sadayuki Furuhashi
 
Collect distributed application logging using fluentd (EFK stack)
Marco Pas
 
Fluentd and Embulk Game Server 4
N Masahiro
 
Fluentd at HKOScon
N Masahiro
 
«Что такое serverless-архитектура и как с ней жить?» Николай Марков, Aligned ...
it-people
 
Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...
Data Con LA
 
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
Márton Kodok
 
Fluentd Unified Logging Layer At Fossasia
N Masahiro
 
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Codemotion
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Demi Ben-Ari
 
Fluentd Overview, Now and Then
SATOSHI TAGOMORI
 
Treasure Data and OSS
N Masahiro
 
Automating Workflows for Analytics Pipelines
Sadayuki Furuhashi
 
Designing the Call of Cthulhu app with Google App Engine
Chris Bunch
 
Building Kick Ass Video Games for the Cloud
Chris Schalk
 
Thug: a new low-interaction honeyclient
Angelo Dell'Aera
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Demi Ben-Ari
 
Integrating Splunk into your Spring Applications
Damien Dallimore
 
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
ForgeRock
 
Power your apps with Gmail, Google Drive, Calendar, Sheets, Slides & more
wesley chun
 
Ad

More from Treasure Data, Inc. (15)

PPTX
GDPR: A Practical Guide for Marketers
Treasure Data, Inc.
 
PPTX
AR and VR by the Numbers: A Data First Approach to the Technology and Market
Treasure Data, Inc.
 
PPTX
Introduction to Customer Data Platforms
Treasure Data, Inc.
 
PPTX
Hands On: Javascript SDK
Treasure Data, Inc.
 
PPTX
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
Treasure Data, Inc.
 
PPTX
Brand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
Treasure Data, Inc.
 
PPTX
How to Power Your Customer Experience with Data
Treasure Data, Inc.
 
PPTX
Why Your VR Game is Virtually Useless Without Data
Treasure Data, Inc.
 
PDF
Connecting the Customer Data Dots
Treasure Data, Inc.
 
PPTX
Harnessing Data for Better Customer Experience and Company Success
Treasure Data, Inc.
 
PDF
Scalable Hadoop in the cloud
Treasure Data, Inc.
 
PDF
Treasure Data: Move your data from MySQL to Redshift with (not much more tha...
Treasure Data, Inc.
 
PDF
Treasure Data From MySQL to Redshift
Treasure Data, Inc.
 
PPTX
Partner webinar presentation aws pebble_treasure_data
Treasure Data, Inc.
 
PDF
Introduction to Hivemall
Treasure Data, Inc.
 
GDPR: A Practical Guide for Marketers
Treasure Data, Inc.
 
AR and VR by the Numbers: A Data First Approach to the Technology and Market
Treasure Data, Inc.
 
Introduction to Customer Data Platforms
Treasure Data, Inc.
 
Hands On: Javascript SDK
Treasure Data, Inc.
 
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
Treasure Data, Inc.
 
Brand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
Treasure Data, Inc.
 
How to Power Your Customer Experience with Data
Treasure Data, Inc.
 
Why Your VR Game is Virtually Useless Without Data
Treasure Data, Inc.
 
Connecting the Customer Data Dots
Treasure Data, Inc.
 
Harnessing Data for Better Customer Experience and Company Success
Treasure Data, Inc.
 
Scalable Hadoop in the cloud
Treasure Data, Inc.
 
Treasure Data: Move your data from MySQL to Redshift with (not much more tha...
Treasure Data, Inc.
 
Treasure Data From MySQL to Redshift
Treasure Data, Inc.
 
Partner webinar presentation aws pebble_treasure_data
Treasure Data, Inc.
 
Introduction to Hivemall
Treasure Data, Inc.
 

Recently uploaded (20)

PDF
The European Business Wallet: Why It Matters and How It Powers the EUDI Ecosy...
Lal Chandran
 
PPTX
b6057ea5-8e8c-4415-90c0-ed8e9666ffcd.pptx
Anees487379
 
PDF
Driving Employee Engagement in a Hybrid World.pdf
Mia scott
 
PPTX
apidays Singapore 2025 - Designing for Change, Julie Schiller (Google)
apidays
 
PDF
1750162332_Snapshot-of-Indias-oil-Gas-data-May-2025.pdf
sandeep718278
 
PDF
apidays Singapore 2025 - From API Intelligence to API Governance by Harsha Ch...
apidays
 
PDF
The Best NVIDIA GPUs for LLM Inference in 2025.pdf
Tamanna36
 
PPTX
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
PDF
apidays Singapore 2025 - Trustworthy Generative AI: The Role of Observability...
apidays
 
PDF
InformaticsPractices-MS - Google Docs.pdf
seshuashwin0829
 
PDF
NIS2 Compliance for MSPs: Roadmap, Benefits & Cybersecurity Trends (2025 Guide)
GRC Kompas
 
PDF
Using AI/ML for Space Biology Research
VICTOR MAESTRE RAMIREZ
 
PPTX
apidays Helsinki & North 2025 - API access control strategies beyond JWT bear...
apidays
 
PPTX
05_Jelle Baats_Tekst.pptx_AI_Barometer_Release_Event
FinTech Belgium
 
PDF
apidays Singapore 2025 - Surviving an interconnected world with API governanc...
apidays
 
PPTX
SHREYAS25 INTERN-I,II,III PPT (1).pptx pre
swapnilherage
 
PPTX
What Is Data Integration and Transformation?
subhashenia
 
PDF
OOPs with Java_unit2.pdf. sarthak bookkk
Sarthak964187
 
PPTX
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
PDF
A GraphRAG approach for Energy Efficiency Q&A
Marco Brambilla
 
The European Business Wallet: Why It Matters and How It Powers the EUDI Ecosy...
Lal Chandran
 
b6057ea5-8e8c-4415-90c0-ed8e9666ffcd.pptx
Anees487379
 
Driving Employee Engagement in a Hybrid World.pdf
Mia scott
 
apidays Singapore 2025 - Designing for Change, Julie Schiller (Google)
apidays
 
1750162332_Snapshot-of-Indias-oil-Gas-data-May-2025.pdf
sandeep718278
 
apidays Singapore 2025 - From API Intelligence to API Governance by Harsha Ch...
apidays
 
The Best NVIDIA GPUs for LLM Inference in 2025.pdf
Tamanna36
 
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
apidays Singapore 2025 - Trustworthy Generative AI: The Role of Observability...
apidays
 
InformaticsPractices-MS - Google Docs.pdf
seshuashwin0829
 
NIS2 Compliance for MSPs: Roadmap, Benefits & Cybersecurity Trends (2025 Guide)
GRC Kompas
 
Using AI/ML for Space Biology Research
VICTOR MAESTRE RAMIREZ
 
apidays Helsinki & North 2025 - API access control strategies beyond JWT bear...
apidays
 
05_Jelle Baats_Tekst.pptx_AI_Barometer_Release_Event
FinTech Belgium
 
apidays Singapore 2025 - Surviving an interconnected world with API governanc...
apidays
 
SHREYAS25 INTERN-I,II,III PPT (1).pptx pre
swapnilherage
 
What Is Data Integration and Transformation?
subhashenia
 
OOPs with Java_unit2.pdf. sarthak bookkk
Sarthak964187
 
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
A GraphRAG approach for Energy Efficiency Q&A
Marco Brambilla
 

Fluentd and Docker - running fluentd within a docker container