SlideShare a Scribd company logo
Ines Sombra
Director of Engineering
The burden of a successful feature: 

Scaling our real time logging platform
presents
Today’s Agenda
A delightful
demo &
context
A deep dive
into logging
Challenges
& future
But first… A bit of context
Observability
tl;dr
https://vimeo.com/267641392
Fresh from Altitude NYC
—Peter Bourgon, Altitude NYC
Observability is an umbrella term. There are different
techniques to achieve observability in a system.
Peter’s classification of Observability
TECHNIQUES SYSTEMS
* Lovingly stolen from Peter Bourgon
SYSTEMS
* Lovingly stolen from Peter Bourgon
TODAY 👉
Peter’s classification of Observability
TECHNIQUES
STOP
Demo
Time!
The burden of a successful feature: Scaling our real time logging platform
The burden of a successful feature: Scaling our real time logging platform
The burden of a successful feature: Scaling our real time logging platform
But Why?
This pipeline is one of the oldest systems at Fastly
Born out of our dissatisfaction w the status quo
We wanted something that would send you logs
extremely fast (stream them near realtime) to
anywhere you want (many endpoints)
Log Streaming 

at Fastly
Logging @ Fastly
Caches Aggregators Endpoints
s3
syslog
gcs
sumologic
bigquery
ftp
papertrail
…
s3
syslog
gcs
sumologic
bigquery
ftp
papertrail
…
Logging @ Fastly
Caches Aggregators Endpoints
s3
syslog
gcs
sumologic
bigquery
ftp
papertrail
…
Logging @ Fastly
Caches Aggregators Endpoints
s3
syslog
gcs
sumologic
bigquery
ftp
papertrail
…
Logging @ Fastly
Caches Aggregators Endpoints
s3
syslog
gcs
sumologic
bigquery
ftp
papertrail
…
Logging @ Fastly
Caches Aggregators Endpoints
s3
syslog
gcs
sumologic
bigquery
ftp
papertrail
…
Logging @ Fastly
Caches Aggregators Endpoints
s3
syslog
gcs
sumologic
bigquery
ftp
papertrail
…
Logging @ Fastly
Caches Aggregators Endpoints
s3
syslog
gcs
sumologic
bigquery
ftp
papertrail
…
Logging @ Fastly
Caches Aggregators Endpoints
Logging pipeline is Stateless
We don’t batch your logs
We don’t store your logs
We stream your logs in
near real-time to your
defined endpoints
We really don’t want your
logs on disk
Logging @ Fastly
Caches + Senders Aggregators
Varnish
Varnish
Varnish
Varnish
Varnish
Varnish
Varnish
Varnish
Logging @ Fastly
Caches + Senders Aggregators
Varnish
Varnish
Varnish
Varnish
Logging @ Fastly
Caches + Senders Aggregators
Varnish
Varnish
Varnish
Logging @ Fastly
Caches + Senders Aggregators
Varnish
Logging pipeline is Best Effort
We try our best to send logs to
your defined endpoint
Your endpoint must be up &
healthy in order for us to be
able to send data to it
We have minimal buffering
Pipeline optimized for log
streaming speed
Logging Endpoints
We don’t limit the number
of endpoints or log lines
per request
~8.6K active endpoints
Ecosystem of endpoints in
different stages of
evolution
Aggregators
Endpoints
s3
syslog
gcs
sumologic
bigquery
ftp
papertrail
…
Logging Streams data
File-based endpoints (time ranged)
Streaming endpoints (protocol or http-requests)
s3 gcs ftp sftp
syslog
sumologic
bigquery logentries papertrail
splunk scalyr honeycomb
Logging Growth (2014-2015)
😳
~430K LPS ~1.2K endpoints ~ 2GBps
Logging Growth (2014-2015)
😳
~430K LPS ~1.2K endpoints ~ 2GBps
Logging Growth (2017-2018)
~3M LPS ~8.6K endpoints ~4GBps
Logging Growth (2017-2018)
~3M LPS ~8.6K endpoints ~4GBps
Logging Growth (8X!!)
~3M LPS ~8.6K endpoints ~4GBps
Logging Endpoints
We send a lot of data continuously to
our supported endpoints
Syslog continues to be our most
popular endpoint but S3 & GCS have
the highest volume
The 70's are still alive with a very
respectable 13 MBps to ftp and 74
kBps to sftp*
* for the non-millennials
Logging Endpoints
Challenges & 

Lessons learned
s3
syslog
gcs
sumologic
bigquery
ftp
papertrail
…
Logging @ Fastly
Caches Aggregators Endpoints
Volume Challenges
No hard limits to what you
can log, this can be
challenging
System is multi-tenant. Noisy
neighbors can affect delivery
Consider sampling for high
volume logging
Burden of many
endpoints
Classic integrations
challenges (each endpoint is
a downstream dependency)
Standard endpoint clients
often don’t meet our needs
Having our own clients
affords us extra optimizations
Endpoints & Health
Some endpoints have known
limitations (infamous
examples: S3, BigQuery, GCS)
Difficult to infer if an
endpoint is working or not
(Hard to test setup too)
Structured logging (JSON via
VCL) is challenging
Service Isolation
Prioritize delivery of content over
log retention
An aggregator discards the oldest
logs it has when it can’t deliver
them fast enough
In a cache node we are our own
customers so senders do the
same when they can’t reach
aggregators fast enough
Expectation Mismatch
Burden of a system that works so well is that it
makes you believe you have strong guarantees
Design constraints determine the SLA of the
pipeline
General advice: Understand the design choices of
the systems you use because they limit what is
possible to guarantee *
The Future of
Logging
The team have been Busy bees
H2
H1
Platform performance
& addressing the
challenges of
individual endpoints
We are getting fancy!
Platform Performance
Reducing lock contention & CPU usage
Smarter memory allocation &
management
Overhauling all endpoints
Halving the time it takes for a log line to
be processed (from sender read to
aggregator line preparation)
Getting fancy
BigQuery improvements
New endpoints: Kafka
More integrations with
cloud services
Make endpoints easier to
debug
The burden of a successful feature: Scaling our real time logging platform
Want More?
Want more endpoints?
Want metrics?
Want easier structured logging?
Want VCL counters + secondly
aggregation + a higher SLA?
Want More?
Want more endpoints?
Want metrics?
Want easier structured logging?
Want VCL counters + secondly
aggregation + a higher SLA?
Dom Fee
Want More?
Want more endpoints?
Want metrics?
Want easier structured logging?
Want VCL counters + secondly
aggregation + a higher SLA?
Dom Fee
Want More?
tl;dr LOGGING
Fastly lets you extend the
visibility of your system to the
edge & gain meaningful insights
in near real-time
Is a pipeline with very specific
constraints & guarantees
Exciting things are coming!
(l,d)ogs of Fastly
https://github.com/Randommood/Altitude2018

More Related Content

What's hot (20)

PDF
Flink Forward Berlin 2017: Steffen Hausmann - Build a Real-time Stream Proces...
Flink Forward
 
PPTX
How to Improve the Observability of Apache Cassandra and Kafka applications...
Paul Brebner
 
PDF
Flink Forward Berlin 2017: Aljoscha Krettek - Talk Python to me: Stream Proce...
Flink Forward
 
PPTX
Flink Forward San Francisco 2018: David Reniz & Dahyr Vergara - "Real-time m...
Flink Forward
 
PDF
Kafka for Real-Time Event Processing in Serverless Environments
confluent
 
PPTX
Kafka Practices @ Uber - Seattle Apache Kafka meetup
Mingmin Chen
 
PDF
Improving Logging Ingestion Quality At Pinterest: Fighting Data Corruption An...
HostedbyConfluent
 
PDF
Flink forward-2017-netflix keystones-paas
Monal Daxini
 
PDF
Registry improvements update
APNIC
 
PDF
Running Flink in Production: The good, The bad and The in Between - Lakshmi ...
Flink Forward
 
PPTX
Apache HBase at Airbnb
HBaseCon
 
PDF
Matching the Scale at Tinder with Kafka
confluent
 
PDF
Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...
HostedbyConfluent
 
PDF
Stream processing with Apache Flink @ OfferUp
Bowen Li
 
PDF
NetflixOSS Meetup S6E1 - Titus & Containers
aspyker
 
PPTX
ChronoLogic Tools Demo: 6/12/18
ChronoLogic
 
PPTX
Portable Streaming Pipelines with Apache Beam
confluent
 
PDF
Administrative techniques to reduce Kafka costs | Anna Kepler, Viasat
HostedbyConfluent
 
PPTX
DataEngConf SF16 - High cardinality time series search
Hakka Labs
 
PDF
Kafka Summit NYC 2017 - Building Advanced Streaming Applications using the La...
confluent
 
Flink Forward Berlin 2017: Steffen Hausmann - Build a Real-time Stream Proces...
Flink Forward
 
How to Improve the Observability of Apache Cassandra and Kafka applications...
Paul Brebner
 
Flink Forward Berlin 2017: Aljoscha Krettek - Talk Python to me: Stream Proce...
Flink Forward
 
Flink Forward San Francisco 2018: David Reniz & Dahyr Vergara - "Real-time m...
Flink Forward
 
Kafka for Real-Time Event Processing in Serverless Environments
confluent
 
Kafka Practices @ Uber - Seattle Apache Kafka meetup
Mingmin Chen
 
Improving Logging Ingestion Quality At Pinterest: Fighting Data Corruption An...
HostedbyConfluent
 
Flink forward-2017-netflix keystones-paas
Monal Daxini
 
Registry improvements update
APNIC
 
Running Flink in Production: The good, The bad and The in Between - Lakshmi ...
Flink Forward
 
Apache HBase at Airbnb
HBaseCon
 
Matching the Scale at Tinder with Kafka
confluent
 
Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...
HostedbyConfluent
 
Stream processing with Apache Flink @ OfferUp
Bowen Li
 
NetflixOSS Meetup S6E1 - Titus & Containers
aspyker
 
ChronoLogic Tools Demo: 6/12/18
ChronoLogic
 
Portable Streaming Pipelines with Apache Beam
confluent
 
Administrative techniques to reduce Kafka costs | Anna Kepler, Viasat
HostedbyConfluent
 
DataEngConf SF16 - High cardinality time series search
Hakka Labs
 
Kafka Summit NYC 2017 - Building Advanced Streaming Applications using the La...
confluent
 

Similar to The burden of a successful feature: Scaling our real time logging platform (20)

PDF
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly
SolarWinds Loggly
 
PPTX
Asynchronous micro-services and the unified log
Alexander Dean
 
PDF
Loggly - How to Scale Your Architecture and DevOps Practices for Big Data App...
SolarWinds Loggly
 
PDF
Altitude SF 2017: Logging at the edge
Fastly
 
PDF
Logging : How much is too much? Network Security Monitoring Talk @ hasgeek
vivekrajan
 
PDF
Security Events Logging at Bell with the Elastic Stack
Elasticsearch
 
KEY
London devops logging
Tomas Doran
 
PPTX
Is 12 Factor App Right About Logging
Phil Wilkins
 
PPTX
Software architecture for data applications
Ding Li
 
KEY
Message:Passing - lpw 2012
Tomas Doran
 
PPTX
Your Guide to Streaming - The Engineer's Perspective
Ilya Ganelin
 
PPTX
Tools and practices to use in a Continuous Delivery pipeline
Matteo Emili
 
PDF
How to build observability into Serverless (BuildStuff 2018)
Yan Cui
 
PDF
Log Management: AtlSecCon2015
cameronevans
 
PDF
Intro to open source observability with grafana, prometheus, loki, and tempo(...
LibbySchulze
 
PPTX
How fluentd fits into the modern software landscape
Phil Wilkins
 
PDF
OSMC 2023 | Large-scale logging made easy by Alexandr Valialkin
NETWAYS
 
PPTX
Log Files
Heinrich Hartmann
 
PDF
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
VMworld
 
PDF
Building data intensive applications
Amit Kejriwal
 
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly
SolarWinds Loggly
 
Asynchronous micro-services and the unified log
Alexander Dean
 
Loggly - How to Scale Your Architecture and DevOps Practices for Big Data App...
SolarWinds Loggly
 
Altitude SF 2017: Logging at the edge
Fastly
 
Logging : How much is too much? Network Security Monitoring Talk @ hasgeek
vivekrajan
 
Security Events Logging at Bell with the Elastic Stack
Elasticsearch
 
London devops logging
Tomas Doran
 
Is 12 Factor App Right About Logging
Phil Wilkins
 
Software architecture for data applications
Ding Li
 
Message:Passing - lpw 2012
Tomas Doran
 
Your Guide to Streaming - The Engineer's Perspective
Ilya Ganelin
 
Tools and practices to use in a Continuous Delivery pipeline
Matteo Emili
 
How to build observability into Serverless (BuildStuff 2018)
Yan Cui
 
Log Management: AtlSecCon2015
cameronevans
 
Intro to open source observability with grafana, prometheus, loki, and tempo(...
LibbySchulze
 
How fluentd fits into the modern software landscape
Phil Wilkins
 
OSMC 2023 | Large-scale logging made easy by Alexandr Valialkin
NETWAYS
 
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
VMworld
 
Building data intensive applications
Amit Kejriwal
 
Ad

More from Fastly (20)

PDF
Revisiting HTTP/2
Fastly
 
PPTX
Altitude San Francisco 2018: Preparing for Video Streaming Events at Scale
Fastly
 
PPTX
Altitude San Francisco 2018: Building the Souther Hemisphere of the Internet
Fastly
 
PDF
Altitude San Francisco 2018: The World Cup Stream
Fastly
 
PDF
Altitude San Francisco 2018: We Own Our Destiny
Fastly
 
PDF
Altitude San Francisco 2018: Scale and Stability at the Edge with 1.4 Billion...
Fastly
 
PDF
Altitude San Francisco 2018: Moving Off the Monolith: A Seamless Migration
Fastly
 
PDF
Altitude San Francisco 2018: Bringing TLS to GitHub Pages
Fastly
 
PDF
Altitude San Francisco 2018: HTTP Invalidation Workshop
Fastly
 
PDF
Altitude San Francisco 2018: HTTP/2 Tales: Discovery and Woe
Fastly
 
PPTX
Altitude San Francisco 2018: How Magento moved to the cloud while maintaining...
Fastly
 
PDF
Altitude San Francisco 2018: Scaling Ethereum to 10B requests per day
Fastly
 
PPTX
Altitude San Francisco 2018: Authentication at the Edge
Fastly
 
PDF
Altitude San Francisco 2018: WebAssembly Tools & Applications
Fastly
 
PPTX
Altitude San Francisco 2018: Testing with Fastly Workshop
Fastly
 
PDF
Altitude San Francisco 2018: Fastly Purge Control at the USA TODAY NETWORK
Fastly
 
PPTX
Altitude San Francisco 2018: WAF Workshop
Fastly
 
PPTX
Altitude San Francisco 2018: Logging at the Edge
Fastly
 
PPTX
Altitude San Francisco 2018: Video Workshop Docs
Fastly
 
PPTX
Altitude San Francisco 2018: Programming the Edge
Fastly
 
Revisiting HTTP/2
Fastly
 
Altitude San Francisco 2018: Preparing for Video Streaming Events at Scale
Fastly
 
Altitude San Francisco 2018: Building the Souther Hemisphere of the Internet
Fastly
 
Altitude San Francisco 2018: The World Cup Stream
Fastly
 
Altitude San Francisco 2018: We Own Our Destiny
Fastly
 
Altitude San Francisco 2018: Scale and Stability at the Edge with 1.4 Billion...
Fastly
 
Altitude San Francisco 2018: Moving Off the Monolith: A Seamless Migration
Fastly
 
Altitude San Francisco 2018: Bringing TLS to GitHub Pages
Fastly
 
Altitude San Francisco 2018: HTTP Invalidation Workshop
Fastly
 
Altitude San Francisco 2018: HTTP/2 Tales: Discovery and Woe
Fastly
 
Altitude San Francisco 2018: How Magento moved to the cloud while maintaining...
Fastly
 
Altitude San Francisco 2018: Scaling Ethereum to 10B requests per day
Fastly
 
Altitude San Francisco 2018: Authentication at the Edge
Fastly
 
Altitude San Francisco 2018: WebAssembly Tools & Applications
Fastly
 
Altitude San Francisco 2018: Testing with Fastly Workshop
Fastly
 
Altitude San Francisco 2018: Fastly Purge Control at the USA TODAY NETWORK
Fastly
 
Altitude San Francisco 2018: WAF Workshop
Fastly
 
Altitude San Francisco 2018: Logging at the Edge
Fastly
 
Altitude San Francisco 2018: Video Workshop Docs
Fastly
 
Altitude San Francisco 2018: Programming the Edge
Fastly
 
Ad

Recently uploaded (20)

PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
Python basic programing language for automation
DanialHabibi2
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
July Patch Tuesday
Ivanti
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
Python basic programing language for automation
DanialHabibi2
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
July Patch Tuesday
Ivanti
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 

The burden of a successful feature: Scaling our real time logging platform