SlideShare a Scribd company logo
Elasticsearch In Netflix
Danny Yuan, Jae Bae
Welcome
Hashtag: #ES_in_Netflix
@Elasticsearch - Elasticsearch
!

@stonse - Sudhir Tonse
!

@g9yuayon - Danny Yuan
!

@metacret - Jae Bae
Who Are We?
Who Are We?
Software engineers in Netflix’s
Platform Engineering team,
working on large scale data
infrastructure
Who Are We?
Software engineers in Netflix’s
Platform Engineering team,
working on large scale data
infrastructure
Building and operating Netflix’s
cloud real-time query service
Why Are We Here?
Why Are We Here?
How We Use Elasticsearch
Why Are We Here?
How We Use Elasticsearch
Why Elasticsearch
Why Are We Here?
How We Use Elasticsearch
Why Elasticsearch
How We Run Elasticsearch
Why Are We Here?
How We Use Elasticsearch
Why Elasticsearch
How We Run Elasticsearch
To Seek Your Feedback
How We Use Elasticsearch
Querying Log Events
Tracking Service Deployments
Querying Log Events
A Little Historical Perspective
Netflix is a log generating company
that also happens to stream movies
- Adrian Cockroft

photo credit: http://www.flickr.com/photos/decade_null/142235888/sizes/o/in/photostream/
Elasticsearch in Netflix
Elasticsearch in Netflix
A Humble Beginning
A Humble Beginning
A Humble Beginning
A Humble Beginning
Things Changed
Elasticsearch in Netflix
Elasticsearch in Netflix
Application

Application

Application
Application

Application

Application

Application

Application

Application

Application
70,000,000,000
1,500,000
Making Sense of Billions of Events
So We Evolved
So We Evolved
So We Evolved

hgrep -C 10 -k 5,2,3 'users.*[1-9]{3}' *catalina.out s3//bucket
So We Evolved

hgrep -C 10 -k 5,2,3 'users.*[1-9]{3}' *catalina.out s3//bucket
So We Evolved

hgrep -C 10 -k 5,2,3 'users.*[1-9]{3}' *catalina.out s3//bucket
So We Evolved

hgrep -C 10 -k 5,2,3 'users.*[1-9]{3}' *catalina.out s3//bucket
So We Evolved

hgrep -C 10 -k 5,2,3 'users.*[1-9]{3}' *catalina.out s3//bucket
select * from log_events where dateint=20140101
Field Name

Field Value

Client

“API”

Server

“Cryptex”

StatusCode

200

ResponseTime

73
Log data
Server Farm

Log data
Server Farm
Log Collectors

Log data
Server Farm
Elasticsearch in Netflix
Elasticsearch in Netflix
Elasticsearch in Netflix
Elasticsearch in Netflix
Elasticsearch in Netflix
What Could Go Wrong?
Elasticsearch in Netflix
You thought parallelization would save the day?
Think again
You thought parallelization would save the day?
Think again
What Is Missing?
Interactive Exploration
Functional Requirements
Arbitrary Boolean Queries
Aggregated Query
- Top N Query
- Trend
- Distribution
Non-Functional Requirements
- Interactive (response within seconds)
!

- Quickly locates the right log events

- Minimal programming effort
It’s All about Extracting Small Data
Out of Big Data
Elasticsearch in Netflix
Elasticsearch in Netflix
Elasticsearch in Netflix
Now Back to the Use Case
Intelligent Alerts
Guided Debugging in the Right Context
Guided Debugging in the Right Context
Guided Debugging in the Right Context
Guided Debugging in the Right Context
Guided Debugging in the Right Context
Guided Debugging in the Right Context
Guided Debugging in the Right Context
Guided Debugging in the Right Context
A Useful Pattern
Aggregated Query -> Individual Query
Examples
- S3 diagnostics
!

- Tracking email campaigns 

-	 Request traces
Status:200
RequestId Parent Id Node Id Service Name

Status

4965-4a74

0

123

Edge Service

200

4965-4a74

123

456

Gateway

200

4965-4a74

456

789

Service A

200

4965-4a74e

456

abc

Service B

200
Elasticsearch in Netflix
Edge Service (456) ---> Gateway (789)

Data Name

Value

Request ID

4965-4a74

Response Time

25 ms

Endpoints

/rest/service

Status Code

200
Why Elasticsearch?
Automatic Sharding and Replication
Elasticsearch in Netflix
Flexible Schema
Flexible Schema
- Schemaless
Flexible Schema
- Schemaless
- Reasonable defaults
Elasticsearch in Netflix
Nice Extension Model
Nice Extension Model
- Customizable REST Actions
Nice Extension Model
- Customizable REST Actions

- Site Plugins
Nice Extension Model
- Customizable REST Actions

- Site Plugins
- River Plugins
Nice Extension Model
- Customizable REST Actions

- Site Plugins
- River Plugins
- Discovery Module
Elasticsearch in Netflix
Ecosystem - Plugins, Kibana
Tracking Service Deployments
!

{ edda }
Elasticsearch in Netflix
Built by Netflix Monitoring Eng Team
Built by Netflix Monitoring Eng Team
Tracks History and Changes to Service
Deployments

Built by Netflix Monitoring Eng Team
Tracks History and Changes to Service
Deployments

Keeps Many Revisions
Built by Netflix Monitoring Eng Team
Tracks History and Changes to Service
Deployments

Keeps Many Revisions
Tracks Dozens of Document Types
Elasticsearch in Netflix
Elasticsearch in Netflix
Why Elasticsearch?
Elasticsearch in Netflix
Schemas may change at any time
Schemas may change at any time
Go schemaless
Elasticsearch in Netflix
Users may search for any combination of fields

Users may search for any combination of fields

This is what search engine is designed for
Elasticsearch in Netflix
Users often needs only a few fields
Users often needs only a few fields
Projection via “fields” query
Elasticsearch in Netflix
Need range queries on date and revisions
Need range queries on date and revisions
Natively supported by Elasticsearch
Need range queries on date and revisions
Natively supported by Elasticsearch
Route by document ID
Running ES in Netflix
Operational Challenges
Operational Challenges
Back pressure when indexing
Operational Challenges
Back pressure when indexing
Diverse configurations and data
Operational Challenges
Back pressure when indexing
Diverse configurations and data
Dynamic flow of log events
Operational Challenges
Back pressure when indexing
Diverse configurations and data
Dynamic flow of log events
Needs extensive monitoring and alerting
Operational Challenges
Back pressure when indexing
Diverse configurations and data
Dynamic flow of log events
Needs extensive monitoring and alerting
Tolerating outage at different scales
Favor Pulling Over Pushing
Elasticsearch in Netflix
Elasticsearch in Netflix
Choose Config with Data
Elasticsearch in Netflix
Integrating ES
AMI for Deployment by Asgard
Archaius for Configuration
Eureka for Server Discovery
Suro for Data Delivery
Servo for Monitoring Metrics
Zone-aware Replication
Multi-region Deployment
Multi-region Deployment
Discovery over Cassandra

Region-aware replication
Favor Index Rolling Over TTL
Favor Index Rolling Over TTL
A dedicated service manages index rolling

Uses index template and routing
Worth Trying G1
Worth Trying G1
Not recommended by ES team, but

Worth Trying G1
Not recommended by ES team, but

Has fewer and shorter GC pauses

Worth Trying G1
Not recommended by ES team, but

Has fewer and shorter GC pauses

Occasional SIGSEGV, but it’s okay
Simple Majority for Master Election
Simple Majority for Master Election
Split-brain problem
Simple Majority for Master Election
Split-brain problem


discovery.zen.minimum_master_nodes

Simple Majority for Master Election
Split-brain problem


discovery.zen.minimum_master_nodes

Dynamically updated
Future Work
Future Work
Automatic incremental backup and restore
Future Work
Automatic incremental backup and restore


Auto scaling

Future Work
Automatic incremental backup and restore


Auto scaling

Fully automated deployment

Future Work
Automatic incremental backup and restore


Auto scaling

Fully automated deployment

Support more use cases
We’re Hiring
Thank You!

More Related Content

What's hot (20)

PPTX
Centralized log-management-with-elastic-stack
Rich Lee
 
PPTX
Apache BigtopによるHadoopエコシステムのパッケージング(Open Source Conference 2021 Online/Osaka...
NTT DATA Technology & Innovation
 
PDF
Log analysis with the elk stack
Vikrant Chauhan
 
PDF
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Noritaka Sekiyama
 
PPTX
FIWARE Wednesday Webinars - How to Design DataModels
FIWARE
 
PDF
데브시스터즈 데이터 레이크 구축 이야기 : Data Lake architecture case study (박주홍 데이터 분석 및 인프라 팀...
Amazon Web Services Korea
 
PPTX
Elastic search overview
ABC Talks
 
PDF
Amazon OpenSearch Deep dive - 내부구조, 성능최적화 그리고 스케일링
Amazon Web Services Korea
 
PDF
Introduction to elasticsearch
hypto
 
PPTX
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
DataWorks Summit/Hadoop Summit
 
PDF
AWS Fargate on EKS 실전 사용하기
AWSKRUG - AWS한국사용자모임
 
PDF
Introducing ELK
AllBits BVBA (freelancer)
 
PPTX
검색엔진이 데이터를 다루는 법 김종민
종민 김
 
PDF
Introduction à ElasticSearch
Fadel Chafai
 
PPTX
ELK Stack
Phuc Nguyen
 
PPTX
[NDC17] Kubernetes로 개발서버 간단히 찍어내기
SeungYong Oh
 
PDF
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Edureka!
 
PDF
Apache Atlasの現状とデータガバナンス事例 #hadoopreading
Yahoo!デベロッパーネットワーク
 
PPTX
Introduction to Elasticsearch
Ismaeel Enjreny
 
PDF
Creating a Context-Aware solution, Complex Event Processing with FIWARE Perseo
Fernando Lopez Aguilar
 
Centralized log-management-with-elastic-stack
Rich Lee
 
Apache BigtopによるHadoopエコシステムのパッケージング(Open Source Conference 2021 Online/Osaka...
NTT DATA Technology & Innovation
 
Log analysis with the elk stack
Vikrant Chauhan
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Noritaka Sekiyama
 
FIWARE Wednesday Webinars - How to Design DataModels
FIWARE
 
데브시스터즈 데이터 레이크 구축 이야기 : Data Lake architecture case study (박주홍 데이터 분석 및 인프라 팀...
Amazon Web Services Korea
 
Elastic search overview
ABC Talks
 
Amazon OpenSearch Deep dive - 내부구조, 성능최적화 그리고 스케일링
Amazon Web Services Korea
 
Introduction to elasticsearch
hypto
 
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
DataWorks Summit/Hadoop Summit
 
AWS Fargate on EKS 실전 사용하기
AWSKRUG - AWS한국사용자모임
 
Introducing ELK
AllBits BVBA (freelancer)
 
검색엔진이 데이터를 다루는 법 김종민
종민 김
 
Introduction à ElasticSearch
Fadel Chafai
 
ELK Stack
Phuc Nguyen
 
[NDC17] Kubernetes로 개발서버 간단히 찍어내기
SeungYong Oh
 
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Edureka!
 
Apache Atlasの現状とデータガバナンス事例 #hadoopreading
Yahoo!デベロッパーネットワーク
 
Introduction to Elasticsearch
Ismaeel Enjreny
 
Creating a Context-Aware solution, Complex Event Processing with FIWARE Perseo
Fernando Lopez Aguilar
 

Similar to Elasticsearch in Netflix (20)

PDF
ELK-Stack-Essential-Concepts-TheELKStack-LunchandLearn.pdf
cadejaumafiq
 
PPTX
The ELK Stack - Launch and Learn presentation
saivjadhav2003
 
PDF
Roaring with elastic search sangam2018
Vinay Kumar
 
PDF
Elasticsearch Introduction at BigData meetup
Eric Rodriguez (Hiring in Lex)
 
PDF
Log Analytics with AWS
AWS Germany
 
PDF
Elasticsearch
Shagun Rathore
 
PDF
Analyzing your web and application logs with the Amazon Elasticsearch Service...
javier ramirez
 
PDF
Elasticsearch speed is key
Enterprise Search Warsaw Meetup
 
PPTX
Intro elasticsearch taswarbhatti
Taswar Bhatti
 
PPTX
Elastic pivorak
Pivorak MeetUp
 
PPTX
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Oleksiy Panchenko
 
PPTX
Bridging Batch and Real-time Systems for Anomaly Detection
DataWorks Summit
 
PPTX
Building an ETL pipeline for Elasticsearch using Spark
Itai Yaffe
 
PPTX
Elasticsearch - DevNexus 2015
Roy Russo
 
PDF
Explore Elasticsearch and Why It’s Worth Using
Inexture Solutions
 
PDF
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
Anna Ossowski
 
PPTX
The Elastic Stack as a SIEM
John Hubbard
 
PDF
American Ancestors Use Case - Scalability & Support Using the Elasticsearch S...
Elasticsearch
 
PPTX
BigData Search Simplified with ElasticSearch
TO THE NEW | Technology
 
PPTX
Elasticsearch
Divij Sehgal
 
ELK-Stack-Essential-Concepts-TheELKStack-LunchandLearn.pdf
cadejaumafiq
 
The ELK Stack - Launch and Learn presentation
saivjadhav2003
 
Roaring with elastic search sangam2018
Vinay Kumar
 
Elasticsearch Introduction at BigData meetup
Eric Rodriguez (Hiring in Lex)
 
Log Analytics with AWS
AWS Germany
 
Elasticsearch
Shagun Rathore
 
Analyzing your web and application logs with the Amazon Elasticsearch Service...
javier ramirez
 
Elasticsearch speed is key
Enterprise Search Warsaw Meetup
 
Intro elasticsearch taswarbhatti
Taswar Bhatti
 
Elastic pivorak
Pivorak MeetUp
 
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Oleksiy Panchenko
 
Bridging Batch and Real-time Systems for Anomaly Detection
DataWorks Summit
 
Building an ETL pipeline for Elasticsearch using Spark
Itai Yaffe
 
Elasticsearch - DevNexus 2015
Roy Russo
 
Explore Elasticsearch and Why It’s Worth Using
Inexture Solutions
 
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
Anna Ossowski
 
The Elastic Stack as a SIEM
John Hubbard
 
American Ancestors Use Case - Scalability & Support Using the Elasticsearch S...
Elasticsearch
 
BigData Search Simplified with ElasticSearch
TO THE NEW | Technology
 
Elasticsearch
Divij Sehgal
 
Ad

More from Danny Yuan (6)

PDF
Streaming Analytics in Uber
Danny Yuan
 
PDF
Streaming Processing in Uber Marketplace for Kafka Summit 2016
Danny Yuan
 
PDF
QCon SF-2015 Stream Processing in uber
Danny Yuan
 
PDF
QConSF 2014 talk on Netflix Mantis, a stream processing system
Danny Yuan
 
PDF
netflix-real-time-data-strata-talk
Danny Yuan
 
PDF
Strata lightening-talk
Danny Yuan
 
Streaming Analytics in Uber
Danny Yuan
 
Streaming Processing in Uber Marketplace for Kafka Summit 2016
Danny Yuan
 
QCon SF-2015 Stream Processing in uber
Danny Yuan
 
QConSF 2014 talk on Netflix Mantis, a stream processing system
Danny Yuan
 
netflix-real-time-data-strata-talk
Danny Yuan
 
Strata lightening-talk
Danny Yuan
 
Ad

Recently uploaded (20)

PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PDF
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 

Elasticsearch in Netflix