SlideShare a Scribd company logo
1© Cloudera, Inc. All rights reserved.
Solr consistency and recovery internals
Mano Kovacs | July 13, 2017
2© Cloudera, Inc. All rights reserved.
Intro
• Mano Kovacs
• Cloudera Search engineer
• Working on “Why is my Solr cluster down?” mysteries.
• 15 yrs of dev, high-performant web services, IoT platform
• Amature slideshow enthusiast
3© Cloudera, Inc. All rights reserved.
Agenda
• Consistency basics (leaders/follower)
• Leader election
• When to recover
• General recovery (peersync, replication)
• Recovery in detail
• Leader-Initiated Recovery
• Auto Add Replica
4© Cloudera, Inc. All rights reserved.
Basics
• Shards in collection
• One leader per shard
• Leader gets writes
• Replicates
5© Cloudera, Inc. All rights reserved.
Leader Election
• Zookeeper Leader election recipe
• Sequential, ephemeral nodes for each replica
• The order dictates the leader candidates
• First in order becomes leader candidate
• Replicas watch the previous candidate to get notified
• If leader fails, next in line will be the candidate
• Leader candidates follow leader preparation process
6© Cloudera, Inc. All rights reserved.
Leader Election - leader candidate
• On restart: waits all replicas to participate (default 3 mins)
• Sync changes from other replicas
• Verify last state ACTIVE if not startup
• If all were DOWN, shard hangs (SOLR-7065)
• Verify there was no error reported (LIR… tbd)
7© Cloudera, Inc. All rights reserved.
What causes Recovery?
• Routine Events
•Add or Move Replica - not having the data
•Restart (upgrade/tuning) - might missed updates
• Not Routine Events
•Server crash
•Leader
•Replica
•Network failure (Lose ZK Connection)
•Replica partitioned: can access ZK, but not the leader
8© Cloudera, Inc. All rights reserved.
Recovery (from 30k fts.)
• Replaying unfinished updates from tlog
• Check if we are synced
• If no, “How much am I behind?”
• If N (def=100) docs or less
• Retrieving delta
• Else
• Replication: pulling full index
• Go ACTIVE
9© Cloudera, Inc. All rights reserved.
Recovery (from 1000 fts.)
• Buffering new updates
• So we won’t get behind over and over
again
• Waiting leader to notice us
• Otherwise we don’t get updates
• Replay buffered updates
• Hopefully replay catches up with
incoming updates
10© Cloudera, Inc. All rights reserved.
Recovery (from 100 fts.)
• Updates are versioned
• Timestamp+counter
• PeerSync: last N updates by version
• Index has fingerprint (hash of doc versions)
• If there is other updates missing,
fingerprint will fail
• Consistency safety net if others fail
11© Cloudera, Inc. All rights reserved.
Leader-Initiated Recovery
• Partitioning Leader from Replica,
but not ZK
• Leader will send recovery requests
to replica (with retries)
• If Replica went down, it will do
normal recovery process anyway
• If replica is partitioned and up, it
will still serve stale reads :(
12© Cloudera, Inc. All rights reserved.
LIR problems - SOLR-9555
• Race condition between LIR and
standard Recovery
• Mike Drob’s patch is almost done
• Solves problem with
partitioned replicas too with ZK
watches
13© Cloudera, Inc. All rights reserved.
AutoAddReplica
• Using shared file system (e.g. HDFS)
• Provides durability
• Instances share index folders
• Move cores to live nodes on failure
• Use same index folder
• Pros
• Durability with rep factor 1
• Handle perm. node loss
• Cons
• Still no HA and read scalability if
using single replica
• Lots of fix from Mark Miller lately
14© Cloudera, Inc. All rights reserved.
Summary
• Details about SolrCloud cluster
• Help to improve!
• PlantUML is cool to document
15© Cloudera, Inc. All rights reserved.
Thank you
E: manokovacs@cloudera.com
T: @manokovacs

More Related Content

What's hot (20)

PPTX
Introduction to Redis
Arnab Mitra
 
PDF
Apache Kafka Streams + Machine Learning / Deep Learning
Kai Wähner
 
PDF
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
StreamNative
 
PPTX
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Jean-Paul Azar
 
PDF
Apache ZooKeeper
Scott Leberknight
 
PPTX
Sql vs NoSQL-Presentation
Shubham Tomar
 
PPTX
The Basics of MongoDB
valuebound
 
PDF
NOSQLEU - Graph Databases and Neo4j
Tobias Lindaaker
 
PPTX
Introduction to azure cosmos db
Ratan Parai
 
PDF
DSpace 7 ORCID Integration
4Science
 
PPTX
Azure Synapse Analytics Overview (r2)
James Serra
 
PDF
AWS glue technical enablement training
Info Alchemy Corporation
 
PDF
Debunking some “RDF vs. Property Graph” Alternative Facts
Neo4j
 
ODP
Introduction to Apache solr
Knoldus Inc.
 
PDF
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
Databricks
 
PPTX
Azure redis cache
Shahriar Hossain
 
PDF
Azure SQL Database
nj-azure
 
PPTX
AZ-204T00A-PowerPoint_00.pptx
JavierMadrigal29
 
PDF
Experimentation to Industrialization: Implementing MLOps
Databricks
 
ODP
Cassandra Data Modelling
Knoldus Inc.
 
Introduction to Redis
Arnab Mitra
 
Apache Kafka Streams + Machine Learning / Deep Learning
Kai Wähner
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
StreamNative
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Jean-Paul Azar
 
Apache ZooKeeper
Scott Leberknight
 
Sql vs NoSQL-Presentation
Shubham Tomar
 
The Basics of MongoDB
valuebound
 
NOSQLEU - Graph Databases and Neo4j
Tobias Lindaaker
 
Introduction to azure cosmos db
Ratan Parai
 
DSpace 7 ORCID Integration
4Science
 
Azure Synapse Analytics Overview (r2)
James Serra
 
AWS glue technical enablement training
Info Alchemy Corporation
 
Debunking some “RDF vs. Property Graph” Alternative Facts
Neo4j
 
Introduction to Apache solr
Knoldus Inc.
 
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
Databricks
 
Azure redis cache
Shahriar Hossain
 
Azure SQL Database
nj-azure
 
AZ-204T00A-PowerPoint_00.pptx
JavierMadrigal29
 
Experimentation to Industrialization: Implementing MLOps
Databricks
 
Cassandra Data Modelling
Knoldus Inc.
 

Similar to Solr consistency and recovery internals (7)

PDF
Solr Consistency and Recovery Internals - Mano Kovacs, Cloudera
Lucidworks
 
PDF
SolrCloud - High Availability and Fault Tolerance: Presented by Mark Miller, ...
Lucidworks
 
PDF
How SolrCloud Solved Recovery Issues - Dat Cao Manh, Lucidworks
Lucidworks
 
PPTX
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Lucidworks (Archived)
 
PPTX
Solr Exchange: Introduction to SolrCloud
thelabdude
 
PDF
Introduction to SolrCloud
Varun Thacker
 
PDF
Building and Running Solr-as-a-Service: Presented by Shai Erera, IBM
Lucidworks
 
Solr Consistency and Recovery Internals - Mano Kovacs, Cloudera
Lucidworks
 
SolrCloud - High Availability and Fault Tolerance: Presented by Mark Miller, ...
Lucidworks
 
How SolrCloud Solved Recovery Issues - Dat Cao Manh, Lucidworks
Lucidworks
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Lucidworks (Archived)
 
Solr Exchange: Introduction to SolrCloud
thelabdude
 
Introduction to SolrCloud
Varun Thacker
 
Building and Running Solr-as-a-Service: Presented by Shai Erera, IBM
Lucidworks
 
Ad

More from Cloudera, Inc. (20)

PPTX
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
PPTX
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
PPTX
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
 
PPTX
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
 
PPTX
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
PPTX
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
PPTX
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
PPTX
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
PPTX
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
PPTX
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
PPTX
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
PPTX
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
PPTX
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
PPTX
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
PPTX
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
 
PPTX
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
 
PPTX
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
PPTX
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
PPTX
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
PPTX
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 
Ad

Recently uploaded (20)

PPTX
Phage Therapy and Bacteriophage Biology.pptx
Prachi Virat
 
PPTX
Pratik inorganic chemistry silicon based ppt
akshaythaker18
 
PPTX
Diagnostic Features of Common Oral Ulcerative Lesions.pptx
Dr Palak borade
 
PDF
Pharma Part 1.pdf #pharmacology #pharmacology
hikmatyt01
 
PPTX
PEDIA IDS IN A GIST_6488b6b5-3152-4a4a-a943-20a56efddd43 (2).pptx
tdas83504
 
PDF
Global Congress on Forensic Science and Research
infoforensicscience2
 
PPTX
Entner-Doudoroff pathway by Santosh .pptx
santoshpaudel35
 
PDF
Treatment and safety of drinking water .
psuvethapalani
 
PPT
Restriction digestion of DNA for students of undergraduate and post graduate ...
DrMukeshRameshPimpli
 
PDF
Unit-3 ppt.pdf organic chemistry unit 3 heterocyclic
visionshukla007
 
PDF
A High-Caliber View of the Bullet Cluster through JWST Strong and Weak Lensin...
Sérgio Sacani
 
PPTX
Class12_Physics_Chapter2 electric potential and capacitance.pptx
mgmahati1234
 
PDF
Plant growth promoting bacterial non symbiotic
psuvethapalani
 
PDF
Annual report 2024 - Inria - English version.pdf
Inria
 
PPTX
MODULE 2 Effects of Lifestyle in the Function of Respiratory and Circulator...
judithgracemangunday
 
PPTX
Envenomation AND ANIMAL BITES DETAILS.pptx
HARISH543351
 
PDF
Adding Geochemistry To Understand Recharge Areas - Kinney County, Texas - Jim...
Texas Alliance of Groundwater Districts
 
PDF
The ALMA-CRISTAL survey: Gas, dust, and stars in star-forming galaxies when t...
Sérgio Sacani
 
PPT
Experimental Design by Cary Willard v3.ppt
MohammadRezaNirooman1
 
PDF
FYS 100 final presentation on Afro cubans
RowanSales
 
Phage Therapy and Bacteriophage Biology.pptx
Prachi Virat
 
Pratik inorganic chemistry silicon based ppt
akshaythaker18
 
Diagnostic Features of Common Oral Ulcerative Lesions.pptx
Dr Palak borade
 
Pharma Part 1.pdf #pharmacology #pharmacology
hikmatyt01
 
PEDIA IDS IN A GIST_6488b6b5-3152-4a4a-a943-20a56efddd43 (2).pptx
tdas83504
 
Global Congress on Forensic Science and Research
infoforensicscience2
 
Entner-Doudoroff pathway by Santosh .pptx
santoshpaudel35
 
Treatment and safety of drinking water .
psuvethapalani
 
Restriction digestion of DNA for students of undergraduate and post graduate ...
DrMukeshRameshPimpli
 
Unit-3 ppt.pdf organic chemistry unit 3 heterocyclic
visionshukla007
 
A High-Caliber View of the Bullet Cluster through JWST Strong and Weak Lensin...
Sérgio Sacani
 
Class12_Physics_Chapter2 electric potential and capacitance.pptx
mgmahati1234
 
Plant growth promoting bacterial non symbiotic
psuvethapalani
 
Annual report 2024 - Inria - English version.pdf
Inria
 
MODULE 2 Effects of Lifestyle in the Function of Respiratory and Circulator...
judithgracemangunday
 
Envenomation AND ANIMAL BITES DETAILS.pptx
HARISH543351
 
Adding Geochemistry To Understand Recharge Areas - Kinney County, Texas - Jim...
Texas Alliance of Groundwater Districts
 
The ALMA-CRISTAL survey: Gas, dust, and stars in star-forming galaxies when t...
Sérgio Sacani
 
Experimental Design by Cary Willard v3.ppt
MohammadRezaNirooman1
 
FYS 100 final presentation on Afro cubans
RowanSales
 

Solr consistency and recovery internals

  • 1. 1© Cloudera, Inc. All rights reserved. Solr consistency and recovery internals Mano Kovacs | July 13, 2017
  • 2. 2© Cloudera, Inc. All rights reserved. Intro • Mano Kovacs • Cloudera Search engineer • Working on “Why is my Solr cluster down?” mysteries. • 15 yrs of dev, high-performant web services, IoT platform • Amature slideshow enthusiast
  • 3. 3© Cloudera, Inc. All rights reserved. Agenda • Consistency basics (leaders/follower) • Leader election • When to recover • General recovery (peersync, replication) • Recovery in detail • Leader-Initiated Recovery • Auto Add Replica
  • 4. 4© Cloudera, Inc. All rights reserved. Basics • Shards in collection • One leader per shard • Leader gets writes • Replicates
  • 5. 5© Cloudera, Inc. All rights reserved. Leader Election • Zookeeper Leader election recipe • Sequential, ephemeral nodes for each replica • The order dictates the leader candidates • First in order becomes leader candidate • Replicas watch the previous candidate to get notified • If leader fails, next in line will be the candidate • Leader candidates follow leader preparation process
  • 6. 6© Cloudera, Inc. All rights reserved. Leader Election - leader candidate • On restart: waits all replicas to participate (default 3 mins) • Sync changes from other replicas • Verify last state ACTIVE if not startup • If all were DOWN, shard hangs (SOLR-7065) • Verify there was no error reported (LIR… tbd)
  • 7. 7© Cloudera, Inc. All rights reserved. What causes Recovery? • Routine Events •Add or Move Replica - not having the data •Restart (upgrade/tuning) - might missed updates • Not Routine Events •Server crash •Leader •Replica •Network failure (Lose ZK Connection) •Replica partitioned: can access ZK, but not the leader
  • 8. 8© Cloudera, Inc. All rights reserved. Recovery (from 30k fts.) • Replaying unfinished updates from tlog • Check if we are synced • If no, “How much am I behind?” • If N (def=100) docs or less • Retrieving delta • Else • Replication: pulling full index • Go ACTIVE
  • 9. 9© Cloudera, Inc. All rights reserved. Recovery (from 1000 fts.) • Buffering new updates • So we won’t get behind over and over again • Waiting leader to notice us • Otherwise we don’t get updates • Replay buffered updates • Hopefully replay catches up with incoming updates
  • 10. 10© Cloudera, Inc. All rights reserved. Recovery (from 100 fts.) • Updates are versioned • Timestamp+counter • PeerSync: last N updates by version • Index has fingerprint (hash of doc versions) • If there is other updates missing, fingerprint will fail • Consistency safety net if others fail
  • 11. 11© Cloudera, Inc. All rights reserved. Leader-Initiated Recovery • Partitioning Leader from Replica, but not ZK • Leader will send recovery requests to replica (with retries) • If Replica went down, it will do normal recovery process anyway • If replica is partitioned and up, it will still serve stale reads :(
  • 12. 12© Cloudera, Inc. All rights reserved. LIR problems - SOLR-9555 • Race condition between LIR and standard Recovery • Mike Drob’s patch is almost done • Solves problem with partitioned replicas too with ZK watches
  • 13. 13© Cloudera, Inc. All rights reserved. AutoAddReplica • Using shared file system (e.g. HDFS) • Provides durability • Instances share index folders • Move cores to live nodes on failure • Use same index folder • Pros • Durability with rep factor 1 • Handle perm. node loss • Cons • Still no HA and read scalability if using single replica • Lots of fix from Mark Miller lately
  • 14. 14© Cloudera, Inc. All rights reserved. Summary • Details about SolrCloud cluster • Help to improve! • PlantUML is cool to document
  • 15. 15© Cloudera, Inc. All rights reserved. Thank you E: [email protected] T: @manokovacs