SlideShare a Scribd company logo
Deploying and managing Solr at Scale
Who am I?
• Anshum Gupta, Apache Lucene/Solr committer,
Lucidworks Employee.
• Interested in search and related stuff.
• Apache Lucene since 2006 and Solr since 2010.
• Organizations I am or have been a part of:
Apache Solr has a huge install base and tremendous momentum
most widely used search
solution on the planet.
8M+
total downloads
Solr is both established & growing
250,000+
monthly downloads
Solr has tens of thousands
of applications in production.
You use Solr everyday.
2500+open Solr jobs.
Activity Summary
30 Day summary
Dec 06, 2014 - Jan 05, 2015
• 135 Commits
• 17 Contributors
via https://www.openhub.net/p/solr
12 Month Summary
Jan 5, 2014 — Jan 5, 2015
• 1363 Commits
• 30 Contributors
Getting started with Solr
• Download
• Untar/Unzip
• bin/solr start -e cloud -noprompt
• open http://localhost:8983/solr
Recent usability improvements
• Start scripts
• Schema APIs
• Config API - Register custom handlers using API
• Status APIs and more….
SolrCloud Architecture
Shard 1
(leader)
Followers
Shard 2
(leader)
Followers
ZooKeeper
Ensemble
Multiple Nodes = Need for Coordination
Production scale?
• Zk ensemble. NOT embedded
• Multiple nodes
• Manually (or script) the 4 steps for each node?
Solr Scale Toolkit
• Open Source!
• Fabric (Python) toolset for deploying and managing SolrCloud
clusters in the cloud
• Code to support benchmark tests (Pig script for data generation /
indexing, JMeter samplers)
• EC2 for now, more cloud providers coming soon via Apache
libcloud
• No *need* to know Python!
The building blocks: A lot of python!
• boto – Python API for AWS (EC2, S3, etc)
• Fabric – Python-based tool for automating system admin tasks over SSH
• pysolr – Python library for Solr (sending commits, queries, ...)
• kazoo – Python client tools for ZooKeeper
• Supporting Cast:
• JMeter – run tests, generate reports
• collectd – system monitoring
• Logstash4Solr – log aggregation
• JConsole/VisualVM – monitor JVM during indexing / queries
Overview of features:
• Provisioning N machine instances in EC2
• Configuring / starting ZooKeeper (1 to n servers)
• Configuring / starting N Solr instances in cloud
mode (M x N nodes)
• Integrating with Logstash4Solr and other
supporting services, e.g. collectd
• Day-to-day operations on an existing cluster
N X M SolrCloud Nodes
ZK Host N
Node 1: Custom AMI
Architecture
Solr-Scale-Toolkit
SiLK
ZK Host 1
ZooKeeper 1
ZK Ensemble
Meta Node
Solr Node 1: 8983
core
core
core
Solr Node N: 89xx
core
core
core
ZooKeeper N
X M such machines
system monitoring
of M machines w/
collectd and JMX
Provisioning cluster nodes
• Custom built AMI (one for PV instances and one for HVM instances) – Amazon Linux
• Dedicated disk per Solr node
• Launch and then poll status until they are live
• Verify SSH connectivity
• Tag each instance with a cluster ID and username
fab new_ec2_instances:test1,n=3,instance_type=m3.xlarge
Deploy ZooKeeper ensemble
• Two options to use the ensemble:
• Provision 1 to N nodes when you launch Solr cluster
• use existing named ensemble
• Fabric command simply creates the myid files and zoo.cfg file for the
ensemble
• and some cron scripts for managing snapshots
• Basic health checking of ZooKeeper status:
• echo srvr | nc localhost 2181
fab	
  new_zk_ensemble:zk1,n=3
Deploy SolrCloud cluster
• Uses bin/solr in Solr 4.10 to control Solr nodes
• Set system props: jetty.port, host, zkHost, JVM opts
• One or more Solr nodes per machine
• JVM mem opts dependent on instance type and # of Solr nodes per
instance
• Optionally configure log4j.properties to append messages to Rabbitmq
for SiLK integration
fab	
  new_solrcloud:test1,zk=zk1,nodesPerHost=2
Demo
• Launch ZooKeeper Ensemble
• 3 nodes to establish quorum
• Launch SolrCloud cluster
• Create new collection and index some docs
• Run a healthcheck on the collection
Dashboards
Other useful stuff
• patch from a local build.
• fab mine: See clusters I’m running (or for other users too)
• fab kill_mine: Terminate all instances I’m running
• fab ssh_to: Quick way to SSH to one of the nodes in a cluster
• fab stop/recover/kill: Basic commands for controlling specific
Solr nodes in the cluster
• fab jmeter: Execute a JMeter test plan against your cluster
• Example test plan and Java sampler is included with the source
Testing Methodology
• Transparent repeatable results
• Ideally hoping for something owned by the community
• Synthetic docs ~ 1K each on disk, mix of field types
• Data set created using code borrowed from PigMix
• English text fields generated using a Zipfian distribution
• Java 1.7u67, Amazon Linux, r3.2xlarge nodes
• enhanced networking enabled, placement group, same AZ
• Stock Solr (cloud) 4.10
• Using custom GC tuning parameters and auto-commit settings
• Use Elastic MapReduce to generate indexing load
• As many nodes as I need to drive Solr!
Indexing performance
Cluster Size # of Shards # of Replicas Reducers Time (secs) Docs / sec
10 10 1 48 1762 73,780
10 10 2 34 3727 34,881
10 20 1 48 1282 101,404
10 20 2 34 3207 40,536
10 30 1 72 1070 121,495
10 30 2 60 3159 41,152
15 15 1 60 1106 117,541
15 15 2 42 2465 52,738
15 30 1 60 827 157,195
15 30 2 42 2129 61,062
Indexing performance lessons
• Solr has no built-in throttling support – will accept work until it
falls over; need to build this into your indexing application
logic
• Oversharding helps parallelize indexing work and gives you an
easy way to add more hardware to your cluster
• GC tuning is critical
• Auto-hard commit to keep transaction logs manageable
• Auto soft-commit to see docs as they are indexed
• Replication is expensive! (Work in progress, SOLR-6816)
Query Performance
• Still a work in progress!
• Sustained QPS & Execution time of 99th Percentile
• Stable: ~5,000 QPS / 99th at 300ms while indexing ~10,000 docs / sec
• Using the TermsComponent to build queries based on the terms in each
field.
• Harder to accurately simulate user queries over synthetic data
• Need mix of faceting, paging, sorting, grouping, boolean clauses, range
queries, boosting, filters (some cached, some not), etc ...
• Start with one server (1 shard) to determine baseline query performance.
• Look for inefficiencies in your schema and other config settings
More on query performance…
• Higher risk of full GC pauses (facets, filters, sorting)
• Use optimized data structures (DocValues) for facet / sort fields, Trie-
based numeric fields for range queries, facet.method=enum for low
cardinality fields
• Add more replicas; load-balance
• -Dhttp.maxConnections=## (default = 5, increase to accommodate
more threads sending queries)
• Avoid increasing ZooKeeper client timeout ~ 15000 (15 seconds) is
about right
• Don’t just keep throwing more memory at Java! –Xmx128G
Roadmap
• Not just AWS
• No need for custom AMI, configurable download
paths and versions.
Questions?
References
• Solr scale toolkit
• Blog: http://lucidworks.com/blog/introducing-
the-solr-scale-toolkit/
• Podcast: http://solrcluster.podbean.com/e/tim-
potter-on-the-solr-scale-toolkit/
• github: https://github.com/LucidWorks/solr-
scale-tk
Connect @
http://www.twitter.com/anshumgupta
http://www.linkedin.com/in/anshumgupta/
anshum@apache.org

More Related Content

PPTX
Scaling SolrCloud to a large number of Collections
Anshum Gupta
 
PDF
What's new in Solr 5.0
Anshum Gupta
 
PDF
Best practices for highly available and large scale SolrCloud
Anshum Gupta
 
PPTX
Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
thelabdude
 
PDF
Ease of use in Apache Solr
Anshum Gupta
 
PDF
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Shalin Shekhar Mangar
 
PDF
Introduction to SolrCloud
Varun Thacker
 
PDF
How to make a simple cheap high availability self-healing solr cluster
lucenerevolution
 
Scaling SolrCloud to a large number of Collections
Anshum Gupta
 
What's new in Solr 5.0
Anshum Gupta
 
Best practices for highly available and large scale SolrCloud
Anshum Gupta
 
Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
thelabdude
 
Ease of use in Apache Solr
Anshum Gupta
 
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Shalin Shekhar Mangar
 
Introduction to SolrCloud
Varun Thacker
 
How to make a simple cheap high availability self-healing solr cluster
lucenerevolution
 

What's hot (20)

PDF
First oslo solr community meetup lightning talk janhoy
Cominvent AS
 
PPTX
Solrcloud Leader Election
ravikgiitk
 
PDF
Solr cluster with SolrCloud at lucenerevolution (tutorial)
searchbox-com
 
PDF
Apache Solr 5.0 and beyond
Anshum Gupta
 
PDF
SolrCloud Cluster management via APIs
Anshum Gupta
 
PDF
Call me maybe: Jepsen and flaky networks
Shalin Shekhar Mangar
 
PDF
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Lucidworks
 
PDF
Inside Solr 5 - Bangalore Solr/Lucene Meetup
Shalin Shekhar Mangar
 
PDF
SolrCloud Failover and Testing
Mark Miller
 
PDF
What's New in Apache Solr 4.10
Anshum Gupta
 
PPTX
Managing a SolrCloud cluster using APIs
Anshum Gupta
 
PPTX
Solr Exchange: Introduction to SolrCloud
thelabdude
 
PDF
How SolrCloud Changes the User Experience In a Sharded Environment
lucenerevolution
 
PDF
Scaling search with SolrCloud
Saumitra Srivastav
 
PPTX
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
Lucidworks (Archived)
 
PDF
Solr security frameworks
Anshum Gupta
 
PDF
Cross Datacenter Replication in Apache Solr 6
Shalin Shekhar Mangar
 
PPTX
Scaling Solr with Solr Cloud
Sematext Group, Inc.
 
PDF
Understanding the Solr security framework - Lucene Solr Revolution 2015
Anshum Gupta
 
PDF
High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...
Lucidworks
 
First oslo solr community meetup lightning talk janhoy
Cominvent AS
 
Solrcloud Leader Election
ravikgiitk
 
Solr cluster with SolrCloud at lucenerevolution (tutorial)
searchbox-com
 
Apache Solr 5.0 and beyond
Anshum Gupta
 
SolrCloud Cluster management via APIs
Anshum Gupta
 
Call me maybe: Jepsen and flaky networks
Shalin Shekhar Mangar
 
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Lucidworks
 
Inside Solr 5 - Bangalore Solr/Lucene Meetup
Shalin Shekhar Mangar
 
SolrCloud Failover and Testing
Mark Miller
 
What's New in Apache Solr 4.10
Anshum Gupta
 
Managing a SolrCloud cluster using APIs
Anshum Gupta
 
Solr Exchange: Introduction to SolrCloud
thelabdude
 
How SolrCloud Changes the User Experience In a Sharded Environment
lucenerevolution
 
Scaling search with SolrCloud
Saumitra Srivastav
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
Lucidworks (Archived)
 
Solr security frameworks
Anshum Gupta
 
Cross Datacenter Replication in Apache Solr 6
Shalin Shekhar Mangar
 
Scaling Solr with Solr Cloud
Sematext Group, Inc.
 
Understanding the Solr security framework - Lucene Solr Revolution 2015
Anshum Gupta
 
High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...
Lucidworks
 
Ad

Viewers also liked (18)

PPTX
Benchmarking Solr Performance at Scale
thelabdude
 
PDF
Solr Architecture
Ramez Al-Fayez
 
PDF
Scaling Drupal on Amazon Web Services (DrupalCamp Brighton)
Cogapp
 
PDF
Webinar: Fusion for Business Intelligence
Lucidworks
 
PDF
Webinar: Search and Recommenders
Lucidworks
 
PDF
Downtown SF Lucene/Solr Meetup: Developing Scalable User Search for PlayStati...
Lucidworks
 
PDF
Webinar: Fusion 2.3 Preview - Enhanced Features with Solr & Spark
Lucidworks
 
PDF
Building a Solr Continuous Delivery Pipeline with Jenkins: Presented by James...
Lucidworks
 
PDF
Solr JDBC: Presented by Kevin Risden, Avalon Consulting
Lucidworks
 
PDF
it's just search
Erik Hatcher
 
PDF
Cross Data Center Replication for the Enterprise: Presented by Adam Williams,...
Lucidworks
 
PDF
Using Apache Solr for Images as Big Data: Presented by Kerry Koitzsch, Wipro...
Lucidworks
 
PDF
Downtown SF Lucene/Solr Meetup: Developing Scalable Search for User Generated...
Lucidworks
 
PDF
Working with deeply nested documents in Apache Solr
Anshum Gupta
 
PDF
Coffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, Flax
Lucidworks
 
PPTX
Slash n near real time indexing
Umesh Prasad
 
PDF
Webinar: Replace Google Search Appliance with Lucidworks Fusion
Lucidworks
 
PDF
Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & ...
Lucidworks
 
Benchmarking Solr Performance at Scale
thelabdude
 
Solr Architecture
Ramez Al-Fayez
 
Scaling Drupal on Amazon Web Services (DrupalCamp Brighton)
Cogapp
 
Webinar: Fusion for Business Intelligence
Lucidworks
 
Webinar: Search and Recommenders
Lucidworks
 
Downtown SF Lucene/Solr Meetup: Developing Scalable User Search for PlayStati...
Lucidworks
 
Webinar: Fusion 2.3 Preview - Enhanced Features with Solr & Spark
Lucidworks
 
Building a Solr Continuous Delivery Pipeline with Jenkins: Presented by James...
Lucidworks
 
Solr JDBC: Presented by Kevin Risden, Avalon Consulting
Lucidworks
 
it's just search
Erik Hatcher
 
Cross Data Center Replication for the Enterprise: Presented by Adam Williams,...
Lucidworks
 
Using Apache Solr for Images as Big Data: Presented by Kerry Koitzsch, Wipro...
Lucidworks
 
Downtown SF Lucene/Solr Meetup: Developing Scalable Search for User Generated...
Lucidworks
 
Working with deeply nested documents in Apache Solr
Anshum Gupta
 
Coffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, Flax
Lucidworks
 
Slash n near real time indexing
Umesh Prasad
 
Webinar: Replace Google Search Appliance with Lucidworks Fusion
Lucidworks
 
Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & ...
Lucidworks
 
Ad

Similar to Deploying and managing Solr at scale (20)

PPTX
Benchmarking Solr Performance
Lucidworks
 
PDF
Autoscaling Solr - Shalin Shekhar Mangar, Lucidworks
Lucidworks
 
PPTX
Real-Time Inverted Search NYC ASLUG Oct 2014
Bryan Bende
 
PDF
Meet Solr For The Tirst Again
Varun Thacker
 
PPTX
Meetup on Apache Zookeeper
Anshul Patel
 
PDF
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Belmiro Moreira
 
PDF
Automated Cluster Management and Recovery for Large Scale Multi-Tenant Sea...
Lucidworks
 
PDF
Real-time Inverted Search in the Cloud Using Lucene and Storm
lucenerevolution
 
KEY
Apache Solr - Enterprise search platform
Tommaso Teofili
 
PDF
Fusion on Kubernetes - Alan Eugenio & Joe Streeky, Lucidworks
Lucidworks
 
PPTX
Alfresco tuning part1
Luis Cabaceira
 
PPTX
Alfresco tuning part1
Luis Cabaceira
 
PPTX
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Lucidworks (Archived)
 
PDF
Real Time Indexing and Search - Ashwani Kapoor & Girish Gudla, Trulia
Lucidworks
 
PDF
Solr 4
Erik Hatcher
 
PPTX
Flexible compute
Peter Clapham
 
PPTX
Sanger, upcoming Openstack for Bio-informaticians
Peter Clapham
 
PDF
Solr Powered Lucene
Erik Hatcher
 
PDF
Training Slides: 103 - Basics - Simple Tungsten Clustering Installation
Continuent
 
PDF
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Lucidworks
 
Benchmarking Solr Performance
Lucidworks
 
Autoscaling Solr - Shalin Shekhar Mangar, Lucidworks
Lucidworks
 
Real-Time Inverted Search NYC ASLUG Oct 2014
Bryan Bende
 
Meet Solr For The Tirst Again
Varun Thacker
 
Meetup on Apache Zookeeper
Anshul Patel
 
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Belmiro Moreira
 
Automated Cluster Management and Recovery for Large Scale Multi-Tenant Sea...
Lucidworks
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
lucenerevolution
 
Apache Solr - Enterprise search platform
Tommaso Teofili
 
Fusion on Kubernetes - Alan Eugenio & Joe Streeky, Lucidworks
Lucidworks
 
Alfresco tuning part1
Luis Cabaceira
 
Alfresco tuning part1
Luis Cabaceira
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Lucidworks (Archived)
 
Real Time Indexing and Search - Ashwani Kapoor & Girish Gudla, Trulia
Lucidworks
 
Solr 4
Erik Hatcher
 
Flexible compute
Peter Clapham
 
Sanger, upcoming Openstack for Bio-informaticians
Peter Clapham
 
Solr Powered Lucene
Erik Hatcher
 
Training Slides: 103 - Basics - Simple Tungsten Clustering Installation
Continuent
 
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Lucidworks
 

Recently uploaded (20)

PPTX
Can You Build Dashboards Using Open Source Visualization Tool.pptx
Varsha Nayak
 
PDF
Immersive experiences: what Pharo users do!
ESUG
 
PDF
Download iTop VPN Free 6.1.0.5882 Crack Full Activated Pre Latest 2025
imang66g
 
PDF
New Download MiniTool Partition Wizard Crack Latest Version 2025
imang66g
 
PPTX
TRAVEL APIs | WHITE LABEL TRAVEL API | TOP TRAVEL APIs
philipnathen82
 
PDF
ChatPharo: an Open Architecture for Understanding How to Talk Live to LLMs
ESUG
 
PPTX
classification of computer and basic part of digital computer
ravisinghrajpurohit3
 
PPTX
Web Testing.pptx528278vshbuqffqhhqiwnwuq
studylike474
 
PPT
Why Reliable Server Maintenance Service in New York is Crucial for Your Business
Sam Vohra
 
PPTX
GALILEO CRS SYSTEM | GALILEO TRAVEL SOFTWARE
philipnathen82
 
PPTX
Role Of Python In Programing Language.pptx
jaykoshti048
 
PDF
WatchTraderHub - Watch Dealer software with inventory management and multi-ch...
WatchDealer Pavel
 
PDF
Summary Of Odoo 18.1 to 18.4 : The Way For Odoo 19
CandidRoot Solutions Private Limited
 
PPTX
Presentation about variables and constant.pptx
kr2589474
 
PPTX
Contractor Management Platform and Software Solution for Compliance
SHEQ Network Limited
 
PDF
On Software Engineers' Productivity - Beyond Misleading Metrics
Romén Rodríguez-Gil
 
PPTX
ASSIGNMENT_1[1][1][1][1][1] (1) variables.pptx
kr2589474
 
PDF
Salesforce Implementation Services Provider.pdf
VALiNTRY360
 
PDF
Exploring AI Agents in Process Industries
amoreira6
 
PPTX
Visualising Data with Scatterplots in IBM SPSS Statistics.pptx
Version 1 Analytics
 
Can You Build Dashboards Using Open Source Visualization Tool.pptx
Varsha Nayak
 
Immersive experiences: what Pharo users do!
ESUG
 
Download iTop VPN Free 6.1.0.5882 Crack Full Activated Pre Latest 2025
imang66g
 
New Download MiniTool Partition Wizard Crack Latest Version 2025
imang66g
 
TRAVEL APIs | WHITE LABEL TRAVEL API | TOP TRAVEL APIs
philipnathen82
 
ChatPharo: an Open Architecture for Understanding How to Talk Live to LLMs
ESUG
 
classification of computer and basic part of digital computer
ravisinghrajpurohit3
 
Web Testing.pptx528278vshbuqffqhhqiwnwuq
studylike474
 
Why Reliable Server Maintenance Service in New York is Crucial for Your Business
Sam Vohra
 
GALILEO CRS SYSTEM | GALILEO TRAVEL SOFTWARE
philipnathen82
 
Role Of Python In Programing Language.pptx
jaykoshti048
 
WatchTraderHub - Watch Dealer software with inventory management and multi-ch...
WatchDealer Pavel
 
Summary Of Odoo 18.1 to 18.4 : The Way For Odoo 19
CandidRoot Solutions Private Limited
 
Presentation about variables and constant.pptx
kr2589474
 
Contractor Management Platform and Software Solution for Compliance
SHEQ Network Limited
 
On Software Engineers' Productivity - Beyond Misleading Metrics
Romén Rodríguez-Gil
 
ASSIGNMENT_1[1][1][1][1][1] (1) variables.pptx
kr2589474
 
Salesforce Implementation Services Provider.pdf
VALiNTRY360
 
Exploring AI Agents in Process Industries
amoreira6
 
Visualising Data with Scatterplots in IBM SPSS Statistics.pptx
Version 1 Analytics
 

Deploying and managing Solr at scale

  • 1. Deploying and managing Solr at Scale
  • 2. Who am I? • Anshum Gupta, Apache Lucene/Solr committer, Lucidworks Employee. • Interested in search and related stuff. • Apache Lucene since 2006 and Solr since 2010. • Organizations I am or have been a part of:
  • 3. Apache Solr has a huge install base and tremendous momentum most widely used search solution on the planet. 8M+ total downloads Solr is both established & growing 250,000+ monthly downloads Solr has tens of thousands of applications in production. You use Solr everyday. 2500+open Solr jobs. Activity Summary 30 Day summary Dec 06, 2014 - Jan 05, 2015 • 135 Commits • 17 Contributors via https://www.openhub.net/p/solr 12 Month Summary Jan 5, 2014 — Jan 5, 2015 • 1363 Commits • 30 Contributors
  • 4. Getting started with Solr • Download • Untar/Unzip • bin/solr start -e cloud -noprompt • open http://localhost:8983/solr
  • 5. Recent usability improvements • Start scripts • Schema APIs • Config API - Register custom handlers using API • Status APIs and more….
  • 6. SolrCloud Architecture Shard 1 (leader) Followers Shard 2 (leader) Followers ZooKeeper Ensemble Multiple Nodes = Need for Coordination
  • 7. Production scale? • Zk ensemble. NOT embedded • Multiple nodes • Manually (or script) the 4 steps for each node?
  • 8. Solr Scale Toolkit • Open Source! • Fabric (Python) toolset for deploying and managing SolrCloud clusters in the cloud • Code to support benchmark tests (Pig script for data generation / indexing, JMeter samplers) • EC2 for now, more cloud providers coming soon via Apache libcloud • No *need* to know Python!
  • 9. The building blocks: A lot of python! • boto – Python API for AWS (EC2, S3, etc) • Fabric – Python-based tool for automating system admin tasks over SSH • pysolr – Python library for Solr (sending commits, queries, ...) • kazoo – Python client tools for ZooKeeper • Supporting Cast: • JMeter – run tests, generate reports • collectd – system monitoring • Logstash4Solr – log aggregation • JConsole/VisualVM – monitor JVM during indexing / queries
  • 10. Overview of features: • Provisioning N machine instances in EC2 • Configuring / starting ZooKeeper (1 to n servers) • Configuring / starting N Solr instances in cloud mode (M x N nodes) • Integrating with Logstash4Solr and other supporting services, e.g. collectd • Day-to-day operations on an existing cluster
  • 11. N X M SolrCloud Nodes ZK Host N Node 1: Custom AMI Architecture Solr-Scale-Toolkit SiLK ZK Host 1 ZooKeeper 1 ZK Ensemble Meta Node Solr Node 1: 8983 core core core Solr Node N: 89xx core core core ZooKeeper N X M such machines system monitoring of M machines w/ collectd and JMX
  • 12. Provisioning cluster nodes • Custom built AMI (one for PV instances and one for HVM instances) – Amazon Linux • Dedicated disk per Solr node • Launch and then poll status until they are live • Verify SSH connectivity • Tag each instance with a cluster ID and username fab new_ec2_instances:test1,n=3,instance_type=m3.xlarge
  • 13. Deploy ZooKeeper ensemble • Two options to use the ensemble: • Provision 1 to N nodes when you launch Solr cluster • use existing named ensemble • Fabric command simply creates the myid files and zoo.cfg file for the ensemble • and some cron scripts for managing snapshots • Basic health checking of ZooKeeper status: • echo srvr | nc localhost 2181 fab  new_zk_ensemble:zk1,n=3
  • 14. Deploy SolrCloud cluster • Uses bin/solr in Solr 4.10 to control Solr nodes • Set system props: jetty.port, host, zkHost, JVM opts • One or more Solr nodes per machine • JVM mem opts dependent on instance type and # of Solr nodes per instance • Optionally configure log4j.properties to append messages to Rabbitmq for SiLK integration fab  new_solrcloud:test1,zk=zk1,nodesPerHost=2
  • 15. Demo • Launch ZooKeeper Ensemble • 3 nodes to establish quorum • Launch SolrCloud cluster • Create new collection and index some docs • Run a healthcheck on the collection
  • 17. Other useful stuff • patch from a local build. • fab mine: See clusters I’m running (or for other users too) • fab kill_mine: Terminate all instances I’m running • fab ssh_to: Quick way to SSH to one of the nodes in a cluster • fab stop/recover/kill: Basic commands for controlling specific Solr nodes in the cluster • fab jmeter: Execute a JMeter test plan against your cluster • Example test plan and Java sampler is included with the source
  • 18. Testing Methodology • Transparent repeatable results • Ideally hoping for something owned by the community • Synthetic docs ~ 1K each on disk, mix of field types • Data set created using code borrowed from PigMix • English text fields generated using a Zipfian distribution • Java 1.7u67, Amazon Linux, r3.2xlarge nodes • enhanced networking enabled, placement group, same AZ • Stock Solr (cloud) 4.10 • Using custom GC tuning parameters and auto-commit settings • Use Elastic MapReduce to generate indexing load • As many nodes as I need to drive Solr!
  • 19. Indexing performance Cluster Size # of Shards # of Replicas Reducers Time (secs) Docs / sec 10 10 1 48 1762 73,780 10 10 2 34 3727 34,881 10 20 1 48 1282 101,404 10 20 2 34 3207 40,536 10 30 1 72 1070 121,495 10 30 2 60 3159 41,152 15 15 1 60 1106 117,541 15 15 2 42 2465 52,738 15 30 1 60 827 157,195 15 30 2 42 2129 61,062
  • 20. Indexing performance lessons • Solr has no built-in throttling support – will accept work until it falls over; need to build this into your indexing application logic • Oversharding helps parallelize indexing work and gives you an easy way to add more hardware to your cluster • GC tuning is critical • Auto-hard commit to keep transaction logs manageable • Auto soft-commit to see docs as they are indexed • Replication is expensive! (Work in progress, SOLR-6816)
  • 21. Query Performance • Still a work in progress! • Sustained QPS & Execution time of 99th Percentile • Stable: ~5,000 QPS / 99th at 300ms while indexing ~10,000 docs / sec • Using the TermsComponent to build queries based on the terms in each field. • Harder to accurately simulate user queries over synthetic data • Need mix of faceting, paging, sorting, grouping, boolean clauses, range queries, boosting, filters (some cached, some not), etc ... • Start with one server (1 shard) to determine baseline query performance. • Look for inefficiencies in your schema and other config settings
  • 22. More on query performance… • Higher risk of full GC pauses (facets, filters, sorting) • Use optimized data structures (DocValues) for facet / sort fields, Trie- based numeric fields for range queries, facet.method=enum for low cardinality fields • Add more replicas; load-balance • -Dhttp.maxConnections=## (default = 5, increase to accommodate more threads sending queries) • Avoid increasing ZooKeeper client timeout ~ 15000 (15 seconds) is about right • Don’t just keep throwing more memory at Java! –Xmx128G
  • 23. Roadmap • Not just AWS • No need for custom AMI, configurable download paths and versions.
  • 25. References • Solr scale toolkit • Blog: http://lucidworks.com/blog/introducing- the-solr-scale-toolkit/ • Podcast: http://solrcluster.podbean.com/e/tim- potter-on-the-solr-scale-toolkit/ • github: https://github.com/LucidWorks/solr- scale-tk