SlideShare a Scribd company logo
OCTOBER 11-14, 2016 • BOSTON, MA
Why is my Solr slow?
Mike Drob
Software Engineer, Cloudera
Who Am I?
●
Apache HTrace (Incubating) Committer
●
Software Engineer @ Cloudera
●
Middle School LEGO Robotics Mentor
Outline
●
The need for tracing
●
HTrace introduction
●
Examples with Solr
●
Finding the slow server!
Scenario
●
User complains that queries are “slow”
●
Caches?
●
Or faceting?
●
Or sorting?
●
Or...?
The State of the Art
Run a query with debug=timing:
"timing":{
"time":178.0,
"prepare":{"time":7.0,
"query":{"time":4.0}}
"process":{"time":160.0,
"query":{"time":118.0},
"facet":{"time":33.0}}
What is Apache HTrace
●
Distributed Tracing Library
●
Primitive Type is the “Span”
●
Unique ID
●
Source + Description
●
Start/Stop Time
●
Additional Metadata
●
See Also: Dapper, Zipkin, Wingtips
Reference Architecture
Application
Trace
Gen
Application
Trace
Gen
Application
Trace
Gen
Application
Trace
Gen
Collector
Example Architecture
Solr
trace
Solr
trace
Solr
trace
Solr
trace
htraced
How Do I Trace?
●
Can Trace Any “Unit Of Work”
●
Method Calls
●
Threads
●
RPCs doQuery();
try (TraceScope scope = tracer.newScope(“work”)) {
doQuery();
}
What Does It Look Like?
● Trace of q=*:* and resulting sub-queries
● Data modeled on debug=timing
Aside: Benefits over debug=timing
●
Separate time-lines for each server
●
Ability to mark annotations
●
Data stored for later analysis
●
Trace non-query requests
●
Trace calls to other systems
Something More Complex
techproducts/select?q=*:*&facet.field=popularity
And Other Commands
And Other Commands
Configuration
●
New section in solr.xml
<trace>
<str name="span.receiver.classes">org.apache.htrace.impl.HTracedSpanReceiver</str>
<str name="htraced.receiver.address">trace-server:9075</str>
<str name="sampler.classes">org.apache.htrace.core.ProbabilitySampler</str>
</trace>
●
Sampler Options
●
Advanced: Force Trace
… traceid=<64-bit> …
●
Advanced: Buffer Options
Performance
●
Each trace add ~25ns of overhead
●
Search handler can generate ~20 traces
●
Traces are buffered in memory
●
May consider multiple recievers
Future Work
●
Add tracing to clients (SolrJ)
●
Leverage MDC logging
●
Async Requests
Demo
Demo (Backup)
"responseHeader":{
"zkConnected":true,
"status":0,
"QTime":5251,
"params":{
"q":"*:*",
"indent":"on",
"wt":"json",
"_":"1476334957442"}},
Demo (Backup)
Demo (Backup)
Demo (Backup)
Demo (Backup)
Thank you
mdrob@apache.org
@mikhaildrob

More Related Content

PDF
Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...
Lucidworks
 
PDF
H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...
Lucidworks
 
PDF
Data Engineering with Solr and Spark
Lucidworks
 
KEY
State-of-the-Art Drupal Search with Apache Solr
guest432cd6
 
PDF
Efficient Scalable Search in a Multi-Tenant Environment: Presented by Harry H...
Lucidworks
 
PDF
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
Lucidworks
 
PDF
Thoth - Real-time Solr Monitor and Search Analysis Engine: Presented by Damia...
Lucidworks
 
PDF
Your Big Data Stack is Too Big!: Presented by Timothy Potter, Lucidworks
Lucidworks
 
Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...
Lucidworks
 
H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...
Lucidworks
 
Data Engineering with Solr and Spark
Lucidworks
 
State-of-the-Art Drupal Search with Apache Solr
guest432cd6
 
Efficient Scalable Search in a Multi-Tenant Environment: Presented by Harry H...
Lucidworks
 
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
Lucidworks
 
Thoth - Real-time Solr Monitor and Search Analysis Engine: Presented by Damia...
Lucidworks
 
Your Big Data Stack is Too Big!: Presented by Timothy Potter, Lucidworks
Lucidworks
 

What's hot (20)

PDF
Downtown SF Lucene/Solr Meetup: Developing Scalable User Search for PlayStati...
Lucidworks
 
PDF
Simple Fuzzy Name Matching in Solr: Presented by Chris Mack, Basis Technology
Lucidworks
 
PPTX
The ultimate guide for Elasticsearch plugins
Itamar
 
PDF
Parallel SQL and Streaming Expressions in Apache Solr 6
Shalin Shekhar Mangar
 
PDF
Turning a Search Engine into a Relational Database
Matthias Wahl
 
PDF
Lessons From Sharding Solr At Etsy: Presented by Gregg Donovan, Etsy
Lucidworks
 
PDF
Loading 350M documents into a large Solr cluster: Presented by Dion Olsthoorn...
Lucidworks
 
PDF
Webinar: What's New in Solr 6
Lucidworks
 
PDF
SQL for Elasticsearch
Jodok Batlogg
 
PPTX
Benchmarking Solr Performance at Scale
thelabdude
 
PDF
Introduction to Apache Solr
Alexandre Rafalovitch
 
PDF
Streaming Aggregation in Solr - New Horizons for Search: Presented by Erick E...
Lucidworks
 
PDF
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Lucidworks
 
PDF
Managing Your Content with Elasticsearch
Samantha Quiñones
 
PDF
Solr4 nosql search_server_2013
Lucidworks (Archived)
 
PPTX
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Oleksiy Panchenko
 
PDF
Solr: 4 big features
David Smiley
 
PDF
Practical Elasticsearch - real world use cases
Itamar
 
PDF
High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...
Lucidworks
 
PDF
Presto anatomy
Dongmin Yu
 
Downtown SF Lucene/Solr Meetup: Developing Scalable User Search for PlayStati...
Lucidworks
 
Simple Fuzzy Name Matching in Solr: Presented by Chris Mack, Basis Technology
Lucidworks
 
The ultimate guide for Elasticsearch plugins
Itamar
 
Parallel SQL and Streaming Expressions in Apache Solr 6
Shalin Shekhar Mangar
 
Turning a Search Engine into a Relational Database
Matthias Wahl
 
Lessons From Sharding Solr At Etsy: Presented by Gregg Donovan, Etsy
Lucidworks
 
Loading 350M documents into a large Solr cluster: Presented by Dion Olsthoorn...
Lucidworks
 
Webinar: What's New in Solr 6
Lucidworks
 
SQL for Elasticsearch
Jodok Batlogg
 
Benchmarking Solr Performance at Scale
thelabdude
 
Introduction to Apache Solr
Alexandre Rafalovitch
 
Streaming Aggregation in Solr - New Horizons for Search: Presented by Erick E...
Lucidworks
 
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Lucidworks
 
Managing Your Content with Elasticsearch
Samantha Quiñones
 
Solr4 nosql search_server_2013
Lucidworks (Archived)
 
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Oleksiy Panchenko
 
Solr: 4 big features
David Smiley
 
Practical Elasticsearch - real world use cases
Itamar
 
High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...
Lucidworks
 
Presto anatomy
Dongmin Yu
 
Ad

Similar to Why Is My Solr Slow?: Presented by Mike Drob, Cloudera (20)

KEY
Messaging, interoperability and log aggregation - a new framework
Tomas Doran
 
PPTX
How to Write the Fastest JSON Parser/Writer in the World
Milo Yip
 
PDF
Writing a fast HTTP parser
fukamachi
 
PDF
Tracing the Breadcrumbs: Apache Spark Workload Diagnostics
Databricks
 
KEY
plackdo, plack-like web interface on perl6
Nobuo Danjou
 
PDF
API Design in the Modern Era - Architecture Next 2020
Eran Stiller
 
PDF
Eran Stiller: API design in the modern era - architecture next 2020
CodeValue
 
PDF
Rapid Prototyping with Solr
Erik Hatcher
 
PDF
Plone server
Ramon Navarro
 
PDF
Top ten-list
Brian DeShong
 
KEY
Zero mq logs
Tomas Doran
 
PDF
From a student to an apache committer practice of apache io tdb
jixuan1989
 
PDF
Performance and Abstractions
Metosin Oy
 
PPTX
Proving out flash storage array performance using swingbench and slob
Kapil Goyal
 
PPTX
PHP Profiling/performance
Nicolas Degardin
 
PPTX
Scaling with swagger
Tony Tam
 
PDF
Searching Chinese Patents Presentation at Enterprise Data World
OpenSource Connections
 
PPTX
¡El mejor lenguaje para automatizar pruebas!
Antonio Robres Turon
 
PDF
Rapid Prototyping with Solr
Erik Hatcher
 
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
Messaging, interoperability and log aggregation - a new framework
Tomas Doran
 
How to Write the Fastest JSON Parser/Writer in the World
Milo Yip
 
Writing a fast HTTP parser
fukamachi
 
Tracing the Breadcrumbs: Apache Spark Workload Diagnostics
Databricks
 
plackdo, plack-like web interface on perl6
Nobuo Danjou
 
API Design in the Modern Era - Architecture Next 2020
Eran Stiller
 
Eran Stiller: API design in the modern era - architecture next 2020
CodeValue
 
Rapid Prototyping with Solr
Erik Hatcher
 
Plone server
Ramon Navarro
 
Top ten-list
Brian DeShong
 
Zero mq logs
Tomas Doran
 
From a student to an apache committer practice of apache io tdb
jixuan1989
 
Performance and Abstractions
Metosin Oy
 
Proving out flash storage array performance using swingbench and slob
Kapil Goyal
 
PHP Profiling/performance
Nicolas Degardin
 
Scaling with swagger
Tony Tam
 
Searching Chinese Patents Presentation at Enterprise Data World
OpenSource Connections
 
¡El mejor lenguaje para automatizar pruebas!
Antonio Robres Turon
 
Rapid Prototyping with Solr
Erik Hatcher
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
Ad

More from Lucidworks (20)

PDF
Search is the Tip of the Spear for Your B2B eCommerce Strategy
Lucidworks
 
PDF
Drive Agent Effectiveness in Salesforce
Lucidworks
 
PPTX
How Crate & Barrel Connects Shoppers with Relevant Products
Lucidworks
 
PPTX
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks
 
PPTX
Connected Experiences Are Personalized Experiences
Lucidworks
 
PDF
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Lucidworks
 
PPTX
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
Lucidworks
 
PPTX
Preparing for Peak in Ecommerce | eTail Asia 2020
Lucidworks
 
PPTX
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Lucidworks
 
PPTX
AI-Powered Linguistics and Search with Fusion and Rosette
Lucidworks
 
PDF
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
Lucidworks
 
PPTX
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Lucidworks
 
PDF
Smart Answers for Employee and Customer Support After COVID-19
Lucidworks
 
PPTX
Applying AI & Search in Europe - featuring 451 Research
Lucidworks
 
PPTX
Webinar: Accelerate Data Science with Fusion 5.1
Lucidworks
 
PDF
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Lucidworks
 
PPTX
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Lucidworks
 
PPTX
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Lucidworks
 
PPTX
Webinar: Building a Business Case for Enterprise Search
Lucidworks
 
PPTX
Why Insight Engines Matter in 2020 and Beyond
Lucidworks
 
Search is the Tip of the Spear for Your B2B eCommerce Strategy
Lucidworks
 
Drive Agent Effectiveness in Salesforce
Lucidworks
 
How Crate & Barrel Connects Shoppers with Relevant Products
Lucidworks
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks
 
Connected Experiences Are Personalized Experiences
Lucidworks
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Lucidworks
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
Lucidworks
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Lucidworks
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Lucidworks
 
AI-Powered Linguistics and Search with Fusion and Rosette
Lucidworks
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
Lucidworks
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Lucidworks
 
Smart Answers for Employee and Customer Support After COVID-19
Lucidworks
 
Applying AI & Search in Europe - featuring 451 Research
Lucidworks
 
Webinar: Accelerate Data Science with Fusion 5.1
Lucidworks
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Lucidworks
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Lucidworks
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Lucidworks
 
Webinar: Building a Business Case for Enterprise Search
Lucidworks
 
Why Insight Engines Matter in 2020 and Beyond
Lucidworks
 

Recently uploaded (20)

PDF
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PPTX
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
PPTX
Simple and concise overview about Quantum computing..pptx
mughal641
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PDF
Software Development Methodologies in 2025
KodekX
 
PDF
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PDF
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
PPTX
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PPTX
The Future of AI & Machine Learning.pptx
pritsen4700
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PDF
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
PPTX
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
PDF
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PPTX
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
Simple and concise overview about Quantum computing..pptx
mughal641
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
Software Development Methodologies in 2025
KodekX
 
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
The Future of Artificial Intelligence (AI)
Mukul
 
The Future of AI & Machine Learning.pptx
pritsen4700
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 

Why Is My Solr Slow?: Presented by Mike Drob, Cloudera