SlideShare a Scribd company logo
Mastering solr
   Jur de Vries
Who am I?

Developer/architect at Triquanta
Trainer at Wizzlern
Use case

Market place
Advertisements
Adjust relevancy
Paid boosting of add's
Of course we use Drupal and Apache Solr
Running Solr Locally

Download latest version (3.6)
Be sure to download distribution (not src)
Unpack solr
Go to example directory
Run
 java -jar start.jar
Drupal: which contrib?

2 Possibilities
  Apachesolr search
  Search api with solr backend
Apache solr search

Streghts:
  Supported by Acquia
  Easy to set up
  Mature
Weaknesses
  Integration with views (still in dev)
Search Api

Strengths
  Flexible
  Indexes all entities
  Excellent views integration
  Related fields are easy to add to index
Weaknesses
  Not supported (yet) by Acquia
  Solr backend has some issues
Drupal: which contrib?

Apachesolr search integration
  Quick setup
  Acquia
Search API
  Exportable configuaration
  Views integration
  Index all entities
Depends on your needs
Basic use of search api

Create server
Create index
  Select fields to index
  Define data alterations
  Define processors
Start indexing
Field types

Integer, date, boolean
String or fulltext?
  Fulltext will get processed!
      Tokenize
      Stopwords
      Ignore case
  String is as is
Demo

Run solr
Copy schema.xml and solrconfig.xml (!)
Create server
Create index
Create view
  ads
  Ad filter exposed: search
Advanced use of Search api

This talk is about Solr, not about search API
Understand Solr first!
Many resources on the web
Watch screencasts etc
Mastering Solr

Mastering solr is understanding solr
What happens after a Drupal module?
Let's have a look at the request
Solr request

Look at solr log
Parameters:
  start
  rows
  q (query)
  qf (query fields)
  fl (fields)
  fq (filter query)
Field names

item_id, id
t_.., ss_.., → why?
Solr has to know how to handle fields
Field api: field names differ
Dynamic field names: tell solr field type!
Schema.xml

Defines field types and fields
The real tweaking starts here!
Let's have a look!
  dynamicField
  field type
  analyzers
Copyfield
What can you do in schema.xml?

Synonyms (is disabled by default)
Stopwords (and, or, etc)
Stemming
Proper multilingual handling
Browse the schema

Solr offers schema browsing
Go to: http://localhost:8983/solr/admin
Search relevancy

Types of boosting:
  Field level boost
  Boost function
  Boost query
  (QueryElevation)
Boost parameters

Field level boosting: qf
   qf:t_body^20
   score in field is multiplied by 20
Boost function: bf
   bf:product(fieldname, 2)
   result of function is added to score
Boost query: bq
boost (only for edismax) like bf but multiplication
Let's boost title

Field level boost is incorporated in Search API...
But, where are the numbers in the request???
Search api solr forgot to add them!
There is a patch :-)
But lets do it another way...
Debugging Solr

Lets add &echoParams=all to the request...
Where do all these parameters come from?
Solrconfig.xml!!!
Among other things: request handler
Let's look at the dismax request handler
Solrconfig.xml

(Default) Request handler:
  Default parameters
  Add Spellcheck
  Tweak all kinds of search behavior!
  Let's add default search fields with boost
Boost function

Mathematical functions on field values
Available functions:
  sum(x,y): x + y
  product(x,y): x * y
  scale(x, minTarget, maxTarget)
  recip(x, m, a, b): x / (m * a + b)
  ms(): time → ms(NOW/DAY, created)
  Many more!
Boost date

We need ms(): big values!
Linear? To much difference
Recip!
recip(x,1,1000,1000)
if x 1000: half
1 year: 3.1e10
recip(ms(NOW/YEAR?, created),1,3.1e10,3.1e10)
bf=recip(ms(NOW/YEAR?, created),1,3.1e10,3.1e10)^3
Use a graphing tool!
Boost queries

Do a query like fq:
Boost add's:
  content_type:add
  bq=content_type:add
  bq=(content_type)^20
Debugging relevancy

We know how to boost
How can finetuning be done?
solr has the solutions:
  add debugQuery=on
debugQuery=on


normal                   source
Relevancy

Choose your boosting methods
Try in your browser
Finetuning: debugQuery=on, source
Add parameters to solrconfig.xml
Or...
Add parameters in code

use
hook_search_api_solr_query_alter(array
  &$call_args, SearchApiQueryInterface $query)
$call_args['params']['bq'] = '(t_title:foo)^20'
$call_args['params']['bf'][] = b_promote
Override solr service class

In Search API: define server class
extend solr service class
Only change key methods
It's all about passing parameters!
Conclusion

Tweak indexing in schema.xml
  Stopwords
  Multilingual
Tweak searching in solrconfig.xml
Tweak searching by passing variables
This is only an introduction!
Questions?
Feedback & follow-up:
http://drupalcampgent.be/feedback

More Related Content

PPS
Introduction to Solr
Jayesh Bhoyar
 
PDF
Solr workshop
Yasas Senarath
 
PPTX
Apache Solr + ajax solr
Net7
 
PPTX
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Alexandre Rafalovitch
 
PPTX
JSON in Solr: from top to bottom
Alexandre Rafalovitch
 
PDF
Solr Troubleshooting - TreeMap approach
Alexandre Rafalovitch
 
PDF
Apache Solr Workshop
Saumitra Srivastav
 
PDF
Using Apache Solr
pittaya
 
Introduction to Solr
Jayesh Bhoyar
 
Solr workshop
Yasas Senarath
 
Apache Solr + ajax solr
Net7
 
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Alexandre Rafalovitch
 
JSON in Solr: from top to bottom
Alexandre Rafalovitch
 
Solr Troubleshooting - TreeMap approach
Alexandre Rafalovitch
 
Apache Solr Workshop
Saumitra Srivastav
 
Using Apache Solr
pittaya
 

What's hot (20)

PDF
Solr Masterclass Bangkok, June 2014
Alexandre Rafalovitch
 
PPT
Enterprise Search Solution: Apache SOLR. What's available and why it's so cool
Ecommerce Solution Provider SysIQ
 
PPTX
Rapid Solr Schema Development (Phone directory)
Alexandre Rafalovitch
 
PDF
An Introduction to Basics of Search and Relevancy with Apache Solr
Lucidworks (Archived)
 
PDF
From content to search: speed-dating Apache Solr (ApacheCON 2018)
Alexandre Rafalovitch
 
ODP
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasks
Alexandre Rafalovitch
 
PPTX
Solr 6 Feature Preview
Yonik Seeley
 
PPT
Solr Presentation
Gaurav Verma
 
PDF
Solr Recipes Workshop
Erik Hatcher
 
PPTX
Apache Solr
Minh Tran
 
PDF
Rapid Prototyping with Solr
Erik Hatcher
 
PDF
Schemaless Solr and the Solr Schema REST API
lucenerevolution
 
PDF
Get the most out of Solr search with PHP
Paul Borgermans
 
PDF
New-Age Search through Apache Solr
Edureka!
 
PPTX
Apache Solr
Semih Hakkıoğlu
 
PPT
Introduction to Apache Solr.
ashish0x90
 
PDF
Solr Black Belt Pre-conference
Erik Hatcher
 
PDF
Introduction to Solr
Erik Hatcher
 
PDF
Solr Query Parsing
Erik Hatcher
 
PDF
Solr Indexing and Analysis Tricks
Erik Hatcher
 
Solr Masterclass Bangkok, June 2014
Alexandre Rafalovitch
 
Enterprise Search Solution: Apache SOLR. What's available and why it's so cool
Ecommerce Solution Provider SysIQ
 
Rapid Solr Schema Development (Phone directory)
Alexandre Rafalovitch
 
An Introduction to Basics of Search and Relevancy with Apache Solr
Lucidworks (Archived)
 
From content to search: speed-dating Apache Solr (ApacheCON 2018)
Alexandre Rafalovitch
 
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasks
Alexandre Rafalovitch
 
Solr 6 Feature Preview
Yonik Seeley
 
Solr Presentation
Gaurav Verma
 
Solr Recipes Workshop
Erik Hatcher
 
Apache Solr
Minh Tran
 
Rapid Prototyping with Solr
Erik Hatcher
 
Schemaless Solr and the Solr Schema REST API
lucenerevolution
 
Get the most out of Solr search with PHP
Paul Borgermans
 
New-Age Search through Apache Solr
Edureka!
 
Apache Solr
Semih Hakkıoğlu
 
Introduction to Apache Solr.
ashish0x90
 
Solr Black Belt Pre-conference
Erik Hatcher
 
Introduction to Solr
Erik Hatcher
 
Solr Query Parsing
Erik Hatcher
 
Solr Indexing and Analysis Tricks
Erik Hatcher
 
Ad

Similar to Mastering solr (20)

PPT
Enterprise search in_drupal_pub
dstuartnz
 
PPTX
Building strong foundations apex enterprise patterns
andyinthecloud
 
ODP
Dev8d Apache Solr Tutorial
Sourcesense
 
PDF
Bay Area Apache Spark ™ Meetup: Upcoming Apache Spark 4.0.0 Release
carlyakerly1
 
PDF
Rails and the Apache SOLR Search Engine
David Keener
 
PDF
Make your gui shine with ajax solr
lucenerevolution
 
PDF
WebNet Conference 2012 - Designing complex applications using html5 and knock...
Fabio Franzini
 
PPTX
Implementing full text search with Apache Solr
techprane
 
PPT
Introduction to Force.com
Kaushik Chakraborty
 
PDF
New-Age Search through Apache Solr
Edureka!
 
PPS
Simplify your professional web development with symfony
Francois Zaninotto
 
PPTX
New Features in JDK 8
Martin Toshev
 
PPT
Introduction to coding using Python
Dan D'Urso
 
PDF
Julio Capote, Twitter
Ontico
 
PPTX
Introduction to Laravel Framework (5.2)
Viral Solani
 
PPTX
Salesforce
maheswara reddy
 
PDF
Flock: Data Science Platform @ CISL
Databricks
 
PPTX
slides.pptx
abcabc794064
 
PPTX
Salesforce Summer 14 Release
Jyothylakshmy P.U
 
PPT
Programming With Amazon, Google, And E Bay
Phi Jack
 
Enterprise search in_drupal_pub
dstuartnz
 
Building strong foundations apex enterprise patterns
andyinthecloud
 
Dev8d Apache Solr Tutorial
Sourcesense
 
Bay Area Apache Spark ™ Meetup: Upcoming Apache Spark 4.0.0 Release
carlyakerly1
 
Rails and the Apache SOLR Search Engine
David Keener
 
Make your gui shine with ajax solr
lucenerevolution
 
WebNet Conference 2012 - Designing complex applications using html5 and knock...
Fabio Franzini
 
Implementing full text search with Apache Solr
techprane
 
Introduction to Force.com
Kaushik Chakraborty
 
New-Age Search through Apache Solr
Edureka!
 
Simplify your professional web development with symfony
Francois Zaninotto
 
New Features in JDK 8
Martin Toshev
 
Introduction to coding using Python
Dan D'Urso
 
Julio Capote, Twitter
Ontico
 
Introduction to Laravel Framework (5.2)
Viral Solani
 
Salesforce
maheswara reddy
 
Flock: Data Science Platform @ CISL
Databricks
 
slides.pptx
abcabc794064
 
Salesforce Summer 14 Release
Jyothylakshmy P.U
 
Programming With Amazon, Google, And E Bay
Phi Jack
 
Ad

Recently uploaded (20)

PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PDF
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PDF
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
PDF
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
PPTX
Simple and concise overview about Quantum computing..pptx
mughal641
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PPTX
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PPTX
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
Simple and concise overview about Quantum computing..pptx
mughal641
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 

Mastering solr