Balancing the Dimensions of User Intent

Trey Grainger
Chief Algorithms Officer
Balancing the Dimensions of User Intent
October 28, 2019

Trey Grainger
Chief Algorithms Officer
• Previously: SVP of Engineering @ Lucidworks; Director of Engineering @ CareerBuilder
• Georgia Tech – MBA, Management of Technology
• Furman University – BA, Computer Science, Business, & Philosophy
• Stanford University – Information Retrieval & Web Search
Other fun projects:
• Co-author of Solr in Action, plus numerous research publications
• Advisor to Presearch, the decentralized search engine
• Lucene / Solr contributor
About Me

• About Lucidworks
• What is AI-powered Search?
• The Dimensions of User Intent
• Content Understanding:
• Keyword Search
• User Understanding:
• Collaborative Recommendations
• Content Understanding + User Understanding:
• Personalized Search
• Domain Understanding:
• Knowledge Graphs
• Domain Understanding + User Understanding:
• Domain-aware Matching
• Content Understanding + Domain Understanding:
• Semantic Search
• Balancing Approaches:
• Keyword vs. Vector vs. Knowledge Graph Search
• Vector Search
• Knowledge Graph Search
• Combining it all together
Agenda

Who are we?
300+ CUSTOMERS ACROSS THE
FORTUNE 1000
400+EMPLOYEES
OFFICES IN
San Francisco, CA (HQ)
Raleigh-Durham, NC
Cambridge, UK
Bangalore, India
Hong Kong
The Search & AI Conference
COMPANY BEHIND
D E V E L O P M E N T,
H O S T I N G ,
& S U P P O R T

Proudly built with open-source
tech at its core: Apache Solr &
Apache Spark
Personalizes search
with applied
machine learning
Proven on the
world’s biggest
information systems

http://aiPoweredSearch.com
... is my new book!
(Haystack discount code: ctwhay19)

AI-powered Search
Question / Answer
Systems
Virtual Assistants
• Signals Boosting Models
• Learning to Rank
• Semantic Search
• Collaborative Filtering
• Personalized Search
• Content Clustering
• NLP / Entity Resolution
• Semantic Knowledge Graphs
• Document Classification
• etc.
• Neural Search
• Word Embeddings
• Vector Search
• Image / Voice Search
• etc.
• Question / Answer Systems
• Virtual Assistants
• Chatbots
• Rules-based Relevancy
• etc.

We have a big toolbox - great!

But how do we properly apply
those tools?

Dimensions of User Intent
Content
Understanding
Domain
Understanding
User
Understanding
User Intent

Keyword
Search
Content
Understanding
Domain
Understanding
User
Understanding
User Intent

/solr/collection/select/?q=apache solr
Term Documents
… …
apache
doc1, doc3, doc4,
doc5
…
lucene doc2, doc4, doc6
… …
solr
doc1, doc3, doc4,
doc7, doc8
… …
doc5
doc7 doc8
doc1 doc3
doc4
solr
apache
apache solr
Matching queries to documents

BM25 (Relevance Scoring between Query and Documents)
Score(q, d) =
∑ idf(t) · ( tf(t in d) · (k + 1) ) / ( tf(t in d) + k · (1 – b + b · |d| / avgdl )
t in q
Where:
t = term; d = document; q = query; i = index
tf(t in d) = numTermOccurrencesInDocument ½
idf(t) = 1 + log (numDocs / (docFreq + 1))
|d| = ∑ 1
t in d
avgdl = = ( ∑ |d| ) / ( ∑ 1 ) )
d in i d in i
k = Free parameter. Usually ~1.2 to 2.0. Increases term frequency saturation point.
b = Free parameter. Usually ~0.75. Increases impact of document normalization.

Keyword
Search
Content
Understanding
Domain
Understanding
Collaborative
Recommendations User
Understanding
User Intent

Collaborative Filtering (Recommendations)
User
Searches
User
Sees
Results
User
takes an
action
Users’ actions
inform system
improvements
User Query Results
Alonzo ipad doc10,
doc22,
doc12, …
Elena printer doc84,
doc2,
doc17, …
Ming ipad doc10,
doc22,
doc12, …
… … …
User Action Document
Alonzo click doc22
Elena click doc17
Ming click doc12
Alonzo purchase doc22
Ming click doc22
Ming purchase doc12
Elena click doc2
… … …
User Item Weight
Alonzo doc22 1.0
Alonzo doc12 0.4
… … …
Ming doc12 0.9
Ming doc22 0.6
… … …
ipad ⌕
Matrix Factorization
Recommendations for Alonzo:
• doc22: “iPad Pro”
• doc12: “Kindle Fire”
…

Recommendations (User-Item, Item-Item, Query-Item)
User Item Weight
Alonzo doc22 1.0
Alonzo doc12 0.4
… … …
Ming doc12 0.9
Ming doc22 0.6
… … …
Recommendations for Alonzo:
…
Item Item Weight
doc22 doc22 1.0
doc22 doc12 0.85
… … …
doc12 doc12 1.0
doc12 doc22 0.83
… … …
Query Item Weight
ipad doc22 0.98
ipad doc12 0.6
… … …
kindle doc12 0.96
apple doc22 0.90
… … …
Recommendations for Doc22:
…
Recommendations for “ipad”:
…
Matrix Factorization

Keyword
Search
Knowledge Graph
Content
Understanding
Domain
Understanding
Collaborative
Understanding
User Intent

What is a Knowledge Graph?
(vs. Ontology vs. Taxonomy vs. Synonyms, etc.)

Overly Simplistic Definitions
Alternative Labels: Substitute words with identical meanings
[ CTO => Chief Technology Officer; specialise => specialize ]
Synonyms List: Provides substitute words that can be used to represent
the same or very similar things
[ human => homo sapien, mankind; food => sustenance, meal ]
Taxonomy: Classifies things into Categories
[ john is Human; Human is Mammal; Mammal is Animal ]
Ontology: Defines relationships between types of things
[ animal eats food; human is animal ]
Knowledge Graph: Instantiation of an
Ontology (contains the things that are related)
[ john is human; john eats food ]
A Knowledge Graph subsumes the other types.

Keyword
Search
Knowledge Graph
User Intent
Personalized
Search
Content
Understanding
Domain
Understanding
Collaborative
Understanding

Keyword Search
(Completely User-specified)
Traditional
Recommendations
(Completely driven by
user behavior)

Keyword Search
User-guided
Recommendations
(Mostly driven by user profile,
partially user-specified)
Traditional
Recommendations
user behavior)
Keyword Search
Personalized
Queries
(Mostly user-specified,
partially driven by user profile)

Personalized
Queries
(Mostly user-specified,
partially driven by user profile)
Keyword Search
User-guided
Recommendations
(Mostly driven by user profile,
partially user-specified)
Traditional
Recommendations
user behavior)
Personalized Search

Regular Search Results:
Personalized Search Results:
User:

Nice - personalization is awesome!
Let’s roll it out everywhere!

Keyword
Search
Knowledge Graph
User Intent
Personalized
Search
Domain-aware
Matching
Content
Understanding
Domain
Understanding
Collaborative
Understanding

Knowledge Graph
(Understanding conceptual
and logical relationships
between domain-specific entities)
Collaborative
Recommendations
user behavior)

Personas / User Profiles
(User attributes and preferences in
knowledge graph)
Multimodal Recommendations
(Recommendations combining
collaborative filtering plus user-based
profile attribute matching/ranking)
Knowledge Graph
Collaborative
Recommendations
user behavior)

Personas / User Profiles
(User attributes and preferences in
knowledge graph)
(Recommendations combining
collaborative filtering plus user-based
profile attribute matching/ranking)
Knowledge Graph
Collaborative
Recommendations
user behavior)
Domain-aware Matching

http://localhost:8983/solr/jobs/select/?
fl=jobtitle,city,state,salary&
q=(
jobtitle:"nurse educator"^25 OR jobtitle:(nurse educator)^10
)
AND (
(city:"Boston" AND state:"MA")^15
OR state:"MA")
AND _val_:"map(salary, 40000, 60000,10, 0)"
AND similar_users:{!terms}u99,u1,u50,u2311,u253,u70,u99
*Example derived from chapter 16 of Solr in Action
Jane is a nurse educator in Boston seeking between $40K and $60K
She has interacted with the same content as the following users:
u99,u1,u50,u2311,u253,u70,u99

Keyword
Search
Knowledge Graph
User Intent
Personalized
Search
Semantic
Search
Domain-aware
Matching
Content
Understanding
Domain
Understanding
Collaborative
Understanding

Keyword Search
(Finding and
Ranking Keyword)
Knowledge Graph
(Understanding conceptual and
logical relationships between
domain-specific entities)

Language Understanding
(Understanding syntax
and query structure)
Keyword Search
(Finding and
Ranking Keyword)
Terminology Understanding
(Understanding domain-specific
terms and conceptual meaning)
Knowledge Graph

Language Understanding
(Understanding syntax
and query structure)
Terminology Understanding
(Understanding domain-specific
terms and conceptual meaning)
Keyword Search
(Finding and
Ranking Keyword)
Knowledge Graph
Semantic Search

Sentence Embeddings:
[ 2, 3, 2, 4, 2, 1, 5, 3 ]
[ 5, 3, 2, 3, 4, 0, 3, 4 ]
. . .
Document Embedding:
[ 4, 1, 4, 2, 1, 2, 4, 3 ]
Word Embeddings:
[ 5, 1, 3, 4, 2, 1, 5, 3 ]
[ 4, 1, 3, 0, 1, 1, 4, 2 ]
. . .
Paragraph Embeddings:
[ 5, 1, 4, 1, 0, 2, 4, 0 ]
[ 1, 1, 4, 2, 1, 0, 0, 0 ]
. . .
Thought Vectors

apple caffeine cheese coffee drink donut food juice pizza tea water … term N
cappuccino 0 0 0 0 0 0 0 0 0 0 0 ...
apple 1 0 0 0 0 0 0 0 0 0 0 ...
juice 0 0 0 0 0 0 0 1 0 0 0 ...
cheese 0 0 1 0 0 0 0 0 0 0 0 ...
pizza 0 0 0 0 0 0 0 0 1 0 0 ...
donut 0 0 0 0 0 1 0 0 0 0 0 ...
green 0 0 0 0 0 0 0 0 0 0 0 ...
tea 0 0 0 0 0 0 0 0 0 1 0 ...
bread 0 0 1 0 0 0 0 0 0 0 0 ...
sticks 0 0 0 0 0 0 0 0 0 0 0 ...
exact term lookup in inverted indexquery
Single Term Searches (as a Vector)

Combined Vector
query
Multi-term Query Vectors
juice 0 0 0 0 0 0 0 1 0 0 0 ...
apple 1 0 0 0 0 0 0 0 0 0 0 ...
+
apple juice 1 0 0 0 0 0 0 1 0 0 0 ...

apple caffeine cheese coffee drink donut food juice pizza tea water … term N
latte 0 0 0 0 0 0 0 0 0 0 0 ...
cappuccino 0 0 0 0 0 0 0 0 0 0 0 ...
apple juice 1 0 0 0 0 0 0 1 0 0 0 ...
cheese pizza 0 0 1 0 0 0 0 0 1 0 0 ...
donut 0 0 0 0 0 1 0 0 0 0 0 ...
soda 0 0 0 0 0 0 0 0 0 0 0 ...
green tea 0 0 0 0 0 0 0 0 0 1 0 ...
water 0 0 0 0 0 0 0 0 0 0 1 ...
cheese bread
sticks
0 0 1 0 0 0 0 0 0 0 0 ...
cinnamon sticks 0 0 0 0 0 0 0 0 0 0 0 ...
exact term lookup in inverted indexquery
Multi-term Searches

food drink dairy bread caffeine sweet calories healthy
apple juice 0 5 0 0 0 4 4 3
cappuccino 0 5 3 0 4 1 2 3
cheese bread
sticks
5 0 4 5 0 1 4 2
cheese pizza 5 0 4 4 0 1 5 2
cinnamon
bread sticks
5 0 1 5 0 3 4 2
donut 5 0 1 5 0 4 5 1
green tea 0 5 0 0 2 1 1 5
latte 0 5 4 0 4 1 3 3
soda 0 5 0 0 3 5 5 0
water 0 5 0 0 0 0 0 5
Dimensionality Reduction

Phrase: Vector:
apple juice: [ 0, 5, 0, 0, 0, 4, 4, 3 ]
cappuccino: [ 0, 5, 3, 0, 4, 1, 2, 3 ]
cheese bread sticks: [ 5, 0, 4, 5, 0, 1, 4, 2 ]
cheese pizza: [ 5, 0, 4, 4, 0, 1, 5, 2 ]
cinnamon bread sticks: [ 5, 0, 4, 5, 0, 1, 4, 2 ]
donut: [ 5, 0, 1, 5, 0, 4, 5, 1 ]
green tea: [ 0, 5, 0, 0, 2, 1, 1, 5 ]
latte: [ 0, 5, 4, 0, 4, 1, 3, 3 ]
soda: [ 0, 5, 0, 0, 3, 5, 5, 0 ]
water: [ 0, 5, 0, 0, 0, 0, 0, 5 ]
Ranked Results: Green Tea
0.94 water
0.85 cappuccino
0.80 latte
0.78 apple juice
0.60 soda
… …
0.19 donut
Vector Similarity Scores:
Vector Similarity (a, b):
cos(θ) = a · b
|a| × |b|
Ranked Results: Cheese Pizza
0.99 cheese bread sticks
0.91 cinnamon bread sticks
0.89 donut
0.47 latte
0.46 apple juice
… …
0.19 water
Vector Similarity Scoring

Vector Similarity Scores:
Performance Considerations
Problem: Vector Scoring is Slow
• Unlike keyword search, which looks up pre-indexed answers to queries, Vector Search must instead calculate
similarities between the query vector and every document’s vectors to determine best matches, which is
slow at scale.
Solution: Quantized Vectors
• “Quantization” is the process for mapping vectors features to discrete values.
• Creating “tokens” which map to a similar vector space, enables matching on those tokens to perform an ANN
(Approximate Nearest Neighbor) search
• This enables converting vector scoring into a search problem (term lookup and scoring), which is fast again,
at the expense of some recall and scoring accuracy
Recommended Approach: Quantized Vector Search + Vector Similarity Reranking
• Combine the best of both worlds by running an initial ANN search on a quantized vector representation, and
then re-rank the top-N results using full Vector similarity scoring.

Option 1: Streaming Expressions

curl -X POST -H "Content-Type: application/json"
http://localhost:8983/solr/food/update?commit=true
--data-binary ' [
{"id": "1", "name_s":"donut", "vector_fs":[5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0]},
{"id": "2", "name_s":"apple juice",
"vector_fs":[1.0,5.0,0.0,0.0,0.0,4.0,4.0,3.0]},
{"id": "3", "name_s":"cappuccino",
"vector_fs":[0.0,5.0,3.0,0.0,4.0,1.0,2.0,3.0]},
{"id": "4", "name_s":"cheese pizza",
"vector_fs":[5.0,0.0,4.0,4.0,0.0,1.0,5.0,2.0]},
{"id": "5", "name_s":"green tea",
"vector_fs":[0.0,5.0,0.0,0.0,2.0,1.0,1.0,5.0]},
{"id": "6", "name_s":"latte", "vector_fs":[0.0,5.0,4.0,0.0,4.0,1.0,3.0,3.0]},
{"id": "7", "name_s":"soda", "vector_fs":[0.0,5.0,0.0,0.0,3.0,5.0,5.0,0.0]},
{"id": "8", "name_s":"cheese bread sticks",
"vector_fs":[5.0,0.0,4.0,5.0,0.0,1.0,4.0,2.0]},
{"id": "9", "name_s":"water", "vector_fs":[0.0,5.0,0.0,0.0,0.0,0.0,0.0,5.0]},
{"id": "10", "name_s":"cinnamon bread sticks",
"vector_fs":[5.0,0.0,1.0,5.0,0.0,3.0,4.0,2.0]}
] '
Send Documents to Solr:
Streaming Expressions

Option 2:
Streaming Expressions Query Parser

http://localhost:8983/solr/food/select?q=*:*&fl=id,name_s&
fq={!streaming_expression}top(
select(
search(food, q="*:*", fl="id,vector_fs", sort="id asc"),
cosineSimilarity(vector_fs, array(5.1,0.0,1.0,5.0,0.0,4.0,5.0,1.0)) as cos, id),
n=5, sort="cos desc”
)
{ "responseHeader":{
… },
"response":{"numFound":5,"start":0,"docs":[
{ "name_s":"donut", "id":"1"},
{ "name_s":"apple juice", "id":"2"},
{ "name_s":"cheese pizza", "id":"4"},
{ "name_s":"cheese bread sticks", "id":"8"},
{ "name_s":"cinnamon bread sticks", "id":"10"}]
}}
Request:
Response:
Streaming Expressions Query Parser

Option 3:
Solr Vector Scoring Plugin

curl -X POST -H "Content-Type: application/json"
http://localhost:8983/solr/{your-collection-name}/update?commit=true --
data-binary ‘
[
{"name":"example 0", "vector":"0|1.55 1|3.53 2|2.3 3|0.7 4|3.44 5|2.33"},
{"name":"example 4", "vector":"0|4.01 1|3.69 2|2 3|4.36 4|1.09 5|0.1"},
{"name":"example 5", "vector":"0|0.64 1|3.95 2|1.03 3|1.65 4|0.99 5|0.09"}
]'

Request:
Response:
http://localhost:8983/solr/{your-collection-name}/query?fl=name,score,vector&q={!vp f=vector
vector="0.1,4.75,0.3,1.2,0.7,4.0”
}
{ "responseHeader":{ "status":0, "QTime":1}},
"response":{ "numFound":6,"start":0,"maxScore":0.99984086,
"docs":[
{ "name":["example 3"], "vector":["0|0.06 1|4.73 2|0.29 3|1.27 4|0.69 5|3.9 "],
"score":0.99984086},
{ "name":["example 0"], "vector":["0|1.55 1|3.53 2|2.3 3|0.7 4|3.44 5|2.33 "], "score":0.7693964},
{ "name":["example 4"], "vector":["0|4.01 1|3.69 2|2 3|4.36 4|1.09 5|0.1 "], "score":0.5328145},
{ "name":["example 2"], "vector":["0|1.11 1|0.6 2|1.47 3|1.99 4|2.91 5|1.01 "], "score":0.44909418}]
}}

Option 4:
Solr Vector Scoring + LSH Plugin

curl -X POST -H "Content-Type: application/json" http://localhost:8983/solr/{your-collection-
name}/update?update.chain=LSH&commit=true --data-binary ‘
[
{"id":"1", "vector":"1.55,3.53,2.3,0.7,3.44,2.33"},
{"id":"2", "vector":"3.54,0.4,4.16,4.88,4.28,4.25"}
]'
vector="1.55,3.53,2.3,0.7,3.44,2.33" lsh="true"
reRankDocs="5"}&fl=name,score,vector,_vector_,_lsh_hash_
Request:

Response:
{
"responseHeader":{ "status":0, "QTime":8, "response":{"numFound":1,"start":0,"maxScore":36.65736,
"docs":[
{ "id": "1", "vector":"1.55,3.53,2.3,0.7,3.44,2.33",
"_vector_":"/z/GZmZAYeuFQBMzMz8zMzNAXCj2QBUeuA==",
"_lsh_hash_":["0_8", "1_35", "2_7", "3_10", "4_2", "5_35", "6_16", "7_30", "8_27", "9_12", "10_7",
"11_32", "12_48", "13_36", "14_10", "15_7", "16_42", "17_5", "18_3", "19_2", "20_1",
"21_0", "22_24", "23_18", "24_42", "25_31", "26_35", "27_8", "28_1", "29_24", "30_47",
"31_14", "32_22", "33_39", "34_0", "35_34", "36_34", "37_39", "38_27", "39_27",
"40_45", "41_10", "42_21", "43_34", "44_41", "45_9", "46_31", "47_0", "48_4", "49_43"],
"score":36.65736}
] } }
vector="1.55,3.53,2.3,0.7,3.44,2.33" lsh="true"
reRankDocs="5"}&fl=name,score,vector,_vector_,_lsh_hash_
Request:

Option 5 (Work in Progress):
First-class Vector Fields in Lucene/Solr

ANN Benchmarks
(Approximate Nearest Neighbor)
https://github.com/erikbern/ann-benchmarks

• Take queries, documents, sentences, paragraphs, etc. and
transform them into vectors.
• Usually leverage deep learning, which can discover rich language
usage rules and map them to combinations of features in the
vector
• Popular Libraries:
• Bert
• Elmo
• Universal Sentence Encoder
• Word2Vec
• Sentence2Vec
• Glove
• fastText
• many more …
Vector Encoders

Query Type Likely Outcome
Obscure keyword combinations
Q. (software OR hardware) AND enginee*
• Keyword search succeeds
• Vector Search fails
Natural Language Queries
Q. Can my wife drive on my insurance?
• Keyword search might get
lucky, but probably fails
• Vector Search succeeds
Fuzzy Language Queries
Q. famous french tower
• Keyword search mismatch
yields poor results
• Vector Search succeeds
Structured Relationship Queries
Q. popular bbq near Activate
• Keyword search fails
• Vector search fails
• Need a Knowledge Graph!
Keyword Search vs. Vector Search

Giant Graph of Relationships...
Trey Grainger works for Lucidworks.
He spoke at the Activate 2019
conference.
#Activate19
(Activate) wqs held in Washington, DC
September 9-12, 2019.
Trey got his masters degree from
Georgia Tech.
Trey’s Voicemail

id: 1
job_title: Software Engineer
desc: software engineer at a
great company
skills: .Net, C#, java
id: 2
job_title: Registered Nurse
desc: a registered nurse at
hospital doing hard work
skills: oncology, phlebotemy
id: 3
job_title: Java Developer
desc: a software engineer or a
java engineer doing work
skills: java, scala, hibernate
field doc term
desc
1
a
at
company
engineer
great
software
2
a
at
doing
hard
hospital
nurse
registered
work
3
a
doing
engineer
java
or
software
work
job_title 1
Software
Engineer
… … …
Terms-Docs Inverted IndexDocs-Terms Forward IndexDocuments
Source: Trey Grainger,
Khalifeh AlJadda,
Mohammed Korayem,
Andries Smith.“The Semantic
Knowledge Graph: A
compact, auto-generated
model for real-time traversal
and ranking of any
relationship within a domain”.
DSAA 2016.
Knowledge
Graph
field term postings
list
doc pos
desc
a
1 4
2 1
3 1, 5
at
1 3
2 4
company 1 6
doing
2 6
3 8
engineer
1 2
3 3, 7
great 1 5
hard 2 7
hospital 2 5
java 3 6
nurse 2 3
or 3 4
registered 2 2
software
1 1
3 2
work
2 10
3 9
job_title java developer 3 1
… … … …

Related term vector (for query concept expansion)
http://localhost:8983/solr/stack-exchange-health/skg

Disambiguation by Category Example
Meaning 1: Restaurant => bbq, brisket, ribs, pork, …
Meaning 2: Outdoor Equipment => bbq, grill, charcoal, propane, …

Demo Data
Places (also includes geonames database)
Entities (includes search commands)
Text Content
[ Web crawl of restaurant and product reviews sites ]

Solr Knowledge Graph Traversal Query
"bbq",

Why this Semantic Nuance Matters

popular barbeque near Activate
(popular same as "good", "top", "best")
Hotels near Haystack EU
hotels near popular BBQ in Berlin
BBQ near airports near Berlin
hotels near movie theaters in Berlin …
Other Knowledge Graph Search examples:

News Search : popularity and freshness drive relevance
Restaurant Search: geographical proximity and price range are critical
Ecommerce: likelihood of a purchase is key
Movie search: More popular titles are generally more relevant
Job search: category of job, salary range, and geographical proximity matter
The right ranking algorithm is domain and context-dependent

Example Combining Content + Domain + User Context
News website:
/select?
fq={!cache=false v=$keywords}&
q= {!func}scale(query($keywords),0,25)
{!func}scale(geodist(),0,25)
{!func}recip(rord(publicationDate),1,25,0)
{!func}scale(popularity,0,25)&
keywords="fall festival"&
sfield=location&
pt=33.748,-84.391
25%
25%
25%
25%
*Example from chapter 16 of Solr in Action

But how do we figure out the right
balance of weights?

Learning to Rank
User
Searches
User
Sees
Results
User
takes an
action
Users’ actions
inform system
improvements
User Query Re
Alonzo ipad do
do
do
Elena printer do
do
do
Ming ipad do
do
do
… … …
User Action Document
Alonzo click doc22
Elena click doc17
Ming click doc12
Alonzo purchase doc22
Ming click doc22
Ming purchase doc22
Elena click doc2
… … …
Feature Weight
title_match_all_terms 15.25
exact_phrase_match 10
signal_boost 9.5
content_age 9.2
user_geo_distance 6.5
personalization_cat_1 2.8
doc_popularity 2.75
… …
ipad ⌕
Initial Results:
1) doc1
2) doc2
3) doc3
Build Ranking Classifier
(from Implicit Relevance Judgements)
Final Results:
1) doc3
2) doc1
3) doc2

Facet,
Topic &
Cluster
Query Rule
Matching
Natural
Language
Machine
Learning
Boosted
Results
Signals
Content
Index
System Generated
Human Generated
Application Generated
Solution
Data

We operationalize AI for the
largest businesses on the planet.

Trey Grainger
trey@lucidworks.com
@treygrainger
Other presentations:
http://www.treygrainger.com
40% Discount code: ctwhay19
http://aiPoweredSearch.com
http://solrinaction.com
Books:
Thank You!

Balancing the Dimensions of User Intent

More Related Content

What's hot

Similar to Balancing the Dimensions of User Intent

More from Trey Grainger

Recently uploaded

Balancing the Dimensions of User Intent