SlideShare a Scribd company logo
A BETTER MATCH MEANS BETTER CARE®
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
The Search for NLP
Standing up “QuickLP” for PoC
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Me
I’m an engineer because I’m a curious person who likes products & problem-
solving.
- Interpersonal rhetoric
- HCI
- Healthcare IT
- Solr
- Data “intuition”
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
A better match means better care
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Kyruus Search
A better match means better care.
The Kyruus Search & API team exists to connect humans to relevant care by
connecting them to relevant data.
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Agenda
The problem t
The space t
The options t
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Expectations
Ideas, not solutions
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Expectations
Where to look, not what to see
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
The user journey is simple:
Need t
Input t
Results t
Problem
Intent Query
Documents
$$
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Problem
Information retrieval 101
User Information
😬(you)
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Problem
Information retrieval 101
User Information
😬(you)
Information
Information
Information
Information
Information
Information
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Problem
Information retrieval 101
User Information
😬(you)
Information
Information
Information
Information
Information
Information
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
The Space | Statistical relevance
https://towardsdatascience.com/tf-term-frequency-idf-inverse-document-frequency-from-scratch-in-python-6c2b61b78558
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
The Space | NLP
https://hackernoon.com/various-optimisation-techniques-and-their-impact-on-generation-of-word-embeddings-3480bd7ed54f
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Context-aware embeddings
The Space | NLP
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
The Space | NLP
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
The Space -- NLP
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Reality check
Your users don’t care t
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Reality check
Your users don’t care t
User Information
😬(you)
Information
Information
Information
Information
Information
Information
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Door No. 3, Johnny!
Approach The work The hope The reality
Spray & pray
Tune, tweak, and “test” all sorts of
configurations, settings, analyzers, etc.
That you make enough
permutations to catch most
people and that the parts you
can’t cover just don’t show up
(head in the sand)
- You’re leaving some users out in the cold
- You’re spending valuable engineering
resources trying to fix it in a way that will never
last, simply building a house of cards that will
fall as soon as something changes
Host Sesame
Street
Spend lots of time to source a great
ML/AI/NLP candidate, spend lots of
money to secure the best candidate, and
spend a lot of time trying to get your
organization suddenly ready for the work
they will do (e.g. analytics, logging,
tracking, monitoring, et. al.)
You’ll spend enough money to
buy yourself a silver bullet, that
this person will save the day
- There are no silver bullets
- Having the right person is only part of the
equation, the organization must be at a point of
maturation to support them and their work
long-term
- You just bet all your chips on red—and the
house always wins
Crawl, walk,
run
Find areas of opportunity and exploit
them creatively with the tools on hand
You solve discrete use cases, one
at a time, while learning deeper
opportunities and greater
nuances in the user’s experience
- You really do solve painful user experiences
- You, your team, and your organization are
given the requisite time to grow & mature into a
new competency—at a fraction of the cost—
while delivering on user value throughout the
whole process
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Pareto principle
Focus on the
outsized gains
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Do the work
Eyes before AIs
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
NLP is ultimately about
understanding your users
The heart of NLP
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Simple Query
pediatric cardiologist 46220
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Query segmentation / query understanding
pediatric cardiologist 46220
age_group_id: 5
specialty_id: 1
location_id: 142
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Some Ideas
Remember the fundamentals
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Remember the fundamentals
Pediatric
Cardiologist
46220
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Some Ideas
Use every tool in the toolbox
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Use every tool in the toolbox
pediatric cardiologist 46220[ ]
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Some Ideas
Facets are features t
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Facets are features
pediatric cardiologist 46220
46221
46222
46223
adolescent
geriatric
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Some Ideas
Honorable mentions
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Honorable mentions
Wikipedia et. al. as source of truth
“Near me”
PMI
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Final words
Focus and have fun
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Resources
Ted Sullivan @ Lucidworks
Giovanni Fernandez-Kincade
Berlin Buzzwords
Haystack Conference
Activate Conference
SparkNLP Slack
Relevant Search -- book & Slack
Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE
Thank you

More Related Content

Similar to Searching for NLP: Using Elasticsearch to Create MVPs of NLP-enabled User Experiences (20)

PDF
The Ultimate Data-Driven Marketing Survival Guide
Daniel Robinson
 
PPTX
Visualization Best Practices Webinar
Unilytics
 
PDF
Data science for fundraisers
James Orton
 
PDF
A picture is worth a thousand words_Mathilda Eloff
Mathilda Eloff
 
PDF
Creating a Data-Driven Organization, Data Day Texas, January 2016
Carl Anderson
 
PDF
Multipying the power of your agile team with Design
Phil Barrett
 
PDF
Creating a Data-Driven Organization, Crunchconf, October 2015
Carl Anderson
 
PDF
Whose Page Is It Anyway?: Creating a content workflow that serves the audience
Malaika Carpenter
 
PDF
#CSOAUS: Innovation - for a brighter future at News Corp Australia
Mark Drasutis
 
PDF
Personas Demystified 1.0
Mo Goltz
 
PPTX
Let's Walk the Talk of Simplification
Ajai Kumar Varshney
 
PDF
What to do with Personas
Laura Lorenzo
 
PDF
Big Data LA 2016: Backstage to a Data Driven Culture
Pauline Chow
 
PDF
Success Through an Actionable Data Science Stack
Domino Data Lab
 
PDF
Measuring Team Happiness – A Real-Life Journey of Fostering an Engaging Worki...
Agile Montréal
 
PDF
Optimal Tech Stack Report
Tincup & Co.
 
PDF
Creating a Data-Driven Organization (Data Day Seattle 2015)
Carl Anderson
 
PDF
What is data science? No really, what is a data scientist?
Dr. Melissa Sassi
 
PDF
Melbourne Service Jam Toolkit
stefanie85
 
ODP
Stop searching for that elusive data scientist
Parul Verma
 
The Ultimate Data-Driven Marketing Survival Guide
Daniel Robinson
 
Visualization Best Practices Webinar
Unilytics
 
Data science for fundraisers
James Orton
 
A picture is worth a thousand words_Mathilda Eloff
Mathilda Eloff
 
Creating a Data-Driven Organization, Data Day Texas, January 2016
Carl Anderson
 
Multipying the power of your agile team with Design
Phil Barrett
 
Creating a Data-Driven Organization, Crunchconf, October 2015
Carl Anderson
 
Whose Page Is It Anyway?: Creating a content workflow that serves the audience
Malaika Carpenter
 
#CSOAUS: Innovation - for a brighter future at News Corp Australia
Mark Drasutis
 
Personas Demystified 1.0
Mo Goltz
 
Let's Walk the Talk of Simplification
Ajai Kumar Varshney
 
What to do with Personas
Laura Lorenzo
 
Big Data LA 2016: Backstage to a Data Driven Culture
Pauline Chow
 
Success Through an Actionable Data Science Stack
Domino Data Lab
 
Measuring Team Happiness – A Real-Life Journey of Fostering an Engaging Worki...
Agile Montréal
 
Optimal Tech Stack Report
Tincup & Co.
 
Creating a Data-Driven Organization (Data Day Seattle 2015)
Carl Anderson
 
What is data science? No really, what is a data scientist?
Dr. Melissa Sassi
 
Melbourne Service Jam Toolkit
stefanie85
 
Stop searching for that elusive data scientist
Parul Verma
 

More from FaithWestdorp (18)

PDF
Using Elastiknn for exact and approximate nearest neighbor search
FaithWestdorp
 
PDF
Observability from the Home
FaithWestdorp
 
PDF
Elasticsearch Goes to Congress
FaithWestdorp
 
PPTX
Eliminate your zombie technology ray myers - 11-5-2020
FaithWestdorp
 
PDF
Mejorando las busquedas en nuestras aplicaciones web con elasticsearch
FaithWestdorp
 
PDF
Evolving with Elastic: GetSet Learning
FaithWestdorp
 
PPTX
EmPOW: Integrating Attack Behavior Intelligence into Logstash Plugins
FaithWestdorp
 
PDF
Examining OpenData with a Search Index using Elasticsearch
FaithWestdorp
 
PDF
From the trenches: scaling a large log management deployment
FaithWestdorp
 
PDF
Logstash and Maxmind: not just for GEOIP anymore
FaithWestdorp
 
PDF
Elasticsearch's aggregations & esctl in action or how i built a cli tool...
FaithWestdorp
 
PDF
Introduction to machine learning using Elastic
FaithWestdorp
 
PDF
Upgrade your attack model: finding and stopping fileless attacks with MITRE A...
FaithWestdorp
 
PDF
Elastic Observability
FaithWestdorp
 
PDF
Threat hunting with Elastic APM
FaithWestdorp
 
PDF
Guide to Data Visualization in Kibana
FaithWestdorp
 
PDF
Elastic's recommendation on keeping services up and running with real-time vi...
FaithWestdorp
 
PDF
Esctl in action elastic user group presentation aug 25 2020
FaithWestdorp
 
Using Elastiknn for exact and approximate nearest neighbor search
FaithWestdorp
 
Observability from the Home
FaithWestdorp
 
Elasticsearch Goes to Congress
FaithWestdorp
 
Eliminate your zombie technology ray myers - 11-5-2020
FaithWestdorp
 
Mejorando las busquedas en nuestras aplicaciones web con elasticsearch
FaithWestdorp
 
Evolving with Elastic: GetSet Learning
FaithWestdorp
 
EmPOW: Integrating Attack Behavior Intelligence into Logstash Plugins
FaithWestdorp
 
Examining OpenData with a Search Index using Elasticsearch
FaithWestdorp
 
From the trenches: scaling a large log management deployment
FaithWestdorp
 
Logstash and Maxmind: not just for GEOIP anymore
FaithWestdorp
 
Elasticsearch's aggregations & esctl in action or how i built a cli tool...
FaithWestdorp
 
Introduction to machine learning using Elastic
FaithWestdorp
 
Upgrade your attack model: finding and stopping fileless attacks with MITRE A...
FaithWestdorp
 
Elastic Observability
FaithWestdorp
 
Threat hunting with Elastic APM
FaithWestdorp
 
Guide to Data Visualization in Kibana
FaithWestdorp
 
Elastic's recommendation on keeping services up and running with real-time vi...
FaithWestdorp
 
Esctl in action elastic user group presentation aug 25 2020
FaithWestdorp
 
Ad

Recently uploaded (20)

PDF
Building Resilience with Digital Twins : Lessons from Korea
SANGHEE SHIN
 
PDF
Women in Automation Presents: Reinventing Yourself — Bold Career Pivots That ...
DianaGray10
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PDF
Impact of IEEE Computer Society in Advancing Emerging Technologies including ...
Hironori Washizaki
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
Persuasive AI: risks and opportunities in the age of digital debate
Speck&Tech
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
Ampere Offers Energy-Efficient Future For AI And Cloud
ShapeBlue
 
PDF
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
Meetup Kickoff & Welcome - Rohit Yadav, CSIUG Chairman
ShapeBlue
 
PDF
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
PDF
Smart Air Quality Monitoring with Serrax AQM190 LITE
SERRAX TECHNOLOGIES LLP
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PPTX
Building and Operating a Private Cloud with CloudStack and LINBIT CloudStack ...
ShapeBlue
 
PDF
July Patch Tuesday
Ivanti
 
PPTX
Building a Production-Ready Barts Health Secure Data Environment Tooling, Acc...
Barts Health
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Building Resilience with Digital Twins : Lessons from Korea
SANGHEE SHIN
 
Women in Automation Presents: Reinventing Yourself — Bold Career Pivots That ...
DianaGray10
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
Impact of IEEE Computer Society in Advancing Emerging Technologies including ...
Hironori Washizaki
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Persuasive AI: risks and opportunities in the age of digital debate
Speck&Tech
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
Ampere Offers Energy-Efficient Future For AI And Cloud
ShapeBlue
 
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Meetup Kickoff & Welcome - Rohit Yadav, CSIUG Chairman
ShapeBlue
 
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
Smart Air Quality Monitoring with Serrax AQM190 LITE
SERRAX TECHNOLOGIES LLP
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
Building and Operating a Private Cloud with CloudStack and LINBIT CloudStack ...
ShapeBlue
 
July Patch Tuesday
Ivanti
 
Building a Production-Ready Barts Health Secure Data Environment Tooling, Acc...
Barts Health
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Ad

Searching for NLP: Using Elasticsearch to Create MVPs of NLP-enabled User Experiences

  • 1. A BETTER MATCH MEANS BETTER CARE® Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE The Search for NLP Standing up “QuickLP” for PoC
  • 2. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Me I’m an engineer because I’m a curious person who likes products & problem- solving. - Interpersonal rhetoric - HCI - Healthcare IT - Solr - Data “intuition”
  • 3. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE A better match means better care
  • 4. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Kyruus Search A better match means better care. The Kyruus Search & API team exists to connect humans to relevant care by connecting them to relevant data.
  • 5. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Agenda The problem t The space t The options t
  • 6. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Expectations Ideas, not solutions
  • 7. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Expectations Where to look, not what to see
  • 8. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE The user journey is simple: Need t Input t Results t Problem Intent Query Documents $$
  • 9. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Problem Information retrieval 101 User Information 😬(you)
  • 10. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Problem Information retrieval 101 User Information 😬(you) Information Information Information Information Information Information
  • 11. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Problem Information retrieval 101 User Information 😬(you) Information Information Information Information Information Information
  • 12. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE The Space | Statistical relevance https://towardsdatascience.com/tf-term-frequency-idf-inverse-document-frequency-from-scratch-in-python-6c2b61b78558
  • 13. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE The Space | NLP https://hackernoon.com/various-optimisation-techniques-and-their-impact-on-generation-of-word-embeddings-3480bd7ed54f
  • 14. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Context-aware embeddings The Space | NLP
  • 15. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE The Space | NLP
  • 16. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE The Space -- NLP
  • 17. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Reality check Your users don’t care t
  • 18. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Reality check Your users don’t care t User Information 😬(you) Information Information Information Information Information Information
  • 19. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Door No. 3, Johnny! Approach The work The hope The reality Spray & pray Tune, tweak, and “test” all sorts of configurations, settings, analyzers, etc. That you make enough permutations to catch most people and that the parts you can’t cover just don’t show up (head in the sand) - You’re leaving some users out in the cold - You’re spending valuable engineering resources trying to fix it in a way that will never last, simply building a house of cards that will fall as soon as something changes Host Sesame Street Spend lots of time to source a great ML/AI/NLP candidate, spend lots of money to secure the best candidate, and spend a lot of time trying to get your organization suddenly ready for the work they will do (e.g. analytics, logging, tracking, monitoring, et. al.) You’ll spend enough money to buy yourself a silver bullet, that this person will save the day - There are no silver bullets - Having the right person is only part of the equation, the organization must be at a point of maturation to support them and their work long-term - You just bet all your chips on red—and the house always wins Crawl, walk, run Find areas of opportunity and exploit them creatively with the tools on hand You solve discrete use cases, one at a time, while learning deeper opportunities and greater nuances in the user’s experience - You really do solve painful user experiences - You, your team, and your organization are given the requisite time to grow & mature into a new competency—at a fraction of the cost— while delivering on user value throughout the whole process
  • 20. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Pareto principle Focus on the outsized gains
  • 21. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Do the work Eyes before AIs
  • 22. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE NLP is ultimately about understanding your users The heart of NLP
  • 23. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Simple Query pediatric cardiologist 46220
  • 24. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Query segmentation / query understanding pediatric cardiologist 46220 age_group_id: 5 specialty_id: 1 location_id: 142
  • 25. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Some Ideas Remember the fundamentals
  • 26. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Remember the fundamentals Pediatric Cardiologist 46220
  • 27. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Some Ideas Use every tool in the toolbox
  • 28. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Use every tool in the toolbox pediatric cardiologist 46220[ ]
  • 29. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Some Ideas Facets are features t
  • 30. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Facets are features pediatric cardiologist 46220 46221 46222 46223 adolescent geriatric
  • 31. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Some Ideas Honorable mentions
  • 32. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Honorable mentions Wikipedia et. al. as source of truth “Near me” PMI
  • 33. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Final words Focus and have fun
  • 34. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Resources Ted Sullivan @ Lucidworks Giovanni Fernandez-Kincade Berlin Buzzwords Haystack Conference Activate Conference SparkNLP Slack Relevant Search -- book & Slack
  • 35. Kyruus, Inc. CONFIDENTIAL. DO NOT DISTRIBUTE Thank you

Editor's Notes

  • #7: This isn’t a master class on what to do exactly. This is intended to stir your creativity, prod at some things that you hadn’t thought about before, and get you minded in the right direction.
  • #8: This isn’t a master class on what to do exactly. This is intended to stir your creativity, prod at some things that you hadn’t thought about before, and get you minded in the right direction.
  • #10: Bad news: you are the reason they can’t get to it Good news: you are the only way they will be able to get to it Their life is in your hands
  • #11: Bad news: you are the reason they can’t get to it Good news: you are the only way they will be able to get to it Their life is in your hands
  • #12: Bad news: you are the reason they can’t get to it Good news: you are the only way they will be able to get to it Their life is in your hands
  • #14: Word embeddings: GloVe Word2Vec Bag of words FastText
  • #15: Polysemy: I got the invite to do this talk I got anxious Hope you can say afterwards, “I got it”
  • #16: The group grew: Bert, Ernie, Big Bird, etc.
  • #17: It’s getting a bit out of hand
  • #18: As much as your users likely love Sesame Street… they don’t care about how bleeding edge your solution is They’ll be grouchier than Oscar when your solution doesn’t work. They’ll be happy as Elmo when it does—regardless of how.
  • #19: As much as your users likely love Sesame Street… they don’t care about how bleeding edge your solution is They’ll be grouchier than Oscar when your solution doesn’t work. They’ll be happy as Elmo when it does—regardless of how.
  • #20: Cute, lovable puppet characters notwithstanding
  • #23: There is likely a very fat initial part of your tail wherein you can get outsized gains and improvements. You won’t solve all the issues, but you’ll solve 80% of them with only 20% of the work or investment. Or perhaps you’ll solve the problems that equate to 80% of the user value, company bottom-line, etc. The point is this: focus on the wins, not the hows; the value, not the tech.
  • #24: You can probably spend 1-2 days tops and comb through some logs to find areas of opportunity for your application. If you can’t see it clearly then how could you spec it clearly?
  • #28: Dismax
  • #29: E.g. Dismax Optimize what you have before you build new, costly tech that needs to be optimized “Pediatric cardiologist 46220” will have a greater chance of being properly tuned to relevance once we appropriate our data and shift to the terms-centric approach found in a dismax query Suggested reading: Relevant Search by Doug Turnbull and John Berryman
  • #30: It’s not cheating to use your app layer, other technologies, etc. Redis + Zip
  • #31: It’s not cheating to use your app layer, other technologies, etc. Redis + Zip Maybe instead of just ranking higher on a zip match you want to filter on it. Regex and modify your query to be a facet. Or maybe you don’t want to filter out other zips but have concentric rings of sorting done based on your user’s submitted zip code. Adding this layer is a trivial amount of work for your engineers, it’s a trivial amount of impact to your infrastructure, e.g. Redis storage, and it’s a trivial amount of added latency to the overall request time—but it’s a non-trivial upgrade to your user’s overall experience.
  • #32: Facets are features. They’re facts about your data, simple truths that help you navigate it. This is why users use them when your precision isn’t good and your recall is really high: facets are the features they wish they’d given you or that you’d discerned. This being the case, you could be very aggressive and stuff keywords, highly-sought-after terms and phrases, into a special junk drawer for your documents and have those fields boosted in your dismax query. If nothing else, it can help you see the shape of your data a bit better in terms of density & distribution which then will help you to best facilitate the “cheapLP” solution needed to get users to the data Note, of course, that facets can be one of the first stops for your new ML/NLP engineer to get relevant feature data for their models Ted Sullivan from Lucidworks
  • #35: Wikipedia: “heart attackers” isn’t a phrase, “heart attack” is, leverage for phrase recognition Don’t build a huge algorithm to know what “near me” means -- there are plenty of fat head areas of opportunity Point-wise mutual information is a means of measuring associations in information theory, e.g. “heart” and “attack” vs. “heart attack”
  • #36: The reality is that there are a LOT of problems to solve If you don’t focus on one at a time then you probably won’t get to any of them. Sesame Street is smart in that they take one letter and one number at a time and teach it to kids. Boiling the ocean is frustrating. Solving real user problems is a lot of fun, especially if you do it in a way that is cost-effective and creates a rapidity or momentum to your stream of value-delivery