SlideShare a Scribd company logo
Lisa Jung
Developer Advocate @Elastic
Beginner’s Crash Course to Elastic Stack Series
Part 1.2: Understanding the Relevance of your search using
Elasticsearch and Kibana
Beginner’s crash course to the Elastic Stack Series
● Part 1.1: Intro to Elasticsearch and Kibana
○ use case of Elasticsearch and Kibana
○ the basic architecture of Elasticsearch
○ perform CRUD(Create, Read, Update, and Delete) operations with
Elasticsearch and Kibana
Missed the first workshop? No worries!
● Part 1.1: Intro to Elasticsearch and Kibana
○ Repo: https://ela.st/workshop-1-repo
The Elastic Stack
Reliably and securely take data from
any source, in any format, then search,
analyze, and visualize it in real time.
Elasticsearch
Store | Search | Analyze
Part 1.2 Understanding the relevance of your search with Elasticsearch and Kibana - Beginner's Crash Course to the Elastic Stack Series - .pdf
Elastic is a search company.
We focus on value to users by producing fast results
that operate at scale and are relevant. This is our
DNA. We believe search is an experience. It is what
defines us, and makes us unique.
Scale,
Relevance
Relevance
How do we measure relevance?
● Precision
● Recall
Store | Search | Analyze
I store data as
documents!
Documents with similar traits
are grouped into an index!
Index
Document Document Document
Document Document
Document
When search query is sent, Elasticsearch retrieves relevant
documents and presents the documents as search results.
Index
Document Document Document
Document Document
Document
Search Results
These two diagrams depict the same thing!
Index
Index
Document Document Document
Document Document
Document
Index
True positives are relevant documents that are
returned to the user.
T
T
T
T
T True positives
Index
False positives are irrelevant documents that are
returned to the user.
T
T
T
T
T True positives
F
F False positives
Index
True negatives are irrelevant documents that are
not returned to the user.
T
T
T
T
T
T
T
F
F
True negatives
T
T
T
T
Index
False negatives are relevant documents that were
not returned to the user.
T
T
T
T
T
T
T
F
F
True negatives
False negatives
T
T
T
T
What is precision?
Precision =
True positives
True positives + False positives
What portion of the retrieved data is actually relevant to
the search query?
What is recall?
Recall =
True positives
True positives + False negatives
What portion of relevant data is being returned as search
results?
T
T
T
Precision and Recall are inversely related
Precision
I want all the
retrieved results to
be a perfect match to
the query, even if it
means returning less
documents.
I want to retrieve
more results even
if documents may
not be a perfect
match to the
query.
Precision Recall
Recall
Precision Recall
Precision and recall determine which documents are
included in the search results.
Precision and recall do not determine which of the returned
documents are more relevant compared to the other!
Ranking refers to ordering of the results (from most relevant
results at the top, to least relevant at the bottom).
Most Relevant
…
…
…
Less Relevant
…
…
…
Least Relevant
How to form good habits
(Highest Score)
(Lowest Score)
What is score?
● The score is a value that represents how relevant a document is to that
specific query
● A score is computed for each document that is a hit
What is score?
● Term Frequency(TF)
● Inverse Document Frequency(IDF)
What is score?
How to form good habits } Search Query
Hits
Most Relevant
…
…
…
Less Relevant
…
…
…
Least Relevant
(Highest Score)
(Lowest Score)
Term Frequency(TF) determines how many times each search
term appears in a document.
How to form good habits
TF= 4
TF= 1
If search terms are found in
high frequency in a
document, the document is
considered more relevant to
the search query.
Search Terms
} Search Query
What is Inverse Document Frequency(IDF)?
How to form good habits
How to form a meetup group
Hits
Good chicken recipe
How to form a band
Good times rolling
Good habits 101
Good habits are easy to master!
We may contain some
of the search terms
but we have nothing
to do with forming
good habits!
IDF diminishes the weight of
terms that occur very
frequently in the document
set and increases the weight
of terms that occur rarely!
Fine tuning precision or recall using
Elasticsearch and Kibana
Click on the link to the workshop repo.
https://ela.st/workshop-2-repo
Scroll down to the Resources section & open Free
Elastic Cloud Trial link in a new tab.
Scroll down to the Resources section & open Free
Elastic Cloud Trial link in a new tab.
Go to the email account you signed up with and
verify your email.
Open email from Elastic. Click on verify and accept
button.
Enter your password.
Click on start your free trial.
Select the Elastic Stack.
Configure your settings.
Name your deployment then create deployment.
Beginner’s Crash Course
Save your deployment credentials.
Open Kibana.
Beginner’s Crash Course
Beginner’s Crash Course
Click on Explore on my own option.
Click on Upload a file option
Download and Unzip News Category Dataset from Kaggle
Drag and drop a file you want to upload.
Kibana will give you an analysis of the first 1000 lines of
your data and give you a summary of your dataset.
Field section displays fields identified, high level statistics,
and top occuring values
Click on import button
Name your index and click on import.
Then Elasticsearch will do the rest!
news_headlines
Click on menu icon, and open Dev Tools.
Click on Explore on my own option.
Fine tuning precision or recall using
Elasticsearch and Kibana
Questions?
Join the Elastic Austin User Group for
updates on future workshops!
Lisa Jung
https://discuss.elastic.co/

More Related Content

PDF
[Vancouver] part 2 understanding the relevance of your search with elasticse...
UllyCarolinneSampaio
 
PPTX
Exploratory testing part 2
Dawn Code
 
PPTX
The Three Pillars of Continuous Delivery - Boston Continuous Delivery Event
XebiaLabs
 
PDF
Get full visibility and find hidden security issues
Elasticsearch
 
PPTX
Incremental design v2
Michael Carew
 
PPTX
Utilizing the natural langauage toolkit for keyword research
Erudite
 
PDF
Data Science - Part XI - Text Analytics
Derek Kane
 
PDF
How publishing works in the digital era
Apex CoVantage
 
[Vancouver] part 2 understanding the relevance of your search with elasticse...
UllyCarolinneSampaio
 
Exploratory testing part 2
Dawn Code
 
The Three Pillars of Continuous Delivery - Boston Continuous Delivery Event
XebiaLabs
 
Get full visibility and find hidden security issues
Elasticsearch
 
Incremental design v2
Michael Carew
 
Utilizing the natural langauage toolkit for keyword research
Erudite
 
Data Science - Part XI - Text Analytics
Derek Kane
 
How publishing works in the digital era
Apex CoVantage
 

Similar to Part 1.2 Understanding the relevance of your search with Elasticsearch and Kibana - Beginner's Crash Course to the Elastic Stack Series - .pdf (20)

PDF
Intro to Elasticsearch and Kibana.pdf
ssuser65fa31
 
PPTX
Real-time Recommendations for Retail: Architecture, Algorithms, and Design
Juliet Hougland
 
PDF
Everything You Wish You Knew About Search
IDEAS - Int'l Data Engineering and Science Association
 
PDF
LGL Certification Training Guide
Erin Shumaker
 
PPT
Writing Quality Code
indikaMaligaspe
 
PDF
Machine Learning Product Managers Meetup Event
Benjamin Schulte
 
PDF
Agile Software Architecture
Chris F Carroll
 
PPTX
Navigating the Mess of a Shared drive Migration to SharePoint
Joanne Klein
 
PPTX
Natural language processing and search
Nathan McMinn
 
PPTX
Elegant and Efficient Database Design
Becky Sweger
 
PPTX
Metric Abuse: Frequently Misused Metrics in Oracle
Steve Karam
 
PPTX
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Max Irwin
 
PDF
DevOps Paradox: Going Faster Brings Higher Quality, Lower Costs, & Better Out...
dev2ops
 
PDF
Developing in R - the contextual Multi-Armed Bandit edition
Robin van Emden
 
PPTX
Scrum and kanban in the enterprise webinar
Mike Cottmeyer
 
PPTX
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
Boston Institute of Analytics
 
DOC
Business analyst
Hemanth Kumar
 
PDF
Barga Data Science lecture 9
Roger Barga
 
PDF
Get full visibility and find hidden security issues
Elasticsearch
 
PDF
Object oriented software engineering concepts
Komal Singh
 
Intro to Elasticsearch and Kibana.pdf
ssuser65fa31
 
Real-time Recommendations for Retail: Architecture, Algorithms, and Design
Juliet Hougland
 
Everything You Wish You Knew About Search
IDEAS - Int'l Data Engineering and Science Association
 
LGL Certification Training Guide
Erin Shumaker
 
Writing Quality Code
indikaMaligaspe
 
Machine Learning Product Managers Meetup Event
Benjamin Schulte
 
Agile Software Architecture
Chris F Carroll
 
Navigating the Mess of a Shared drive Migration to SharePoint
Joanne Klein
 
Natural language processing and search
Nathan McMinn
 
Elegant and Efficient Database Design
Becky Sweger
 
Metric Abuse: Frequently Misused Metrics in Oracle
Steve Karam
 
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Max Irwin
 
DevOps Paradox: Going Faster Brings Higher Quality, Lower Costs, & Better Out...
dev2ops
 
Developing in R - the contextual Multi-Armed Bandit edition
Robin van Emden
 
Scrum and kanban in the enterprise webinar
Mike Cottmeyer
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
Boston Institute of Analytics
 
Business analyst
Hemanth Kumar
 
Barga Data Science lecture 9
Roger Barga
 
Get full visibility and find hidden security issues
Elasticsearch
 
Object oriented software engineering concepts
Komal Singh
 
Ad

Recently uploaded (20)

PDF
A Complete Guide to Data Migration Services for Modern Businesses
Aurnex
 
PDF
bain-temasek-sea-green-economy-2022-report-investing-behind-the-new-realities...
YudiSaputra43
 
DOCX
unit 1 BC.docx - INTRODUCTION TO BUSINESS COMMUICATION
MANJU N
 
PDF
NewBase 24 July 2025 Energy News issue - 1805 by Khaled Al Awadi._compressed...
Khaled Al Awadi
 
PDF
NewBase 29 July 2025 Energy News issue - 1807 by Khaled Al Awadi_compressed.pdf
Khaled Al Awadi
 
PPTX
PUBLIC RELATIONS N6 slides (4).pptx poin
chernae08
 
DOCX
India's Emerging Global Leadership in Sustainable Energy Production The Rise ...
Insolation Energy
 
PDF
2025 07 29 The Future, Backwards Agile 2025.pdf
Daniel Walsh
 
PPTX
Final PPT on DAJGUA, EV Charging, Meter Devoloution, CGRF, Annual Accounts & ...
directord
 
PPTX
Memorandum and articles of association explained.pptx
Keerthana Chinnathambi
 
PDF
New Royals Distribution Plan Presentation
ksherwin
 
PDF
What are the steps to buy GitHub accounts safely?
d14405913
 
PPTX
The Ultimate Guide to Customer Journey Mapping
RUPAL AGARWAL
 
PDF
Withum Webinar - OBBBA: Tax Insights for Food and Consumer Brands
Withum
 
PPTX
Appreciations - July 25.pptxsdsdsddddddsssss
anushavnayak
 
PDF
Unveiling the Latest Threat Intelligence Practical Strategies for Strengtheni...
Auxis Consulting & Outsourcing
 
DOCX
UNIT 2 BC.docx- cv - RESOLUTION -MINUTES-NOTICE - BUSINESS LETTER DRAFTING
MANJU N
 
PPTX
Virbyze_Our company profile_Preview.pptx
myckwabs
 
PPTX
Integrative Negotiation: Expanding the Pie
badranomar1990
 
PPTX
E-commerce and its impact on business.
pandeyranjan5483
 
A Complete Guide to Data Migration Services for Modern Businesses
Aurnex
 
bain-temasek-sea-green-economy-2022-report-investing-behind-the-new-realities...
YudiSaputra43
 
unit 1 BC.docx - INTRODUCTION TO BUSINESS COMMUICATION
MANJU N
 
NewBase 24 July 2025 Energy News issue - 1805 by Khaled Al Awadi._compressed...
Khaled Al Awadi
 
NewBase 29 July 2025 Energy News issue - 1807 by Khaled Al Awadi_compressed.pdf
Khaled Al Awadi
 
PUBLIC RELATIONS N6 slides (4).pptx poin
chernae08
 
India's Emerging Global Leadership in Sustainable Energy Production The Rise ...
Insolation Energy
 
2025 07 29 The Future, Backwards Agile 2025.pdf
Daniel Walsh
 
Final PPT on DAJGUA, EV Charging, Meter Devoloution, CGRF, Annual Accounts & ...
directord
 
Memorandum and articles of association explained.pptx
Keerthana Chinnathambi
 
New Royals Distribution Plan Presentation
ksherwin
 
What are the steps to buy GitHub accounts safely?
d14405913
 
The Ultimate Guide to Customer Journey Mapping
RUPAL AGARWAL
 
Withum Webinar - OBBBA: Tax Insights for Food and Consumer Brands
Withum
 
Appreciations - July 25.pptxsdsdsddddddsssss
anushavnayak
 
Unveiling the Latest Threat Intelligence Practical Strategies for Strengtheni...
Auxis Consulting & Outsourcing
 
UNIT 2 BC.docx- cv - RESOLUTION -MINUTES-NOTICE - BUSINESS LETTER DRAFTING
MANJU N
 
Virbyze_Our company profile_Preview.pptx
myckwabs
 
Integrative Negotiation: Expanding the Pie
badranomar1990
 
E-commerce and its impact on business.
pandeyranjan5483
 
Ad

Part 1.2 Understanding the relevance of your search with Elasticsearch and Kibana - Beginner's Crash Course to the Elastic Stack Series - .pdf

  • 1. Lisa Jung Developer Advocate @Elastic Beginner’s Crash Course to Elastic Stack Series Part 1.2: Understanding the Relevance of your search using Elasticsearch and Kibana
  • 2. Beginner’s crash course to the Elastic Stack Series ● Part 1.1: Intro to Elasticsearch and Kibana ○ use case of Elasticsearch and Kibana ○ the basic architecture of Elasticsearch ○ perform CRUD(Create, Read, Update, and Delete) operations with Elasticsearch and Kibana
  • 3. Missed the first workshop? No worries! ● Part 1.1: Intro to Elasticsearch and Kibana ○ Repo: https://ela.st/workshop-1-repo
  • 4. The Elastic Stack Reliably and securely take data from any source, in any format, then search, analyze, and visualize it in real time.
  • 7. Elastic is a search company. We focus on value to users by producing fast results that operate at scale and are relevant. This is our DNA. We believe search is an experience. It is what defines us, and makes us unique. Scale, Relevance
  • 9. How do we measure relevance? ● Precision ● Recall
  • 10. Store | Search | Analyze I store data as documents! Documents with similar traits are grouped into an index! Index Document Document Document Document Document Document
  • 11. When search query is sent, Elasticsearch retrieves relevant documents and presents the documents as search results. Index Document Document Document Document Document Document Search Results
  • 12. These two diagrams depict the same thing! Index Index Document Document Document Document Document Document
  • 13. Index True positives are relevant documents that are returned to the user. T T T T T True positives
  • 14. Index False positives are irrelevant documents that are returned to the user. T T T T T True positives F F False positives
  • 15. Index True negatives are irrelevant documents that are not returned to the user. T T T T T T T F F True negatives T T T T
  • 16. Index False negatives are relevant documents that were not returned to the user. T T T T T T T F F True negatives False negatives T T T T
  • 17. What is precision? Precision = True positives True positives + False positives What portion of the retrieved data is actually relevant to the search query?
  • 18. What is recall? Recall = True positives True positives + False negatives What portion of relevant data is being returned as search results? T T T
  • 19. Precision and Recall are inversely related Precision I want all the retrieved results to be a perfect match to the query, even if it means returning less documents. I want to retrieve more results even if documents may not be a perfect match to the query. Precision Recall Recall Precision Recall
  • 20. Precision and recall determine which documents are included in the search results. Precision and recall do not determine which of the returned documents are more relevant compared to the other!
  • 21. Ranking refers to ordering of the results (from most relevant results at the top, to least relevant at the bottom). Most Relevant … … … Less Relevant … … … Least Relevant How to form good habits (Highest Score) (Lowest Score)
  • 22. What is score? ● The score is a value that represents how relevant a document is to that specific query ● A score is computed for each document that is a hit
  • 23. What is score? ● Term Frequency(TF) ● Inverse Document Frequency(IDF)
  • 24. What is score? How to form good habits } Search Query Hits Most Relevant … … … Less Relevant … … … Least Relevant (Highest Score) (Lowest Score)
  • 25. Term Frequency(TF) determines how many times each search term appears in a document. How to form good habits TF= 4 TF= 1 If search terms are found in high frequency in a document, the document is considered more relevant to the search query. Search Terms } Search Query
  • 26. What is Inverse Document Frequency(IDF)? How to form good habits How to form a meetup group Hits Good chicken recipe How to form a band Good times rolling Good habits 101 Good habits are easy to master! We may contain some of the search terms but we have nothing to do with forming good habits! IDF diminishes the weight of terms that occur very frequently in the document set and increases the weight of terms that occur rarely!
  • 27. Fine tuning precision or recall using Elasticsearch and Kibana
  • 28. Click on the link to the workshop repo. https://ela.st/workshop-2-repo
  • 29. Scroll down to the Resources section & open Free Elastic Cloud Trial link in a new tab.
  • 30. Scroll down to the Resources section & open Free Elastic Cloud Trial link in a new tab.
  • 31. Go to the email account you signed up with and verify your email.
  • 32. Open email from Elastic. Click on verify and accept button.
  • 34. Click on start your free trial.
  • 37. Name your deployment then create deployment. Beginner’s Crash Course
  • 38. Save your deployment credentials.
  • 39. Open Kibana. Beginner’s Crash Course Beginner’s Crash Course
  • 40. Click on Explore on my own option.
  • 41. Click on Upload a file option
  • 42. Download and Unzip News Category Dataset from Kaggle
  • 43. Drag and drop a file you want to upload.
  • 44. Kibana will give you an analysis of the first 1000 lines of your data and give you a summary of your dataset.
  • 45. Field section displays fields identified, high level statistics, and top occuring values
  • 46. Click on import button
  • 47. Name your index and click on import.
  • 48. Then Elasticsearch will do the rest! news_headlines
  • 49. Click on menu icon, and open Dev Tools.
  • 50. Click on Explore on my own option.
  • 51. Fine tuning precision or recall using Elasticsearch and Kibana
  • 53. Join the Elastic Austin User Group for updates on future workshops!