SlideShare a Scribd company logo
How We Incrementally Improved
           Search




        Ravi Mynampaty
         @ravimynampaty
Agenda


   Background
‱   Roadmap
‱   Implementation
‱   Analytics
‱   Benefits
‱   Challenges
‱   Next Steps
Background: A few years ago


‱   Out-of-the-box Ultraseek
‱   No optimization, no customization
‱   Fraction of HBS content indexed / searchable
‱   Many dead ends
‱   Proliferation of different search tools
‱   User sentiment
     ‱ “search sucks”
     ‱ “why can’t it be more like Google”
Background: Our Vision

‱   One Search Box to Rule Them All
‱   The long term goal: enterprise search
‱   One-stop searching
‱   Google-like simplicity
‱   Handle refinement / navigation on results page
Agenda


‱   Background
   Roadmap
‱   Implementation
‱   Analytics
‱   Benefits
‱   Challenges
‱   Next Steps
Roadmap: Preliminary Steps

‱ Inventory document collections

‱ Inventory search-type tools

‱ Of the above, identify
      – most heavily used
      – strategically significant
      – high impact
      – Low Hanging Fruit (LHF)
Roadmap: Implementation Plan

‱   Prioritize tasks by ease of content access
    and implementation (LHF)

‱   Develop timeline

‱   Build prototypes and iterate the design
Agenda


‱   Background
‱   Roadmap
   Implementation
‱   Analytics
‱   Benefits
‱   Challenges
‱   Next Steps
Implementation: How we built it

‱ Customized Ultraseek’s results display code
‱ Worked with owners of software apps
      –Provided JSON APIs
      –Allowed us to spider their app/repository
‱ HTML is the API !!
‱ In other words:
      No rocket science involved
Implementation: Three Integration Approaches


‱ Blended Search (e.g., Faculty/Staff Directory)

‱ Brokered Query (e.g., Video Catalog)

‱ Query Resubmit (e.g., Alumni Directory)
Implementation: Blended Search

Spider HBS web content outside of HBS.EDU
‱ Harbus.org (student newspaper)
‱ Club and affiliated sites




Spider HBS content located in other applications
‱ Faculty and staff phone book
‱ Alumni Class Notes application
Implementation : Optimize and clean up search indexes

 Work with content owners to create good HTML page titles
   ‱ Faculty Publications pages
   ‱ 20th Century Leadership database
   ‱ Address MS-Office / PDF files too

 Eliminating duplicate search results / use filters

 Adjusting Relevance per collection / source / file path
Implementation : Create Best Bets




Top 10 Queries
Oct – Dec
Implementation: Unify Blended Search + Query Resubmit
Query refinement options
(Blended Search)




Query resubmit options
“Integration-lite”
Implementation: Expanding the Net w/ Brokered Search


‱ When direct indexing isn’t practical
    Harvard.edu search
    HBS VideoTools (intranet only)
    MBA Event Calendar (intranet only)
‱ A query is handed off to another search engine
‱ Results are returned “behind the scenes” as
  JavaScript Object Notation (JSON) / Python
‱ Ajax-like support of asynchronous search
  processes
Implementation: Brokered Query in Action
Implementation: Brokered Query in Action
Implementation: Brokered Query in Action
Implementation: One-offs

‱ Software Dev Docs (cmd line)

$ find ./software/docs –name ‘*html’
         | xargs grep -i oracle | less

(returns 100s of docs)

‱ Built web-based search UI
Agenda


‱   Background
‱   Roadmap
‱   Implementation
   Analytics
‱   Benefits
‱   Challenges
‱   Next Steps
Analytics: Tracking Usage of Features
Analytics: Tracking Best Bets
Analytics: Tracking Best Bets
Agenda


‱   Background
‱   Roadmap
‱   Implementation
‱   Analytics
   Benefits
‱   Challenges
‱   Next Steps
Benefits

‱ Single point of access for various repositories

‱ Shortcomings of underlying tools overcome

‱ Better access to content from rest of Harvard

‱ Traffic boost to e-commerce site
Agenda


‱   Background
‱   Roadmap
‱   Implementation
‱   Analytics
‱   Benefits
   Challenges
‱   Next Steps
Challenges


‱ Search is never done

‱ Complex permissions issues

‱ SERP design convergence

‱ SharePoint
Agenda


‱   Background
‱   Roadmap
‱   Implementation
‱   Analytics
‱   Benefits
‱   Challenges
   Next Steps
Next Steps

‱ Tackling the mixed-mode situation
‱ Integration with taxonomies
‱ Search experience within HBS applications
‱ Faceted search where rich metadata
  available
‱ Analytics feeding website design and
  vocabulary development
Conclusion


‱ Tactical, iterative approach enabled
  significant progress

‱ Implementing simpler features/tweaks may
  have higher impact

‱ Your existing search engine may have more
  gas in it than you realize

More Related Content

PPT
Document management #RWIRW
Alison McNab
 
PPTX
Frances McNamara - Discovery strategies for Kuali OLE - VuFind at the Univers...
Kuali Days UK
 
PPTX
I serve the users
Ron Delaney
 
PPTX
Sap abap course content
shivasryit
 
PPTX
T44u 2015, web development best practice
Terminalfour
 
PPTX
T44u 2015, imperial college
Terminalfour
 
PPTX
Shaking hands with the developer: How IT Communications can help you build a ...
Sarah Khan
 
PDF
Drupal North East - Drupal 6 to 7 migration case study
Peacock Carter Ltd
 
Document management #RWIRW
Alison McNab
 
Frances McNamara - Discovery strategies for Kuali OLE - VuFind at the Univers...
Kuali Days UK
 
I serve the users
Ron Delaney
 
Sap abap course content
shivasryit
 
T44u 2015, web development best practice
Terminalfour
 
T44u 2015, imperial college
Terminalfour
 
Shaking hands with the developer: How IT Communications can help you build a ...
Sarah Khan
 
Drupal North East - Drupal 6 to 7 migration case study
Peacock Carter Ltd
 

What's hot (20)

PPTX
Web Forms, or How I Learned to Stop Worrying and Love Web Services
hannonhill
 
PDF
Sap abap online training
Monster Courses
 
PPTX
RDA Toolkit Essentials 01.16
jhennelly
 
PPTX
Briefer: UX design process
Andriy Vaskiv
 
PPTX
RDA Toolkit Essentials - 06.18.2014
jhennelly
 
PPTX
RDA Toolkit Essentials webinar 03.19.14
jhennelly
 
KEY
Sitecore at the University of Alberta
Tim Schneider
 
PPTX
Zero to Sixty with Oracle ApEx
Bradley Brown
 
PPTX
11.14 RDA Toolkit essentials
jhennelly
 
PPTX
07.18 rda toolkit essentials
jhennelly
 
PPTX
09.19 rda toolkit essentials
jhennelly
 
PPTX
09.18.13 RDA Toolkit Essentials
jhennelly
 
PDF
Developer Conference 1.5 - Making the Move to Visual COBOL (Transvive)
Micro Focus
 
PPT
333 seminar2 danacompton
Society for Scholarly Publishing
 
PPTX
RDA Toolkit Essentials 2013.06.11
jhennelly
 
PPTX
RDA Toolkit Essentials 2015-06-11
jhennelly
 
PPTX
RDA Toolkit Essentials 2015-03-18
jhennelly
 
PPTX
RDA Toolkit Essentials 2014-12-17
jhennelly
 
PPTX
Dynamic sitemaps
Charlie Morris
 
PDF
ALM Works Structure - Boston Atlassian User Group
Greg Venable
 
Web Forms, or How I Learned to Stop Worrying and Love Web Services
hannonhill
 
Sap abap online training
Monster Courses
 
RDA Toolkit Essentials 01.16
jhennelly
 
Briefer: UX design process
Andriy Vaskiv
 
RDA Toolkit Essentials - 06.18.2014
jhennelly
 
RDA Toolkit Essentials webinar 03.19.14
jhennelly
 
Sitecore at the University of Alberta
Tim Schneider
 
Zero to Sixty with Oracle ApEx
Bradley Brown
 
11.14 RDA Toolkit essentials
jhennelly
 
07.18 rda toolkit essentials
jhennelly
 
09.19 rda toolkit essentials
jhennelly
 
09.18.13 RDA Toolkit Essentials
jhennelly
 
Developer Conference 1.5 - Making the Move to Visual COBOL (Transvive)
Micro Focus
 
333 seminar2 danacompton
Society for Scholarly Publishing
 
RDA Toolkit Essentials 2013.06.11
jhennelly
 
RDA Toolkit Essentials 2015-06-11
jhennelly
 
RDA Toolkit Essentials 2015-03-18
jhennelly
 
RDA Toolkit Essentials 2014-12-17
jhennelly
 
Dynamic sitemaps
Charlie Morris
 
ALM Works Structure - Boston Atlassian User Group
Greg Venable
 
Ad

Similar to How We Incrementally Improved Search (20)

PPTX
Creating a Documentation Portal
Steve Anderson
 
PDF
Implementing Site Search in CQ5 / AEM
rtpaem
 
PPTX
Emerging technologies in academic libraries
Michael Cummings
 
PPTX
Metaphor: A system for related searches recommendations
Mitul Tiwari
 
PDF
Implimenting and Mitigating Change with all of this Newfangled Technology
Indiana Online Users Group
 
PDF
The Enterprise Search Market in a Nutshell
Dr. Haxel Consult
 
PPTX
SharePoint 2013 Search Based Solutions
SPC Adriatics
 
PPTX
Developing Search-driven application in SharePoint 2013
SPC Adriatics
 
PPTX
How to Manage and Troubleshoot Search: A Practical Guide
SPC Adriatics
 
PPT
SPLive Orlando - 10 Things I Like in SharePoint 2013 Search
Agnes Molnar
 
PDF
Anatomy of an Intranet (Triangle SharePoint User Group) October 2016
Michael Greene
 
PPTX
Introduction to Azure Search
Radoslav Gatev
 
PPTX
SPConnections - Search Administration in SharePoint 2013
Agnes Molnar
 
PPTX
Top 7 mistakes
Talbott Crowell
 
PPTX
SPConnections - What's new in SharePoint 2013 Search
Agnes Molnar
 
PPSX
Olympya web-tools 2011
Paulo Mattos
 
PDF
Project Topic Presentation Data and Web Science Group IE686 Large Language Mo...
cniclsh
 
PPTX
SPCAdriatics - Search Administration and Troubleshooting in SharePoint 2013
Agnes Molnar
 
PPTX
Feature Driven Development agile sofwtare
ahmed948311
 
PPTX
TechFuse 2013 - Break down the walls SharePoint 2013
Avtex
 
Creating a Documentation Portal
Steve Anderson
 
Implementing Site Search in CQ5 / AEM
rtpaem
 
Emerging technologies in academic libraries
Michael Cummings
 
Metaphor: A system for related searches recommendations
Mitul Tiwari
 
Implimenting and Mitigating Change with all of this Newfangled Technology
Indiana Online Users Group
 
The Enterprise Search Market in a Nutshell
Dr. Haxel Consult
 
SharePoint 2013 Search Based Solutions
SPC Adriatics
 
Developing Search-driven application in SharePoint 2013
SPC Adriatics
 
How to Manage and Troubleshoot Search: A Practical Guide
SPC Adriatics
 
SPLive Orlando - 10 Things I Like in SharePoint 2013 Search
Agnes Molnar
 
Anatomy of an Intranet (Triangle SharePoint User Group) October 2016
Michael Greene
 
Introduction to Azure Search
Radoslav Gatev
 
SPConnections - Search Administration in SharePoint 2013
Agnes Molnar
 
Top 7 mistakes
Talbott Crowell
 
SPConnections - What's new in SharePoint 2013 Search
Agnes Molnar
 
Olympya web-tools 2011
Paulo Mattos
 
Project Topic Presentation Data and Web Science Group IE686 Large Language Mo...
cniclsh
 
SPCAdriatics - Search Administration and Troubleshooting in SharePoint 2013
Agnes Molnar
 
Feature Driven Development agile sofwtare
ahmed948311
 
TechFuse 2013 - Break down the walls SharePoint 2013
Avtex
 
Ad

More from Ravi Mynampaty (13)

PDF
Build Your Own World Class Directory Search From Alpha to Omega
Ravi Mynampaty
 
PDF
Let Search Power Your Intranet!
Ravi Mynampaty
 
PDF
How we spiked the HBS water supply with Solr
Ravi Mynampaty
 
PDF
Building a Solr-driven Web Portal
Ravi Mynampaty
 
PDF
Developing a Search & Findability Practice for the Enterprise
Ravi Mynampaty
 
PDF
Clustering as presented at UX Poland 2013
Ravi Mynampaty
 
PDF
Unix for Librarians
Ravi Mynampaty
 
PDF
Clustering Search Log Data
Ravi Mynampaty
 
PDF
Findability Standards
Ravi Mynampaty
 
PDF
What to Feed Your Search Engine: The Evolution of Search Analytics at HBS
Ravi Mynampaty
 
PDF
Better Search UX
Ravi Mynampaty
 
PDF
Business owner findability interview questions
Ravi Mynampaty
 
PDF
Developing & Implementing Findability Standards
Ravi Mynampaty
 
Build Your Own World Class Directory Search From Alpha to Omega
Ravi Mynampaty
 
Let Search Power Your Intranet!
Ravi Mynampaty
 
How we spiked the HBS water supply with Solr
Ravi Mynampaty
 
Building a Solr-driven Web Portal
Ravi Mynampaty
 
Developing a Search & Findability Practice for the Enterprise
Ravi Mynampaty
 
Clustering as presented at UX Poland 2013
Ravi Mynampaty
 
Unix for Librarians
Ravi Mynampaty
 
Clustering Search Log Data
Ravi Mynampaty
 
Findability Standards
Ravi Mynampaty
 
What to Feed Your Search Engine: The Evolution of Search Analytics at HBS
Ravi Mynampaty
 
Better Search UX
Ravi Mynampaty
 
Business owner findability interview questions
Ravi Mynampaty
 
Developing & Implementing Findability Standards
Ravi Mynampaty
 

Recently uploaded (20)

PPTX
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PDF
Software Development Methodologies in 2025
KodekX
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PDF
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PPTX
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
PDF
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PDF
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
PDF
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
PDF
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PDF
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
PDF
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
Software Development Methodologies in 2025
KodekX
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 

How We Incrementally Improved Search

  • 1. How We Incrementally Improved Search Ravi Mynampaty @ravimynampaty
  • 2. Agenda  Background ‱ Roadmap ‱ Implementation ‱ Analytics ‱ Benefits ‱ Challenges ‱ Next Steps
  • 3. Background: A few years ago
 ‱ Out-of-the-box Ultraseek ‱ No optimization, no customization ‱ Fraction of HBS content indexed / searchable ‱ Many dead ends ‱ Proliferation of different search tools ‱ User sentiment ‱ “search sucks” ‱ “why can’t it be more like Google”
  • 4. Background: Our Vision ‱ One Search Box to Rule Them All ‱ The long term goal: enterprise search ‱ One-stop searching ‱ Google-like simplicity ‱ Handle refinement / navigation on results page
  • 5. Agenda ‱ Background  Roadmap ‱ Implementation ‱ Analytics ‱ Benefits ‱ Challenges ‱ Next Steps
  • 6. Roadmap: Preliminary Steps ‱ Inventory document collections ‱ Inventory search-type tools ‱ Of the above, identify – most heavily used – strategically significant – high impact – Low Hanging Fruit (LHF)
  • 7. Roadmap: Implementation Plan ‱ Prioritize tasks by ease of content access and implementation (LHF) ‱ Develop timeline ‱ Build prototypes and iterate the design
  • 8. Agenda ‱ Background ‱ Roadmap  Implementation ‱ Analytics ‱ Benefits ‱ Challenges ‱ Next Steps
  • 9. Implementation: How we built it ‱ Customized Ultraseek’s results display code ‱ Worked with owners of software apps –Provided JSON APIs –Allowed us to spider their app/repository ‱ HTML is the API !! ‱ In other words: No rocket science involved
  • 10. Implementation: Three Integration Approaches ‱ Blended Search (e.g., Faculty/Staff Directory) ‱ Brokered Query (e.g., Video Catalog) ‱ Query Resubmit (e.g., Alumni Directory)
  • 11. Implementation: Blended Search Spider HBS web content outside of HBS.EDU ‱ Harbus.org (student newspaper) ‱ Club and affiliated sites Spider HBS content located in other applications ‱ Faculty and staff phone book ‱ Alumni Class Notes application
  • 12. Implementation : Optimize and clean up search indexes Work with content owners to create good HTML page titles ‱ Faculty Publications pages ‱ 20th Century Leadership database ‱ Address MS-Office / PDF files too Eliminating duplicate search results / use filters Adjusting Relevance per collection / source / file path
  • 13. Implementation : Create Best Bets Top 10 Queries Oct – Dec
  • 14. Implementation: Unify Blended Search + Query Resubmit
  • 15. Query refinement options (Blended Search) Query resubmit options “Integration-lite”
  • 16. Implementation: Expanding the Net w/ Brokered Search ‱ When direct indexing isn’t practical  Harvard.edu search  HBS VideoTools (intranet only)  MBA Event Calendar (intranet only) ‱ A query is handed off to another search engine ‱ Results are returned “behind the scenes” as JavaScript Object Notation (JSON) / Python ‱ Ajax-like support of asynchronous search processes
  • 20. Implementation: One-offs ‱ Software Dev Docs (cmd line) $ find ./software/docs –name ‘*html’ | xargs grep -i oracle | less (returns 100s of docs) ‱ Built web-based search UI
  • 21. Agenda ‱ Background ‱ Roadmap ‱ Implementation  Analytics ‱ Benefits ‱ Challenges ‱ Next Steps
  • 25. Agenda ‱ Background ‱ Roadmap ‱ Implementation ‱ Analytics  Benefits ‱ Challenges ‱ Next Steps
  • 26. Benefits ‱ Single point of access for various repositories ‱ Shortcomings of underlying tools overcome ‱ Better access to content from rest of Harvard ‱ Traffic boost to e-commerce site
  • 27. Agenda ‱ Background ‱ Roadmap ‱ Implementation ‱ Analytics ‱ Benefits  Challenges ‱ Next Steps
  • 28. Challenges ‱ Search is never done ‱ Complex permissions issues ‱ SERP design convergence ‱ SharePoint
  • 29. Agenda ‱ Background ‱ Roadmap ‱ Implementation ‱ Analytics ‱ Benefits ‱ Challenges  Next Steps
  • 30. Next Steps ‱ Tackling the mixed-mode situation ‱ Integration with taxonomies ‱ Search experience within HBS applications ‱ Faceted search where rich metadata available ‱ Analytics feeding website design and vocabulary development
  • 31. Conclusion ‱ Tactical, iterative approach enabled significant progress ‱ Implementing simpler features/tweaks may have higher impact ‱ Your existing search engine may have more gas in it than you realize