SlideShare a Scribd company logo
QUERY EXPANSION WITH ENRICHED USER PROFILES FOR PERSONALIZED
SEARCH UTILIZING FOLKSONOMY DATA
ABSTRACT
Query expansion has been widely adopted in Web search as a way of tackling the ambiguity of
queries. Personalized search utilizing folksonomy data has demonstrated an extreme vocabulary
mismatch problem that requires even more effective query expansion methods. Co-occurrence
statistics, tag-tag relationships and semantic matching approaches are among those favored by
previous research. However, user profiles which only contain a user’s past annotation
information may not be enough to support the selection of expansion terms, especially for users
with limited previous activity with the system. We propose a novel model to construct enriched
user profiles with the help of an external corpus for personalized query expansion. Our model
integrates the current state-of-the-art text representation learning framework, known as word
embeddings, with topic models in two groups of pseudo-aligned documents. Based on user
profiles, we build two novel query expansion techniques. These two techniques are based on
topical weights-enhanced word embeddings, and the topical relevance between the query and the
terms inside a user profile respectively. The results of an in-depth experimental evaluation,
performed on two real-world datasets using different external corpora, show that our approach
outperforms traditional techniques, including existing non-personalized and personalized query
expansion methods.
EXISTING SYSTEM
Over the past number of years personalized search algorithms which utilize folksonomy data
have attracted significant attention in the literature . This is partially due to the relative
unavailability of users’ search and click-through history to independent researchers not
employed by, or engaged with, a commercial search engine. Another reason for utilizing
folksonomy data is that tags are highly ambiguous, representing a typical realworld Web search
scenario of short queries formulated by users. “Folksonomy” is a term typically used to describe
the social classification phenomenon. Online folksonomy services are used by millions of users
world-wide, enabling users to save and organize their online bookmarks with freely chosen short
text descriptors.
DISADVANTAGES:
 User profiles which contain only a user’s past annotation information may not be enough
to support the effective selection of expansion terms, especially for users who have had
limited previous activity with the system.
 Previous personalized QE research either favors tagtag relationships or relies on the co-
occurrence statistics of two terms.
PROPOSED SYSTEM
We tackle the challenge of personalized QE utilizing folksonomy data in a novel way by
integrating latent and deep semantics. We propose a novel model that integrates word
embeddings with topic models to construct enriched user profiles with the help of an external
corpus.We suggest two novel personalized QE techniques based on topical weights-enhanced
word embeddings, and the topical relevance between the query and the terms inside a user
profile. The techniques demonstrate significantly better results than previously proposed non-
personalized and personalized QE methods.
ADVANTAGES
Our model integrates the current state-of-the-art text representation learning framework, known
as word embeddings, with topic models in two groups of pseudo-aligned documents between
user annotations and documents from the external corpus. Based on these enhanced user profiles,
we then present two novel QE techniques.
The first technique approaches the problem by using topical weights-enhanced word embeddings
to select the best possible expansion terms.
The second technique calculates the topical relevance between the query and the terms inside a
user profile.
OBJECTIVES
 We tackle the challenge of personalized QE utilizing folksonomy data in a novel way by
integrating latent and deep semantics.
 We propose a novel model that integrates word embeddings with topic models to
construct enriched user profiles with the help of an external corpus.
 We suggest two novel personalized QE techniques based on topical weights-enhanced
word embeddings, and the topical relevance between the query and the terms inside a
user profile. The techniques demonstrate significantly better results than previously
proposed non-personalized and personalized QE methods.
Architecture Diagram
SYSTEM REQUIREMENTS
H/W SYSTEM CONFIGURATION:-
Processor - Pentium –IV
Speed - 1.5 Ghz
RAM - 512 MB(min)
Hard Disk - 40 GB
S/W SYSTEM CONFIGURATION
 Operating System :Windows95/98/2000/XP
 Application Server : Tomcat5.0/6.X
 Front End : HTML, Java, Jsp
 Scripts : JavaScript.
 Server side Script : Java Server Pages.
 Database Connectivity : Mysql.

More Related Content

What's hot (20)

PDF
A Review: Text Classification on Social Media Data
IOSR Journals
 
PDF
Iaetsd hierarchical fuzzy rule based classification
Iaetsd Iaetsd
 
PDF
K1803057782
IOSR Journals
 
PDF
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
ijdmtaiir
 
PDF
IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...
IJERD Editor
 
PDF
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD Editor
 
PDF
Iaetsd efficient filteration of unwanted messages
Iaetsd Iaetsd
 
PDF
G017415465
IOSR Journals
 
DOCX
A system to filter unwanted messages from osn user walls
IEEEFINALYEARPROJECTS
 
PDF
Custom-Made Ranking in Databases Establishing and Utilizing an Appropriate Wo...
ijsrd.com
 
PDF
50120140502013
IAEME Publication
 
PDF
IRJET- A Novel Technique for Inferring User Search using Feedback Sessions
IRJET Journal
 
PDF
A New Algorithm for Inferring User Search Goals with Feedback Sessions
IJERA Editor
 
PDF
Query- And User-Dependent Approach for Ranking Query Results in Web Databases
IOSR Journals
 
PPTX
A system to filter unwanted messages from OSN user walls
Gajanand Sharma
 
PDF
User search goal inference and feedback session using fast generalized – fuzz...
eSAT Publishing House
 
DOCX
Dynamic personalized recommendation on sparse data
JPINFOTECH JAYAPRAKASH
 
PDF
USPatents
Rakesh Barve
 
PDF
Performance Evaluation of Query Processing Techniques in Information Retrieval
idescitation
 
PDF
Using user personalized ontological profile to infer semantic knowledge for p...
Joao Luis Tavares
 
A Review: Text Classification on Social Media Data
IOSR Journals
 
Iaetsd hierarchical fuzzy rule based classification
Iaetsd Iaetsd
 
K1803057782
IOSR Journals
 
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
ijdmtaiir
 
IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...
IJERD Editor
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD Editor
 
Iaetsd efficient filteration of unwanted messages
Iaetsd Iaetsd
 
G017415465
IOSR Journals
 
A system to filter unwanted messages from osn user walls
IEEEFINALYEARPROJECTS
 
Custom-Made Ranking in Databases Establishing and Utilizing an Appropriate Wo...
ijsrd.com
 
50120140502013
IAEME Publication
 
IRJET- A Novel Technique for Inferring User Search using Feedback Sessions
IRJET Journal
 
A New Algorithm for Inferring User Search Goals with Feedback Sessions
IJERA Editor
 
Query- And User-Dependent Approach for Ranking Query Results in Web Databases
IOSR Journals
 
A system to filter unwanted messages from OSN user walls
Gajanand Sharma
 
User search goal inference and feedback session using fast generalized – fuzz...
eSAT Publishing House
 
Dynamic personalized recommendation on sparse data
JPINFOTECH JAYAPRAKASH
 
USPatents
Rakesh Barve
 
Performance Evaluation of Query Processing Techniques in Information Retrieval
idescitation
 
Using user personalized ontological profile to infer semantic knowledge for p...
Joao Luis Tavares
 

Similar to QUERY EXPANSION WITH ENRICHED USER PROFILES FOR PERSONALIZED SEARCH UTILIZING FOLKSONOMY DATA (20)

PDF
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
SBGC
 
PDF
Naresh sharma
Nishanthi Bheeman
 
PDF
Semantic web personalization
Alexander Decker
 
PPTX
Web Minnig and text mining presentation
ZahraWaheed9
 
PPTX
User friendly pattern search paradigm
Migrant Systems
 
PDF
EMPLOYING THE CATEGORIES OF WIKIPEDIA IN THE TASK OF AUTOMATIC DOCUMENTS CLUS...
IJCI JOURNAL
 
DOCX
JPJ1421 Facilitating Document Annotation Using Content and Querying Value
chennaijp
 
DOCX
JAVA 2013 IEEE DATAMINING PROJECT PMSE A Personalized Mobile Search Engine
IEEEGLOBALSOFTTECHNOLOGIES
 
DOCX
Personalized mobile search engine
IEEEFINALYEARPROJECTS
 
PDF
Ay3313861388
IJMER
 
PDF
Projection Multi Scale Hashing Keyword Search in Multidimensional Datasets
IRJET Journal
 
PDF
E0322035037
inventionjournals
 
DOCX
JPJ1419 Discovering Emerging Topics in Social Streams via Link-Anomaly Detec...
chennaijp
 
PDF
Kp3518241828
IJERA Editor
 
PDF
Classification of News and Research Articles Using Text Pattern Mining
IOSR Journals
 
DOC
View the Microsoft Word document.doc
butest
 
DOC
View the Microsoft Word document.doc
butest
 
DOC
View the Microsoft Word document.doc
butest
 
PDF
an efficient approach for co extracting opinion targets based in online revie...
INFOGAIN PUBLICATION
 
PDF
IRJET - Deep Collaborrative Filtering with Aspect Information
IRJET Journal
 
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
SBGC
 
Naresh sharma
Nishanthi Bheeman
 
Semantic web personalization
Alexander Decker
 
Web Minnig and text mining presentation
ZahraWaheed9
 
User friendly pattern search paradigm
Migrant Systems
 
EMPLOYING THE CATEGORIES OF WIKIPEDIA IN THE TASK OF AUTOMATIC DOCUMENTS CLUS...
IJCI JOURNAL
 
JPJ1421 Facilitating Document Annotation Using Content and Querying Value
chennaijp
 
JAVA 2013 IEEE DATAMINING PROJECT PMSE A Personalized Mobile Search Engine
IEEEGLOBALSOFTTECHNOLOGIES
 
Personalized mobile search engine
IEEEFINALYEARPROJECTS
 
Ay3313861388
IJMER
 
Projection Multi Scale Hashing Keyword Search in Multidimensional Datasets
IRJET Journal
 
E0322035037
inventionjournals
 
JPJ1419 Discovering Emerging Topics in Social Streams via Link-Anomaly Detec...
chennaijp
 
Kp3518241828
IJERA Editor
 
Classification of News and Research Articles Using Text Pattern Mining
IOSR Journals
 
View the Microsoft Word document.doc
butest
 
View the Microsoft Word document.doc
butest
 
View the Microsoft Word document.doc
butest
 
an efficient approach for co extracting opinion targets based in online revie...
INFOGAIN PUBLICATION
 
IRJET - Deep Collaborrative Filtering with Aspect Information
IRJET Journal
 
Ad

More from Prasadu Peddi (17)

PDF
Pointers
Prasadu Peddi
 
PDF
String notes
Prasadu Peddi
 
DOCX
B.Com 1year Lab programs
Prasadu Peddi
 
DOCX
COMPUTING SEMANTIC SIMILARITY OF CONCEPTS IN KNOWLEDGE GRAPHS
Prasadu Peddi
 
DOCX
Energy-efficient Query Processing in Web Search Engines
Prasadu Peddi
 
DOCX
MINING COMPETITORS FROM LARGE UNSTRUCTURED DATASETS
Prasadu Peddi
 
DOCX
GENERATING QUERY FACETS USING KNOWLEDGE BASES
Prasadu Peddi
 
DOCX
UNDERSTAND SHORTTEXTS BY HARVESTING & ANALYZING SEMANTIKNOWLEDGE
Prasadu Peddi
 
DOCX
SOCIRANK: IDENTIFYING AND RANKING PREVALENT NEWS TOPICS USING SOCIAL MEDIA FA...
Prasadu Peddi
 
DOCX
COLLABORATIVE FILTERING-BASED RECOMMENDATION OF ONLINE SOCIAL VOTING
Prasadu Peddi
 
DOCX
DYNAMIC FACET ORDERING FOR FACETED PRODUCT SEARCH ENGINES
Prasadu Peddi
 
PPTX
A Cross Tenant Access Control (CTAC) Model for Cloud Computing: Formal Specif...
Prasadu Peddi
 
PPTX
Time and Attribute Factors Combined Access Control on Time-Sensitive Data in ...
Prasadu Peddi
 
PPTX
Attribute Based Storage Supporting Secure Deduplication of Encrypted D...
Prasadu Peddi
 
PPTX
RAAC: Robust and Auditable Access Control with Multiple Attribute Authorities...
Prasadu Peddi
 
PPTX
Provably Secure Key-Aggregate Cryptosystems with Broadcast Aggregate Keys for...
Prasadu Peddi
 
PPTX
Identity-Based Remote Data Integrity Checking With Perfect Data Privacy Prese...
Prasadu Peddi
 
Pointers
Prasadu Peddi
 
String notes
Prasadu Peddi
 
B.Com 1year Lab programs
Prasadu Peddi
 
COMPUTING SEMANTIC SIMILARITY OF CONCEPTS IN KNOWLEDGE GRAPHS
Prasadu Peddi
 
Energy-efficient Query Processing in Web Search Engines
Prasadu Peddi
 
MINING COMPETITORS FROM LARGE UNSTRUCTURED DATASETS
Prasadu Peddi
 
GENERATING QUERY FACETS USING KNOWLEDGE BASES
Prasadu Peddi
 
UNDERSTAND SHORTTEXTS BY HARVESTING & ANALYZING SEMANTIKNOWLEDGE
Prasadu Peddi
 
SOCIRANK: IDENTIFYING AND RANKING PREVALENT NEWS TOPICS USING SOCIAL MEDIA FA...
Prasadu Peddi
 
COLLABORATIVE FILTERING-BASED RECOMMENDATION OF ONLINE SOCIAL VOTING
Prasadu Peddi
 
DYNAMIC FACET ORDERING FOR FACETED PRODUCT SEARCH ENGINES
Prasadu Peddi
 
A Cross Tenant Access Control (CTAC) Model for Cloud Computing: Formal Specif...
Prasadu Peddi
 
Time and Attribute Factors Combined Access Control on Time-Sensitive Data in ...
Prasadu Peddi
 
Attribute Based Storage Supporting Secure Deduplication of Encrypted D...
Prasadu Peddi
 
RAAC: Robust and Auditable Access Control with Multiple Attribute Authorities...
Prasadu Peddi
 
Provably Secure Key-Aggregate Cryptosystems with Broadcast Aggregate Keys for...
Prasadu Peddi
 
Identity-Based Remote Data Integrity Checking With Perfect Data Privacy Prese...
Prasadu Peddi
 
Ad

Recently uploaded (20)

PPTX
Heart Bleed Bug - A case study (Course: Cryptography and Network Security)
Adri Jovin
 
DOCX
CS-802 (A) BDH Lab manual IPS Academy Indore
thegodhimself05
 
PDF
Zilliz Cloud Demo for performance and scale
Zilliz
 
PPTX
美国电子版毕业证南卡罗莱纳大学上州分校水印成绩单USC学费发票定做学位证书编号怎么查
Taqyea
 
PDF
GTU Civil Engineering All Semester Syllabus.pdf
Vimal Bhojani
 
PPTX
265587293-NFPA 101 Life safety code-PPT-1.pptx
chandermwason
 
PDF
Electrical Engineer operation Supervisor
ssaruntatapower143
 
PPTX
Server Side Web Development Unit 1 of Nodejs.pptx
sneha852132
 
PDF
Reasons for the succes of MENARD PRESSUREMETER.pdf
majdiamz
 
PDF
MAD Unit - 2 Activity and Fragment Management in Android (Diploma IT)
JappanMavani
 
PPTX
Green Building & Energy Conservation ppt
Sagar Sarangi
 
PPTX
MobileComputingMANET2023 MobileComputingMANET2023.pptx
masterfake98765
 
PDF
Biomechanics of Gait: Engineering Solutions for Rehabilitation (www.kiu.ac.ug)
publication11
 
PDF
MAD Unit - 1 Introduction of Android IT Department
JappanMavani
 
PPTX
GitOps_Without_K8s_Training_detailed git repository
DanialHabibi2
 
PPTX
Day2 B2 Best.pptx
helenjenefa1
 
PDF
AI TECHNIQUES FOR IDENTIFYING ALTERATIONS IN THE HUMAN GUT MICROBIOME IN MULT...
vidyalalltv1
 
PPTX
The Role of Information Technology in Environmental Protectio....pptx
nallamillisriram
 
PDF
International Journal of Information Technology Convergence and services (IJI...
ijitcsjournal4
 
PPTX
Introduction to Design of Machine Elements
PradeepKumarS27
 
Heart Bleed Bug - A case study (Course: Cryptography and Network Security)
Adri Jovin
 
CS-802 (A) BDH Lab manual IPS Academy Indore
thegodhimself05
 
Zilliz Cloud Demo for performance and scale
Zilliz
 
美国电子版毕业证南卡罗莱纳大学上州分校水印成绩单USC学费发票定做学位证书编号怎么查
Taqyea
 
GTU Civil Engineering All Semester Syllabus.pdf
Vimal Bhojani
 
265587293-NFPA 101 Life safety code-PPT-1.pptx
chandermwason
 
Electrical Engineer operation Supervisor
ssaruntatapower143
 
Server Side Web Development Unit 1 of Nodejs.pptx
sneha852132
 
Reasons for the succes of MENARD PRESSUREMETER.pdf
majdiamz
 
MAD Unit - 2 Activity and Fragment Management in Android (Diploma IT)
JappanMavani
 
Green Building & Energy Conservation ppt
Sagar Sarangi
 
MobileComputingMANET2023 MobileComputingMANET2023.pptx
masterfake98765
 
Biomechanics of Gait: Engineering Solutions for Rehabilitation (www.kiu.ac.ug)
publication11
 
MAD Unit - 1 Introduction of Android IT Department
JappanMavani
 
GitOps_Without_K8s_Training_detailed git repository
DanialHabibi2
 
Day2 B2 Best.pptx
helenjenefa1
 
AI TECHNIQUES FOR IDENTIFYING ALTERATIONS IN THE HUMAN GUT MICROBIOME IN MULT...
vidyalalltv1
 
The Role of Information Technology in Environmental Protectio....pptx
nallamillisriram
 
International Journal of Information Technology Convergence and services (IJI...
ijitcsjournal4
 
Introduction to Design of Machine Elements
PradeepKumarS27
 

QUERY EXPANSION WITH ENRICHED USER PROFILES FOR PERSONALIZED SEARCH UTILIZING FOLKSONOMY DATA

  • 1. QUERY EXPANSION WITH ENRICHED USER PROFILES FOR PERSONALIZED SEARCH UTILIZING FOLKSONOMY DATA ABSTRACT Query expansion has been widely adopted in Web search as a way of tackling the ambiguity of queries. Personalized search utilizing folksonomy data has demonstrated an extreme vocabulary mismatch problem that requires even more effective query expansion methods. Co-occurrence statistics, tag-tag relationships and semantic matching approaches are among those favored by previous research. However, user profiles which only contain a user’s past annotation information may not be enough to support the selection of expansion terms, especially for users with limited previous activity with the system. We propose a novel model to construct enriched user profiles with the help of an external corpus for personalized query expansion. Our model integrates the current state-of-the-art text representation learning framework, known as word embeddings, with topic models in two groups of pseudo-aligned documents. Based on user profiles, we build two novel query expansion techniques. These two techniques are based on topical weights-enhanced word embeddings, and the topical relevance between the query and the terms inside a user profile respectively. The results of an in-depth experimental evaluation, performed on two real-world datasets using different external corpora, show that our approach outperforms traditional techniques, including existing non-personalized and personalized query expansion methods. EXISTING SYSTEM Over the past number of years personalized search algorithms which utilize folksonomy data have attracted significant attention in the literature . This is partially due to the relative unavailability of users’ search and click-through history to independent researchers not employed by, or engaged with, a commercial search engine. Another reason for utilizing folksonomy data is that tags are highly ambiguous, representing a typical realworld Web search scenario of short queries formulated by users. “Folksonomy” is a term typically used to describe the social classification phenomenon. Online folksonomy services are used by millions of users
  • 2. world-wide, enabling users to save and organize their online bookmarks with freely chosen short text descriptors. DISADVANTAGES:  User profiles which contain only a user’s past annotation information may not be enough to support the effective selection of expansion terms, especially for users who have had limited previous activity with the system.  Previous personalized QE research either favors tagtag relationships or relies on the co- occurrence statistics of two terms. PROPOSED SYSTEM We tackle the challenge of personalized QE utilizing folksonomy data in a novel way by integrating latent and deep semantics. We propose a novel model that integrates word embeddings with topic models to construct enriched user profiles with the help of an external corpus.We suggest two novel personalized QE techniques based on topical weights-enhanced word embeddings, and the topical relevance between the query and the terms inside a user profile. The techniques demonstrate significantly better results than previously proposed non- personalized and personalized QE methods. ADVANTAGES Our model integrates the current state-of-the-art text representation learning framework, known as word embeddings, with topic models in two groups of pseudo-aligned documents between user annotations and documents from the external corpus. Based on these enhanced user profiles, we then present two novel QE techniques. The first technique approaches the problem by using topical weights-enhanced word embeddings to select the best possible expansion terms. The second technique calculates the topical relevance between the query and the terms inside a user profile.
  • 3. OBJECTIVES  We tackle the challenge of personalized QE utilizing folksonomy data in a novel way by integrating latent and deep semantics.  We propose a novel model that integrates word embeddings with topic models to construct enriched user profiles with the help of an external corpus.  We suggest two novel personalized QE techniques based on topical weights-enhanced word embeddings, and the topical relevance between the query and the terms inside a user profile. The techniques demonstrate significantly better results than previously proposed non-personalized and personalized QE methods. Architecture Diagram SYSTEM REQUIREMENTS
  • 4. H/W SYSTEM CONFIGURATION:- Processor - Pentium –IV Speed - 1.5 Ghz RAM - 512 MB(min) Hard Disk - 40 GB S/W SYSTEM CONFIGURATION  Operating System :Windows95/98/2000/XP  Application Server : Tomcat5.0/6.X  Front End : HTML, Java, Jsp  Scripts : JavaScript.  Server side Script : Java Server Pages.  Database Connectivity : Mysql.