Aggregating Multiple Dimensions for Computing Document Relevance

Aggregating Multiple Dimensions
for Computing Document Relevance
Mauro Dragoni
Fondazione Bruno Kessler (FBK), Shape and Evolving Living Knowledge Unit (SHELL)
2nd KEYSTONE Summer School
Santiago de Compostela, July 21st 2016
1

How will we spend time today?
 Our Goal:
to understand how documents can be evaluated by adopting a multi-
criteria framework
2
Presentation of the theoretical framework
Case Study 1
Representing
documents through
different layers
Case Study 2
Combining user
profiles, queries, and
document content for
computing relevance
Case Study 3
Merge and explode
Case Study 1 and Case
Study 2… 

Why is this topic interesting?
 Indexing documents and querying repositories is not only a matter of
weighting terms
 At the end of this lesson you should be able to:
 consider a document from different perspectives
 understand why YOU can be part of the document score
 know how to treat different type of information content
 What might I expect from you?
 To see a paper on this topic published in the near future… 
 To get new ideas, proposed by you…
3

Some Background
 The main idea behind this topic is “multi-criteria decision making”
 What does it mean?
 Suppose to have an entity E and a set C of n criteria
 We need to evaluate, for each criterion Ci how much E satisfies Ci
 We have to aggregate all satisfaction degrees for evaluating E
 Some suggested papers
 Ronald R. Yager. Modeling prioritized multicriteria decision making. IEEE Trans. Systems, Man, and
Cybernetics, Part B 34(6): 2396-2404 (2004)
 Ronald R. Yager. Prioritized aggregation operators. Int. J. Approx. Reasoning 48(1): 263-274 (2008)
 Célia da Costa Pereira, Mauro Dragoni, Gabriella Pasi. Multidimensional relevance: Prioritized
aggregation in a personalized Information Retrieval setting. Inf. Process. Manage. 48(2): 340-357
(2012)
 Francesco Corcoglioniti, Mauro Dragoni, Marco Rospocher, Alessio Palmero Aprosio. Knowledge
Extraction for Information Retrieval. ESWC 2016: 317-333
4

Further Readings
 Fuzzy Logic
 Zadeh book and papers
 Knowledge Extraction
 Semantic Web (ISWC conference series, KBS and JWS journals, …)
 Knowledge Management (KR, IJCAI, AAAI, …)
 Natural Language Processing (ACL, COLING, …)
 User Modeling and Interaction
 UMAP proceedings
 HCI papers
5

Introductory Example
 John is looking for a bicycle for his little son
 John takes care of two criteria: “safety” and “inexpensiveness”
 John considers “safety” > “inexpensiveness”
 We may two scenarios:
1. John is not able to find a “safe” bicycle that is also “cheap”.
2. John has a low budget. Thus, he has to find a trade-off between the two criteria.
6
E
C1 C2

Problem Representation
 Components
 the set C of the n considered criteria: C = {C1, …, Cn};
 the collection D of entities (documents in the specific case of IR);
 an aggregation function F computing the score F(C1(d),…, Cn(d)) of each
document d contained in D;
 a priority model P defined by… someone (user, system maintainer, etc.);
 a weighting schema W.
7

Weighting Schema – Expert-based choice
 Weights are arbitrarily chosen by an expert.
 No rules for computing them.
 For example:
 C1  λ1 = 0.7
 C2  λ2 = 0.5
 C3  λ3 = 0.6
 C4  λ4 = 0.3
 You need to justify the values you choose.
8

Weighting Schema – Priority-based choice
 Weights are computed “automatically” based on the priority between
criteria.
 For each document d, the weight of the most important criterion C1 is set
to 1.0 by definition.
 The weights of the other criteria are computed as follows:
9

Weight Schema - Considerations
 A weighting schema can be decided a-priori but…
 We can learn a new weighting schema:
 from learn-to-rank dataset, or
 from the IR system usage.
 The choice of the weighting schema, obviously, affects the effectiveness
of your information retrieval system.
 Where can we apply such weighting schema?
10

Three (not exhaustive) Operators
 As you can imagine… there are different ways for combining weights and
criteria
 Operator 1: “Scoring”
 weighted criteria scores are summed
 Operator 2: “Min” or “And”
 among weighted criteria scores, minimum score is selected
 Operator 3: “Max” or “Or”
 among weighted criteria scores, maximum score is selected
11

The “Scoring” Operator
 The overall document score is computed by summing the weighted
scores computed for all criteria.
 The score computed on the most important criterion leads the overall
document score.
 Less important criteria help in refining the overall document score.
12

The “And” (or “Min”) Operator
 The document score is strongly dependent on the degree of satisfaction
of the least satisfied criterion
 Very restrictive operator
 Suggestion: consider criteria that are really relevant for a user!!!
13

The “Or” (or “Max”) Operator
 Dangerous operator!
 Recommendation: criteria with a satisfaction degree of zero do not have
to be considered.
 It is useful only when priority between criteria is not used.
 Weighting schema is manually defined
 Weight of less important criteria are not based on the value of the most
important ones.
14

Operators’ Properties
 Boundary Conditions
 Continuity
 Monotonicity (just for Scoring)
 Absorbing Element (“0”, for Scoring and Min operators)
15

The Operators in Action
 Assume to have a document D composed as follows:
16
Title
Abstract
Introduction
Content
Title  C1
Abstract  C2
Introduction  C3
Content  C4

 Suppose to perform a query as follows:
 Q = {qt1, qt2, qt3}
 Assume that, for each document field, you have a normalized similarity
values:
 sim(Q, DTitle) = 0.5
 sim(Q, DAbstract) = 0.4
 sim(Q, DIntroduction) = 0.2
 sim(Q, DContent) = 0.7
 As you can imagine, by using different priorities and different
aggregations, the document score will be different.
17

Criteria score: C1 = 0.5; C2 = 0.8; C3 = 0.2; C4 = 0.7
Priority schemas:
P1: C1 > C2 > C3 > C4
P2: C1 > C2 > C4 > C3
Weights:
for P1: for P2:
w1: 1.0 w1: 1.0
w2: 1.0 * 0.5 = 0.5 w2: 1.0 * 0.5 = 0.5
w3: 0.5 * 0.8 = 0.4 w3: 0.5 * 0.8 = 0.4
w4: 0.4 * 0.2 = 0.08 w4: 0.4 * 0.7 = 0.28
18

 Document score
 “Scoring” operator:
• DP1 = (0.5 * 1.0) + (0.8 * 0.5) + (0.2 * 0.4) + (0.7 * 0.08) = 1.036
• DP2 = (0.5 * 1.0) + (0.8 * 0.5) + (0.7 * 0.4) + (0.2 * 0.28) = 1.236
 “And” operator:
• DP1 = min(0.5^1.0, 0.8^0.5, 0.2^0.4, 0.7^0.08) = min(0.5, 0.89, 0.53, 0.97) = 0.5
• DP2 = min(0.5^1.0, 0.8^0.5, 0.7^0.4, 0.2^0.28) = min(0.5, 0.89, 0.87, 0.64) = 0.5
 “Or” operator:
• DP1 = max(0.5^1.0, 0.8^0.5, 0.2^0.4, 0.7^0.08) = max(0.5, 0.89, 0.53, 0.97) = 0.97
• DP2 = max(0.5^1.0, 0.8^0.5, 0.7^0.4, 0.2^0.28) = max(0.5, 0.89, 0.87, 0.64) = 0.89
19

Any question so far?
20
Timeout…

Case Study 1 – The Scenario
 Keyword search over a multi-layer representation of documents
 Documents and queries structure:
 Textual layer: natural language text
 Metadata layers:
• Entity Linking
• Predicates
• Roles/Types
• Timing Information
 Problems:
 How to compute the score for each layer?
 How to aggregate such scores?
 How to weight each layer?
21

 Natural language content is enriched with four metadata/semantic layers
 URI Layer: links with entities detected into the text and mapped to DBpedia
entities
 TYPE Layer: conceptual classification of the named entities detected into the
text and mapped with both DBpedia and Yago knowledge bases
 TIME Layer: metadata related to the temporal mentions find into the text by
using a temporal expression recognizer (ex. “the eighteenth century”, “2015-18-
12”, etc.)
 FRAME Layer: output of the application of semantic role labeling techniques.
Generally, this output includes predicates and their arguments describing a
specific role in the context of the predicate.
Example:
“He has been influenced by Carl Gauss” 
[framebase:Subjective_influence; dbpedia:Carl_Friedrich_Gauss]
22

Case Study 1 – Example
 Text: “astronomers influenced by Gauss”
 Layers
 URI Layer: “dbpedia:Carl_Friedrich_Gauss”
 TYPE Layer: “yago:GermanMathematicians”, “yago:NumberTheorists”,
“yago:FellowsOfTheRoyalSociety”
 TIME Layer: “day:1777-04-30”, “day:1855-02-23”, “century:1700”
 FRAME Layer: “Subjective_influence.v_Carl_Friedrich_Gauss”
 Annotations provided by PIKES (https://pikes.fbk.eu)
23

Case Study 1 - Evaluation
 331 documents, 35 queries
 Jörg Waitelonis, Claudia Exeler, Harald Sack. Enabled Generalized Vector Space Model to
Improve Document Retrieval. NLP-DBPEDIA@ISWC 2015: 33-44
 Multi-value relevance (1=irrelevant, 5=relevant)
 Diverse queries: from keyword-base search to queries requiring semantic
capabilities
24

 2 baselines:
 Google custom search API
 Textual layer only (~Lucene)
 Measures: Prec1,5,10, MAP, MAP10, NDCG, NDCG10
 Same weights for textual and semantic layers:
 TEXTUAL (50%)
 URI (12,5%), TYPE (12,5%), FRAME (12,5%), TIME (12,5%)
25

26
Approach/
System
Prec1 Prec5 Prec10 NDCG NDCG10 MAP MAP10
Google 0.543 0.411 0.343 0.434 0.405 0.255 0.219
Textual 0.943 0.669 0.453 0.832 0.782 0.733 0.681
KE4IR 0.971 0.680 0.474 0.854 0.806 0.758 0.713
KE4IR vs. Textual 3.03% 1.71% 4.55% 2.64% 2.99% 3.50% 4.74%

27
Layers (TEXTUAL+) Prec1 Prec5 Prec10 NDCG NDCG10 MAP MAP10
URI,TYPE,FRAME,TIME 0.971 0.680 0.474 0.854 0.806 0.758 0.713
URI,TYPE,FRAME 0.971 0.680 0.474 0.853 0.804 0.757 0.712
URI,TYPE,TIME 0.971 0.680 0.474 0.851 0.802 0.757 0.712
URI,TYPE 0.971 0.680 0.474 0.849 0.801 0.755 0.710
URI,FRAME,TIME 0.971 0.674 0.465 0.844 0.796 0.750 0.702
URI,FRAME 0.971 0.674 0.465 0.842 0.795 0.749 0.702
URI,TIME 0.971 0.674 0.465 0.840 0.791 0.747 0.700
TYPE,FRAME,TIME 0.943 0.674 0.471 0.848 0.799 0.745 0.700
TYPE,TIME 0.943 0.674 0.471 0.843 0.794 0.743 0.697
TYPE,FRAME 0.943 0.674 0.468 0.847 0.797 0.743 0.695
FRAME,TIME 0.943 0.674 0.462 0.842 0.793 0.741 0.693

Case Study 1 – What We Learnt
 How the effectiveness of a system can be affected if we change weights.
 In this specific case, the use of an expert-based weighting schema helps
you in balancing the importance of the semantic information…
 … however, we are using learning to rank for identifying potential
priorities between used layers.
 Further lessons more related to the use of semantic layers.
 Future work: to apply the approach to larger collections.
29

Any question on
Case Study 1?
30
Timeout…

 Combine document information with user profiles.
 Assumption: you already have computed user profiles.
 Which information can you use?
 RELIABILITY: How much a user trusts the document source.
 COVERAGE: How strongly a user profiles is represented in a document
(inclusion of a user profiles into a document).
 APPROPRIATENESS: How much a document satisfies a user profile (similarity
between user profile and document).
 ABOUTNESS: Trivial criterion, how much a document matches the performed
query
31

Case Study 2 – Reliability
 Why do I trust information source differently?
 How much do you trust an information source?
 you might fix such values;
 you might infer them.
32

Case Study 2 – Coverage
 The “coverage” criterion allows to compute how strongly a user profile is
contained in the document
 Suppose to have a profile of a user interested in the following topics:
 c = {sports, economics}
 Suppose to have a document talking about the following topics:
 d= {violence, politics, economics, sports}
 c = {0, 0, 1, 1} d = {1, 1, 1, 1}  Coverage(c,d) = 1.0
33

Case Study 2 – Appropriateness
 The “appropriateness” criterion allows to compute how much a
document satisfies a user profile
 Suppose to have a profile of a user interested in the following topics:
 c = {sports, economics}
 Suppose to have a document talking about the following topics:
 d= {violence, politics, economics, sports}
 c = {0, 0, 1, 1} d = {1, 1, 1, 1}  Appropriateness(c,d) = 0.5
34

Case Study 2 – Aboutness
 The “classic” similarity between a query and documents contained in a
repository.
 Many model available… and various adaptations based on the context.
35

Case Study 2 – Validation
 The Reuters RCV1 Collection has been used for creating user profiles and
for generating user queries.
 20 users have been involved in the evaluation campaign.
 Different aggregation schemas have been tested.
36

Case Study 2 – Validation (Ab > Ap > C > R)
37

Case Study 2 – What We Learnt
 When users are involved, it is very difficult to define an aggregation
schema.
 The same occurs for the priority between criteria.
 Creating (or learning) a user profiles is already a big problem itself.
 The quality of user profiles significantly affects the effectiveness of the
retrieval algorithm.
 If you start playing with criteria and weight schemas, you will never end!!!
38

Any question on
Case Study 2?
39
Timeout…

Case Study 3
 Let’s get back to the first simple example…
40
Title
Abstract
Introduction
Content
Title  C1
Abstract  C2
Introduction  C3
Content  C4

Case Study 3 – Suppose that…
 Each field has been annotated with different ontologies, but belonging to
the same domain
 this means that you have, for the same field, many layers with different
annotations… one for each used ontology
 Your repository contains documents coming from different sources
 is the reliability of each repository the same?
 Your users have a history
 Users profiles need to be updated
 this aspect is out of the scope of this talk… but you should be aware of it… 
 Any other idea?
41

Exploding Fields
42
You have something to think about… Good luck!!!

So… for concluding
 Considering retrieval as a multi-criteria decision making problem is
interesting to explore.
 There is room for investigating a lot of stuff.
 Do not be scary on using user profiles.
 I invite you to consider recent works on simulating user interactions with IR
systems
• David Maxwell, Leif Azzopardi. Simulating Interactive Information Retrieval: SimIIR: A
Framework for the Simulation of Interaction. SIGIR 2016: 1141-1144 (+ the tutorial he
gave)
 My suggestion: try to combine
 content
 semantic metadata
 users history
43

44
It’s time for questions…
Mauro Dragoni
Fondazione Bruno Kessler
https://shell.fbk.eu/index.php/Mauro_Dragoni
dragoni@fbk.eu

Aggregating Multiple Dimensions for Computing Document Relevance

More Related Content

What's hot (19)

Viewers also liked (9)

Similar to Aggregating Multiple Dimensions for Computing Document Relevance (20)

Recently uploaded (20)

Aggregating Multiple Dimensions for Computing Document Relevance