SlideShare a Scribd company logo
AI in between online and offline discourse - and
what is the role of ChatGPT in all of that?
Denkreise, Universität Bonn
Stefan Dietze, 10.05.2023
2
Discourse Interactions
Algorithms/AI
Discourse, interactions, AI: complex interdependencies
Society, Media, Politics & Policies
(Offline & Online)
3
Discourse, interactions, AI: complex interdependencies
AI-based methods to understand & influence online
discourse & interactions (PART I)
▪ Recommendation & ranking in Web search (e.g.
Google), shopping or social media
▪ Social media feeds & timelines
▪ Fact checking, detection of misinformation (e.g.
news & social media)
▪ Classification of users (e.g. into political
leanings, experts/lay people etc)
▪ Hate-speech detection
▪ Stance & sentiment detection
▪ …
AI-based methods trained on & influencing data
from discourse & interactions
Discourse Interactions
Algorithms/AI
4
Discourse, interactions, AI: complex interdependencies
Contemporary and future AI challenges (PART II)
▪ Self-reinforcing cycle between AI &
online/offline behaviour: AI influenced by /
trained on data from discourse/interactions
that are in turn influenced by AI and so on
▪ Biases (in models & behaviour) => echo
chambers, filter bubbles etc
▪ Transparency, reproducibility, decay
▪ ….
Discourse Interactions
Algorithms/AI
5
AI for mining claims and stances in online discourse
stance,
claim trustworthiness?
stance,
claim trustworthiness?
6
Detecting stances towards claims/opinions in online discourse
Motivation
▪ Problem: detecting stance of documents (e.g. Web pages, scientific
publication) towards a given claim (unbalanced class distribution)
▪ Motivation: stance of documents (in particular disagreement) useful
(a) as signal for truthfulness (fake news detection) and (b) document or
source classification (PLDs, publishers)
Approach
▪ Cascading binary classifiers: addressing individual issues (e.g.
misclassification costs) per step
▪ Features, e.g. textual similarity (Word2Vec etc), sentiments, LIWC, etc.
▪ Best-performing models: 1) SVM with class-wise penalty, 2) CNN, 3)
SVM with class-wise penalty
▪ Experiments on FNC-1 dataset (and FNC baselines)
Results
▪ Minor overall performance improvement
▪ Improvement on disagree class by 27%
(but still far from robust)
A. Roy, A. Ekbal, S. Dietze, P. Fafalios, Exploiting stance hierarchies for cost-sensitive stance detection of
Web documents, IJIS2022
7
ClaimsKG: a knowledge graph of fact-checked claims (to feed AI)
Motivation
▪ Fact-checked claims spread across various
(unstructured) fact-checking sites
▪ Example: finding claims about / made by US
republican politicians across the Web?
Approach
▪ Harvesting claims & metadata from fact-
checking/news sites (e.g. snopes.com,
Politifact.com etc); currently approx. 70.000
claims
▪ Information extraction & linking
o Linking mentioned entities eg POTUS and
Trump to „wikipedia:DonaldTrump“
o Normalisation of ratings (true, false,
mixture, other); coreference resolution of
claims
o Exposing data through established
vocabularies and W3C standards and APIs
https://data.gesis.org/claimskg/
A. Tchechmedjiev, P. Fafalios, K. Boland, S. Dietze, B. Zapilko, K. Todorov,
ClaimsKG – A Live Knowledge Graph of fact-checked Claims, ISWC2019
8
ClaimsKG as training corpus for AI models for fact-checking
▪ Ground truth data for AI-based methods for fact-
checking
▪ Initial step for any fact-checker when „check-worthy
claims“ appear: „Claim Retrieval“ i.e. identifying if an
arbitrary statement (eg from a news article) has
already been fact-checked (ClaimsKG as knowledge
base)
▪ Shared AI task: „CLEF2022-CheckThat! Lab“
▪ Unsupervised approach „SimBA“ achieving state-of-
the-art performance (P@1 = 94%)
Hövelmeyer, A., Boland, K., Dietze, S., SimBa at CheckThat! 2022: Lexical
and Semantic Similarity Based Detection of Verified Claims in an
Unsupervised and Supervised Way, CLEF Working Notes 2022, CLEF 2022-
Conference and Labs of the Evaluation Forum.
http://dbpedia.org/resource/Tim_Berners-Lee
wna:positive-emotion
onyx:Intensity "0.75"
onyx:Intensity "0.0"
http://dbpedia.org/resource/Solid
wna:negative-emotion
AI for mining opinions: the case of Twitter
Challenges
▪ Heterogenity: multimodal, multilingual, informal,
“noisy” language
▪ Context dependence: interpretation of
tweets/posts (entities, sentiments) requires
consideration of context (e.g. time, linked content),
“Dusseldorf” => City or Football team
▪ Dynamics & scale: e.g. 6000 tweets per second,
plus interactions (retweets etc) and context (e.g.
25% of tweets contain URLs)
▪ Evolution and temporal aspects: evolution of
interactions over time crucial for many social
sciences questions
▪ Representativity and bias: demographic
distributions not known a priori in archived data
collections
TweetsKB – a large-scale research corpus of societal opinions
▪ Harvesting & archiving of 14 Billion tweets
(permanent collection from Twitter 1% sample since
2013)
▪ Information extraction pipeline to build a KG of
entities, interactions & sentiments
(distributed Map/Reduce batch processing)
o Entity linking with knowledge graph/DBpedia
(“president”/“potus”/”trump” =>
dbp:DonaldTrump)
o Sentiment analysis/annotation
o Geotagging
o Lifting into knowledge graph schema
Dimitrov, D., Baran, E., Fafalios, P., Yu, R., Zhu, X., Zloch, M., Dietze, S., TweetsCOV19 – A Knowledge Base of Semantically Annotated Tweets about the
COVID-19 Pandemic, CIKM2020
https://data.gesis.org/tweetskb
https://data.gesis.org/tweetskb
▪ Harvesting & archiving of 14 Billion tweets
(permanent collection from Twitter 1% sample since
2013)
▪ Information extraction pipeline to build a KG of
entities, interactions & sentiments
(distributed Map/Reduce batch processing)
o Entity linking with knowledge graph/DBpedia
(“president”/“potus”/”trump” =>
dbp:DonaldTrump)
o Sentiment analysis/annotation
o Geotagging
o Lifting into knowledge graph schema
▪ Public, privacy-aware, large-scale research corpus
of public opinions and their evolution
=> interdisciplinary research
Dimitrov, D., Baran, E., Fafalios, P., Yu, R., Zhu, X., Zloch, M., Dietze, S., TweetsCOV19 – A Knowledge Base of Semantically Annotated Tweets about the
COVID-19 Pandemic, CIKM2020
TweetsKB – a large-scale research corpus of societal opinions
Social science research using TweetsKB
https://dd4p.gesis.org
Investigating Vaccine Hesitancy in DACH countries
Germany suspends
vaccinations with Astra
Zeneca
Twitter discourse zu “Impfbereitschaft”
Social science research using TweetsKB
Investigating Vaccine Hesitancy in DACH countries
https://dd4p.gesis.org
Social science research using TweetsKB
Vaccine Hesitancy– key topics in “safety” category
„Schwangerschaft“ „Kimmich“
„Alter“
„Nebenwirkungen“
„Herzinfarkt“
„Zulassung“
https://dd4p.gesis.org
17
▪ AI4Sci project: understanding and classification of science discourse online (news, social Web)
▪ Percentage of tweets containing
links to scientific articles (journals,
publishers, science blogs etc)
▪ Uses list of > 30 K science web
domains
▪ Data source: TweetsKB
(https://data.gesis.org/tweetskb/),
> 10 bn tweets archived since 2013
https://ai4sci-project.org/
How about scientific discourse on the Web?
Example: Twitter
How about scientific discourse on the Web?
Example: Twitter
18
https://ai4sci-project.org/
Science claim
Science reference
Science relevance
No science
Science reference
Hafid, S., Schellhammer, S., Bringay, S., Todorov, K., Dietze, S., "SciTweets - A Dataset and Annotation Framework for Detecting Scientific Online
Discourse", CIKM2022
SciTweets dataset & classifier
20
▪ Ground truth dataset, heuristics-based sampling
strategy and annotation framework for testing
classification models
▪ 1261 expert-labeled tweets across all
classes/labels
▪ Baseline classifiers based on SciBERT transformer
model (fine-tuned/tested on SciTweets)
▪ Ongoing: analysis of large-scale science discourse
and its evolution
Hafid, S., Schellhammer, S., Bringay, S., Todorov, K., Dietze, S., SciTweets - A Dataset and Annotation Framework for Detecting Scientific Online Discourse,
CIKM2022
https://ai4sci-project.org/
21
But: do we actually need dedicated NLP methods after ChatGPT?
• ChatGPT does not outperform specialized
models in most NLP tasks
• ChatGPT’s impressiveness stems from its
capacity to do many things reasonably
well
• ChatGPT does not mark the end of AI/NLP
research
Source: http://opensamizdat.com/posts/chatgpt_survey/
24
Discourse, interactions, AI: complex interdependencies
Contemporary and future AI challenges (PART II)
▪ Self-reinforcing cycle between AI &
online/offline behaviour: AI influenced by /
trained on data from discourse/interactions
that are in turn influenced by AI and so on
▪ Biases (in models & behaviour) => echo
chambers, filter bubbles etc
▪ Transparency, reproducibility, decay
▪ ….
Discourse Interactions
Algorithms/AI
25
What‘s the role of ChatGPT et al
in all this?
• ChatGPT didn’t come out of nowhere but is a
continuation of trends throughout the last decades
• Mostly powered by progress in (deep) representation
& transfer learning
26
What is driving the recent AI advances?
Deep & representation learning for language understanding
Percentage of deep learning papers in major NLP conferences
(Source: Young et al., Recent Trends in Deep Learning Based Natural Language Processing)
• Pretraining of embeddings: predicting
low-dimensional vector representations
of words & text, e.g. Word2Vec
[Mikolov et al., 2013]
• RNN/CNN architectures in
encoder/decoder settings (e.g. for
machine translation) [Vaswani et al.,
2017]
• Pretraining language models for task-
specific transfer learning, e.g., BERT -
Bidirectional Encoder Representations
from Transformers [Devlin et al., 2018]
T. Mikolov et al., Distributed Representations of Words and Phrases and their Compositionality, NIPS (2013)
J. Devlin et al., BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (2018)
A. Vaswani et al. Attention is all you need, NIPS (2017)
27
Transfer learning for language understanding
Goal: Language Model (“reusable language
representation”)
(Paris:France, Tokyo:Japan)
Approach: un-/self-supervised Machine Learning
(e.g. „Masked Language Modeling“)
Data: the more the merrier, e.g. Wikipedia, Google Book
Corpus (200 Bn words), Common Crawl, Google crawl of
indexed Web pages etc
Data cheap, computation costly
Goal: dedicated model for downstream task
(e.g. dialog system, Tweet classification => Tweet X:
RightWing)
Approach: supervised machine learning, reinforcement
learning
Data: labelled data
(e.g. expert annotated tweets, weak labels)
Data costly, computation cheap
Pretraining of foundational language models Finetuning/supervised ML („the old ways“)
28
Transfer learning for language understanding
Goal: Language Model (“reusable language
representation”)
(Paris:France, Tokyo:Japan)
Approach: un-/self-supervised Machine Learning
(e.g. „Masked Language Modeling“)
Data: the more the merrier, e.g. Wikipedia, Google Book
Corpus (200 Bn words), Common Crawl, Google crawl of
indexed Web pages etc
Data cheap, computation costly
Goal: dedicated model for downstream task
(e.g. dialog system, Tweet classification => Tweet X:
RightWing)
Approach: supervised machine learning, reinforcement
learning
Data: labelled data
(e.g. expert annotated tweets, weak labels)
Data costly, computation cheap
Pretraining of foundational language models Finetuning/supervised ML („the old ways“)
• Language understanding capabilities of
models established during pretraining
stage
• Paradigm shift in NLP / NLU due to
transferable foundational models
• Little/no finetuning required for specific
down-stream tasks („Zero-Shot Learning“)
• Requires no costly data annotation but can
be fed with massive volumes of text
• „The larger the better“
(GPT3 = 175 Bn parameters)
29
AI progress & societal impact/challenges
▪ Stopping the „arms race“? The bigger the
better?
▪ Concentration of power: platform providers
(gatekeeper to data), AI companies (gatekeeper
to models, infrastructure)
▪ Environmental cost (Bender et al., On the Dangers of
Stochastic Parrots: Can Language Models Be Too Big?)
▪ Misinformation from generative models (e.g.
deep fakes)
▪ But it‘s not only about AI: data, infrastructure,
information has been controlled by large
commercial platform providers for quite some
time
30
Crucial AI challenges that we/AI research is trying to tackle
▪ Transparency of AI models
▪ Access to data, infrastructure, models
▪ Reproducibility & explainability
▪ Bias of data & AI models
▪ Decay of AI models
▪ …
31
AI transparency
Source: https://motherboard.vice.com/en_us/article/j5npeg/why-is-google-translate-spitting-out-sinister-religious-prophecies/
• Neural machine translation as black box system => hard to pin down causes of „learned“ translations
• Potential reasons: adversarial data fed into NN; NN learned to generate syntactically well formed
sentences/statements even on nonsensical input
32
AI transparency: how to evaluate AI if it outperforms humans?
• Saturation of NLP/image processing
benchmarks over time. Initial
performance at -1, human performance
at 0.
• Challenges for performance evaluation
• Creating novel kinds of AI benchmarks
• Note: humans still outperform AI in
many (less specialized) tasks/respects,
e.g. “common-sense reasoning”
Kiela, D. et al., Dynabench: Rethinking Benchmarking in NLP, NAACL2021
33
AI decay: evolving language and discourse context
▪ LMs trained on large volumes of text
▪ Yet: strong vocabulary shift, over-
/underrepresentation of topics/vocabulary in
particular time periods (e.g. Twitter COVID19-
discourse 2020 vs 2019)
▪ AI models for online discourse analysis
require frequent training and updates
(access to data/infrastructure?)
▪ Efficient methods for updating large language
models Source: Hombaiah et al., “Dynamic Language Models for continuously
evolving Content”, SIGKDD2021
34
AI reproducibility: „A worrying analysis of neural recommender approaches”
Dacrema, M. F., Cremonesi, P., Jannach, D., 2019. Are we really making much progress? A worrying analysis of
recent neural recommendation approaches. In ACM RecSys2019. DOI:https://doi.org/10.1145/3298689.3347058
Are DL-based methods reproducible?
Do they actually beat simple baselines?
• Reproducibility crisis (due to lack of transparency)
• State-of-the-art crisis (driven by peer review crisis)
AI biases: learned and elevated by AI/deep learning models
Source: https://techcrunch.com/2016/03/24/microsoft-silences-its-new-a-i-bot-tay-after-twitter-users-teach-it-racism/
35
[N-word]
• Biases in human interactions can be learned and elevated by ML models => fairness and ethics of AI, data
quality and representativeness
• Crucial: research into understanding of AI models
36
Model understanding: do LLMs actually „understand“ language?
▪ Language model (e.g. GPT) prompting
as a way to assess its capacity to
comprehend language („intelligence“)
▪ How does prompt syntax influence
„knowledge retrieval“?
Linzbach, S., Tressel, T., Kallmeyer, L., Dietze, S., Jabeen, H., 2023. Decoding Prompt Syntax: Analysing its Impact on Knowledge Retrieval in Large Language Models. In
Companion Proceedings of the ACM Web Conference 2023
The role of syntax in question-answering through LLMs
37
▪ Experiments on BERT-based
(Transformer) models
▪ Testing for sentence typology (simple,
compound, etc) and
active/passive/nominalised verb
morphology
Linzbach, S., Tressel, T., Kallmeyer, L., Dietze, S., Jabeen, H., 2023. Decoding Prompt Syntax: Analysing its Impact on Knowledge Retrieval in Large Language Models. In
Companion Proceedings of the ACM Web Conference 2023
Model understanding: do LLMs actually „understand“ language?
The role of syntax in question-answering through LLMs
38
Model understanding: how to measure actual „intelligence“ of AI?
• AI benchmark datasets/tests, eg to probe their
commonsense reasoning skills, e.g. AI2Reasoning (ARC)
• Models learn to perform well on a task while struggling
to actually understand
• Example: ChatGPT produces reasonable-sounding
answers that are actually wrong
• Constructing benchmark datasets for evaluation of AI
models remains a huge challenge
Branco et al., Shortcutted Commonsense: Data Spuriousness in Deep Learning of Commonsense Reasoning, EMNLP 2021
39
To sum up: how to beat the cycle?
▪ AI research = creating understanding AI models, e.g. what
and how do LLMs learn?
(e.g. SFB Deep Linguistic Modeling)
▪ FAIR access to AI data & models (e.g. GESIS Methods Hub)
▪ Values-oriented ML: e.g. explainability > accuracy
▪ Realistic and fair benchmark datasets & setups for AI
evaluation
▪ Maintainability: efficient ways on how to train/update
foundational models (e.g. DD4P)
▪ Interdisiciplinary research into complex interdependencies
between AI & online / offline discourse (e.g. NEWORDER)
▪ Governance & incentives: monopolies, ecological aspects
and access to data/infrastructure are not only technical but
societal & political issues
Discourse Interactions
Algorithms/AI
Society, Media, Politics & Policies
(Offline & Online)
40
@stefandietze
https://stefandietze.net
http://gesis.org/en/kts
Thanks!
41
References
▪ Branco et al., Shortcutted Commonsense: Data Spuriousness in Deep Learning of Commonsense Reasoning, EMNLP 2021
▪ Linzbach, S., Tressel, T., Kallmeyer, L., Dietze, S., Jabeen, H., 2023. Decoding Prompt Syntax: Analysing its Impact on Knowledge
Retrieval in Large Language Models. In Companion Proceedings of the ACM Web Conference 2023
▪ Dacrema, M. F., Cremonesi, P., Jannach, D., 2019. Are we really making much progress? A worrying analysis of recent neural
recommendation approaches. In ACM RecSys2019. DOI:https://doi.org/10.1145/3298689.3347058
▪ Kiela, D. et al., Dynabench: Rethinking Benchmarking in NLP, NAACL2021
▪ Hombaiah et al., “Dynamic Language Models for continuously evolving Content”, SIGKDD2021
▪ Hafid, S., Schellhammer, S., Bringay, S., Todorov, K., Dietze, S., SciTweets - A Dataset and Annotation Framework for Detecting
Scientific Online Discourse, CIKM2022
▪ Dimitrov, D., Baran, E., Fafalios, P., Yu, R., Zhu, X., Zloch, M., Dietze, S., TweetsCOV19 – A Knowledge Base of Semantically
Annotated Tweets about the COVID-19 Pandemic, CIKM2020
▪ A. Tchechmedjiev, P. Fafalios, K. Boland, S. Dietze, B. Zapilko, K. Todorov, ClaimsKG – A Live Knowledge Graph of fact-checked
Claims, ISWC2019
▪ Hövelmeyer, A., Boland, K., Dietze, S., SimBa at CheckThat! 2022: Lexical and Semantic Similarity Based Detection of Verified Claims
in an Unsupervised and Supervised Way, CLEF Working Notes 2022, CLEF 2022- Conference and Labs of the Evaluation Forum.
▪ A. Roy, A. Ekbal, S. Dietze, P. Fafalios, Exploiting stance hierarchies for cost-sensitive stance detection of Web documents, IJIS2022
42
@stefandietze
https://stefandietze.net
http://gesis.org/en/kts

More Related Content

Similar to AI in between online and offline discourse - and what has ChatGPT to do with all of that? (20)

PDF
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...
Marco Brambilla
 
PDF
ISWC2023-McGuinnessTWC16x9FinalShort.pdf
Deborah McGuinness
 
PPTX
‘Big models’: the success and pitfalls of Transformer models in natural langu...
Leiden University
 
PDF
Research Knowledge Graphs at NFDI4DS & GESIS
Stefan Dietze
 
PDF
Credibility Ranking of Tweets during High Impact Events
IIIT Hyderabad
 
PDF
Analyzing Social media’s real data detection through Web content mining using...
IRJET Journal
 
PDF
Big Data Analytics course: Named Entities and Deep Learning for NLP
Christian Morbidoni
 
PDF
15. political discourseinthenewskb
ingeangevaare
 
PDF
On the ethical challenges in the use of AI/ML for research in science and eng...
Xi'an Jiaotong-Liverpool University
 
PDF
Can Social Media Analysis Improve Collective Awareness of Climate Change?
Diana Maynard
 
PDF
Wimmics Research Team 2015 Activity Report
Fabien Gandon
 
PDF
Real-time Tweet Analysis w/ Maltego Carbon 3.5.3
Shalin Hai-Jew
 
PDF
Research Knowledge Graphs at GESIS & NFDI4DataScience
Stefan Dietze
 
PDF
Natural Language Engineering in the Golden Age of Artificial Intelligence
Haithem Afli
 
PDF
Provenance in Data Science From Data Models to Context Aware Knowledge Graphs...
ammaroyantan
 
PDF
Towards the study of sentiment in the public opinion of science in Spanish
Technological Ecosystems for Enhancing Multiculturality
 
PDF
Identifying Relevant Temporal Expressions for Real-world Events
Nattiya Kanhabua
 
PPTX
2022 AAAI DSTC10 Invited Talk
Verena Rieser
 
PPT
Natural Language Processing & Semantic Models in an Imperfect World
Vital.AI
 
PPTX
Semantics of the Black-Box: Using knowledge-infused learning approach to make...
Amit Sheth
 
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...
Marco Brambilla
 
ISWC2023-McGuinnessTWC16x9FinalShort.pdf
Deborah McGuinness
 
‘Big models’: the success and pitfalls of Transformer models in natural langu...
Leiden University
 
Research Knowledge Graphs at NFDI4DS & GESIS
Stefan Dietze
 
Credibility Ranking of Tweets during High Impact Events
IIIT Hyderabad
 
Analyzing Social media’s real data detection through Web content mining using...
IRJET Journal
 
Big Data Analytics course: Named Entities and Deep Learning for NLP
Christian Morbidoni
 
15. political discourseinthenewskb
ingeangevaare
 
On the ethical challenges in the use of AI/ML for research in science and eng...
Xi'an Jiaotong-Liverpool University
 
Can Social Media Analysis Improve Collective Awareness of Climate Change?
Diana Maynard
 
Wimmics Research Team 2015 Activity Report
Fabien Gandon
 
Real-time Tweet Analysis w/ Maltego Carbon 3.5.3
Shalin Hai-Jew
 
Research Knowledge Graphs at GESIS & NFDI4DataScience
Stefan Dietze
 
Natural Language Engineering in the Golden Age of Artificial Intelligence
Haithem Afli
 
Provenance in Data Science From Data Models to Context Aware Knowledge Graphs...
ammaroyantan
 
Towards the study of sentiment in the public opinion of science in Spanish
Technological Ecosystems for Enhancing Multiculturality
 
Identifying Relevant Temporal Expressions for Real-world Events
Nattiya Kanhabua
 
2022 AAAI DSTC10 Invited Talk
Verena Rieser
 
Natural Language Processing & Semantic Models in an Imperfect World
Vital.AI
 
Semantics of the Black-Box: Using knowledge-infused learning approach to make...
Amit Sheth
 

More from Stefan Dietze (20)

PDF
NEWORDER Project - Science in the online knowledge order
Stefan Dietze
 
PDF
An interdisciplinary journey with the SAL spaceship – results and challenges ...
Stefan Dietze
 
PDF
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...
Stefan Dietze
 
PDF
Human-in-the-Loop: das Web als Grundlage interdisziplinärer Data Science Meth...
Stefan Dietze
 
PDF
Towards research data knowledge graphs
Stefan Dietze
 
PDF
Beyond research data infrastructures: exploiting artificial & crowd intellige...
Stefan Dietze
 
PDF
Using AI to understand everyday learning on the Web
Stefan Dietze
 
PDF
Analysing User Knowledge, Competence and Learning during Online Activities
Stefan Dietze
 
PDF
Analysing & Improving Learning Resources Markup on the Web
Stefan Dietze
 
PDF
Beyond Linked Data - Exploiting Entity-Centric Knowledge on the Web
Stefan Dietze
 
PDF
Big Data in Learning Analytics - Analytics for Everyday Learning
Stefan Dietze
 
PDF
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
Stefan Dietze
 
PDF
Mining and Understanding Activities and Resources on the Web
Stefan Dietze
 
PDF
Towards embedded Markup of Learning Resources on the Web
Stefan Dietze
 
PDF
Semantic Linking & Retrieval for Digital Libraries
Stefan Dietze
 
PDF
Linked Data for Architecture, Engineering and Construction (AEC)
Stefan Dietze
 
PDF
Dietze linked data-vr-es
Stefan Dietze
 
PDF
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Stefan Dietze
 
PDF
Turning Data into Knowledge (KESW2014 Keynote)
Stefan Dietze
 
PDF
From Data to Knowledge - Profiling & Interlinking Web Datasets
Stefan Dietze
 
NEWORDER Project - Science in the online knowledge order
Stefan Dietze
 
An interdisciplinary journey with the SAL spaceship – results and challenges ...
Stefan Dietze
 
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...
Stefan Dietze
 
Human-in-the-Loop: das Web als Grundlage interdisziplinärer Data Science Meth...
Stefan Dietze
 
Towards research data knowledge graphs
Stefan Dietze
 
Beyond research data infrastructures: exploiting artificial & crowd intellige...
Stefan Dietze
 
Using AI to understand everyday learning on the Web
Stefan Dietze
 
Analysing User Knowledge, Competence and Learning during Online Activities
Stefan Dietze
 
Analysing & Improving Learning Resources Markup on the Web
Stefan Dietze
 
Beyond Linked Data - Exploiting Entity-Centric Knowledge on the Web
Stefan Dietze
 
Big Data in Learning Analytics - Analytics for Everyday Learning
Stefan Dietze
 
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
Stefan Dietze
 
Mining and Understanding Activities and Resources on the Web
Stefan Dietze
 
Towards embedded Markup of Learning Resources on the Web
Stefan Dietze
 
Semantic Linking & Retrieval for Digital Libraries
Stefan Dietze
 
Linked Data for Architecture, Engineering and Construction (AEC)
Stefan Dietze
 
Dietze linked data-vr-es
Stefan Dietze
 
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Stefan Dietze
 
Turning Data into Knowledge (KESW2014 Keynote)
Stefan Dietze
 
From Data to Knowledge - Profiling & Interlinking Web Datasets
Stefan Dietze
 
Ad

Recently uploaded (20)

PDF
MusicVideoProjectRubric Animation production music video.pdf
ALBERTIANCASUGA
 
PDF
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
PDF
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
PDF
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
PPTX
TSM_08_0811111111111111111111111111111111111111111111111
csomonasteriomoscow
 
PPTX
DATA-COLLECTION METHODS, TYPES AND SOURCES
biggdaad011
 
PPTX
Introduction to Artificial Intelligence.pptx
StarToon1
 
PDF
T2_01 Apuntes La Materia.pdfxxxxxxxxxxxxxxxxxxxxxxxxxxxxxskksk
mathiasdasilvabarcia
 
PPT
01 presentation finyyyal معهد معايره.ppt
eltohamym057
 
PPTX
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
DOCX
AI/ML Applications in Financial domain projects
Rituparna De
 
PPTX
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
PPT
Data base management system Transactions.ppt
gandhamcharan2006
 
PPTX
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
PDF
List of all the AI prompt cheat codes.pdf
Avijit Kumar Roy
 
PPTX
Usage of Power BI for Pharmaceutical Data analysis.pptx
Anisha Herala
 
PDF
Context Engineering vs. Prompt Engineering, A Comprehensive Guide.pdf
Tamanna
 
PDF
WEF_Future_of_Global_Fintech_Second_Edition_2025.pdf
AproximacionAlFuturo
 
PDF
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
PPTX
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
MusicVideoProjectRubric Animation production music video.pdf
ALBERTIANCASUGA
 
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
TSM_08_0811111111111111111111111111111111111111111111111
csomonasteriomoscow
 
DATA-COLLECTION METHODS, TYPES AND SOURCES
biggdaad011
 
Introduction to Artificial Intelligence.pptx
StarToon1
 
T2_01 Apuntes La Materia.pdfxxxxxxxxxxxxxxxxxxxxxxxxxxxxxskksk
mathiasdasilvabarcia
 
01 presentation finyyyal معهد معايره.ppt
eltohamym057
 
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
AI/ML Applications in Financial domain projects
Rituparna De
 
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
Data base management system Transactions.ppt
gandhamcharan2006
 
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
List of all the AI prompt cheat codes.pdf
Avijit Kumar Roy
 
Usage of Power BI for Pharmaceutical Data analysis.pptx
Anisha Herala
 
Context Engineering vs. Prompt Engineering, A Comprehensive Guide.pdf
Tamanna
 
WEF_Future_of_Global_Fintech_Second_Edition_2025.pdf
AproximacionAlFuturo
 
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
Ad

AI in between online and offline discourse - and what has ChatGPT to do with all of that?

  • 1. AI in between online and offline discourse - and what is the role of ChatGPT in all of that? Denkreise, Universität Bonn Stefan Dietze, 10.05.2023
  • 2. 2 Discourse Interactions Algorithms/AI Discourse, interactions, AI: complex interdependencies Society, Media, Politics & Policies (Offline & Online)
  • 3. 3 Discourse, interactions, AI: complex interdependencies AI-based methods to understand & influence online discourse & interactions (PART I) ▪ Recommendation & ranking in Web search (e.g. Google), shopping or social media ▪ Social media feeds & timelines ▪ Fact checking, detection of misinformation (e.g. news & social media) ▪ Classification of users (e.g. into political leanings, experts/lay people etc) ▪ Hate-speech detection ▪ Stance & sentiment detection ▪ … AI-based methods trained on & influencing data from discourse & interactions Discourse Interactions Algorithms/AI
  • 4. 4 Discourse, interactions, AI: complex interdependencies Contemporary and future AI challenges (PART II) ▪ Self-reinforcing cycle between AI & online/offline behaviour: AI influenced by / trained on data from discourse/interactions that are in turn influenced by AI and so on ▪ Biases (in models & behaviour) => echo chambers, filter bubbles etc ▪ Transparency, reproducibility, decay ▪ …. Discourse Interactions Algorithms/AI
  • 5. 5 AI for mining claims and stances in online discourse stance, claim trustworthiness? stance, claim trustworthiness?
  • 6. 6 Detecting stances towards claims/opinions in online discourse Motivation ▪ Problem: detecting stance of documents (e.g. Web pages, scientific publication) towards a given claim (unbalanced class distribution) ▪ Motivation: stance of documents (in particular disagreement) useful (a) as signal for truthfulness (fake news detection) and (b) document or source classification (PLDs, publishers) Approach ▪ Cascading binary classifiers: addressing individual issues (e.g. misclassification costs) per step ▪ Features, e.g. textual similarity (Word2Vec etc), sentiments, LIWC, etc. ▪ Best-performing models: 1) SVM with class-wise penalty, 2) CNN, 3) SVM with class-wise penalty ▪ Experiments on FNC-1 dataset (and FNC baselines) Results ▪ Minor overall performance improvement ▪ Improvement on disagree class by 27% (but still far from robust) A. Roy, A. Ekbal, S. Dietze, P. Fafalios, Exploiting stance hierarchies for cost-sensitive stance detection of Web documents, IJIS2022
  • 7. 7 ClaimsKG: a knowledge graph of fact-checked claims (to feed AI) Motivation ▪ Fact-checked claims spread across various (unstructured) fact-checking sites ▪ Example: finding claims about / made by US republican politicians across the Web? Approach ▪ Harvesting claims & metadata from fact- checking/news sites (e.g. snopes.com, Politifact.com etc); currently approx. 70.000 claims ▪ Information extraction & linking o Linking mentioned entities eg POTUS and Trump to „wikipedia:DonaldTrump“ o Normalisation of ratings (true, false, mixture, other); coreference resolution of claims o Exposing data through established vocabularies and W3C standards and APIs https://data.gesis.org/claimskg/ A. Tchechmedjiev, P. Fafalios, K. Boland, S. Dietze, B. Zapilko, K. Todorov, ClaimsKG – A Live Knowledge Graph of fact-checked Claims, ISWC2019
  • 8. 8 ClaimsKG as training corpus for AI models for fact-checking ▪ Ground truth data for AI-based methods for fact- checking ▪ Initial step for any fact-checker when „check-worthy claims“ appear: „Claim Retrieval“ i.e. identifying if an arbitrary statement (eg from a news article) has already been fact-checked (ClaimsKG as knowledge base) ▪ Shared AI task: „CLEF2022-CheckThat! Lab“ ▪ Unsupervised approach „SimBA“ achieving state-of- the-art performance (P@1 = 94%) Hövelmeyer, A., Boland, K., Dietze, S., SimBa at CheckThat! 2022: Lexical and Semantic Similarity Based Detection of Verified Claims in an Unsupervised and Supervised Way, CLEF Working Notes 2022, CLEF 2022- Conference and Labs of the Evaluation Forum.
  • 9. http://dbpedia.org/resource/Tim_Berners-Lee wna:positive-emotion onyx:Intensity "0.75" onyx:Intensity "0.0" http://dbpedia.org/resource/Solid wna:negative-emotion AI for mining opinions: the case of Twitter Challenges ▪ Heterogenity: multimodal, multilingual, informal, “noisy” language ▪ Context dependence: interpretation of tweets/posts (entities, sentiments) requires consideration of context (e.g. time, linked content), “Dusseldorf” => City or Football team ▪ Dynamics & scale: e.g. 6000 tweets per second, plus interactions (retweets etc) and context (e.g. 25% of tweets contain URLs) ▪ Evolution and temporal aspects: evolution of interactions over time crucial for many social sciences questions ▪ Representativity and bias: demographic distributions not known a priori in archived data collections
  • 10. TweetsKB – a large-scale research corpus of societal opinions ▪ Harvesting & archiving of 14 Billion tweets (permanent collection from Twitter 1% sample since 2013) ▪ Information extraction pipeline to build a KG of entities, interactions & sentiments (distributed Map/Reduce batch processing) o Entity linking with knowledge graph/DBpedia (“president”/“potus”/”trump” => dbp:DonaldTrump) o Sentiment analysis/annotation o Geotagging o Lifting into knowledge graph schema Dimitrov, D., Baran, E., Fafalios, P., Yu, R., Zhu, X., Zloch, M., Dietze, S., TweetsCOV19 – A Knowledge Base of Semantically Annotated Tweets about the COVID-19 Pandemic, CIKM2020 https://data.gesis.org/tweetskb
  • 11. https://data.gesis.org/tweetskb ▪ Harvesting & archiving of 14 Billion tweets (permanent collection from Twitter 1% sample since 2013) ▪ Information extraction pipeline to build a KG of entities, interactions & sentiments (distributed Map/Reduce batch processing) o Entity linking with knowledge graph/DBpedia (“president”/“potus”/”trump” => dbp:DonaldTrump) o Sentiment analysis/annotation o Geotagging o Lifting into knowledge graph schema ▪ Public, privacy-aware, large-scale research corpus of public opinions and their evolution => interdisciplinary research Dimitrov, D., Baran, E., Fafalios, P., Yu, R., Zhu, X., Zloch, M., Dietze, S., TweetsCOV19 – A Knowledge Base of Semantically Annotated Tweets about the COVID-19 Pandemic, CIKM2020 TweetsKB – a large-scale research corpus of societal opinions
  • 12. Social science research using TweetsKB https://dd4p.gesis.org Investigating Vaccine Hesitancy in DACH countries
  • 13. Germany suspends vaccinations with Astra Zeneca Twitter discourse zu “Impfbereitschaft” Social science research using TweetsKB Investigating Vaccine Hesitancy in DACH countries https://dd4p.gesis.org
  • 14. Social science research using TweetsKB Vaccine Hesitancy– key topics in “safety” category „Schwangerschaft“ „Kimmich“ „Alter“ „Nebenwirkungen“ „Herzinfarkt“ „Zulassung“ https://dd4p.gesis.org
  • 15. 17 ▪ AI4Sci project: understanding and classification of science discourse online (news, social Web) ▪ Percentage of tweets containing links to scientific articles (journals, publishers, science blogs etc) ▪ Uses list of > 30 K science web domains ▪ Data source: TweetsKB (https://data.gesis.org/tweetskb/), > 10 bn tweets archived since 2013 https://ai4sci-project.org/ How about scientific discourse on the Web? Example: Twitter
  • 16. How about scientific discourse on the Web? Example: Twitter 18 https://ai4sci-project.org/ Science claim Science reference Science relevance No science Science reference Hafid, S., Schellhammer, S., Bringay, S., Todorov, K., Dietze, S., "SciTweets - A Dataset and Annotation Framework for Detecting Scientific Online Discourse", CIKM2022
  • 17. SciTweets dataset & classifier 20 ▪ Ground truth dataset, heuristics-based sampling strategy and annotation framework for testing classification models ▪ 1261 expert-labeled tweets across all classes/labels ▪ Baseline classifiers based on SciBERT transformer model (fine-tuned/tested on SciTweets) ▪ Ongoing: analysis of large-scale science discourse and its evolution Hafid, S., Schellhammer, S., Bringay, S., Todorov, K., Dietze, S., SciTweets - A Dataset and Annotation Framework for Detecting Scientific Online Discourse, CIKM2022 https://ai4sci-project.org/
  • 18. 21 But: do we actually need dedicated NLP methods after ChatGPT? • ChatGPT does not outperform specialized models in most NLP tasks • ChatGPT’s impressiveness stems from its capacity to do many things reasonably well • ChatGPT does not mark the end of AI/NLP research Source: http://opensamizdat.com/posts/chatgpt_survey/
  • 19. 24 Discourse, interactions, AI: complex interdependencies Contemporary and future AI challenges (PART II) ▪ Self-reinforcing cycle between AI & online/offline behaviour: AI influenced by / trained on data from discourse/interactions that are in turn influenced by AI and so on ▪ Biases (in models & behaviour) => echo chambers, filter bubbles etc ▪ Transparency, reproducibility, decay ▪ …. Discourse Interactions Algorithms/AI
  • 20. 25 What‘s the role of ChatGPT et al in all this? • ChatGPT didn’t come out of nowhere but is a continuation of trends throughout the last decades • Mostly powered by progress in (deep) representation & transfer learning
  • 21. 26 What is driving the recent AI advances? Deep & representation learning for language understanding Percentage of deep learning papers in major NLP conferences (Source: Young et al., Recent Trends in Deep Learning Based Natural Language Processing) • Pretraining of embeddings: predicting low-dimensional vector representations of words & text, e.g. Word2Vec [Mikolov et al., 2013] • RNN/CNN architectures in encoder/decoder settings (e.g. for machine translation) [Vaswani et al., 2017] • Pretraining language models for task- specific transfer learning, e.g., BERT - Bidirectional Encoder Representations from Transformers [Devlin et al., 2018] T. Mikolov et al., Distributed Representations of Words and Phrases and their Compositionality, NIPS (2013) J. Devlin et al., BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (2018) A. Vaswani et al. Attention is all you need, NIPS (2017)
  • 22. 27 Transfer learning for language understanding Goal: Language Model (“reusable language representation”) (Paris:France, Tokyo:Japan) Approach: un-/self-supervised Machine Learning (e.g. „Masked Language Modeling“) Data: the more the merrier, e.g. Wikipedia, Google Book Corpus (200 Bn words), Common Crawl, Google crawl of indexed Web pages etc Data cheap, computation costly Goal: dedicated model for downstream task (e.g. dialog system, Tweet classification => Tweet X: RightWing) Approach: supervised machine learning, reinforcement learning Data: labelled data (e.g. expert annotated tweets, weak labels) Data costly, computation cheap Pretraining of foundational language models Finetuning/supervised ML („the old ways“)
  • 23. 28 Transfer learning for language understanding Goal: Language Model (“reusable language representation”) (Paris:France, Tokyo:Japan) Approach: un-/self-supervised Machine Learning (e.g. „Masked Language Modeling“) Data: the more the merrier, e.g. Wikipedia, Google Book Corpus (200 Bn words), Common Crawl, Google crawl of indexed Web pages etc Data cheap, computation costly Goal: dedicated model for downstream task (e.g. dialog system, Tweet classification => Tweet X: RightWing) Approach: supervised machine learning, reinforcement learning Data: labelled data (e.g. expert annotated tweets, weak labels) Data costly, computation cheap Pretraining of foundational language models Finetuning/supervised ML („the old ways“) • Language understanding capabilities of models established during pretraining stage • Paradigm shift in NLP / NLU due to transferable foundational models • Little/no finetuning required for specific down-stream tasks („Zero-Shot Learning“) • Requires no costly data annotation but can be fed with massive volumes of text • „The larger the better“ (GPT3 = 175 Bn parameters)
  • 24. 29 AI progress & societal impact/challenges ▪ Stopping the „arms race“? The bigger the better? ▪ Concentration of power: platform providers (gatekeeper to data), AI companies (gatekeeper to models, infrastructure) ▪ Environmental cost (Bender et al., On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?) ▪ Misinformation from generative models (e.g. deep fakes) ▪ But it‘s not only about AI: data, infrastructure, information has been controlled by large commercial platform providers for quite some time
  • 25. 30 Crucial AI challenges that we/AI research is trying to tackle ▪ Transparency of AI models ▪ Access to data, infrastructure, models ▪ Reproducibility & explainability ▪ Bias of data & AI models ▪ Decay of AI models ▪ …
  • 26. 31 AI transparency Source: https://motherboard.vice.com/en_us/article/j5npeg/why-is-google-translate-spitting-out-sinister-religious-prophecies/ • Neural machine translation as black box system => hard to pin down causes of „learned“ translations • Potential reasons: adversarial data fed into NN; NN learned to generate syntactically well formed sentences/statements even on nonsensical input
  • 27. 32 AI transparency: how to evaluate AI if it outperforms humans? • Saturation of NLP/image processing benchmarks over time. Initial performance at -1, human performance at 0. • Challenges for performance evaluation • Creating novel kinds of AI benchmarks • Note: humans still outperform AI in many (less specialized) tasks/respects, e.g. “common-sense reasoning” Kiela, D. et al., Dynabench: Rethinking Benchmarking in NLP, NAACL2021
  • 28. 33 AI decay: evolving language and discourse context ▪ LMs trained on large volumes of text ▪ Yet: strong vocabulary shift, over- /underrepresentation of topics/vocabulary in particular time periods (e.g. Twitter COVID19- discourse 2020 vs 2019) ▪ AI models for online discourse analysis require frequent training and updates (access to data/infrastructure?) ▪ Efficient methods for updating large language models Source: Hombaiah et al., “Dynamic Language Models for continuously evolving Content”, SIGKDD2021
  • 29. 34 AI reproducibility: „A worrying analysis of neural recommender approaches” Dacrema, M. F., Cremonesi, P., Jannach, D., 2019. Are we really making much progress? A worrying analysis of recent neural recommendation approaches. In ACM RecSys2019. DOI:https://doi.org/10.1145/3298689.3347058 Are DL-based methods reproducible? Do they actually beat simple baselines? • Reproducibility crisis (due to lack of transparency) • State-of-the-art crisis (driven by peer review crisis)
  • 30. AI biases: learned and elevated by AI/deep learning models Source: https://techcrunch.com/2016/03/24/microsoft-silences-its-new-a-i-bot-tay-after-twitter-users-teach-it-racism/ 35 [N-word] • Biases in human interactions can be learned and elevated by ML models => fairness and ethics of AI, data quality and representativeness • Crucial: research into understanding of AI models
  • 31. 36 Model understanding: do LLMs actually „understand“ language? ▪ Language model (e.g. GPT) prompting as a way to assess its capacity to comprehend language („intelligence“) ▪ How does prompt syntax influence „knowledge retrieval“? Linzbach, S., Tressel, T., Kallmeyer, L., Dietze, S., Jabeen, H., 2023. Decoding Prompt Syntax: Analysing its Impact on Knowledge Retrieval in Large Language Models. In Companion Proceedings of the ACM Web Conference 2023 The role of syntax in question-answering through LLMs
  • 32. 37 ▪ Experiments on BERT-based (Transformer) models ▪ Testing for sentence typology (simple, compound, etc) and active/passive/nominalised verb morphology Linzbach, S., Tressel, T., Kallmeyer, L., Dietze, S., Jabeen, H., 2023. Decoding Prompt Syntax: Analysing its Impact on Knowledge Retrieval in Large Language Models. In Companion Proceedings of the ACM Web Conference 2023 Model understanding: do LLMs actually „understand“ language? The role of syntax in question-answering through LLMs
  • 33. 38 Model understanding: how to measure actual „intelligence“ of AI? • AI benchmark datasets/tests, eg to probe their commonsense reasoning skills, e.g. AI2Reasoning (ARC) • Models learn to perform well on a task while struggling to actually understand • Example: ChatGPT produces reasonable-sounding answers that are actually wrong • Constructing benchmark datasets for evaluation of AI models remains a huge challenge Branco et al., Shortcutted Commonsense: Data Spuriousness in Deep Learning of Commonsense Reasoning, EMNLP 2021
  • 34. 39 To sum up: how to beat the cycle? ▪ AI research = creating understanding AI models, e.g. what and how do LLMs learn? (e.g. SFB Deep Linguistic Modeling) ▪ FAIR access to AI data & models (e.g. GESIS Methods Hub) ▪ Values-oriented ML: e.g. explainability > accuracy ▪ Realistic and fair benchmark datasets & setups for AI evaluation ▪ Maintainability: efficient ways on how to train/update foundational models (e.g. DD4P) ▪ Interdisiciplinary research into complex interdependencies between AI & online / offline discourse (e.g. NEWORDER) ▪ Governance & incentives: monopolies, ecological aspects and access to data/infrastructure are not only technical but societal & political issues Discourse Interactions Algorithms/AI Society, Media, Politics & Policies (Offline & Online)
  • 36. 41 References ▪ Branco et al., Shortcutted Commonsense: Data Spuriousness in Deep Learning of Commonsense Reasoning, EMNLP 2021 ▪ Linzbach, S., Tressel, T., Kallmeyer, L., Dietze, S., Jabeen, H., 2023. Decoding Prompt Syntax: Analysing its Impact on Knowledge Retrieval in Large Language Models. In Companion Proceedings of the ACM Web Conference 2023 ▪ Dacrema, M. F., Cremonesi, P., Jannach, D., 2019. Are we really making much progress? A worrying analysis of recent neural recommendation approaches. In ACM RecSys2019. DOI:https://doi.org/10.1145/3298689.3347058 ▪ Kiela, D. et al., Dynabench: Rethinking Benchmarking in NLP, NAACL2021 ▪ Hombaiah et al., “Dynamic Language Models for continuously evolving Content”, SIGKDD2021 ▪ Hafid, S., Schellhammer, S., Bringay, S., Todorov, K., Dietze, S., SciTweets - A Dataset and Annotation Framework for Detecting Scientific Online Discourse, CIKM2022 ▪ Dimitrov, D., Baran, E., Fafalios, P., Yu, R., Zhu, X., Zloch, M., Dietze, S., TweetsCOV19 – A Knowledge Base of Semantically Annotated Tweets about the COVID-19 Pandemic, CIKM2020 ▪ A. Tchechmedjiev, P. Fafalios, K. Boland, S. Dietze, B. Zapilko, K. Todorov, ClaimsKG – A Live Knowledge Graph of fact-checked Claims, ISWC2019 ▪ Hövelmeyer, A., Boland, K., Dietze, S., SimBa at CheckThat! 2022: Lexical and Semantic Similarity Based Detection of Verified Claims in an Unsupervised and Supervised Way, CLEF Working Notes 2022, CLEF 2022- Conference and Labs of the Evaluation Forum. ▪ A. Roy, A. Ekbal, S. Dietze, P. Fafalios, Exploiting stance hierarchies for cost-sensitive stance detection of Web documents, IJIS2022