Skip to main content
Account

Table 4 Percentage of stylometric features from the sets of the most important model features for each sub-domain

From: Improving medical experts’ efficiency of misinformation detection: an exploratory study

Category

LIWC

NER

POS

DEP

Sent

Lexical

TF-IDF

statins

5.3%

0.0%

0.5%

0.5%

0.5%

93.2%

vaccines

2.8%

0.7%

0.7%

1.4%

0.0%

94.4%

psychiatry

4.3%

0.0%

0.0%

0.0%

0.0%

95.7%

allergy testing

8.2%

0.0%

0.0%

0.0%

0.0%

91.9%

antioxidants

14.7%

0.0%

0.0%

0.0%

0.0%

85.3%

steroids for kids

12.3%

0.0%

0.0%

0.0%

0.0%

87.7%

children antibiotics

3.1%

0.0%

0.0%

0.0%

0.0%

96.9%

diet and autism

5.5%

0.0%

0.0%

0.0%

0.0%

94.5%

heart supplements

12.0%

2.0%

0.0%

0.0%

0.0%

86.0%

cc vs. nb

3.9%

0.0%

0.0%

0.0%

0.0%

96.1%

BioBERT

statins

12.1%

3.2%

3.2%

4.2%

0.5%

76.8%

vaccines

13.9%

2.9%

2.9%

10.0%

0.7%

69.7%

psychiatry

10.7%

0.0%

2.9%

3.6%

0.0%

82.9%

allergy testing

10.4%

3.00%

2.2%

4.4%

0.0%

80.0%

antioxidants

14.7%

1.4%

0.0%

0.0%

0.0%

84.0%

steroids for kids

21.5%

0.0%

2.8%

8.3%

0.0%

67.4%

children antibiotics

13.9%

3.1%

3.1%

10.8%

0.0%

69.2%

diet and autism

14.6%

1.8%

1.8%

5.5%

0.0%

76.36%

heart supplements

22.0%

8.0%

2.0%

16.0%

2.0%

50.0%

cc vs. nb

22.0%

0.0%

2.0%

14.0%

4.0%

58.0%

  1. LIWC - Linguistic Inquiry Word Count; NER - Named entities count; POS - parts of speech count; DEP - dependency parsing elements count; sent - either polarity or subjectivity of the text; lexical - features that are not stylometric, retrieved either by TF-IDF transformation or the BioBERT model