SlideShare a Scribd company logo
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com
A Credibility Analysis System for AssessingInformation on Twitter
Now-a-days twitter popularity is growing due to its fast dissemination
(spreading information) of messages and for researchers finding Credibility
(finding whether message is fake or genuine) of messages is a hot topic and
lots of researchers already introduces some techniques to find whether given
message in twitter is fake or genuine but all those techniques were not using
all available information to determine fake or genuine messages.
Sometime in all online social networks some malicious users will spread fake
news and if this news publish on social media then it will put bad effect on
human society. This malicious users will hire peoples or robots to create fake
account and then using that fake account they will publish fake messages. This
type of accounts will not have much profile features such as favourites,
followers, following, hashtag, retweets etc. So by analysing those features we
can determine whether tweets is credible (genuine) or non-credible (fake);
In propose paper author is using four different components to find whether
tweet message is credible (genuine) or non-credible (fake). Below are four
components used in algorithm to find user reputation and tweets credibility.
1) Reputation Based Component: In this technique we will identify number
of followers, following, retweet, hashtag and favourites from tweets
dataset to calculate user’s reputation. For example if tweet is more
genuine then more number of users show interest in that tweet topic
and number of retweets, favourites count will be automatically
increased. Followers/ following of such tweets will also be increased. If
tweets is not genuine then very less number of users will follow it and
reputation score will be less. Formula to calculate reputation
Math.log(favourites)/math.log(max(U_followers,Total_retweet_for_topic)
Similarly reputation score will be calculated for hashtag also, sum of favourites
and hashtag will give reputation score. If user has less followers or retweets
then its score will be less.
If calculated value is less than 0.1 then we consider this user tweet as Non-
Credible (fake) and if greater > 0.1 then consider as Credible (genuine)
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com
2) Classifier Engine: using data mining algorithms such as Random Forest
and Naïve Bayes or propose FeatureRank_NB we can classify/predict
given tweet is genuine or fake. First with all existing data a train model
will be generated with above algorithms and then this algorithms train
model will be applied on new test tweet to determine whether given
test tweet is genuine or fake. Propose FeatureRank_NB algorithm will
apply ranking algorithm on all features such as favourite, followers,
following, hashtag etc to determine relevant (important values) features
to predict tweets are genuine or fake. Ranking algorithm will give high
score to those attributes which occur more number of time and that
attribute consider as important and more weight or rank assigned to
those attributes.
3) User Experience Component: In this module we will apply sentiment
analysis algorithm to determine whether given tweets contains more
number of positive or negative words. If user is experience and genuine
then he will used more number of positive words and if user is fake then
he will used more number of negative words. By applying sentiment
detection we can detect user experience and help us to identify whether
user tweet is fake or genuine. To calculate sentiments we are using
Stanford Natural Language Processing API’s which classify given tweets
as Positive or Negative or Neutral.
4) Feature Ranking Algorithm: This module will be applied on all tweets
features to select only those attributes which are important and has
more rank.
All algorithms given in paper will work on above describe four components.
Implementation
To implement above paper concept we are using tweets dataset from twitter.
Screen shots.
To run project double click on ‘run.bat’ file to get below screen
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com
In above screen click on ‘Upload Tweets Dataset’ button to upload tweets
folder. Each file in tweet folder contains tweets and profile from one user
After uploading tweets folder will get below screen
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com
In above screen we can see we extracted all features from all user profile. In
above screen ‘Tweet Content’ column contains tweets text message. Now click
on ‘Calculate User Reputation’ button to calculate user reputation from above
extracted features
In above screen for all tweets from each user we calculated reputation score
and also calculated whether tweets contains positive or negative messages.
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com
Now click on ‘Extract Features’ button to extract features and to calculate
credibility score from those tweets features
In above screen for each user tweet we calculated credibility based on
sentiment and reputation score. If score less than 0.60% then tweets will
consider as non-credible tweet and > 0.60% consider as credible tweet. Now
from above features we will train Random Forest Algorithm to build
classification model and to calculate correctly classified instances (record).
Now click on ‘Run Random Forest Algorithm’ button to build train model.
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com
In above screen selected text we can see total 36 tweets users found and out
of that Random forest able to correctly classified 27 tweets. Now classify same
thing with propose‘FeatureRank NB’ classifier, now click on ‘Run Feature Rank
NB’ button to classify with propose features ranking technique
In above screen we can see propose feature ranking technique able to
correctly classified 28 records which are more than existing random forest
technique. Now click on ‘Correctly Classified Instances Comparison Graph’ to
see correctly classified instances between existing and propose technique
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com
In above graph x-axis represents algorithm name and y-axis represents
correctly classified count. Existing random forest has less number of correctly
classified instances compare to propose Feature Ranking NB algorithm

More Related Content

Similar to A credibility analysis system for assessing information on twitter (20)

PDF
IRJET - Analysis of Fake Ranking on Social Media: Twitter
IRJET Journal
 
PDF
Measuring Opinion Credibility in Twitter
Mya Thandar
 
PDF
Measuring Opinion Credibility in Twiiter
mthandar
 
PDF
Analyzing Social media’s real data detection through Web content mining using...
IRJET Journal
 
DOCX
Sentiment analysis using machine learning
Venkat Projects
 
PDF
Credibility Ranking of Tweets during High Impact Events
IIIT Hyderabad
 
PDF
IRJET- Review Analyser with Bot
IRJET Journal
 
PPTX
Collaborative personalized tweet recommendation
Liangjie Hong
 
DOCX
Sentiment analysis using machine learning and deep Learning
Venkat Projects
 
PDF
757
Anurag Jain
 
PPTX
Final Year PPT on Twitter App
scorpionking257
 
PPTX
Twitter_Sentiment_analysis.pptx
JOELFRANKLIN13
 
PDF
Predicting the future with social media
Peter Wlodarczak
 
PPTX
Sentiment analysis of twitter using python
Manan Gadhiya
 
PPTX
Political prediction analysis using text mining and deep learning
Vishwambhar Deshpande
 
PDF
Experimental of vectorizer and classifier for scrapped social media data
TELKOMNIKA JOURNAL
 
PPTX
Svm and maximum entropy model for sentiment analysis of tweets
S M Raju
 
PPTX
Political Prediction Analysis using text mining and deep learning.pptx
DineshGaikwad36
 
PDF
Emotion Recognition By Textual Tweets Using Machine Learning
IRJET Journal
 
PDF
IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
IRJET Journal
 
IRJET - Analysis of Fake Ranking on Social Media: Twitter
IRJET Journal
 
Measuring Opinion Credibility in Twitter
Mya Thandar
 
Measuring Opinion Credibility in Twiiter
mthandar
 
Analyzing Social media’s real data detection through Web content mining using...
IRJET Journal
 
Sentiment analysis using machine learning
Venkat Projects
 
Credibility Ranking of Tweets during High Impact Events
IIIT Hyderabad
 
IRJET- Review Analyser with Bot
IRJET Journal
 
Collaborative personalized tweet recommendation
Liangjie Hong
 
Sentiment analysis using machine learning and deep Learning
Venkat Projects
 
Final Year PPT on Twitter App
scorpionking257
 
Twitter_Sentiment_analysis.pptx
JOELFRANKLIN13
 
Predicting the future with social media
Peter Wlodarczak
 
Sentiment analysis of twitter using python
Manan Gadhiya
 
Political prediction analysis using text mining and deep learning
Vishwambhar Deshpande
 
Experimental of vectorizer and classifier for scrapped social media data
TELKOMNIKA JOURNAL
 
Svm and maximum entropy model for sentiment analysis of tweets
S M Raju
 
Political Prediction Analysis using text mining and deep learning.pptx
DineshGaikwad36
 
Emotion Recognition By Textual Tweets Using Machine Learning
IRJET Journal
 
IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
IRJET Journal
 

More from Venkat Projects (20)

DOCX
1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx
Venkat Projects
 
DOCX
12.BLOCKCHAIN BASED MILK DELIVERY PLATFORM FOR STALLHOLDER DAIRY FARMERS IN K...
Venkat Projects
 
DOCX
10.ATTENDANCE CAPTURE SYSTEM USING FACE RECOGNITION.docx
Venkat Projects
 
DOCX
9.IMPLEMENTATION OF BLOCKCHAIN IN FINANCIAL SECTOR TO IMPROVE SCALABILITY.docx
Venkat Projects
 
DOCX
8.Geo Tracking Of Waste And Triggering Alerts And Mapping Areas With High Was...
Venkat Projects
 
DOCX
Image Forgery Detection Based on Fusion of Lightweight Deep Learning Models.docx
Venkat Projects
 
DOCX
6.A FOREST FIRE IDENTIFICATION METHOD FOR UNMANNED AERIAL VEHICLE MONITORING ...
Venkat Projects
 
DOCX
WATERMARKING IMAGES
Venkat Projects
 
DOCX
4.LOCAL DYNAMIC NEIGHBORHOOD BASED OUTLIER DETECTION APPROACH AND ITS FRAMEWO...
Venkat Projects
 
DOCX
Application and evaluation of a K-Medoidsbased shape clustering method for an...
Venkat Projects
 
DOCX
OPTIMISED STACKED ENSEMBLE TECHNIQUES IN THE PREDICTION OF CERVICAL CANCER US...
Venkat Projects
 
DOCX
1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx
Venkat Projects
 
DOCX
2022 PYTHON MAJOR PROJECTS LIST.docx
Venkat Projects
 
DOCX
2022 PYTHON PROJECTS LIST.docx
Venkat Projects
 
DOCX
2021 PYTHON PROJECTS LIST.docx
Venkat Projects
 
DOCX
2021 python projects list
Venkat Projects
 
DOCX
10.sentiment analysis of customer product reviews using machine learni
Venkat Projects
 
DOCX
9.data analysis for understanding the impact of covid–19 vaccinations on the ...
Venkat Projects
 
DOCX
6.iris recognition using machine learning technique
Venkat Projects
 
DOCX
5.local community detection algorithm based on minimal cluster
Venkat Projects
 
1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx
Venkat Projects
 
12.BLOCKCHAIN BASED MILK DELIVERY PLATFORM FOR STALLHOLDER DAIRY FARMERS IN K...
Venkat Projects
 
10.ATTENDANCE CAPTURE SYSTEM USING FACE RECOGNITION.docx
Venkat Projects
 
9.IMPLEMENTATION OF BLOCKCHAIN IN FINANCIAL SECTOR TO IMPROVE SCALABILITY.docx
Venkat Projects
 
8.Geo Tracking Of Waste And Triggering Alerts And Mapping Areas With High Was...
Venkat Projects
 
Image Forgery Detection Based on Fusion of Lightweight Deep Learning Models.docx
Venkat Projects
 
6.A FOREST FIRE IDENTIFICATION METHOD FOR UNMANNED AERIAL VEHICLE MONITORING ...
Venkat Projects
 
WATERMARKING IMAGES
Venkat Projects
 
4.LOCAL DYNAMIC NEIGHBORHOOD BASED OUTLIER DETECTION APPROACH AND ITS FRAMEWO...
Venkat Projects
 
Application and evaluation of a K-Medoidsbased shape clustering method for an...
Venkat Projects
 
OPTIMISED STACKED ENSEMBLE TECHNIQUES IN THE PREDICTION OF CERVICAL CANCER US...
Venkat Projects
 
1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx
Venkat Projects
 
2022 PYTHON MAJOR PROJECTS LIST.docx
Venkat Projects
 
2022 PYTHON PROJECTS LIST.docx
Venkat Projects
 
2021 PYTHON PROJECTS LIST.docx
Venkat Projects
 
2021 python projects list
Venkat Projects
 
10.sentiment analysis of customer product reviews using machine learni
Venkat Projects
 
9.data analysis for understanding the impact of covid–19 vaccinations on the ...
Venkat Projects
 
6.iris recognition using machine learning technique
Venkat Projects
 
5.local community detection algorithm based on minimal cluster
Venkat Projects
 
Ad

Recently uploaded (20)

PDF
Knee Extensor Mechanism Injuries - Orthopedic Radiologic Imaging
Sean M. Fox
 
PPTX
STAFF DEVELOPMENT AND WELFARE: MANAGEMENT
PRADEEP ABOTHU
 
PPTX
SPINA BIFIDA: NURSING MANAGEMENT .pptx
PRADEEP ABOTHU
 
PDF
Dimensions of Societal Planning in Commonism
StefanMz
 
PPTX
HYDROCEPHALUS: NURSING MANAGEMENT .pptx
PRADEEP ABOTHU
 
PDF
Women's Health: Essential Tips for Every Stage.pdf
Iftikhar Ahmed
 
PDF
Isharyanti-2025-Cross Language Communication in Indonesian Language
Neny Isharyanti
 
PPTX
Soil and agriculture microbiology .pptx
Keerthana Ramesh
 
PPTX
How to Set Maximum Difference Odoo 18 POS
Celine George
 
PPTX
Cultivation practice of Litchi in Nepal.pptx
UmeshTimilsina1
 
PPTX
ASRB NET 2023 PREVIOUS YEAR QUESTION PAPER GENETICS AND PLANT BREEDING BY SAT...
Krashi Coaching
 
PDF
ARAL-Orientation_Morning-Session_Day-11.pdf
JoelVilloso1
 
PDF
Biological Bilingual Glossary Hindi and English Medium
World of Wisdom
 
PPTX
PATIENT ASSIGNMENTS AND NURSING CARE RESPONSIBILITIES.pptx
PRADEEP ABOTHU
 
PPTX
A PPT on Alfred Lord Tennyson's Ulysses.
Beena E S
 
PPTX
grade 5 lesson matatag ENGLISH 5_Q1_PPT_WEEK4.pptx
SireQuinn
 
PDF
DIGESTION OF CARBOHYDRATES,PROTEINS,LIPIDS
raviralanaresh2
 
PDF
LAW OF CONTRACT ( 5 YEAR LLB & UNITARY LLB)- MODULE-3 - LEARN THROUGH PICTURE
APARNA T SHAIL KUMAR
 
PDF
The History of Phone Numbers in Stoke Newington by Billy Thomas
History of Stoke Newington
 
PPTX
I AM MALALA The Girl Who Stood Up for Education and was Shot by the Taliban...
Beena E S
 
Knee Extensor Mechanism Injuries - Orthopedic Radiologic Imaging
Sean M. Fox
 
STAFF DEVELOPMENT AND WELFARE: MANAGEMENT
PRADEEP ABOTHU
 
SPINA BIFIDA: NURSING MANAGEMENT .pptx
PRADEEP ABOTHU
 
Dimensions of Societal Planning in Commonism
StefanMz
 
HYDROCEPHALUS: NURSING MANAGEMENT .pptx
PRADEEP ABOTHU
 
Women's Health: Essential Tips for Every Stage.pdf
Iftikhar Ahmed
 
Isharyanti-2025-Cross Language Communication in Indonesian Language
Neny Isharyanti
 
Soil and agriculture microbiology .pptx
Keerthana Ramesh
 
How to Set Maximum Difference Odoo 18 POS
Celine George
 
Cultivation practice of Litchi in Nepal.pptx
UmeshTimilsina1
 
ASRB NET 2023 PREVIOUS YEAR QUESTION PAPER GENETICS AND PLANT BREEDING BY SAT...
Krashi Coaching
 
ARAL-Orientation_Morning-Session_Day-11.pdf
JoelVilloso1
 
Biological Bilingual Glossary Hindi and English Medium
World of Wisdom
 
PATIENT ASSIGNMENTS AND NURSING CARE RESPONSIBILITIES.pptx
PRADEEP ABOTHU
 
A PPT on Alfred Lord Tennyson's Ulysses.
Beena E S
 
grade 5 lesson matatag ENGLISH 5_Q1_PPT_WEEK4.pptx
SireQuinn
 
DIGESTION OF CARBOHYDRATES,PROTEINS,LIPIDS
raviralanaresh2
 
LAW OF CONTRACT ( 5 YEAR LLB & UNITARY LLB)- MODULE-3 - LEARN THROUGH PICTURE
APARNA T SHAIL KUMAR
 
The History of Phone Numbers in Stoke Newington by Billy Thomas
History of Stoke Newington
 
I AM MALALA The Girl Who Stood Up for Education and was Shot by the Taliban...
Beena E S
 
Ad

A credibility analysis system for assessing information on twitter

  • 1. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected] A Credibility Analysis System for AssessingInformation on Twitter Now-a-days twitter popularity is growing due to its fast dissemination (spreading information) of messages and for researchers finding Credibility (finding whether message is fake or genuine) of messages is a hot topic and lots of researchers already introduces some techniques to find whether given message in twitter is fake or genuine but all those techniques were not using all available information to determine fake or genuine messages. Sometime in all online social networks some malicious users will spread fake news and if this news publish on social media then it will put bad effect on human society. This malicious users will hire peoples or robots to create fake account and then using that fake account they will publish fake messages. This type of accounts will not have much profile features such as favourites, followers, following, hashtag, retweets etc. So by analysing those features we can determine whether tweets is credible (genuine) or non-credible (fake); In propose paper author is using four different components to find whether tweet message is credible (genuine) or non-credible (fake). Below are four components used in algorithm to find user reputation and tweets credibility. 1) Reputation Based Component: In this technique we will identify number of followers, following, retweet, hashtag and favourites from tweets dataset to calculate user’s reputation. For example if tweet is more genuine then more number of users show interest in that tweet topic and number of retweets, favourites count will be automatically increased. Followers/ following of such tweets will also be increased. If tweets is not genuine then very less number of users will follow it and reputation score will be less. Formula to calculate reputation Math.log(favourites)/math.log(max(U_followers,Total_retweet_for_topic) Similarly reputation score will be calculated for hashtag also, sum of favourites and hashtag will give reputation score. If user has less followers or retweets then its score will be less. If calculated value is less than 0.1 then we consider this user tweet as Non- Credible (fake) and if greater > 0.1 then consider as Credible (genuine)
  • 2. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected] 2) Classifier Engine: using data mining algorithms such as Random Forest and Naïve Bayes or propose FeatureRank_NB we can classify/predict given tweet is genuine or fake. First with all existing data a train model will be generated with above algorithms and then this algorithms train model will be applied on new test tweet to determine whether given test tweet is genuine or fake. Propose FeatureRank_NB algorithm will apply ranking algorithm on all features such as favourite, followers, following, hashtag etc to determine relevant (important values) features to predict tweets are genuine or fake. Ranking algorithm will give high score to those attributes which occur more number of time and that attribute consider as important and more weight or rank assigned to those attributes. 3) User Experience Component: In this module we will apply sentiment analysis algorithm to determine whether given tweets contains more number of positive or negative words. If user is experience and genuine then he will used more number of positive words and if user is fake then he will used more number of negative words. By applying sentiment detection we can detect user experience and help us to identify whether user tweet is fake or genuine. To calculate sentiments we are using Stanford Natural Language Processing API’s which classify given tweets as Positive or Negative or Neutral. 4) Feature Ranking Algorithm: This module will be applied on all tweets features to select only those attributes which are important and has more rank. All algorithms given in paper will work on above describe four components. Implementation To implement above paper concept we are using tweets dataset from twitter. Screen shots. To run project double click on ‘run.bat’ file to get below screen
  • 3. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected] In above screen click on ‘Upload Tweets Dataset’ button to upload tweets folder. Each file in tweet folder contains tweets and profile from one user After uploading tweets folder will get below screen
  • 4. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected] In above screen we can see we extracted all features from all user profile. In above screen ‘Tweet Content’ column contains tweets text message. Now click on ‘Calculate User Reputation’ button to calculate user reputation from above extracted features In above screen for all tweets from each user we calculated reputation score and also calculated whether tweets contains positive or negative messages.
  • 5. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected] Now click on ‘Extract Features’ button to extract features and to calculate credibility score from those tweets features In above screen for each user tweet we calculated credibility based on sentiment and reputation score. If score less than 0.60% then tweets will consider as non-credible tweet and > 0.60% consider as credible tweet. Now from above features we will train Random Forest Algorithm to build classification model and to calculate correctly classified instances (record). Now click on ‘Run Random Forest Algorithm’ button to build train model.
  • 6. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected] In above screen selected text we can see total 36 tweets users found and out of that Random forest able to correctly classified 27 tweets. Now classify same thing with propose‘FeatureRank NB’ classifier, now click on ‘Run Feature Rank NB’ button to classify with propose features ranking technique In above screen we can see propose feature ranking technique able to correctly classified 28 records which are more than existing random forest technique. Now click on ‘Correctly Classified Instances Comparison Graph’ to see correctly classified instances between existing and propose technique
  • 7. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected] In above graph x-axis represents algorithm name and y-axis represents correctly classified count. Existing random forest has less number of correctly classified instances compare to propose Feature Ranking NB algorithm