SlideShare a Scribd company logo
Sentiment Classification
Unsupervised Sentiment Classification
 Unsupervised methods do not require labeled
examples.
 Knowledge about the task is usually added by
using lexical resources and
 hard-coded heuristics, e.g.:
 Lexicons + patterns: VADER
 Patterns + Simple language model: SO-PMI
 Neural language models have been found that
they learn to recognize
 sentiment with no explicit knowledge about the
task.
Supervised/unsupervised
 Supervised learning methods are the most commonly used one,
yet also
 some unsupervised methods have been successfully.
 Unsupervised methods rely on the shared and recurrent
characteristics of the
 sentiment dimension across topics to perform classification by
means of
 hand-made heuristics and simple language models.
 Supervised methods rely on a training set of labeled examples that
describe
 the correct classification label to be assigned to a number of
documents.
 A learning algorithm then exploits the examples to model a general
 classification function.
VADER
 VADER (Valence Aware Dictionary for
sEntiment Reasoning)uses a curated
lexicon derived from well known sentiment
lexicons that assigns a positivity/negativity
score to 7k+ words/emoticons.
 It also uses a number of hand-written
pattern matching rules (e.g., negation,
intensifiers) to modify the contribution of
the original word scores to the overall
sentiment of text.
 Hutto and Gilbert. VADER: A Parsimonious
Rule-based Model for Sentiment Analysis
of Social Media Text. ICWSM 2014.
 VADER is integrated into NLTK
The classification pipeline
The elements of a classification pipeline are:
1. Tokenization
2. Feature extraction
3. Feature selection
4. Weighting
5. Learning
 Steps from 1 to 4 define the feature space and
how text is converted into vectors.
 Step 5 creates the classification model.
Skikit-learn
 The scikit-learn library defines a rich number of
data processing and machine learning algorithms.
 Most modules in scikit implement a 'fit-transform'
interface:
 fit method learns the parameter of the module from
input data
 transform method apply the method implemented by
the module to the data
 fit_transform does both actions in sequence, and is
useful to connect modules in a pipeline.
Deep Learning for Sentiment Analysis
Convolutional Neural Network
 A convolutional layer in a NN is composed by a set of filters.
 A filter combines a "local" selection of input values into an output value.
 All filters are "sweeped" across all input.
• A filter using a window length of 5 is applied to all the sequences of 5 words
in a text.
• 3 filters using a window of 5 applied to a text of 10 words produce 18 output
values. Why?
• Filters have additional parameters that define their behavior at the start/end
of documents (padding), the size of the sweep step (stride), the eventual
presence of holes in the filter window (dilation).
 During training each filter specializes into recognizing some kind
of relevant combination of features.
 CNNs work well on stationary feats, i.e., those independent
from position.
-
Not
going
to
the
beach
tomorrow
:-(
+
-
convolutional layer with
multiple filters
Multilayer perceptron
with dropout
embeddings
for each word
max over time pooling
CNN for Sentiment Classification
1. Embeddings Layer, Rd
(d = 300)
2. Convolutional Layer with Relu activation
 Multiple filters of sliding windows of various sizes h
ci = f(F  Si:i+h−1 + b)
3. max-pooling layer
4. dropout layer
5. linear layer with tanh activation
6. softmax layer
S
Frobenius matrix product
Sense Specific Word Embeddings
 Sentiment Specific Word Embeddings
 Uses an annotated corpus with polarities (e.g.
tweets)
 SS Word Embeddings achieve SotA accuracy on
tweet sentiment classification
U
the cat sits on
LM likelihood + Polarity
Learning
 Generic loss function
LCW(x, xc
) = max(0, 1  f(x) + f(xc
))
 SS loss function
LSS(x, xc
) = max(0, 1  ds(x) f(x)1 + ds(x) f(xc
)1)
 Gradients
Semeval 2015 Sentiment on Tweets
Team Phrase Level
Polarity
Tweet
Attardi (unofficial) 67.28
Moschitti 84.79 64.59
KLUEless 84.51 61.20
IOA 82.76 62.62
WarwickDCS 82.46 57.62
Webis 64.84
SwissCheese at SemEval 2016
 three-phase procedure:
1. creation of word embeddings for initialization of the first layer.
Word2vec on an unlabelled corpus of 200M tweets.
2. distant supervised phase, where the network weights and word
embeddings are trained to capture aspects related to sentiment.
Emoticons used to infer the polarity of a balanced set of 90M tweets.
3. supervised phase, where the network is trained on the provided
supervised training data.
Ensemble of Classifiers
 Ensemble of classifiers
 combining the outputs of two 2-layer CNNs having
similar architectures but differing in the choice of
certain parameters (such as the number of
convolutional filters).
 networks were also initialized using different word
embeddings and used slightly different training data
for the distant supervised phase.
 A total of 7 outputs were combined
Results
2013 2014 2015 2016 Tweet
Tweet SMS Tweet
Sarcas
m
Live-
Journal
Tweet Avg F1 Acc
SwissCheese
Combination
70.05 63.72 71.62 56.61 69.57 67.11 63.31 64.61
SwissCheese
single
67.00 69.12 62.00 71.32 61.01 57.19
UniPI 59.218 58.511 62.718 38.125 65.412 58.619 57.118 63.93
UniPI SWE 64.2 60.6 68.4 48.1 66.8 63.5 59.2 65.2
Breakdown over all test sets
SwissCheese Prec. Rec. F1
positive 67.48 74.14 70.66
negative 53.26 67.86 59.68
neutral 71.47 59.51 64.94
Avg F1 65.17
Accuracy 64.62
UniPI 3 Prec. Rec. F1
positive 70.88 65.35 68.00
negative 50.29 58.93 54.27
neutral 68.02 68.12 68.07
Avg F1 61.14
Accuracy 65.64
Sentiment Classification from a single neuron
 A char-level LSTM with 4096 units has been
trained on 82 millions of reviews from Amazon.
 The model is trained only to predict the next
character in the text
 After training one of the units had a very high
correlation with sentiment, resulting in state-of-
the-art accuracy when used as a classifier.
 The model can be used to generate text.
 By setting the value of the sentiment unit, one
can control the sentiment of the resulting text.
Blog post - Radford et al. Learning to Generate Reviews and Discovering
Sentiment. Arxiv 1704.01444

More Related Content

Similar to sentiment analysis using machine learning (20)

PDF
Analyse de sentiment et classification par approche neuronale en Python et Weka
Patrice Bellot - Aix-Marseille Université / CNRS (LIS, INS2I)
 
PPTX
Sentiment analysis
Aditya Kamble
 
PDF
IRJET - Twitter Sentiment Analysis using Machine Learning
IRJET Journal
 
PDF
Sentiment Analysis
Data Science Society
 
PDF
Sentiment Analysis using Naïve Bayes, CNN, SVM
IRJET Journal
 
ODP
Sentiment Analysis on Twitter
Subarno Pal
 
PPT
Opinion Mining
Ali Habeeb
 
PPTX
CNN for Sentiment Analysis on Italian Tweets
Giuseppe Attardi
 
PDF
Neural Network Based Context Sensitive Sentiment Analysis
Editor IJCATR
 
PDF
The Identification of Depressive Moods from Twitter Data by Using Convolution...
IRJET Journal
 
PDF
Q01741118123
IOSR Journals
 
PPTX
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
Geetika Gautam
 
PDF
DEEP LEARNING SENTIMENT ANALYSIS OF AMAZON.COM REVIEWS AND RATINGS
ijscai
 
PDF
LSTM Based Sentiment Analysis
ijtsrd
 
PDF
Week 2 Sentiment Analysis Using Machine Learning
SARCCOM
 
PDF
A Survey Of Various Machine Learning Techniques For Text Classification
Joshua Gorinson
 
PDF
Convolutional Neural Network for Text Classification
Anaïs Addad
 
PPTX
Predicting Tweet Sentiment
Lucinda Linde
 
PPT
opinionmining-131221011849-phpapp02-converted.ppt
ssuser059331
 
PDF
IRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
IRJET Journal
 
Analyse de sentiment et classification par approche neuronale en Python et Weka
Patrice Bellot - Aix-Marseille Université / CNRS (LIS, INS2I)
 
Sentiment analysis
Aditya Kamble
 
IRJET - Twitter Sentiment Analysis using Machine Learning
IRJET Journal
 
Sentiment Analysis
Data Science Society
 
Sentiment Analysis using Naïve Bayes, CNN, SVM
IRJET Journal
 
Sentiment Analysis on Twitter
Subarno Pal
 
Opinion Mining
Ali Habeeb
 
CNN for Sentiment Analysis on Italian Tweets
Giuseppe Attardi
 
Neural Network Based Context Sensitive Sentiment Analysis
Editor IJCATR
 
The Identification of Depressive Moods from Twitter Data by Using Convolution...
IRJET Journal
 
Q01741118123
IOSR Journals
 
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
Geetika Gautam
 
DEEP LEARNING SENTIMENT ANALYSIS OF AMAZON.COM REVIEWS AND RATINGS
ijscai
 
LSTM Based Sentiment Analysis
ijtsrd
 
Week 2 Sentiment Analysis Using Machine Learning
SARCCOM
 
A Survey Of Various Machine Learning Techniques For Text Classification
Joshua Gorinson
 
Convolutional Neural Network for Text Classification
Anaïs Addad
 
Predicting Tweet Sentiment
Lucinda Linde
 
opinionmining-131221011849-phpapp02-converted.ppt
ssuser059331
 
IRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
IRJET Journal
 

More from RameshPrasadBhatta2 (14)

PPTX
DBMS overview.pptx for Ug and pg students
RameshPrasadBhatta2
 
PPTX
3150713_Python_GTU_Study_Material_Presentations_Unit-3_20112020032538AM.pptx
RameshPrasadBhatta2
 
PPTX
3130703_DBMS_GTU_Study_Material_Presentations_Unit-8_16102020044754AM.pptx
RameshPrasadBhatta2
 
PPT
Lecture5.ppt C style sheet notes for B.CA and BIT
RameshPrasadBhatta2
 
PPTX
Research for PhD in Rohilkhand university
RameshPrasadBhatta2
 
PPTX
GEN740 Indexing 1.pptx my study notes during phd
RameshPrasadBhatta2
 
PPTX
IPR1.pptx intellectual property rights for PhD
RameshPrasadBhatta2
 
PPT
1Belmontreport.ppt for research methods in phd
RameshPrasadBhatta2
 
PPTX
Literature review for students of pg class
RameshPrasadBhatta2
 
PPTX
Data base MODEL for undergraduate students
RameshPrasadBhatta2
 
DOC
java swing notes in easy manner for UG students
RameshPrasadBhatta2
 
DOC
Java database connectivity notes for undergraduate
RameshPrasadBhatta2
 
PPTX
programming with xml for graduate students
RameshPrasadBhatta2
 
PPTX
University undergraduate HTML notes for free for CSIT
RameshPrasadBhatta2
 
DBMS overview.pptx for Ug and pg students
RameshPrasadBhatta2
 
3150713_Python_GTU_Study_Material_Presentations_Unit-3_20112020032538AM.pptx
RameshPrasadBhatta2
 
3130703_DBMS_GTU_Study_Material_Presentations_Unit-8_16102020044754AM.pptx
RameshPrasadBhatta2
 
Lecture5.ppt C style sheet notes for B.CA and BIT
RameshPrasadBhatta2
 
Research for PhD in Rohilkhand university
RameshPrasadBhatta2
 
GEN740 Indexing 1.pptx my study notes during phd
RameshPrasadBhatta2
 
IPR1.pptx intellectual property rights for PhD
RameshPrasadBhatta2
 
1Belmontreport.ppt for research methods in phd
RameshPrasadBhatta2
 
Literature review for students of pg class
RameshPrasadBhatta2
 
Data base MODEL for undergraduate students
RameshPrasadBhatta2
 
java swing notes in easy manner for UG students
RameshPrasadBhatta2
 
Java database connectivity notes for undergraduate
RameshPrasadBhatta2
 
programming with xml for graduate students
RameshPrasadBhatta2
 
University undergraduate HTML notes for free for CSIT
RameshPrasadBhatta2
 
Ad

Recently uploaded (20)

PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
July Patch Tuesday
Ivanti
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
July Patch Tuesday
Ivanti
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Ad

sentiment analysis using machine learning

  • 2. Unsupervised Sentiment Classification  Unsupervised methods do not require labeled examples.  Knowledge about the task is usually added by using lexical resources and  hard-coded heuristics, e.g.:  Lexicons + patterns: VADER  Patterns + Simple language model: SO-PMI  Neural language models have been found that they learn to recognize  sentiment with no explicit knowledge about the task.
  • 3. Supervised/unsupervised  Supervised learning methods are the most commonly used one, yet also  some unsupervised methods have been successfully.  Unsupervised methods rely on the shared and recurrent characteristics of the  sentiment dimension across topics to perform classification by means of  hand-made heuristics and simple language models.  Supervised methods rely on a training set of labeled examples that describe  the correct classification label to be assigned to a number of documents.  A learning algorithm then exploits the examples to model a general  classification function.
  • 4. VADER  VADER (Valence Aware Dictionary for sEntiment Reasoning)uses a curated lexicon derived from well known sentiment lexicons that assigns a positivity/negativity score to 7k+ words/emoticons.  It also uses a number of hand-written pattern matching rules (e.g., negation, intensifiers) to modify the contribution of the original word scores to the overall sentiment of text.  Hutto and Gilbert. VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. ICWSM 2014.  VADER is integrated into NLTK
  • 5. The classification pipeline The elements of a classification pipeline are: 1. Tokenization 2. Feature extraction 3. Feature selection 4. Weighting 5. Learning  Steps from 1 to 4 define the feature space and how text is converted into vectors.  Step 5 creates the classification model.
  • 6. Skikit-learn  The scikit-learn library defines a rich number of data processing and machine learning algorithms.  Most modules in scikit implement a 'fit-transform' interface:  fit method learns the parameter of the module from input data  transform method apply the method implemented by the module to the data  fit_transform does both actions in sequence, and is useful to connect modules in a pipeline.
  • 7. Deep Learning for Sentiment Analysis
  • 8. Convolutional Neural Network  A convolutional layer in a NN is composed by a set of filters.  A filter combines a "local" selection of input values into an output value.  All filters are "sweeped" across all input. • A filter using a window length of 5 is applied to all the sequences of 5 words in a text. • 3 filters using a window of 5 applied to a text of 10 words produce 18 output values. Why? • Filters have additional parameters that define their behavior at the start/end of documents (padding), the size of the sweep step (stride), the eventual presence of holes in the filter window (dilation).  During training each filter specializes into recognizing some kind of relevant combination of features.  CNNs work well on stationary feats, i.e., those independent from position.
  • 9. - Not going to the beach tomorrow :-( + - convolutional layer with multiple filters Multilayer perceptron with dropout embeddings for each word max over time pooling CNN for Sentiment Classification 1. Embeddings Layer, Rd (d = 300) 2. Convolutional Layer with Relu activation  Multiple filters of sliding windows of various sizes h ci = f(F  Si:i+h−1 + b) 3. max-pooling layer 4. dropout layer 5. linear layer with tanh activation 6. softmax layer S Frobenius matrix product
  • 10. Sense Specific Word Embeddings  Sentiment Specific Word Embeddings  Uses an annotated corpus with polarities (e.g. tweets)  SS Word Embeddings achieve SotA accuracy on tweet sentiment classification U the cat sits on LM likelihood + Polarity
  • 11. Learning  Generic loss function LCW(x, xc ) = max(0, 1  f(x) + f(xc ))  SS loss function LSS(x, xc ) = max(0, 1  ds(x) f(x)1 + ds(x) f(xc )1)  Gradients
  • 12. Semeval 2015 Sentiment on Tweets Team Phrase Level Polarity Tweet Attardi (unofficial) 67.28 Moschitti 84.79 64.59 KLUEless 84.51 61.20 IOA 82.76 62.62 WarwickDCS 82.46 57.62 Webis 64.84
  • 13. SwissCheese at SemEval 2016  three-phase procedure: 1. creation of word embeddings for initialization of the first layer. Word2vec on an unlabelled corpus of 200M tweets. 2. distant supervised phase, where the network weights and word embeddings are trained to capture aspects related to sentiment. Emoticons used to infer the polarity of a balanced set of 90M tweets. 3. supervised phase, where the network is trained on the provided supervised training data.
  • 14. Ensemble of Classifiers  Ensemble of classifiers  combining the outputs of two 2-layer CNNs having similar architectures but differing in the choice of certain parameters (such as the number of convolutional filters).  networks were also initialized using different word embeddings and used slightly different training data for the distant supervised phase.  A total of 7 outputs were combined
  • 15. Results 2013 2014 2015 2016 Tweet Tweet SMS Tweet Sarcas m Live- Journal Tweet Avg F1 Acc SwissCheese Combination 70.05 63.72 71.62 56.61 69.57 67.11 63.31 64.61 SwissCheese single 67.00 69.12 62.00 71.32 61.01 57.19 UniPI 59.218 58.511 62.718 38.125 65.412 58.619 57.118 63.93 UniPI SWE 64.2 60.6 68.4 48.1 66.8 63.5 59.2 65.2
  • 16. Breakdown over all test sets SwissCheese Prec. Rec. F1 positive 67.48 74.14 70.66 negative 53.26 67.86 59.68 neutral 71.47 59.51 64.94 Avg F1 65.17 Accuracy 64.62 UniPI 3 Prec. Rec. F1 positive 70.88 65.35 68.00 negative 50.29 58.93 54.27 neutral 68.02 68.12 68.07 Avg F1 61.14 Accuracy 65.64
  • 17. Sentiment Classification from a single neuron  A char-level LSTM with 4096 units has been trained on 82 millions of reviews from Amazon.  The model is trained only to predict the next character in the text  After training one of the units had a very high correlation with sentiment, resulting in state-of- the-art accuracy when used as a classifier.  The model can be used to generate text.  By setting the value of the sentiment unit, one can control the sentiment of the resulting text. Blog post - Radford et al. Learning to Generate Reviews and Discovering Sentiment. Arxiv 1704.01444