SlideShare a Scribd company logo
2
Most read
4
Most read
International Journal of Trend in Scientific Research and Development (IJTSRD)
Volume 5 Issue 4, May-June 2021 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470
@ IJTSRD | Unique Paper ID – IJTSRD42372 | Volume – 5 | Issue – 4 | May-June 2021 Page 720
Amazon Product Review Sentiment
Analysis with Machine Learning
Ravi Kumar Singh1, Dr. Kamalraj Ramalingam2
1Student,2Associate Professor,
1,2Department of Master of Computer Applications, School of CS,
Jain Deemed to be University, Bangalore, Karnataka, India
ABSTRACT
Users of Amazon's online shopping service are allowed to leave feedback for
the items they buy. Amazon makes no effort to monitor or limit the scope of
these reviews. Although the amount of reviews for various items varies, the
reviews provide easily accessible and abundant data for a variety of
applications. This paper aims to apply and expand existing natural language
processing and sentiment analysis research to data obtained from Amazon.
The number of stars given to a product by a user is used as training data for
supervised machine learning. Since more people are dependent on online
products these days, the value of a review is increasing. Before making a
purchase, a buyer must read thousands of reviews to fully comprehend a
product. In this day and age of machine learning, however, sorting through
thousands of comments and learning from them would be much easier if a
model was used to polarize and learn from them.Weused supervisedlearning
to polarize a massive Amazon dataset and achieve satisfactory accuracy.
KEYWORDS: Sentiment analysis, machine learning, Amazon customer reviews,
Logistic Regression Classifier, Decision Tree Classifier, SVM
How to cite this paper: Ravi Kumar Singh
| Dr. Kamalraj Ramalingam "Amazon
Product Review Sentiment Analysis with
Machine Learning"
Published in
International Journal
of Trend in Scientific
Research and
Development(ijtsrd),
ISSN: 2456-6470,
Volume-5 | Issue-4,
June 2021, pp.720-723, URL:
www.ijtsrd.com/papers/ijtsrd42372.pdf
Copyright © 2021 by author (s) and
International Journal ofTrendinScientific
Research and Development Journal. This
is an Open Access article distributed
under the terms of
the Creative
Commons Attribution
License (CC BY 4.0)
(http://creativecommons.org/licenses/by/4.0)
INTRODUCTION
As online marketplaces have grown in popularity over the
years, online retailers and vendors have encouraged their
customers to share their thoughts on the items they've
purchased. Thousands of reviews are written every day on
the Internet about a wide range of products, programmes,
and locations. As a result, the Internet has surpassed all
other sources for collecting information and opinions on a
product or service.
The Internet has revolutionized the way we purchase
products. Wherever product testing is not feasible in the
retail e-commerce environment of online marketplace.
Furthermore, in today's retail sale environment, a large
number of new products are introduced on a regular basis.
As a result, consumers can rely heavily on product feedback
to shape their opinions in preparation for a more complex
cognitive process during the purchasing process. Users, on
the other hand, always find looking out and comparing text
reviews to be challenging. As a result, we want a higher
numerical rating system that is backed up by feedback, so
that consumers can easily make a buying decision.
Clients can require the use of a score device at some point
during their decision-making process in order to locate
useful feedback as quickly as possible. As a result, models
that can predict a person's score based on a textual content
assessment are critical. Obtaining a common sense of a
textual evaluation may want to enhance customer service. It
can also help businesses increase sales and develop their
products by gaining a better understanding of what their
customers want.
The Amazon electronicproductevaluationdatasetwastaken
into accounts. The evaluations and ratings provided by
customers to exceptional products, as well as reviews about
the customer's product(s), were also taken into accounts.
LITERATURE SURVEY
Sentiment analysis has gotten a lot of attention in recent
years thanks to the abundance of online reviews. As a result,
numerous studies have been conducted in this area. Someof
the most relevant research workstothisthesisarediscussed
in this section.
SVM was tested for text classification by Joachims (1998),
who found that it performed well in all experiments with
lower error levels than other classification methods.
With the assistance of SVM and Naive Bayes and maximum
entropy classification, Pang, Lee, and Vaithyanathan (2002)
attempted supervised learning for classifyingmoviereviews
into two groups, positive and negative. In terms ofprecision,
all three methods performed admirably.Inthisanalysis,they
experimented with different features and discovered that
when a bag of words was used as a feature in the classifiers,
the machine learning algorithms performed better.
Three supervised machine learningalgorithms,NaiveBayes,
SVM, and N-gram model, were tested on online feedback
about various travel destinations around the world in a
IJTSRD42372
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD42372 | Volume – 5 | Issue – 4 | May-June 2021 Page 721
recent survey conducted by Ye etal.(2009).Theydiscovered
in this study that well-trained machine learning algorithms
work exceptionally well for classification of travel
destination reviews in terms of accuracy. They also showed
that the SVM and N-gram models outperformed the Naive
Bayes system. However, increasing the number of training
data sets decreased the gap between the algorithms
significantly.
Chaovalit and Zhou (2005) compared a supervised machine
learning algorithm to an unsupervised approach to movie
review called Semantic orientation, and found that the
supervised approach was more efficient than the
unsupervised form.
Naive Bayes and SVM are two of the most widely used
methods in sentiment classification issues, according to
several studies (Joachims 1998; Pang et al. 2002; Ye et al.
2009). As a result, this study attempts to apply supervised
machine learning algorithms suchasNaiveBayesandSVMto
Amazon's beauty product reviews.
PROPOSED SYSTEM
The method entails gathering product-based datasets from
various E-commerce sites suchasamazon.com,epinion.com,
and others. The feedback is received on items such as
phones, iPods, and other electronic devices. The aim of this
project is to use algorithms like random forest,decisiontree,
and SVM to evaluate and forecast product reviews by
classifying them as positive, negative,orneutral.Weconduct
pre-processing, extract features on which comments are
made, measure polarity of feedback, and plot a graph for the
result since the input is about unstructuredproduct reviews.
Dealing with negation is also covered in the results. For
instance, "the Nokia phone is not bad" is a positive review
despite the negative word "not." The approachflowdiagram
as shown below, and the subsections are explained in detail
in the following subsections.
Sentiment Classification Algorithm:
Sentiment analysis, also known as opinion mining, is a
problem in natural language processing (NLP) that entails
recognizing and extracting subjective knowledge from text
sources. The aim of sentiment classification is to interpret
user feedback and categorize them as positive or negative,
without requiring the system to fully comprehend the
semantics of each phrase or text.
Sentiment analysis is becoming a powerful method for
monitoring and analyzing consumer sentiment as people
share their thoughts and feelings more freely than ever
before. Brands can learn what makes consumers happy or
sad by automatically analyzing consumer reviews such as
survey responses and social media interactions. This allows
them to tailor goods and services to theircustomers'specific
requirements.
Different areas, such as movie reviews, travel destination
reviews, and product reviews, have been attempted by
sentiment classification.
Random forest Classifier (RFC)
Random Forest is a concept for putting together decision
trees that can be obtained by combining multiple decision
trees. We can run into issues like outlier data or noisy data
while using single tree classifiers, such as decision tree
classifiers, which can affect the performance of the classifier
function, while Random Forest as a classifier provides
randomness and is therefore highly resistant to noise and
outliers. This classifier produces two different forms of
randomness: data randomness and function randomness.
This classifier has a numberofhyperparametersbecause it's
used to combine multiple Decision Trees, such as:
How many trees should be built in the Decision Forest?
What is the maximum number of features that can be
selected at random?
The maximum height of each tree.
Since it uses the concepts of bootstrapping and bagging,
Random Forest is thought to be a reliable and accurate
classifier.
Support vector machine (SVM)
Support vector machines (SVMs) are a type of supervised
learning system that can be used to solve sentiment
classification problems (Cristianini & ShaweTaylor 2000).
This approach positions marked training data on a decision
plane, then uses an algorithm to create an optimal
hyperplane that divides the data into groups or classes. As
shown in Figure 1, the best hyperplane is the one that
separates the groups by the largest margin. This is done by
choosing a hyperplane that is the furthest away from the
nearest data on each class (Berk 2016). “The groups are not
separated in H1. H2 has a slight advantage, but only by a
small margin. H3 divides them by the greatest possible
margin.” Weinberg, Zack (2012).
Fig1: Support Vector Machine
Logistic Regression Classifier (LRC)
The likelihood of an outcome with only two possible values
is predicted using logistic regression (i.e. a dichotomy). One
or more predictors are used to make the prediction
(numerical and categorical). For two reasons, linear
regression is ineffective for predicting the value of a binary
variable:
Values outside the appropriate range would be predicted by
a linear regression (e.g. predicting probabilities outside the
range 0 to 1)
The residuals would not necessarily spread around the
expected axis since dichotomous experiments could only
have one of two potential values for each experiment.
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD42372 | Volume – 5 | Issue – 4 | May-June 2021 Page 722
A logistic regression, on the other hand, yields a logistic
curve with values ranging from 0 to 1. In logistic regression,
rather than using the probability, the usual logarithm of the
target variable's "odds" is used to construct the curve.
Furthermore, the predictors do not have to be normally
distributed or have the same variance in and category to be
efficient.
Decision Tree Classifier (DTC)
A hierarchical tree structure with attributes represented by
decision nodes and attribute values represented by edges.
The creation of decision rules for classifying new data
instances is made possible by this tree-like representation.
A decision tree is a tool for making decisionsthatusesa tree-
like model of decisions and their possible outcomes, such as
chance event outcomes, resource costs, and utility. It's one
way of displaying an algorithm that iscompletelymadeup of
conditional control statements.
Result and Discussion
The predictive accuracy of the models is calculated after
testing and training the dataset to decide which model is the
best classifier for classifying feedback. The SVM model, as
seen in the table, has the best predictive accuracy of the four
models, whereas the Decision Tree model has the worst
predictive accuracy.
Model Name Accuracy
Logistic Regression Classifier 93.92%
Support Vector Machine 93.94%
Random Forest Classifier 93.50%
Decision Tree Classifier 90.10%
After a few arbitrary feedbacks, it seems that our
features are working properly with Positive, Neutral,
and Negative outcome.
We can also see that our Support Vector Machine
Classifier has improved to a level of 94.08 percent
accuracy after running the grid quest.
Conclusion and Future Work
Sentiment analysis is the process of recognizing and
aggregating user sentiment or opinions. The method of
deciding whether the polarity of text in a document or
sentence is positive, negative, or neutral is known as
sentiment analysis. We can see that four approaches have
been compared, and a result has been calculated for
approaches on the product review dataset. The accuracy of
Logistic Regression is found to be 93.92 %, SVM is found to
be 93.94 %, Decision Tree is found to be 90.10 %, and
Random Forest is found to be 93.50 %. Among the four
models, the SVM model has the highest predictive accuracy.
We can see that text files that are too big take a long time to
process. Automatic sentimental analysis is a powerful tool
for detecting and forecasting current and future patterns.
While opinions at the feature level have been sought, there
are still many limitations that can be explored further. The
potential for future development –
Providing product reviews in a variety of languages.
Addressing the issue of slang mapping.
Dealing with sarcastically expressed views.
Identifyingcomparativeviewsanddetermining whichof
the two products under consideration is the best.
Dealing with anaphora resolution, which is what the
opinion is really about.
In the future, the work could be expanded to conduct
multiclass classification of reviews, which would give
consumers a clearer picture ofthereview'sessence,allowing
them to make better product decisions. It can also beusedto
predict a product's ranking based on the review. This would
provide consumers with a trustworthy rating because the
product's rating and the sentiment of the review will often
contradict each other. The proposed job extension would be
extremely beneficial to the e-commerce industry by
increasing customer loyalty and confidence.
ACKNOWLEDGEMENT:
I do acknowledge the support and encouragement of all
people who helped me throughout the completion of this
project.
I would wish to give thanks Dr. Dinesh Nilkhant, Director -
JGI, Knowledge Campus, Bangalore, Karnataka for proving
the facilities to try to analysis work. His leadership and
management skills are continuously a supply of inspiration.
I conjointly wish to give thanks Dr. M. N Nachappa, Dean,
School of Computer Science & IT, Jain deemed to be
university, Knowledge campus, Bangalore,Karnataka forhis
support and cordial cooperation.
I would wish to give thanks to our MCA & program
coordinator, Dr. BhuvanaJ, MentorandAssociateProfessor,
Department of MasterofComputerApplicationfor providing
for providing the support and steerage to try to analysis
work. Her timely direction and motivation helped metostay
my patience throughout this journey.
Moving further, I would wish to give thanks my sincere
gratitude to project coordinators Members, Dr. Lakshmi
JVN and Dr. Gangotri, Assistant Professor, Department of
Master of Computer Application for sharingtheir experience
which helped me in completingmythesisinthe bestpossible
way. In addition, they also helped in critically reviewing and
proof reading my work and my project thesis.
References
[1] S. Brownfield and J. Zhou, "Sentiment Analysis of
Amazon Product Reviews," in Proceedings of the
Computational Methods in Systems and Software,
Springer, 2020, pp. 739--750.
[2] T. Haque, N. Saber and F. Shah, "Sentimentanalysison
large scale Amazon product reviews," in 2018 IEEE
international conference on innovative research and
development (ICIRD), IEEE, 2018, pp. 1--6.
[3] R. Jagdale, V. Shirsat and S. Deshmukh, "Sentiment
analysis on product reviews using machine learning
techniques," in Cognitive Informatics and Soft
Computing, Springer, 2019, pp. 639--647.
[4] N. Nandal, R. Tanwar and J. Pruthi, "Machine learning
based aspect level sentiment analysis for Amazon
products," Spatial Information Research, vol. 28, pp.
601--607, 2020.
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD42372 | Volume – 5 | Issue – 4 | May-June 2021 Page 723
[5] A. Rathor, A. Agarwal and P. Dimri, "Comparative
study of machine learning approaches for amazon
reviews," Procedia computer science, vol. 132, pp.
1552--1561, 2018.
[6] A. Ravi, A. Khettry and S. Sethumadhavachar,
"Amazon Reviews as Corpus for Sentiment Analysis
Using Machine Learning," in International Conference
on Advances in ComputingandDataSciences,Springer,
2019, pp. 403--411.
[7] J. Sing, G. Singh and R. Singh, "Optimization of
sentiment analysis using machine learning
classifiers," Human-centric Computing and
information Sciences, vol. 7, pp. 1--12, 2017.
[8] Z. Singla, S. Randhawa and S. Jain, "Sentimentanalysis
of customer productreviews usingmachinelearning,"
in 2017 international conference on intelligent
computing and control (I2C2), IEEE, 2017, pp. 1--5.
[9] K. Srujan, S. Nikhil, H. Rao and K. Karthik,
"Classification of amazon book reviews based on
sentiment analysis," in Information Systems Design
and Intelligent Applications, Springer, 2018, pp. 401--
411.
[10] W. Tan, X. Wang and X. Xu, "Sentiment analysis for
Amazon reviews," in International Conference, 2018,
pp. 1--5.
[11] S. Wassan, X. Chen, T. Shen and M. Waqar, "Amazon
Product Sentiment Analysis using Machine Learning
Techniques," Revista Argentina de Cl{'i}nica
Psicol{'o}gica, vol. 30, p. 695, 2021.
[12] S. Dey, S. Wasif, D. Tonmoy and S. Sultana, "A
Comparative Study of Support Vector Machine and
Naive Bayes Classifier for Sentiment Analysis on
Amazon Product Reviews," in 2020 International
Conference on Contemporary Computing and
Applications (IC3A), IEEE, 2020, pp. 217--220.

More Related Content

What's hot (20)

PPTX
Sentiment Analysis
Aditya Nag
 
PPTX
Text MIning
Prakhyath Rai
 
PDF
Sentiment Analysis of Twitter Data
Sumit Raj
 
PPT
Machine learning
Sanjay krishne
 
PDF
Sentiment Analysis
Dinesh V
 
PPTX
Sentiment analysis using ml
Pravin Katiyar
 
PPTX
Machine learning
Saurabh Agrawal
 
PPTX
New sentiment analysis of tweets using python by Ravi kumar
Ravi Kumar
 
PDF
Text classification & sentiment analysis
M. Atif Qureshi
 
PDF
Sentiment Analysis
Data Science Society
 
PPTX
Sentiment Analysis on Twitter
SmritiAgarwal26
 
PDF
Machine learning
Dr Geetha Mohan
 
PPTX
Sentiment analysis
Makrand Patil
 
PDF
Twitter sentimentanalysis report
Savio Aberneithie
 
PPTX
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
Geetika Gautam
 
DOCX
Tweet sentiment analysis
Anil Shrestha
 
PPTX
Sentiment Analysis using Twitter Data
Hari Prasad
 
PDF
Machine Learning in Banking Sector
Knoldus Inc.
 
PPTX
Credit card fraud detection
vineeta vineeta
 
PPTX
Sentiment analysis
Seher Can
 
Sentiment Analysis
Aditya Nag
 
Text MIning
Prakhyath Rai
 
Sentiment Analysis of Twitter Data
Sumit Raj
 
Machine learning
Sanjay krishne
 
Sentiment Analysis
Dinesh V
 
Sentiment analysis using ml
Pravin Katiyar
 
Machine learning
Saurabh Agrawal
 
New sentiment analysis of tweets using python by Ravi kumar
Ravi Kumar
 
Text classification & sentiment analysis
M. Atif Qureshi
 
Sentiment Analysis
Data Science Society
 
Sentiment Analysis on Twitter
SmritiAgarwal26
 
Machine learning
Dr Geetha Mohan
 
Sentiment analysis
Makrand Patil
 
Twitter sentimentanalysis report
Savio Aberneithie
 
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
Geetika Gautam
 
Tweet sentiment analysis
Anil Shrestha
 
Sentiment Analysis using Twitter Data
Hari Prasad
 
Machine Learning in Banking Sector
Knoldus Inc.
 
Credit card fraud detection
vineeta vineeta
 
Sentiment analysis
Seher Can
 

Similar to Amazon Product Review Sentiment Analysis with Machine Learning (20)

PDF
Sentiment Analysis on Product Reviews Using Supervised Learning Techniques
IRJET Journal
 
PPTX
Business Analytics Final Capstone Project Presenation PPT.pptx
Kavitha860274
 
PDF
A Novel Jewellery Recommendation System using Machine Learning and Natural La...
IRJET Journal
 
PDF
IRJET - Online Product Scoring based on Sentiment based Review Analysis
IRJET Journal
 
PDF
IRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
IRJET Journal
 
PDF
IRJET- Physical Design of Approximate Multiplier for Area and Power Efficiency
IRJET Journal
 
PDF
A Novel Hybrid Classification Approach for Sentiment Analysis of Text Document
IJECEIAES
 
DOCX
Customer_Analysis.docx
KevalKabariya
 
PDF
IRJET- Sentimental Analysis of Product Reviews for E-Commerce Websites
IRJET Journal
 
PDF
Sentiment Analysis Using Hybrid Approach: A Survey
IJERA Editor
 
PDF
IRJET - Support Vector Machine versus Naive Bayes Classifier:A Juxtaposition ...
IRJET Journal
 
PDF
Sentimental Analysis and Opinion Mining on Online Customer Review
IRJET Journal
 
PDF
K1802056469
IOSR Journals
 
PDF
IRJET- Sentiment Analysis of Customer Reviews on Laptop Products for Flip...
IRJET Journal
 
PDF
IRJET- Analysis of Brand Value Prediction based on Social Media Data
IRJET Journal
 
PPTX
1.pptx
NRakesh8
 
PDF
OPINION MINING AND ANALYSIS: A SURVEY
ijnlc
 
PDF
IRJET- Comparative Study of Classification Algorithms for Sentiment Analy...
IRJET Journal
 
PDF
A Survey on Evaluating Sentiments by Using Artificial Neural Network
IRJET Journal
 
Sentiment Analysis on Product Reviews Using Supervised Learning Techniques
IRJET Journal
 
Business Analytics Final Capstone Project Presenation PPT.pptx
Kavitha860274
 
A Novel Jewellery Recommendation System using Machine Learning and Natural La...
IRJET Journal
 
IRJET - Online Product Scoring based on Sentiment based Review Analysis
IRJET Journal
 
IRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
IRJET Journal
 
IRJET- Physical Design of Approximate Multiplier for Area and Power Efficiency
IRJET Journal
 
A Novel Hybrid Classification Approach for Sentiment Analysis of Text Document
IJECEIAES
 
Customer_Analysis.docx
KevalKabariya
 
IRJET- Sentimental Analysis of Product Reviews for E-Commerce Websites
IRJET Journal
 
Sentiment Analysis Using Hybrid Approach: A Survey
IJERA Editor
 
IRJET - Support Vector Machine versus Naive Bayes Classifier:A Juxtaposition ...
IRJET Journal
 
Sentimental Analysis and Opinion Mining on Online Customer Review
IRJET Journal
 
K1802056469
IOSR Journals
 
IRJET- Sentiment Analysis of Customer Reviews on Laptop Products for Flip...
IRJET Journal
 
IRJET- Analysis of Brand Value Prediction based on Social Media Data
IRJET Journal
 
1.pptx
NRakesh8
 
OPINION MINING AND ANALYSIS: A SURVEY
ijnlc
 
IRJET- Comparative Study of Classification Algorithms for Sentiment Analy...
IRJET Journal
 
A Survey on Evaluating Sentiments by Using Artificial Neural Network
IRJET Journal
 
Ad

More from ijtsrd (20)

PDF
A Study of School Dropout in Rural Districts of Darjeeling and Its Causes
ijtsrd
 
PDF
Pre extension Demonstration and Evaluation of Soybean Technologies in Fedis D...
ijtsrd
 
PDF
Pre extension Demonstration and Evaluation of Potato Technologies in Selected...
ijtsrd
 
PDF
Pre extension Demonstration and Evaluation of Animal Drawn Potato Digger in S...
ijtsrd
 
PDF
Pre extension Demonstration and Evaluation of Drought Tolerant and Early Matu...
ijtsrd
 
PDF
Pre extension Demonstration and Evaluation of Double Cropping Practice Legume...
ijtsrd
 
PDF
Pre extension Demonstration and Evaluation of Common Bean Technology in Low L...
ijtsrd
 
PDF
Enhancing Image Quality in Compression and Fading Channels A Wavelet Based Ap...
ijtsrd
 
PDF
Manpower Training and Employee Performance in Mellienium Ltdawka, Anambra State
ijtsrd
 
PDF
A Statistical Analysis on the Growth Rate of Selected Sectors of Nigerian Eco...
ijtsrd
 
PDF
Automatic Accident Detection and Emergency Alert System using IoT
ijtsrd
 
PDF
Corporate Social Responsibility Dimensions and Corporate Image of Selected Up...
ijtsrd
 
PDF
The Role of Media in Tribal Health and Educational Progress of Odisha
ijtsrd
 
PDF
Advancements and Future Trends in Advanced Quantum Algorithms A Prompt Scienc...
ijtsrd
 
PDF
A Study on Seismic Analysis of High Rise Building with Mass Irregularities, T...
ijtsrd
 
PDF
Descriptive Study to Assess the Knowledge of B.Sc. Interns Regarding Biomedic...
ijtsrd
 
PDF
Performance of Grid Connected Solar PV Power Plant at Clear Sky Day
ijtsrd
 
PDF
Vitiligo Treated Homoeopathically A Case Report
ijtsrd
 
PDF
Vitiligo Treated Homoeopathically A Case Report
ijtsrd
 
PDF
Uterine Fibroids Homoeopathic Perspectives
ijtsrd
 
A Study of School Dropout in Rural Districts of Darjeeling and Its Causes
ijtsrd
 
Pre extension Demonstration and Evaluation of Soybean Technologies in Fedis D...
ijtsrd
 
Pre extension Demonstration and Evaluation of Potato Technologies in Selected...
ijtsrd
 
Pre extension Demonstration and Evaluation of Animal Drawn Potato Digger in S...
ijtsrd
 
Pre extension Demonstration and Evaluation of Drought Tolerant and Early Matu...
ijtsrd
 
Pre extension Demonstration and Evaluation of Double Cropping Practice Legume...
ijtsrd
 
Pre extension Demonstration and Evaluation of Common Bean Technology in Low L...
ijtsrd
 
Enhancing Image Quality in Compression and Fading Channels A Wavelet Based Ap...
ijtsrd
 
Manpower Training and Employee Performance in Mellienium Ltdawka, Anambra State
ijtsrd
 
A Statistical Analysis on the Growth Rate of Selected Sectors of Nigerian Eco...
ijtsrd
 
Automatic Accident Detection and Emergency Alert System using IoT
ijtsrd
 
Corporate Social Responsibility Dimensions and Corporate Image of Selected Up...
ijtsrd
 
The Role of Media in Tribal Health and Educational Progress of Odisha
ijtsrd
 
Advancements and Future Trends in Advanced Quantum Algorithms A Prompt Scienc...
ijtsrd
 
A Study on Seismic Analysis of High Rise Building with Mass Irregularities, T...
ijtsrd
 
Descriptive Study to Assess the Knowledge of B.Sc. Interns Regarding Biomedic...
ijtsrd
 
Performance of Grid Connected Solar PV Power Plant at Clear Sky Day
ijtsrd
 
Vitiligo Treated Homoeopathically A Case Report
ijtsrd
 
Vitiligo Treated Homoeopathically A Case Report
ijtsrd
 
Uterine Fibroids Homoeopathic Perspectives
ijtsrd
 
Ad

Recently uploaded (20)

PPTX
A PPT on Alfred Lord Tennyson's Ulysses.
Beena E S
 
PDF
The-Ever-Evolving-World-of-Science (1).pdf/7TH CLASS CURIOSITY /1ST CHAPTER/B...
Sandeep Swamy
 
PDF
Biological Bilingual Glossary Hindi and English Medium
World of Wisdom
 
PDF
Stokey: A Jewish Village by Rachel Kolsky
History of Stoke Newington
 
PDF
Exploring the Different Types of Experimental Research
Thelma Villaflores
 
PPTX
grade 5 lesson matatag ENGLISH 5_Q1_PPT_WEEK4.pptx
SireQuinn
 
PDF
ARAL_Orientation_Day-2-Sessions_ARAL-Readung ARAL-Mathematics ARAL-Sciencev2.pdf
JoelVilloso1
 
PPTX
ASRB NET 2023 PREVIOUS YEAR QUESTION PAPER GENETICS AND PLANT BREEDING BY SAT...
Krashi Coaching
 
PPTX
How to Set Up Tags in Odoo 18 - Odoo Slides
Celine George
 
PDF
ARAL-Orientation_Morning-Session_Day-11.pdf
JoelVilloso1
 
PPTX
How to Create a PDF Report in Odoo 18 - Odoo Slides
Celine George
 
PPT
Talk on Critical Theory, Part II, Philosophy of Social Sciences
Soraj Hongladarom
 
PPTX
Unit 2 COMMERCIAL BANKING, Corporate banking.pptx
AnubalaSuresh1
 
PPTX
Neurodivergent Friendly Schools - Slides from training session
Pooky Knightsmith
 
PDF
Lesson 2 - WATER,pH, BUFFERS, AND ACID-BASE.pdf
marvinnbustamante1
 
PPTX
Universal immunization Programme (UIP).pptx
Vishal Chanalia
 
PDF
LAW OF CONTRACT (5 YEAR LLB & UNITARY LLB )- MODULE - 1.& 2 - LEARN THROUGH P...
APARNA T SHAIL KUMAR
 
PDF
Reconstruct, Restore, Reimagine: New Perspectives on Stoke Newington’s Histor...
History of Stoke Newington
 
PPTX
How to Set Maximum Difference Odoo 18 POS
Celine George
 
PPTX
How to Handle Salesperson Commision in Odoo 18 Sales
Celine George
 
A PPT on Alfred Lord Tennyson's Ulysses.
Beena E S
 
The-Ever-Evolving-World-of-Science (1).pdf/7TH CLASS CURIOSITY /1ST CHAPTER/B...
Sandeep Swamy
 
Biological Bilingual Glossary Hindi and English Medium
World of Wisdom
 
Stokey: A Jewish Village by Rachel Kolsky
History of Stoke Newington
 
Exploring the Different Types of Experimental Research
Thelma Villaflores
 
grade 5 lesson matatag ENGLISH 5_Q1_PPT_WEEK4.pptx
SireQuinn
 
ARAL_Orientation_Day-2-Sessions_ARAL-Readung ARAL-Mathematics ARAL-Sciencev2.pdf
JoelVilloso1
 
ASRB NET 2023 PREVIOUS YEAR QUESTION PAPER GENETICS AND PLANT BREEDING BY SAT...
Krashi Coaching
 
How to Set Up Tags in Odoo 18 - Odoo Slides
Celine George
 
ARAL-Orientation_Morning-Session_Day-11.pdf
JoelVilloso1
 
How to Create a PDF Report in Odoo 18 - Odoo Slides
Celine George
 
Talk on Critical Theory, Part II, Philosophy of Social Sciences
Soraj Hongladarom
 
Unit 2 COMMERCIAL BANKING, Corporate banking.pptx
AnubalaSuresh1
 
Neurodivergent Friendly Schools - Slides from training session
Pooky Knightsmith
 
Lesson 2 - WATER,pH, BUFFERS, AND ACID-BASE.pdf
marvinnbustamante1
 
Universal immunization Programme (UIP).pptx
Vishal Chanalia
 
LAW OF CONTRACT (5 YEAR LLB & UNITARY LLB )- MODULE - 1.& 2 - LEARN THROUGH P...
APARNA T SHAIL KUMAR
 
Reconstruct, Restore, Reimagine: New Perspectives on Stoke Newington’s Histor...
History of Stoke Newington
 
How to Set Maximum Difference Odoo 18 POS
Celine George
 
How to Handle Salesperson Commision in Odoo 18 Sales
Celine George
 

Amazon Product Review Sentiment Analysis with Machine Learning

  • 1. International Journal of Trend in Scientific Research and Development (IJTSRD) Volume 5 Issue 4, May-June 2021 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470 @ IJTSRD | Unique Paper ID – IJTSRD42372 | Volume – 5 | Issue – 4 | May-June 2021 Page 720 Amazon Product Review Sentiment Analysis with Machine Learning Ravi Kumar Singh1, Dr. Kamalraj Ramalingam2 1Student,2Associate Professor, 1,2Department of Master of Computer Applications, School of CS, Jain Deemed to be University, Bangalore, Karnataka, India ABSTRACT Users of Amazon's online shopping service are allowed to leave feedback for the items they buy. Amazon makes no effort to monitor or limit the scope of these reviews. Although the amount of reviews for various items varies, the reviews provide easily accessible and abundant data for a variety of applications. This paper aims to apply and expand existing natural language processing and sentiment analysis research to data obtained from Amazon. The number of stars given to a product by a user is used as training data for supervised machine learning. Since more people are dependent on online products these days, the value of a review is increasing. Before making a purchase, a buyer must read thousands of reviews to fully comprehend a product. In this day and age of machine learning, however, sorting through thousands of comments and learning from them would be much easier if a model was used to polarize and learn from them.Weused supervisedlearning to polarize a massive Amazon dataset and achieve satisfactory accuracy. KEYWORDS: Sentiment analysis, machine learning, Amazon customer reviews, Logistic Regression Classifier, Decision Tree Classifier, SVM How to cite this paper: Ravi Kumar Singh | Dr. Kamalraj Ramalingam "Amazon Product Review Sentiment Analysis with Machine Learning" Published in International Journal of Trend in Scientific Research and Development(ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-4, June 2021, pp.720-723, URL: www.ijtsrd.com/papers/ijtsrd42372.pdf Copyright © 2021 by author (s) and International Journal ofTrendinScientific Research and Development Journal. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0) (http://creativecommons.org/licenses/by/4.0) INTRODUCTION As online marketplaces have grown in popularity over the years, online retailers and vendors have encouraged their customers to share their thoughts on the items they've purchased. Thousands of reviews are written every day on the Internet about a wide range of products, programmes, and locations. As a result, the Internet has surpassed all other sources for collecting information and opinions on a product or service. The Internet has revolutionized the way we purchase products. Wherever product testing is not feasible in the retail e-commerce environment of online marketplace. Furthermore, in today's retail sale environment, a large number of new products are introduced on a regular basis. As a result, consumers can rely heavily on product feedback to shape their opinions in preparation for a more complex cognitive process during the purchasing process. Users, on the other hand, always find looking out and comparing text reviews to be challenging. As a result, we want a higher numerical rating system that is backed up by feedback, so that consumers can easily make a buying decision. Clients can require the use of a score device at some point during their decision-making process in order to locate useful feedback as quickly as possible. As a result, models that can predict a person's score based on a textual content assessment are critical. Obtaining a common sense of a textual evaluation may want to enhance customer service. It can also help businesses increase sales and develop their products by gaining a better understanding of what their customers want. The Amazon electronicproductevaluationdatasetwastaken into accounts. The evaluations and ratings provided by customers to exceptional products, as well as reviews about the customer's product(s), were also taken into accounts. LITERATURE SURVEY Sentiment analysis has gotten a lot of attention in recent years thanks to the abundance of online reviews. As a result, numerous studies have been conducted in this area. Someof the most relevant research workstothisthesisarediscussed in this section. SVM was tested for text classification by Joachims (1998), who found that it performed well in all experiments with lower error levels than other classification methods. With the assistance of SVM and Naive Bayes and maximum entropy classification, Pang, Lee, and Vaithyanathan (2002) attempted supervised learning for classifyingmoviereviews into two groups, positive and negative. In terms ofprecision, all three methods performed admirably.Inthisanalysis,they experimented with different features and discovered that when a bag of words was used as a feature in the classifiers, the machine learning algorithms performed better. Three supervised machine learningalgorithms,NaiveBayes, SVM, and N-gram model, were tested on online feedback about various travel destinations around the world in a IJTSRD42372
  • 2. International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD42372 | Volume – 5 | Issue – 4 | May-June 2021 Page 721 recent survey conducted by Ye etal.(2009).Theydiscovered in this study that well-trained machine learning algorithms work exceptionally well for classification of travel destination reviews in terms of accuracy. They also showed that the SVM and N-gram models outperformed the Naive Bayes system. However, increasing the number of training data sets decreased the gap between the algorithms significantly. Chaovalit and Zhou (2005) compared a supervised machine learning algorithm to an unsupervised approach to movie review called Semantic orientation, and found that the supervised approach was more efficient than the unsupervised form. Naive Bayes and SVM are two of the most widely used methods in sentiment classification issues, according to several studies (Joachims 1998; Pang et al. 2002; Ye et al. 2009). As a result, this study attempts to apply supervised machine learning algorithms suchasNaiveBayesandSVMto Amazon's beauty product reviews. PROPOSED SYSTEM The method entails gathering product-based datasets from various E-commerce sites suchasamazon.com,epinion.com, and others. The feedback is received on items such as phones, iPods, and other electronic devices. The aim of this project is to use algorithms like random forest,decisiontree, and SVM to evaluate and forecast product reviews by classifying them as positive, negative,orneutral.Weconduct pre-processing, extract features on which comments are made, measure polarity of feedback, and plot a graph for the result since the input is about unstructuredproduct reviews. Dealing with negation is also covered in the results. For instance, "the Nokia phone is not bad" is a positive review despite the negative word "not." The approachflowdiagram as shown below, and the subsections are explained in detail in the following subsections. Sentiment Classification Algorithm: Sentiment analysis, also known as opinion mining, is a problem in natural language processing (NLP) that entails recognizing and extracting subjective knowledge from text sources. The aim of sentiment classification is to interpret user feedback and categorize them as positive or negative, without requiring the system to fully comprehend the semantics of each phrase or text. Sentiment analysis is becoming a powerful method for monitoring and analyzing consumer sentiment as people share their thoughts and feelings more freely than ever before. Brands can learn what makes consumers happy or sad by automatically analyzing consumer reviews such as survey responses and social media interactions. This allows them to tailor goods and services to theircustomers'specific requirements. Different areas, such as movie reviews, travel destination reviews, and product reviews, have been attempted by sentiment classification. Random forest Classifier (RFC) Random Forest is a concept for putting together decision trees that can be obtained by combining multiple decision trees. We can run into issues like outlier data or noisy data while using single tree classifiers, such as decision tree classifiers, which can affect the performance of the classifier function, while Random Forest as a classifier provides randomness and is therefore highly resistant to noise and outliers. This classifier produces two different forms of randomness: data randomness and function randomness. This classifier has a numberofhyperparametersbecause it's used to combine multiple Decision Trees, such as: How many trees should be built in the Decision Forest? What is the maximum number of features that can be selected at random? The maximum height of each tree. Since it uses the concepts of bootstrapping and bagging, Random Forest is thought to be a reliable and accurate classifier. Support vector machine (SVM) Support vector machines (SVMs) are a type of supervised learning system that can be used to solve sentiment classification problems (Cristianini & ShaweTaylor 2000). This approach positions marked training data on a decision plane, then uses an algorithm to create an optimal hyperplane that divides the data into groups or classes. As shown in Figure 1, the best hyperplane is the one that separates the groups by the largest margin. This is done by choosing a hyperplane that is the furthest away from the nearest data on each class (Berk 2016). “The groups are not separated in H1. H2 has a slight advantage, but only by a small margin. H3 divides them by the greatest possible margin.” Weinberg, Zack (2012). Fig1: Support Vector Machine Logistic Regression Classifier (LRC) The likelihood of an outcome with only two possible values is predicted using logistic regression (i.e. a dichotomy). One or more predictors are used to make the prediction (numerical and categorical). For two reasons, linear regression is ineffective for predicting the value of a binary variable: Values outside the appropriate range would be predicted by a linear regression (e.g. predicting probabilities outside the range 0 to 1) The residuals would not necessarily spread around the expected axis since dichotomous experiments could only have one of two potential values for each experiment.
  • 3. International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD42372 | Volume – 5 | Issue – 4 | May-June 2021 Page 722 A logistic regression, on the other hand, yields a logistic curve with values ranging from 0 to 1. In logistic regression, rather than using the probability, the usual logarithm of the target variable's "odds" is used to construct the curve. Furthermore, the predictors do not have to be normally distributed or have the same variance in and category to be efficient. Decision Tree Classifier (DTC) A hierarchical tree structure with attributes represented by decision nodes and attribute values represented by edges. The creation of decision rules for classifying new data instances is made possible by this tree-like representation. A decision tree is a tool for making decisionsthatusesa tree- like model of decisions and their possible outcomes, such as chance event outcomes, resource costs, and utility. It's one way of displaying an algorithm that iscompletelymadeup of conditional control statements. Result and Discussion The predictive accuracy of the models is calculated after testing and training the dataset to decide which model is the best classifier for classifying feedback. The SVM model, as seen in the table, has the best predictive accuracy of the four models, whereas the Decision Tree model has the worst predictive accuracy. Model Name Accuracy Logistic Regression Classifier 93.92% Support Vector Machine 93.94% Random Forest Classifier 93.50% Decision Tree Classifier 90.10% After a few arbitrary feedbacks, it seems that our features are working properly with Positive, Neutral, and Negative outcome. We can also see that our Support Vector Machine Classifier has improved to a level of 94.08 percent accuracy after running the grid quest. Conclusion and Future Work Sentiment analysis is the process of recognizing and aggregating user sentiment or opinions. The method of deciding whether the polarity of text in a document or sentence is positive, negative, or neutral is known as sentiment analysis. We can see that four approaches have been compared, and a result has been calculated for approaches on the product review dataset. The accuracy of Logistic Regression is found to be 93.92 %, SVM is found to be 93.94 %, Decision Tree is found to be 90.10 %, and Random Forest is found to be 93.50 %. Among the four models, the SVM model has the highest predictive accuracy. We can see that text files that are too big take a long time to process. Automatic sentimental analysis is a powerful tool for detecting and forecasting current and future patterns. While opinions at the feature level have been sought, there are still many limitations that can be explored further. The potential for future development – Providing product reviews in a variety of languages. Addressing the issue of slang mapping. Dealing with sarcastically expressed views. Identifyingcomparativeviewsanddetermining whichof the two products under consideration is the best. Dealing with anaphora resolution, which is what the opinion is really about. In the future, the work could be expanded to conduct multiclass classification of reviews, which would give consumers a clearer picture ofthereview'sessence,allowing them to make better product decisions. It can also beusedto predict a product's ranking based on the review. This would provide consumers with a trustworthy rating because the product's rating and the sentiment of the review will often contradict each other. The proposed job extension would be extremely beneficial to the e-commerce industry by increasing customer loyalty and confidence. ACKNOWLEDGEMENT: I do acknowledge the support and encouragement of all people who helped me throughout the completion of this project. I would wish to give thanks Dr. Dinesh Nilkhant, Director - JGI, Knowledge Campus, Bangalore, Karnataka for proving the facilities to try to analysis work. His leadership and management skills are continuously a supply of inspiration. I conjointly wish to give thanks Dr. M. N Nachappa, Dean, School of Computer Science & IT, Jain deemed to be university, Knowledge campus, Bangalore,Karnataka forhis support and cordial cooperation. I would wish to give thanks to our MCA & program coordinator, Dr. BhuvanaJ, MentorandAssociateProfessor, Department of MasterofComputerApplicationfor providing for providing the support and steerage to try to analysis work. Her timely direction and motivation helped metostay my patience throughout this journey. Moving further, I would wish to give thanks my sincere gratitude to project coordinators Members, Dr. Lakshmi JVN and Dr. Gangotri, Assistant Professor, Department of Master of Computer Application for sharingtheir experience which helped me in completingmythesisinthe bestpossible way. In addition, they also helped in critically reviewing and proof reading my work and my project thesis. References [1] S. Brownfield and J. Zhou, "Sentiment Analysis of Amazon Product Reviews," in Proceedings of the Computational Methods in Systems and Software, Springer, 2020, pp. 739--750. [2] T. Haque, N. Saber and F. Shah, "Sentimentanalysison large scale Amazon product reviews," in 2018 IEEE international conference on innovative research and development (ICIRD), IEEE, 2018, pp. 1--6. [3] R. Jagdale, V. Shirsat and S. Deshmukh, "Sentiment analysis on product reviews using machine learning techniques," in Cognitive Informatics and Soft Computing, Springer, 2019, pp. 639--647. [4] N. Nandal, R. Tanwar and J. Pruthi, "Machine learning based aspect level sentiment analysis for Amazon products," Spatial Information Research, vol. 28, pp. 601--607, 2020.
  • 4. International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD42372 | Volume – 5 | Issue – 4 | May-June 2021 Page 723 [5] A. Rathor, A. Agarwal and P. Dimri, "Comparative study of machine learning approaches for amazon reviews," Procedia computer science, vol. 132, pp. 1552--1561, 2018. [6] A. Ravi, A. Khettry and S. Sethumadhavachar, "Amazon Reviews as Corpus for Sentiment Analysis Using Machine Learning," in International Conference on Advances in ComputingandDataSciences,Springer, 2019, pp. 403--411. [7] J. Sing, G. Singh and R. Singh, "Optimization of sentiment analysis using machine learning classifiers," Human-centric Computing and information Sciences, vol. 7, pp. 1--12, 2017. [8] Z. Singla, S. Randhawa and S. Jain, "Sentimentanalysis of customer productreviews usingmachinelearning," in 2017 international conference on intelligent computing and control (I2C2), IEEE, 2017, pp. 1--5. [9] K. Srujan, S. Nikhil, H. Rao and K. Karthik, "Classification of amazon book reviews based on sentiment analysis," in Information Systems Design and Intelligent Applications, Springer, 2018, pp. 401-- 411. [10] W. Tan, X. Wang and X. Xu, "Sentiment analysis for Amazon reviews," in International Conference, 2018, pp. 1--5. [11] S. Wassan, X. Chen, T. Shen and M. Waqar, "Amazon Product Sentiment Analysis using Machine Learning Techniques," Revista Argentina de Cl{'i}nica Psicol{'o}gica, vol. 30, p. 695, 2021. [12] S. Dey, S. Wasif, D. Tonmoy and S. Sultana, "A Comparative Study of Support Vector Machine and Naive Bayes Classifier for Sentiment Analysis on Amazon Product Reviews," in 2020 International Conference on Contemporary Computing and Applications (IC3A), IEEE, 2020, pp. 217--220.