SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 05 | May 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2799
Recommender System- Analyzing products by mining Data Streams
Siddhi Divekar1, Gunashree Attarde2, Adesh Chavan3, Ankita Dahiphale4, Prof. Rahul Patil5
1,2,3,4Students, Dept. of computer engineering, Pimpri Chinchwad College of engineering, Pune, Maharashtra, India.
5Professor, Dept. of computer engineering, Pimpri Chinchwad College of engineering, Pune, Maharashtra, India.
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Due to the spread of covid many people losttheir
jobs so to earn their lives they started with small occupations.
But these occupations are still unknown and are not able to
earn profits. So as a helping hand to these people we have
come up with an ecommerce website which will help them
earn profits and get real review from the customerswhichwill
help them improve in their sectors. For earning the profits, we
are about to build a recommendation system by analysing the
best sales of a product using the Boyer Moore Voting
Algorithm. The analyses of the product will also be shown
using data visualization using the Power BI Software. We will
be using the various algorithms liketheSVM, LinearSVC, Naïve
Bayes, etc for detecting whether the provided review isrealor.
Fake
Key Words: Data stream mining, Power BI,
Recommendation, Review, SVM, Naïve Bayes and
Ecommerce.
1. INTRODUCTION
Due to the pandemic situation many people started their
own small-scale business. We are providing an e-commerce
platform for these small-scale entrepreneurs which would
help them to sell their products and get product Reviews
from the customers. The reviews Recommendation will be
based on two types:
1. The reviews from the customerswill beinthestreaming
form which will be then converted into data
visualization and further will help in the product
recommendation system.
2. The supplementary occupations from which they can
also buy supplementary products with the actual
product purchased will help in the occupation
recommendation system.
3. We will also take care of the review submitted are not
fake by applying the various false review algorithms
2. BACKGROUND
[1] Many of our regular activities have been affected bythe
Internet's fast expansion. Ecommerce is one of the
fastest- growing areas. Customers can post evaluations
about e- commerce services in general. These reviews
might be utilized as a source of data. Companies, for
example, can use it to develop goods or services, while
potential customers can use it to determine whether to
buy or use a product. Unfortunately, some people have
tried to generate false reviews in order to boost the
popularity of the product or to discredit it. The goal of
this study is to use the language and rating properties
of a review to detect fraudulent product reviews. In
summary, the suggested system (ICF++) would assess
the honesty of a review, the trustworthiness of the
reviewers, andtheproduct's dependability.Text mining
and opinion mining techniques will be used to
determine a review's honesty value. The results of the
experiment demonstrate that thesuggestedsystemhas
a higher accuracythan the iterative computation
framework (ICF) method's outcome.
[2] Fake review detection has gotten a lot of attention in
recent years. Both the business and research
communities are paying attention to this issue. For
Detecting reviews that represent actual user
experiences and opinions Fakereviews area significant
issue. The benefits of supervised learning are
numerous. One of the primary methodsto resolvingthe
issue Obtaining branded bogus training reviews,onthe
other hand, is challenging. because it is extremely
difficult, if not impossible, to properly identify fakes
manual examinations Various forms of data have been
utilized in previous studies. Training reviews that
aren't entirely true. The faux false evaluations created
with the Amazon Mechanical Turk (AMT)
crowdsourcing tool are maybe the most intriguing.
Using simply word n-gram characteristics, reported an
accuracy of 89.6% using AMT created bogus reviews.
This level of precision is both shocking and promising.
The AMT produced reviews, albeit false, are not actual
bogus reviews on an e- commerce website. The
Turkers are unlikely to be in the same psychological
condition as the authors of actual bogus reviews who
have enterprises to promote or downgrade other
products while producing suchevaluations.Thisnotion
is supported by our research. Following that, it's
reasonable tocompare fakereviewdetectionaccuracies
on pseudo-AMT data with real-life data to determine if
various states of mindmay lead to different writings
and, as a result, different classification accuracies. We
undertake a complete set ofclassification tests using
just n-gram features for actual review data, using all
filtered and non-fake reviews from Yelp.com. Although
the accuracy of false review identification on Yelp's
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 05 | May 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2800
real-life data is just 68.8%, this accuracy suggests that
n-gram characteristics are definitely useful. The
information theoreticmeasure KL-divergence and its
asymmetric attribute are then used to offer a novel and
principled technique for determining the precise
difference between the two types of review data. This
exposes some fascinating psycholinguistic phenomena
concerning false reviewers,bothforcedandnatural.We
offer a new set of behavioral characteristics about
reviewers and their reviews for learning to enhance
classification on real-life Yelp review data, which
substantially improves theclassification result on real-
life opinion spam data.
[3] Power BI has completely changed the business data
visualisation, intelligence, and analytics worlds. Power
BI is a web-based application that enables users to
search for data, convert it, visualise it, and share the
reports and dashboards they create with other usersin
the same or different departments/organizations, as
well as the general public. As of February 2017, Power
BI was used by over 200,000 businesses in 205
countries. Power BI has emerged as a viablecompetitor
for use as a business intelligence tool in small and
medium businesses, thanks to a free version that
includes sufficient features and capabilities. PowerBI's
Quick Insights feature (Michael Hart, 2017) is a new
tool built on a growing collection of powerful logical
algorithms. After upload dataset to PowerBI, a single
click may activate this function, which generates a
number of reports based on the data's analysis without
the need for human interaction. This also aids in
reducing human mistakes in computations, statistical
procedures, which may result to research that isn't
verified. PowerBI is simple to use as a platform for
Research Data Analysis, visualizations and accepting
even Excel files as input. The goal of this article is to
demonstrate how quickly Power BI can turn a dataset
of research data into a collection of reports and
dashboards that can be simply shared.
[4] The ability to store, gather, and manipulate data has
greatly increased as technology has advanced. Data
analysis has grown more crucial as the amount of
information and its complexity grows at a rapid pace.
The purpose of this article is to suggest to the user
goods that are more likely to be purchased. This paper
initially discusses several recommendationapproaches
and research on recommendation systems, before
proposing a better strategy for a successful
recommendation system and explaining the outcomes
of that approach. On a transactional dataset, is
combinations of the k-means clusteringmethodandthe
apriori algorithm is used to provide a better
recommendation list.
[5] The quantity and influence of online reviews grows
because of the growth in the significance of internet
worldwide. Comments, reviews and feedback about
services are very important for the items and service
providers because they influence the consumers and
frequently are the most convenient method for the
customer to decide if they can buy a particular product
or not. Reviews can have a positive as well as negative
impact. And hence, trusting reviews blindly is not
advisable because they involves risk both for the
customers and sellers. Some selling organizations
sometimes offer incentives to people who post positive
reviews and feedbacks for their particular services on
the other hand others may pay to some people to write
negative reviews for their competitor product service
providers. Thus, providing a bad influence over the
consumers and deflecting their decision of buying a
product or not. Such false reviews are called as spam
reviews and are very common in online E-Commerce
systems. Moreover, consumers must also be careful
while going through the reviews and selecting an
particular product or service to make the decision
based on reviews. In this article, we explain how the
suggested system aids in the detection and removal of
false reviews, with a focus on data mining techniques
utilizing the "J48 Algorithm," as well as the system's
performance
[6] User input in the form of app ratings and reviews is
becoming increasingly common in app stores.
Researchers and, more recently, tool providers have
provided analytics and data mining solutions to
developers ana analysts for eg, to assistreleasechoices.
Positive feedback, according to research, boosts app
downloads and revenue, and therefore it’ssuccess.Asa
result, a market for pho bogus, incentivized app
evaluations arose, with yet-to-be-determined
ramifications for developers, app users and
owners.This study investigates false reviews, their
sources, characteristics, and the degree to which they
may be identified automatically. To understand their
tactics and services, we ran disguised questionnaires
with 43 bogus review providers and analyzed their
review rules. We discovered substantial discrepancies
between the matching applications, reviewers, rating
distribution, and frequencybycomparing60thousands
bogus reviews with 62 millions review from the App
Store. This prompted the creation of a simple classifier
that can automatically detect fraudulent app store
reviews. Our classifier has a recall of 91 percent and an
AUC/ROC value of 98 percent on a labelled and
unbalanced dataset with one- tenth of false reviews, as
documented in other areas. Our findings are discussed,
as well as their implications for software engineering,
app consumers, and app store owners.
[7] The importance of internet evaluations on businesses
has risen dramatically in recentyears,andtheyarenow
critical in determining business performance in a wide
range of industries, from restaurants to hotels to e-
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 05 | May 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2801
commerce. Unfortunately, some individuals utilize
unethical methods to boost their internet image, such
as creating false reviews of their own companies or
competitors. Fake review detection has already been
studied in a variety of sectors, including product and
company evaluations in restaurantsandhotels.Despite
its economic importance, however, the consumer
electronics industry has yettobeproperlyinvestigated.
This paper presentsa feature framework foridentifying
fraudulent reviews in the consumer electronics area,
which has been tested. The four part contribution is as
follows- a) creating a database with four differentcities
for consumer electronics domaininordertoclassifythe
fake reviews. b) identify a feature framework for
detection of false reviews. c) on the proposed
framework development of classification method. d)
analyse the output for each cities. The Ada Boost
classifier has been proved to be the best by statistical
methods according to the Friedman test, with an F-
score of 82 percent on the classification job.
[8] In this field of study, two types of datasets are typically
used: pseudo-fake and real-life evaluations. When
compared to pseudo fake reviews,literatureshowsthat
classification models perform poorly in real-world
datasets. Following our analysis we discovered that
behavioral and contextual factors are crucial for
detecting fraudulent reviews. In particular, we utilized
an important behavioral aspect of reviewers known as
"reviewer deviation." Our research focuses on the
relationship between reviewer deviance and other
environmental and behavioral factors.Therelevanceof
a certain feature set for a classification algorithm to
detect fraudulent reviews was empirically
demonstrated. We rated features in a chosen feature
set, and reviewer deviation came in eighth. We scaled
the dataset to test the feasibility of the selected feature
set and found that scaling the dataset can increase both
recall and accuracy. A contextual feature in our chosen
feature set captures text similarity between a
reviewer's reviews. For calculating text similarity of
reviews, we used the NNC, LTC, and BM25 term
weighting methods. BM25 outperformed other word
weighting schemes, according to our findings.
3. PROPOSED SYSTEM
and to identify whether a review is true or fake.
So, to achieve the first motive thatistherecommendation we
will be generating a goggle form which will take feedback
from the customers related to the purchased products. The
data in the form will then be converted into an excel sheet
which will be an input to the Power Bi software which will
give us a clear data visualization of the products sales.
Further this Product sales data will be given as an input to
the streaming algorithm (Boyer Moore voting Streaming
algorithm) after the data preprocessing whichwill helpusin
analyzing the best sales ofa product which can help the
small-scale entrepreneurs to analyze their profits and loss.
The next motive is to let the small-scaleentrepreneursknow
whether the review provided through the google feedback
form are true or fake. We will be using various machine
learning algorithms like the naïvebayes,SVM,randomforest
which will us to classify whether the review is true or fake.
Parameters on which the review will be classified are:
Time span of the review
Technical terms in the review
Ratings
Verify the Purchase
Inspecting the user profile
Customer Jacking
3.2 STREAMING ALGORITHMS
Boyer Moore voting Streaming algorithm:
The Boyer-Moore voting method is one of the most often
used optimum algorithms for determining the majority
element among elements with more than N/ 2 occurrences.
Fig-3.1-System Diagram
3.1 WORKING OF THE SYSTEM
The main motive of our system is to get recommendation
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 05 | May 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2802
Using this algorithm, we will get the best sales product for
the recommendation purpose as the o/p.
Time Complexity = O(N) Space Complexity = O(1)
3.3 ML Algorithms for Fake Review detection Gaussian
Naïve Bayes:
The gaussian Naïve Bayes is a type of the Naïve Bayes
algorithm which acts in accordance with the Gaussian
normal distribution. It also contributes to the continuous
data.
LinearSVC:
This classifier divides data into groups by offering the best
suited hyper plane.
SVM:
Various investigations have revealed If you employ SVC's
default kernel, the Radial BasisFunction(RBF)kernel,you're
likely using a nonlinear decision boundary, which will
greatly outperform a linear decision boundary in the case of
the dataset.
Random Forest: This approach, which is supplied by the
sklearn package, has also been used for classification by
building numerous decision trees set randomly on a sample
of training data.
After applying all of these classifiers, the accuracies of each
are compared, and their accuracy for detecting falsereviews
is evaluated.
Fig-3.2-Flow Diagram
3.4 SDLC MODEL:
We will be using ITERATIVE MODEL. Because the iterative
methodology starts with a modest implementation of a
limited set of software requirements and repeatedly
improves the evolving versions until the entire system is
built and ready for deployment. The Iterative and
Incremental model is depicted in the figure below.
Fig-3.3-Model
3.5 UML
Fig-3.4-Use case Diagram
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 05 | May 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2803
4, DEMO OF POWER BI STREAMING
Fig-4.1-Streaming data from the form
Fig-4.2- Successfully updated timestamp of streaming data
Fig-4.3-Data visualization of the product sales
6. CONCLUSION
We presented an overview of our ecommerce website
whichwill help the people earn profits for the similar
occupation recommendation of the searched product and
also get a true review of their sales so that these reviews
help them to improvise in their field. In future scope the
website can also be used for the marketing the
advertisement of the products to earn more profits.
7. ACKNOWLEDGEMENT
We express our heartfelt gratitude to Prof. Rahul Patil, our
Project Guide, for his encouragement and support
throughout our Project, particularly for the helpful ideas
made during the Project and for laying the groundwork for
our work's accomplishment.
We'd also want to express our sincere gratitude to Prof. Dr.
S. V. Shinde, our Research & Innovation coordinator, and
Prof. S. R. Vispute, our Project Coordinator, for their help,
real support, and guidance from the beginning of the
seminar until the end. We'd like to express our gratitude to
Prof. Dr. K. Rajeswari, Head of the Computer Engineering
Department, for her unflinching support duringtheseminar.
REFERENCES
[1]https://www.researchgate.net/publication/303499094_F
ake_Review_Detection_From_a_Product_Review_Using_Modif
ied_Method_of_Iterative_Computation_Framework
[2]http://www2.cs.uh.edu/~arjun/papers/UIC-CS-TR-yelp-
spam.pdf
[3]http://ir.inflibnet.ac.in:8080/ir/bitstream/1944/2116/1
/2
[4] Application of Data Mining to E-Commerce
Recommendation Systems
[5]https://www.ijsr.net/archive/v7i10/ART20191163.pd
f
[6]https://ir.inflibnet.ac.in/bitstream/1944/2116/1/24.p
df
[7]https://www.youtube.com/watch?v=AGrl-H87pRU

More Related Content

PPTX
FAKE PRODUCT PAPER PRESENTATION.pptx
NareshKumar675331
 
PPTX
seminar.pptx
ShanavasShanu5
 
PDF
E-Commerce Product Rating Based on Customer Review
IRJET Journal
 
PDF
IRJET- Spotting and Removing Fake Product Review in Consumer Rating Reviews
IRJET Journal
 
PDF
IRJET-Fake Product Review Monitoring
IRJET Journal
 
PDF
IRJET - Online Product Scoring based on Sentiment based Review Analysis
IRJET Journal
 
PDF
IRJET- Enhancing NLP Techniques for Fake Review Detection
IRJET Journal
 
PDF
A SUPERVISED MACHINE LEARNING APPROACH USING K-NEAREST NEIGHBOR ALGORITHM TO ...
IRJET Journal
 
FAKE PRODUCT PAPER PRESENTATION.pptx
NareshKumar675331
 
seminar.pptx
ShanavasShanu5
 
E-Commerce Product Rating Based on Customer Review
IRJET Journal
 
IRJET- Spotting and Removing Fake Product Review in Consumer Rating Reviews
IRJET Journal
 
IRJET-Fake Product Review Monitoring
IRJET Journal
 
IRJET - Online Product Scoring based on Sentiment based Review Analysis
IRJET Journal
 
IRJET- Enhancing NLP Techniques for Fake Review Detection
IRJET Journal
 
A SUPERVISED MACHINE LEARNING APPROACH USING K-NEAREST NEIGHBOR ALGORITHM TO ...
IRJET Journal
 

Similar to Recommender System- Analyzing products by mining Data Streams (20)

PDF
Computing Ratings and Rankings by Mining Feedback Comments
IRJET Journal
 
PPTX
SIDDESH PPT.pptxjdcnjdcndjcnfsfsfsfsfsfsfsfssf
SiddeshAvSiddeshAv
 
PDF
IRJET- Customer Feedback Analysis using Machine Learning
IRJET Journal
 
PPTX
dindubsdk (1).pptx
RajeshGr5
 
PDF
Automatic Recommendation of Trustworthy Users in Online Product Rating Sites
IRJET Journal
 
PDF
“Electronic Shopping Website with Recommendation System”
IRJET Journal
 
PDF
IRJET- Online Sequential Behaviour Analysis using Apriori Algorithm
IRJET Journal
 
PPTX
Shiva ppt.pptx
bcvishal50
 
PPTX
Shiva pptvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv...
Vivrfvg
 
PDF
Fake Product Review Monitoring System
ijtsrd
 
PDF
Efficiently Detecting and Analyzing Spam Reviews Using Live Data Feed
IRJET Journal
 
PDF
IRJET- Sentiment Analysis of Customer Reviews on Laptop Products for Flip...
IRJET Journal
 
PDF
EXTRACTING BUSINESS INTELLIGENCE FROM ONLINE PRODUCT REVIEWS
IJDMS
 
PDF
Extracting Business Intelligence from Online Product Reviews
ijsc
 
PDF
Intelligent Shopping Recommender using Data Mining
IRJET Journal
 
PDF
IRJET- Predicting Review Ratings for Product Marketing
IRJET Journal
 
PDF
Fraud Detection in Online Reviews using Machine Learning Techniques
ijceronline
 
PDF
Detection of Fraud Reviews for a Product
IJSRD
 
PPTX
Faisal Seminar.pptx
Shaikhfaisal37
 
DOCX
Detecting Anomalous Online ReviewersAn Unsupervised Approac.docx
khenry4
 
Computing Ratings and Rankings by Mining Feedback Comments
IRJET Journal
 
SIDDESH PPT.pptxjdcnjdcndjcnfsfsfsfsfsfsfsfssf
SiddeshAvSiddeshAv
 
IRJET- Customer Feedback Analysis using Machine Learning
IRJET Journal
 
dindubsdk (1).pptx
RajeshGr5
 
Automatic Recommendation of Trustworthy Users in Online Product Rating Sites
IRJET Journal
 
“Electronic Shopping Website with Recommendation System”
IRJET Journal
 
IRJET- Online Sequential Behaviour Analysis using Apriori Algorithm
IRJET Journal
 
Shiva ppt.pptx
bcvishal50
 
Shiva pptvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv...
Vivrfvg
 
Fake Product Review Monitoring System
ijtsrd
 
Efficiently Detecting and Analyzing Spam Reviews Using Live Data Feed
IRJET Journal
 
IRJET- Sentiment Analysis of Customer Reviews on Laptop Products for Flip...
IRJET Journal
 
EXTRACTING BUSINESS INTELLIGENCE FROM ONLINE PRODUCT REVIEWS
IJDMS
 
Extracting Business Intelligence from Online Product Reviews
ijsc
 
Intelligent Shopping Recommender using Data Mining
IRJET Journal
 
IRJET- Predicting Review Ratings for Product Marketing
IRJET Journal
 
Fraud Detection in Online Reviews using Machine Learning Techniques
ijceronline
 
Detection of Fraud Reviews for a Product
IJSRD
 
Faisal Seminar.pptx
Shaikhfaisal37
 
Detecting Anomalous Online ReviewersAn Unsupervised Approac.docx
khenry4
 

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
IRJET Journal
 
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
IRJET Journal
 
PDF
Kiona – A Smart Society Automation Project
IRJET Journal
 
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
IRJET Journal
 
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
IRJET Journal
 
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
IRJET Journal
 
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
IRJET Journal
 
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
IRJET Journal
 
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
IRJET Journal
 
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
IRJET Journal
 
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
IRJET Journal
 
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
IRJET Journal
 
PDF
Breast Cancer Detection using Computer Vision
IRJET Journal
 
PDF
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
PDF
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
IRJET Journal
 
PDF
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
PDF
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
IRJET Journal
 
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
IRJET Journal
 
Kiona – A Smart Society Automation Project
IRJET Journal
 
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
IRJET Journal
 
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
IRJET Journal
 
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
IRJET Journal
 
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
IRJET Journal
 
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
IRJET Journal
 
BRAIN TUMOUR DETECTION AND CLASSIFICATION
IRJET Journal
 
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
IRJET Journal
 
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
IRJET Journal
 
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
IRJET Journal
 
Breast Cancer Detection using Computer Vision
IRJET Journal
 
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
IRJET Journal
 
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 

Recently uploaded (20)

PPTX
MT Chapter 1.pptx- Magnetic particle testing
ABCAnyBodyCanRelax
 
PDF
dse_final_merit_2025_26 gtgfffffcjjjuuyy
rushabhjain127
 
PDF
Introduction to Ship Engine Room Systems.pdf
Mahmoud Moghtaderi
 
PPTX
Victory Precisions_Supplier Profile.pptx
victoryprecisions199
 
PPT
Ppt for engineering students application on field effect
lakshmi.ec
 
PDF
Advanced LangChain & RAG: Building a Financial AI Assistant with Real-Time Data
Soufiane Sejjari
 
PPTX
Information Retrieval and Extraction - Module 7
premSankar19
 
PDF
2025 Laurence Sigler - Advancing Decision Support. Content Management Ecommer...
Francisco Javier Mora Serrano
 
PDF
Cryptography and Information :Security Fundamentals
Dr. Madhuri Jawale
 
PDF
JUAL EFIX C5 IMU GNSS GEODETIC PERFECT BASE OR ROVER
Budi Minds
 
PDF
Zero Carbon Building Performance standard
BassemOsman1
 
PPTX
Inventory management chapter in automation and robotics.
atisht0104
 
PDF
Top 10 read articles In Managing Information Technology.pdf
IJMIT JOURNAL
 
PDF
Introduction to Data Science: data science process
ShivarkarSandip
 
PDF
settlement FOR FOUNDATION ENGINEERS.pdf
Endalkazene
 
PPTX
Civil Engineering Practices_BY Sh.JP Mishra 23.09.pptx
bineetmishra1990
 
PDF
20ME702-Mechatronics-UNIT-1,UNIT-2,UNIT-3,UNIT-4,UNIT-5, 2025-2026
Mohanumar S
 
PDF
Software Testing Tools - names and explanation
shruti533256
 
PDF
top-5-use-cases-for-splunk-security-analytics.pdf
yaghutialireza
 
PDF
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
MT Chapter 1.pptx- Magnetic particle testing
ABCAnyBodyCanRelax
 
dse_final_merit_2025_26 gtgfffffcjjjuuyy
rushabhjain127
 
Introduction to Ship Engine Room Systems.pdf
Mahmoud Moghtaderi
 
Victory Precisions_Supplier Profile.pptx
victoryprecisions199
 
Ppt for engineering students application on field effect
lakshmi.ec
 
Advanced LangChain & RAG: Building a Financial AI Assistant with Real-Time Data
Soufiane Sejjari
 
Information Retrieval and Extraction - Module 7
premSankar19
 
2025 Laurence Sigler - Advancing Decision Support. Content Management Ecommer...
Francisco Javier Mora Serrano
 
Cryptography and Information :Security Fundamentals
Dr. Madhuri Jawale
 
JUAL EFIX C5 IMU GNSS GEODETIC PERFECT BASE OR ROVER
Budi Minds
 
Zero Carbon Building Performance standard
BassemOsman1
 
Inventory management chapter in automation and robotics.
atisht0104
 
Top 10 read articles In Managing Information Technology.pdf
IJMIT JOURNAL
 
Introduction to Data Science: data science process
ShivarkarSandip
 
settlement FOR FOUNDATION ENGINEERS.pdf
Endalkazene
 
Civil Engineering Practices_BY Sh.JP Mishra 23.09.pptx
bineetmishra1990
 
20ME702-Mechatronics-UNIT-1,UNIT-2,UNIT-3,UNIT-4,UNIT-5, 2025-2026
Mohanumar S
 
Software Testing Tools - names and explanation
shruti533256
 
top-5-use-cases-for-splunk-security-analytics.pdf
yaghutialireza
 
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 

Recommender System- Analyzing products by mining Data Streams

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 05 | May 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2799 Recommender System- Analyzing products by mining Data Streams Siddhi Divekar1, Gunashree Attarde2, Adesh Chavan3, Ankita Dahiphale4, Prof. Rahul Patil5 1,2,3,4Students, Dept. of computer engineering, Pimpri Chinchwad College of engineering, Pune, Maharashtra, India. 5Professor, Dept. of computer engineering, Pimpri Chinchwad College of engineering, Pune, Maharashtra, India. ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - Due to the spread of covid many people losttheir jobs so to earn their lives they started with small occupations. But these occupations are still unknown and are not able to earn profits. So as a helping hand to these people we have come up with an ecommerce website which will help them earn profits and get real review from the customerswhichwill help them improve in their sectors. For earning the profits, we are about to build a recommendation system by analysing the best sales of a product using the Boyer Moore Voting Algorithm. The analyses of the product will also be shown using data visualization using the Power BI Software. We will be using the various algorithms liketheSVM, LinearSVC, Naïve Bayes, etc for detecting whether the provided review isrealor. Fake Key Words: Data stream mining, Power BI, Recommendation, Review, SVM, Naïve Bayes and Ecommerce. 1. INTRODUCTION Due to the pandemic situation many people started their own small-scale business. We are providing an e-commerce platform for these small-scale entrepreneurs which would help them to sell their products and get product Reviews from the customers. The reviews Recommendation will be based on two types: 1. The reviews from the customerswill beinthestreaming form which will be then converted into data visualization and further will help in the product recommendation system. 2. The supplementary occupations from which they can also buy supplementary products with the actual product purchased will help in the occupation recommendation system. 3. We will also take care of the review submitted are not fake by applying the various false review algorithms 2. BACKGROUND [1] Many of our regular activities have been affected bythe Internet's fast expansion. Ecommerce is one of the fastest- growing areas. Customers can post evaluations about e- commerce services in general. These reviews might be utilized as a source of data. Companies, for example, can use it to develop goods or services, while potential customers can use it to determine whether to buy or use a product. Unfortunately, some people have tried to generate false reviews in order to boost the popularity of the product or to discredit it. The goal of this study is to use the language and rating properties of a review to detect fraudulent product reviews. In summary, the suggested system (ICF++) would assess the honesty of a review, the trustworthiness of the reviewers, andtheproduct's dependability.Text mining and opinion mining techniques will be used to determine a review's honesty value. The results of the experiment demonstrate that thesuggestedsystemhas a higher accuracythan the iterative computation framework (ICF) method's outcome. [2] Fake review detection has gotten a lot of attention in recent years. Both the business and research communities are paying attention to this issue. For Detecting reviews that represent actual user experiences and opinions Fakereviews area significant issue. The benefits of supervised learning are numerous. One of the primary methodsto resolvingthe issue Obtaining branded bogus training reviews,onthe other hand, is challenging. because it is extremely difficult, if not impossible, to properly identify fakes manual examinations Various forms of data have been utilized in previous studies. Training reviews that aren't entirely true. The faux false evaluations created with the Amazon Mechanical Turk (AMT) crowdsourcing tool are maybe the most intriguing. Using simply word n-gram characteristics, reported an accuracy of 89.6% using AMT created bogus reviews. This level of precision is both shocking and promising. The AMT produced reviews, albeit false, are not actual bogus reviews on an e- commerce website. The Turkers are unlikely to be in the same psychological condition as the authors of actual bogus reviews who have enterprises to promote or downgrade other products while producing suchevaluations.Thisnotion is supported by our research. Following that, it's reasonable tocompare fakereviewdetectionaccuracies on pseudo-AMT data with real-life data to determine if various states of mindmay lead to different writings and, as a result, different classification accuracies. We undertake a complete set ofclassification tests using just n-gram features for actual review data, using all filtered and non-fake reviews from Yelp.com. Although the accuracy of false review identification on Yelp's
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 05 | May 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2800 real-life data is just 68.8%, this accuracy suggests that n-gram characteristics are definitely useful. The information theoreticmeasure KL-divergence and its asymmetric attribute are then used to offer a novel and principled technique for determining the precise difference between the two types of review data. This exposes some fascinating psycholinguistic phenomena concerning false reviewers,bothforcedandnatural.We offer a new set of behavioral characteristics about reviewers and their reviews for learning to enhance classification on real-life Yelp review data, which substantially improves theclassification result on real- life opinion spam data. [3] Power BI has completely changed the business data visualisation, intelligence, and analytics worlds. Power BI is a web-based application that enables users to search for data, convert it, visualise it, and share the reports and dashboards they create with other usersin the same or different departments/organizations, as well as the general public. As of February 2017, Power BI was used by over 200,000 businesses in 205 countries. Power BI has emerged as a viablecompetitor for use as a business intelligence tool in small and medium businesses, thanks to a free version that includes sufficient features and capabilities. PowerBI's Quick Insights feature (Michael Hart, 2017) is a new tool built on a growing collection of powerful logical algorithms. After upload dataset to PowerBI, a single click may activate this function, which generates a number of reports based on the data's analysis without the need for human interaction. This also aids in reducing human mistakes in computations, statistical procedures, which may result to research that isn't verified. PowerBI is simple to use as a platform for Research Data Analysis, visualizations and accepting even Excel files as input. The goal of this article is to demonstrate how quickly Power BI can turn a dataset of research data into a collection of reports and dashboards that can be simply shared. [4] The ability to store, gather, and manipulate data has greatly increased as technology has advanced. Data analysis has grown more crucial as the amount of information and its complexity grows at a rapid pace. The purpose of this article is to suggest to the user goods that are more likely to be purchased. This paper initially discusses several recommendationapproaches and research on recommendation systems, before proposing a better strategy for a successful recommendation system and explaining the outcomes of that approach. On a transactional dataset, is combinations of the k-means clusteringmethodandthe apriori algorithm is used to provide a better recommendation list. [5] The quantity and influence of online reviews grows because of the growth in the significance of internet worldwide. Comments, reviews and feedback about services are very important for the items and service providers because they influence the consumers and frequently are the most convenient method for the customer to decide if they can buy a particular product or not. Reviews can have a positive as well as negative impact. And hence, trusting reviews blindly is not advisable because they involves risk both for the customers and sellers. Some selling organizations sometimes offer incentives to people who post positive reviews and feedbacks for their particular services on the other hand others may pay to some people to write negative reviews for their competitor product service providers. Thus, providing a bad influence over the consumers and deflecting their decision of buying a product or not. Such false reviews are called as spam reviews and are very common in online E-Commerce systems. Moreover, consumers must also be careful while going through the reviews and selecting an particular product or service to make the decision based on reviews. In this article, we explain how the suggested system aids in the detection and removal of false reviews, with a focus on data mining techniques utilizing the "J48 Algorithm," as well as the system's performance [6] User input in the form of app ratings and reviews is becoming increasingly common in app stores. Researchers and, more recently, tool providers have provided analytics and data mining solutions to developers ana analysts for eg, to assistreleasechoices. Positive feedback, according to research, boosts app downloads and revenue, and therefore it’ssuccess.Asa result, a market for pho bogus, incentivized app evaluations arose, with yet-to-be-determined ramifications for developers, app users and owners.This study investigates false reviews, their sources, characteristics, and the degree to which they may be identified automatically. To understand their tactics and services, we ran disguised questionnaires with 43 bogus review providers and analyzed their review rules. We discovered substantial discrepancies between the matching applications, reviewers, rating distribution, and frequencybycomparing60thousands bogus reviews with 62 millions review from the App Store. This prompted the creation of a simple classifier that can automatically detect fraudulent app store reviews. Our classifier has a recall of 91 percent and an AUC/ROC value of 98 percent on a labelled and unbalanced dataset with one- tenth of false reviews, as documented in other areas. Our findings are discussed, as well as their implications for software engineering, app consumers, and app store owners. [7] The importance of internet evaluations on businesses has risen dramatically in recentyears,andtheyarenow critical in determining business performance in a wide range of industries, from restaurants to hotels to e-
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 05 | May 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2801 commerce. Unfortunately, some individuals utilize unethical methods to boost their internet image, such as creating false reviews of their own companies or competitors. Fake review detection has already been studied in a variety of sectors, including product and company evaluations in restaurantsandhotels.Despite its economic importance, however, the consumer electronics industry has yettobeproperlyinvestigated. This paper presentsa feature framework foridentifying fraudulent reviews in the consumer electronics area, which has been tested. The four part contribution is as follows- a) creating a database with four differentcities for consumer electronics domaininordertoclassifythe fake reviews. b) identify a feature framework for detection of false reviews. c) on the proposed framework development of classification method. d) analyse the output for each cities. The Ada Boost classifier has been proved to be the best by statistical methods according to the Friedman test, with an F- score of 82 percent on the classification job. [8] In this field of study, two types of datasets are typically used: pseudo-fake and real-life evaluations. When compared to pseudo fake reviews,literatureshowsthat classification models perform poorly in real-world datasets. Following our analysis we discovered that behavioral and contextual factors are crucial for detecting fraudulent reviews. In particular, we utilized an important behavioral aspect of reviewers known as "reviewer deviation." Our research focuses on the relationship between reviewer deviance and other environmental and behavioral factors.Therelevanceof a certain feature set for a classification algorithm to detect fraudulent reviews was empirically demonstrated. We rated features in a chosen feature set, and reviewer deviation came in eighth. We scaled the dataset to test the feasibility of the selected feature set and found that scaling the dataset can increase both recall and accuracy. A contextual feature in our chosen feature set captures text similarity between a reviewer's reviews. For calculating text similarity of reviews, we used the NNC, LTC, and BM25 term weighting methods. BM25 outperformed other word weighting schemes, according to our findings. 3. PROPOSED SYSTEM and to identify whether a review is true or fake. So, to achieve the first motive thatistherecommendation we will be generating a goggle form which will take feedback from the customers related to the purchased products. The data in the form will then be converted into an excel sheet which will be an input to the Power Bi software which will give us a clear data visualization of the products sales. Further this Product sales data will be given as an input to the streaming algorithm (Boyer Moore voting Streaming algorithm) after the data preprocessing whichwill helpusin analyzing the best sales ofa product which can help the small-scale entrepreneurs to analyze their profits and loss. The next motive is to let the small-scaleentrepreneursknow whether the review provided through the google feedback form are true or fake. We will be using various machine learning algorithms like the naïvebayes,SVM,randomforest which will us to classify whether the review is true or fake. Parameters on which the review will be classified are: Time span of the review Technical terms in the review Ratings Verify the Purchase Inspecting the user profile Customer Jacking 3.2 STREAMING ALGORITHMS Boyer Moore voting Streaming algorithm: The Boyer-Moore voting method is one of the most often used optimum algorithms for determining the majority element among elements with more than N/ 2 occurrences. Fig-3.1-System Diagram 3.1 WORKING OF THE SYSTEM The main motive of our system is to get recommendation
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 05 | May 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2802 Using this algorithm, we will get the best sales product for the recommendation purpose as the o/p. Time Complexity = O(N) Space Complexity = O(1) 3.3 ML Algorithms for Fake Review detection Gaussian Naïve Bayes: The gaussian Naïve Bayes is a type of the Naïve Bayes algorithm which acts in accordance with the Gaussian normal distribution. It also contributes to the continuous data. LinearSVC: This classifier divides data into groups by offering the best suited hyper plane. SVM: Various investigations have revealed If you employ SVC's default kernel, the Radial BasisFunction(RBF)kernel,you're likely using a nonlinear decision boundary, which will greatly outperform a linear decision boundary in the case of the dataset. Random Forest: This approach, which is supplied by the sklearn package, has also been used for classification by building numerous decision trees set randomly on a sample of training data. After applying all of these classifiers, the accuracies of each are compared, and their accuracy for detecting falsereviews is evaluated. Fig-3.2-Flow Diagram 3.4 SDLC MODEL: We will be using ITERATIVE MODEL. Because the iterative methodology starts with a modest implementation of a limited set of software requirements and repeatedly improves the evolving versions until the entire system is built and ready for deployment. The Iterative and Incremental model is depicted in the figure below. Fig-3.3-Model 3.5 UML Fig-3.4-Use case Diagram
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 05 | May 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2803 4, DEMO OF POWER BI STREAMING Fig-4.1-Streaming data from the form Fig-4.2- Successfully updated timestamp of streaming data Fig-4.3-Data visualization of the product sales 6. CONCLUSION We presented an overview of our ecommerce website whichwill help the people earn profits for the similar occupation recommendation of the searched product and also get a true review of their sales so that these reviews help them to improvise in their field. In future scope the website can also be used for the marketing the advertisement of the products to earn more profits. 7. ACKNOWLEDGEMENT We express our heartfelt gratitude to Prof. Rahul Patil, our Project Guide, for his encouragement and support throughout our Project, particularly for the helpful ideas made during the Project and for laying the groundwork for our work's accomplishment. We'd also want to express our sincere gratitude to Prof. Dr. S. V. Shinde, our Research & Innovation coordinator, and Prof. S. R. Vispute, our Project Coordinator, for their help, real support, and guidance from the beginning of the seminar until the end. We'd like to express our gratitude to Prof. Dr. K. Rajeswari, Head of the Computer Engineering Department, for her unflinching support duringtheseminar. REFERENCES [1]https://www.researchgate.net/publication/303499094_F ake_Review_Detection_From_a_Product_Review_Using_Modif ied_Method_of_Iterative_Computation_Framework [2]http://www2.cs.uh.edu/~arjun/papers/UIC-CS-TR-yelp- spam.pdf [3]http://ir.inflibnet.ac.in:8080/ir/bitstream/1944/2116/1 /2 [4] Application of Data Mining to E-Commerce Recommendation Systems [5]https://www.ijsr.net/archive/v7i10/ART20191163.pd f [6]https://ir.inflibnet.ac.in/bitstream/1944/2116/1/24.p df [7]https://www.youtube.com/watch?v=AGrl-H87pRU