SlideShare a Scribd company logo
CYBERBULLYING DETECTION USING
MACHINE LEARNING
PRESENTED BY GROUP I
Under the Guidance of
Ms.Surya Ashok,
HOD Computer Science department
TEAM MEMBERS:
ANITHA R
KRITHIKA V S
MEGHA M S
PRANIDHI K J
ABSTRACT
● With the widespread use of social media in this era,
cyberbullying increased rapidly as a cybercrime.
● Cyberbullying is a willful and repeated harm inflicted
through the use of computer, cell phones, and other electronic devices.
● The proposed system aims at detecting cyberbullying, it detects abusive
comments and messages in social media platform.
● The Machine learning algorithm,Naive bayes is used to classify comments and
messages as bullying and non-bullying.
● The project ‘Cyberbullying Detection Using Machine Learning’ discusses and
implements the approach of machine learning in order to solve the threat of
cyberbullying, and thus makes social media a safe place for the users.
SYSTEM SPECIFICATIONS
Hardware Specification
Processor : Intel Core i5
Speed : Above 1GHz
RAM capacity : 4GB or above
Hard Disk Space Required : 5 GB or above
Keyboard : Standard Keyboard
Mouse : Standard Mouse
Monitor : Standard color monitor
Software Specification
● Language Used : Python 3.10, HTML5, JavaScript ES6
➔ Here, HTML and JavaScript are Used for designing the web application.
➔ The main advantages of using python in this project is that it is open source.
➔ It also has vast built-in machine learning libraries available.
● Web Framework : Django 3.7
➔ Django is preferred in this project because of its simplicity, flexibility, reliability and scalability.
● Database : SQL Server 2019
➔ SQL Server 2019 (15.x) introduces new ways to work with SQL Server Containers such as
Machine Learning Services.
➔ Supports Query interleaving,which is a tabular mode system configuration that can
improve user query response times in high-concurrency scenarios.
EXISTING SYSTEM
● For several years, the researchers have worked intensively on cyberbullying
detection to find a way to control or reduce cyberbullying in Social Media
platforms.
● In a research work by Massachusetts Institute of Technology, a system to detect
cyberbullying through textual context in YouTube video comments was
developed, but the system showed less precise classification outcome and
increased false positives.
● Generally most existing systems are focused on effects after cyberbullying
incident and there is no accurate system for online cyberbullying detection.
PROPOSED SYSTEM
● The proposed system employs machine learning to avoid human
intervention.
● A dataset containing cyberbullying and non-bullying comments is used to
train the machine learning model using the Sklearn library in Python.
● Naive Bayes algorithm is used for detecting abusive comments and
messages in social media.
● The Naive Bayes algorithm states that:
P(A/B)=(P(B/A) P(A))/P(B)
● In the proposed system automated detection of bullying comments in
social media is implemented.
● The proposed system is platform independent, it can be implemented on
any operating system and it is free to use.
MODULE DESCRIPTION
● User module.
● Admin module.
● Machine learning module.
MODULE FUNCTIONALITIES
❏ USER MODULE
● Users can sign up to the web application by registering themselves by
providing details like user name,password etc..
● Registered users can also sign in to their profile by using user id and password.
● They can post videos,stories and photos in the web application.
● Users can send friend requests to other users and can also chat with their
friends.
● Users can view,like and comment the videos and photos posted by their
friends in the web application.
❏ ADMIN MODULE
● Admin can handle and make changes in the web application.
● They can also view the requests from users .
● They can also view the comments that have been classified as bullying
and non-bullying.
● They can manage the notifications of users.
❏ MACHINE LEARNING MODULE
● The Machine Learning module is responsible for classifying
comments and messages as bullying or non-bullying.
● From a vast set of comments and messages, the Naive Bayes
algorithm is used to predict bullying comments and messages.
● This module includes the following steps :
➢ Data collection
➢ Data preprocessing
➢ Segmentation
➢ Feature extraction
➢ Training
➢ Testing
FLOWCHART OF CYBERBULLYING DETECTION SYSTEM
1. DATA COLLECTION
● Collecting data for training the Machine Learning model is the basic step
in the machine learning pipeline.
● The predictions made by Machine Learning systems can only be as good as
the data on which they have been trained.
● In this system, dataset containing bullying as well as non-bullying
comments and messages.
● The data set is downloaded from KAGGLE website.
● 80% of dataset is used for training and the remaining 20% is used for
testing.
2. DATA PREPROCESSING
● Real-world raw data and images are often incomplete, inconsistent and lacking in
certain behaviors or trends. They are also likely to contain many errors. So, once
collected, they are pre-processed into a format the machine learning algorithm
can use for the model.
● Data preprocessing in Machine Learning is a crucial step that helps enhance the
quality of data to promote the extraction of meaningful insights from the data.
● The proprocessing step also includes the removal of stop words, special characters
and the conversion of uppercase letters to lowercase.
● The Lemmatization step includes converting tense word into root word. For
example, the word running is converted to its root word run.
3. SEGMENTATION
● Segmentation can be defined as the process of separating sentences
into different tokens.
● N-grams are used for grouping tokens.
● N-grams are used for a variety of things. Some examples include auto
completion of sentences.
● In this project, 2-gram is used to group tokens.
4. FEATURE EXTRACTION
● Feature extraction is the process of taking out a list of words from the text data
and then transforming them into a feature set which is usable by a classifier.
● In this system, TF-IDF vectorizer is used for feature extraction.
● TF-IDF stands for term frequency-inverse document frequency and it is a
measure, used to quantify the importance or relevance of string
representations in a document.
● TF-IDF associates each word in a document with a number that represents how
relevant each word is in that document.
5. TRAINING
● Model training is the key step in machine learning that results in a model ready
to be validated, tested, and deployed.
● The performance of the model determines the quality of the applications that
are built using it.
● Quality of training data and the training algorithm are both important assets
during the model training phase.
● Typically, dataset is split for training and testing.
● All these aspects of model training make it both an involved and important
process in the overall machine learning development cycle.
6. TESTING
● In machine learning, model testing is referred to as the process where
the performance of a fully trained model is evaluated on a testing set.
● The testing set consisting of a set of testing samples should be
separated from the both training and validation sets, but it should
follow the same probability distribution as the training set.
● Each testing sample has a known value of the target.
DOMAIN THEORY
➔ Machine learning
● Machine learning (ML) is the study of computer algorithms that improve
automatically through experience.
● Machine learning involves computers discovering how they can perform tasks
without being explicitly programmed to do so.
● The Machine Learning process starts with inputting training data into the
selected algorithm.
● New input data is fed into the machine learning algorithm to test whether the
algorithm works correctly.
➔ NAIVE BAYES
● A Naive Bayes classifier is a probabilistic machine learning model
that’s used for classification task.
● The classifier is based on the Bayes theorem.
Bayes Theorem :
P(A/B)=(P(B/A) P(A))/P(B)
● This system uses Multinomial Naive Bayes Classifier.
● The features/predictors used by the classifier are the frequency of
the words present in the document.
CONFUSION MATRIX
Fig : Confusion Matrix
DATABASE TABLE
ADMIN
USER
POST
MESSAGES
COMMENTS
USER PROFILE
DATA FLOW DIAGRAMS
Fig. : Level 0 DFD
Fig.: Level 1 DFD
Fig.: Level 1 DFD of user
CYBERBULLYING DETECTION USING              MACHINE LEARNING-1 (1).pdf
LEVEL 1.1 DFD OF ADMIN
ER DIAGRAM
ADMIN LOGIN
ADMIN HOME PAGE
SIGNUP PAGE
LOGIN PAGE
HOME PAGE
WARNING MESSAGE
RESTRICTED ACCOUNT
CONCLUSION
The overall aim of the project “Cyberbullying Detection Using Machine
Learning” is to develop a system that automatically classifies comments
and messages as bullying or non-bullying and also remove the bullying
comments from the web application.
BIBLIOGRAPHY
Referenced Sites:
1. Cynthia Van Hee, Gilles Jacobs, Chris Emmery, Bart Desmet, Els Lefever, Ben
Verhoeven, Guy De Pauw, Walter Daelemans, Véronique Hoste, Automatic
detection of cyberbullying in social media text, PloS one 13 (10), e0203794,
2018
2. Sweta Agrawal, Amit Awekar, European conference on information retrieval,
Deep learning for detecting cyberbullying across multiple social media
platforms, 141-153, 2018
3. Ong Chee Hang, Halina Mohamed Dahlan 2019 6th International Conference
on Research and Innovation in Information Systems, Cyberbullying lexicon
for social media, (ICRIIS), 1-6, 2019
4. John Hani, Mohamed Nashaat, Mostafa Ahmed, Zeyad Emad, Eslam Amer,
Ammar Mohammed, Social media cyberbullying detection using machine
learning, Int. J. Adv. Comput. Sci. Appl 10 (5), 703-707, 2019

More Related Content

What's hot (20)

PPSX
Face recognition technology - BEST PPT
Siddharth Modi
 
PPTX
Facial Expression Recognition System using Deep Convolutional Neural Networks.
Sandeep Wakchaure
 
PPTX
Detection of phishing websites
m srikanth
 
PDF
Hand gesture recognition system(FYP REPORT)
Afnan Rehman
 
PDF
Placement management system
Mehul Ranavasiya
 
PPTX
Web scraping
Selecto
 
DOCX
Levels of Virtualization.docx
kumari36
 
PPTX
Attendance system based on face recognition using python by Raihan Sikdar
raihansikdar
 
PPT
BLUE EYES TECHNOLOGY
Chaitanya Ram
 
PPTX
Credit card fraud detection
vineeta vineeta
 
PPT
3D-Password: A More Secure Authentication
Mahesh Gadhwal
 
PPTX
Biometric Security Systems ppt
OECLIB Odisha Electronics Control Library
 
PDF
Crime Analysis & Prediction System
BigDataCloud
 
PPTX
CSE Final Year Project Presentation on Android Application
Ahammad Karim
 
PPT
Cloud computing and service models
Prateek Soni
 
PPTX
IOT - Design Principles of Connected Devices
Devyani Vasistha
 
PDF
Fog computing
Mahantesh Hiremath
 
PPTX
Unit-I Introduction to Cloud Computing.pptx
garkhot123
 
PPTX
Uml restaurant (group 1)
Omid Aminzadeh Gohari
 
DOCX
Best topics for seminar
shilpi nagpal
 
Face recognition technology - BEST PPT
Siddharth Modi
 
Facial Expression Recognition System using Deep Convolutional Neural Networks.
Sandeep Wakchaure
 
Detection of phishing websites
m srikanth
 
Hand gesture recognition system(FYP REPORT)
Afnan Rehman
 
Placement management system
Mehul Ranavasiya
 
Web scraping
Selecto
 
Levels of Virtualization.docx
kumari36
 
Attendance system based on face recognition using python by Raihan Sikdar
raihansikdar
 
BLUE EYES TECHNOLOGY
Chaitanya Ram
 
Credit card fraud detection
vineeta vineeta
 
3D-Password: A More Secure Authentication
Mahesh Gadhwal
 
Biometric Security Systems ppt
OECLIB Odisha Electronics Control Library
 
Crime Analysis & Prediction System
BigDataCloud
 
CSE Final Year Project Presentation on Android Application
Ahammad Karim
 
Cloud computing and service models
Prateek Soni
 
IOT - Design Principles of Connected Devices
Devyani Vasistha
 
Fog computing
Mahantesh Hiremath
 
Unit-I Introduction to Cloud Computing.pptx
garkhot123
 
Uml restaurant (group 1)
Omid Aminzadeh Gohari
 
Best topics for seminar
shilpi nagpal
 

Similar to CYBERBULLYING DETECTION USING MACHINE LEARNING-1 (1).pdf (20)

PPTX
cyberbullyingdetectionusingmachinelearning-11-220913143556-fec10e26.pptx
SaiKiran101146
 
PPTX
Cyber Bullying Detection using SVM and LSTM
KarthikMThirthahalli
 
PPTX
Detecting Threat messages using deep learning.pptx
pramu8279
 
PDF
Cyberbullying Detection Using Machine Learning
IRJET Journal
 
PDF
Cyber bullying detection project documents free downloas
alljobsssinfotech
 
PPTX
CYBER BULLYING DETECTION UPDATED USING social
perumal22
 
PDF
MACHINE LEARNING AND DEEP LEARNING TECHNIQUES FOR DETECTING ABUSIVE CONTENT O...
IRJET Journal
 
PDF
IRJET - Cyberbulling Detection Model
IRJET Journal
 
PDF
A study of cyberbullying detection using Deep Learning and Machine Learning T...
IRJET Journal
 
PDF
A study of cyberbullying detection using Deep Learning and Machine Learning T...
IRJET Journal
 
PDF
Detecting cyberbullying text using the approaches with machine learning model...
IAESIJAI
 
PPTX
Fake news detection
shalushamil
 
PPTX
1069391_Sharayu Mogare_CyberbullyingDetection on social networks using machin...
YOGESHSAHU703818
 
PPTX
final review ppt of engineering hypothetic arm
ssuserd24233
 
PDF
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET Journal
 
PDF
BINARY TEXT CLASSIFICATION OF CYBER HARASSMENT USING DEEP LEARNING
IRJET Journal
 
PPTX
Data Mining Email SPam Detection PPT WITH Algorithms
deepika90811
 
PDF
Machine Learning: Learning with data
ONE Talks
 
PDF
One talk Machine Learning
ONE Talks
 
PDF
Comparative Study of Cyberbullying Detection using Different Machine Learning...
ijtsrd
 
cyberbullyingdetectionusingmachinelearning-11-220913143556-fec10e26.pptx
SaiKiran101146
 
Cyber Bullying Detection using SVM and LSTM
KarthikMThirthahalli
 
Detecting Threat messages using deep learning.pptx
pramu8279
 
Cyberbullying Detection Using Machine Learning
IRJET Journal
 
Cyber bullying detection project documents free downloas
alljobsssinfotech
 
CYBER BULLYING DETECTION UPDATED USING social
perumal22
 
MACHINE LEARNING AND DEEP LEARNING TECHNIQUES FOR DETECTING ABUSIVE CONTENT O...
IRJET Journal
 
IRJET - Cyberbulling Detection Model
IRJET Journal
 
A study of cyberbullying detection using Deep Learning and Machine Learning T...
IRJET Journal
 
A study of cyberbullying detection using Deep Learning and Machine Learning T...
IRJET Journal
 
Detecting cyberbullying text using the approaches with machine learning model...
IAESIJAI
 
Fake news detection
shalushamil
 
1069391_Sharayu Mogare_CyberbullyingDetection on social networks using machin...
YOGESHSAHU703818
 
final review ppt of engineering hypothetic arm
ssuserd24233
 
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET Journal
 
BINARY TEXT CLASSIFICATION OF CYBER HARASSMENT USING DEEP LEARNING
IRJET Journal
 
Data Mining Email SPam Detection PPT WITH Algorithms
deepika90811
 
Machine Learning: Learning with data
ONE Talks
 
One talk Machine Learning
ONE Talks
 
Comparative Study of Cyberbullying Detection using Different Machine Learning...
ijtsrd
 
Ad

Recently uploaded (20)

PPTX
西班牙武康大学毕业证书{UCAMOfferUCAM成绩单水印}原版制作
Taqyea
 
PPTX
L1A Season 1 Guide made by A hegy Eng Grammar fixed
toszolder91
 
PPTX
原版西班牙莱昂大学毕业证(León毕业证书)如何办理
Taqyea
 
PPTX
04 Output 1 Instruments & Tools (3).pptx
GEDYIONGebre
 
PPTX
一比一原版(LaTech毕业证)路易斯安那理工大学毕业证如何办理
Taqyea
 
PPT
introductio to computers by arthur janry
RamananMuthukrishnan
 
PPT
introduction to networking with basics coverage
RamananMuthukrishnan
 
PPTX
一比一原版(SUNY-Albany毕业证)纽约州立大学奥尔巴尼分校毕业证如何办理
Taqyea
 
PPTX
法国巴黎第二大学本科毕业证{Paris 2学费发票Paris 2成绩单}办理方法
Taqyea
 
PDF
Apple_Environmental_Progress_Report_2025.pdf
yiukwong
 
PPT
Computer Securityyyyyyyy - Chapter 1.ppt
SolomonSB
 
PPT
Agilent Optoelectronic Solutions for Mobile Application
andreashenniger2
 
PDF
Build Fast, Scale Faster: Milvus vs. Zilliz Cloud for Production-Ready AI
Zilliz
 
PPTX
Presentation3gsgsgsgsdfgadgsfgfgsfgagsfgsfgzfdgsdgs.pptx
SUB03
 
PDF
Azure_DevOps introduction for CI/CD and Agile
henrymails
 
PPTX
PM200.pptxghjgfhjghjghjghjghjghjghjghjghjghj
breadpaan921
 
PPTX
Orchestrating things in Angular application
Peter Abraham
 
PPTX
sajflsajfljsdfljslfjslfsdfas;fdsfksadfjlsdflkjslgfs;lfjlsajfl;sajfasfd.pptx
theknightme
 
PPTX
ONLINE BIRTH CERTIFICATE APPLICATION SYSYTEM PPT.pptx
ShyamasreeDutta
 
PPTX
Optimization_Techniques_ML_Presentation.pptx
farispalayi
 
西班牙武康大学毕业证书{UCAMOfferUCAM成绩单水印}原版制作
Taqyea
 
L1A Season 1 Guide made by A hegy Eng Grammar fixed
toszolder91
 
原版西班牙莱昂大学毕业证(León毕业证书)如何办理
Taqyea
 
04 Output 1 Instruments & Tools (3).pptx
GEDYIONGebre
 
一比一原版(LaTech毕业证)路易斯安那理工大学毕业证如何办理
Taqyea
 
introductio to computers by arthur janry
RamananMuthukrishnan
 
introduction to networking with basics coverage
RamananMuthukrishnan
 
一比一原版(SUNY-Albany毕业证)纽约州立大学奥尔巴尼分校毕业证如何办理
Taqyea
 
法国巴黎第二大学本科毕业证{Paris 2学费发票Paris 2成绩单}办理方法
Taqyea
 
Apple_Environmental_Progress_Report_2025.pdf
yiukwong
 
Computer Securityyyyyyyy - Chapter 1.ppt
SolomonSB
 
Agilent Optoelectronic Solutions for Mobile Application
andreashenniger2
 
Build Fast, Scale Faster: Milvus vs. Zilliz Cloud for Production-Ready AI
Zilliz
 
Presentation3gsgsgsgsdfgadgsfgfgsfgagsfgsfgzfdgsdgs.pptx
SUB03
 
Azure_DevOps introduction for CI/CD and Agile
henrymails
 
PM200.pptxghjgfhjghjghjghjghjghjghjghjghjghj
breadpaan921
 
Orchestrating things in Angular application
Peter Abraham
 
sajflsajfljsdfljslfjslfsdfas;fdsfksadfjlsdflkjslgfs;lfjlsajfl;sajfasfd.pptx
theknightme
 
ONLINE BIRTH CERTIFICATE APPLICATION SYSYTEM PPT.pptx
ShyamasreeDutta
 
Optimization_Techniques_ML_Presentation.pptx
farispalayi
 
Ad

CYBERBULLYING DETECTION USING MACHINE LEARNING-1 (1).pdf

  • 1. CYBERBULLYING DETECTION USING MACHINE LEARNING PRESENTED BY GROUP I Under the Guidance of Ms.Surya Ashok, HOD Computer Science department TEAM MEMBERS: ANITHA R KRITHIKA V S MEGHA M S PRANIDHI K J
  • 2. ABSTRACT ● With the widespread use of social media in this era, cyberbullying increased rapidly as a cybercrime. ● Cyberbullying is a willful and repeated harm inflicted through the use of computer, cell phones, and other electronic devices. ● The proposed system aims at detecting cyberbullying, it detects abusive comments and messages in social media platform. ● The Machine learning algorithm,Naive bayes is used to classify comments and messages as bullying and non-bullying. ● The project ‘Cyberbullying Detection Using Machine Learning’ discusses and implements the approach of machine learning in order to solve the threat of cyberbullying, and thus makes social media a safe place for the users.
  • 3. SYSTEM SPECIFICATIONS Hardware Specification Processor : Intel Core i5 Speed : Above 1GHz RAM capacity : 4GB or above Hard Disk Space Required : 5 GB or above Keyboard : Standard Keyboard Mouse : Standard Mouse Monitor : Standard color monitor
  • 4. Software Specification ● Language Used : Python 3.10, HTML5, JavaScript ES6 ➔ Here, HTML and JavaScript are Used for designing the web application. ➔ The main advantages of using python in this project is that it is open source. ➔ It also has vast built-in machine learning libraries available. ● Web Framework : Django 3.7 ➔ Django is preferred in this project because of its simplicity, flexibility, reliability and scalability. ● Database : SQL Server 2019 ➔ SQL Server 2019 (15.x) introduces new ways to work with SQL Server Containers such as Machine Learning Services. ➔ Supports Query interleaving,which is a tabular mode system configuration that can improve user query response times in high-concurrency scenarios.
  • 5. EXISTING SYSTEM ● For several years, the researchers have worked intensively on cyberbullying detection to find a way to control or reduce cyberbullying in Social Media platforms. ● In a research work by Massachusetts Institute of Technology, a system to detect cyberbullying through textual context in YouTube video comments was developed, but the system showed less precise classification outcome and increased false positives. ● Generally most existing systems are focused on effects after cyberbullying incident and there is no accurate system for online cyberbullying detection.
  • 6. PROPOSED SYSTEM ● The proposed system employs machine learning to avoid human intervention. ● A dataset containing cyberbullying and non-bullying comments is used to train the machine learning model using the Sklearn library in Python. ● Naive Bayes algorithm is used for detecting abusive comments and messages in social media.
  • 7. ● The Naive Bayes algorithm states that: P(A/B)=(P(B/A) P(A))/P(B) ● In the proposed system automated detection of bullying comments in social media is implemented. ● The proposed system is platform independent, it can be implemented on any operating system and it is free to use.
  • 8. MODULE DESCRIPTION ● User module. ● Admin module. ● Machine learning module.
  • 9. MODULE FUNCTIONALITIES ❏ USER MODULE ● Users can sign up to the web application by registering themselves by providing details like user name,password etc.. ● Registered users can also sign in to their profile by using user id and password. ● They can post videos,stories and photos in the web application. ● Users can send friend requests to other users and can also chat with their friends. ● Users can view,like and comment the videos and photos posted by their friends in the web application.
  • 10. ❏ ADMIN MODULE ● Admin can handle and make changes in the web application. ● They can also view the requests from users . ● They can also view the comments that have been classified as bullying and non-bullying. ● They can manage the notifications of users.
  • 11. ❏ MACHINE LEARNING MODULE ● The Machine Learning module is responsible for classifying comments and messages as bullying or non-bullying. ● From a vast set of comments and messages, the Naive Bayes algorithm is used to predict bullying comments and messages. ● This module includes the following steps : ➢ Data collection ➢ Data preprocessing ➢ Segmentation ➢ Feature extraction ➢ Training ➢ Testing
  • 12. FLOWCHART OF CYBERBULLYING DETECTION SYSTEM
  • 13. 1. DATA COLLECTION ● Collecting data for training the Machine Learning model is the basic step in the machine learning pipeline. ● The predictions made by Machine Learning systems can only be as good as the data on which they have been trained. ● In this system, dataset containing bullying as well as non-bullying comments and messages. ● The data set is downloaded from KAGGLE website. ● 80% of dataset is used for training and the remaining 20% is used for testing.
  • 14. 2. DATA PREPROCESSING ● Real-world raw data and images are often incomplete, inconsistent and lacking in certain behaviors or trends. They are also likely to contain many errors. So, once collected, they are pre-processed into a format the machine learning algorithm can use for the model. ● Data preprocessing in Machine Learning is a crucial step that helps enhance the quality of data to promote the extraction of meaningful insights from the data. ● The proprocessing step also includes the removal of stop words, special characters and the conversion of uppercase letters to lowercase. ● The Lemmatization step includes converting tense word into root word. For example, the word running is converted to its root word run.
  • 15. 3. SEGMENTATION ● Segmentation can be defined as the process of separating sentences into different tokens. ● N-grams are used for grouping tokens. ● N-grams are used for a variety of things. Some examples include auto completion of sentences. ● In this project, 2-gram is used to group tokens.
  • 16. 4. FEATURE EXTRACTION ● Feature extraction is the process of taking out a list of words from the text data and then transforming them into a feature set which is usable by a classifier. ● In this system, TF-IDF vectorizer is used for feature extraction. ● TF-IDF stands for term frequency-inverse document frequency and it is a measure, used to quantify the importance or relevance of string representations in a document. ● TF-IDF associates each word in a document with a number that represents how relevant each word is in that document.
  • 17. 5. TRAINING ● Model training is the key step in machine learning that results in a model ready to be validated, tested, and deployed. ● The performance of the model determines the quality of the applications that are built using it. ● Quality of training data and the training algorithm are both important assets during the model training phase. ● Typically, dataset is split for training and testing. ● All these aspects of model training make it both an involved and important process in the overall machine learning development cycle.
  • 18. 6. TESTING ● In machine learning, model testing is referred to as the process where the performance of a fully trained model is evaluated on a testing set. ● The testing set consisting of a set of testing samples should be separated from the both training and validation sets, but it should follow the same probability distribution as the training set. ● Each testing sample has a known value of the target.
  • 19. DOMAIN THEORY ➔ Machine learning ● Machine learning (ML) is the study of computer algorithms that improve automatically through experience. ● Machine learning involves computers discovering how they can perform tasks without being explicitly programmed to do so. ● The Machine Learning process starts with inputting training data into the selected algorithm. ● New input data is fed into the machine learning algorithm to test whether the algorithm works correctly.
  • 20. ➔ NAIVE BAYES ● A Naive Bayes classifier is a probabilistic machine learning model that’s used for classification task. ● The classifier is based on the Bayes theorem. Bayes Theorem : P(A/B)=(P(B/A) P(A))/P(B) ● This system uses Multinomial Naive Bayes Classifier. ● The features/predictors used by the classifier are the frequency of the words present in the document.
  • 21. CONFUSION MATRIX Fig : Confusion Matrix
  • 25. DATA FLOW DIAGRAMS Fig. : Level 0 DFD
  • 27. Fig.: Level 1 DFD of user
  • 29. LEVEL 1.1 DFD OF ADMIN
  • 38. CONCLUSION The overall aim of the project “Cyberbullying Detection Using Machine Learning” is to develop a system that automatically classifies comments and messages as bullying or non-bullying and also remove the bullying comments from the web application.
  • 39. BIBLIOGRAPHY Referenced Sites: 1. Cynthia Van Hee, Gilles Jacobs, Chris Emmery, Bart Desmet, Els Lefever, Ben Verhoeven, Guy De Pauw, Walter Daelemans, Véronique Hoste, Automatic detection of cyberbullying in social media text, PloS one 13 (10), e0203794, 2018 2. Sweta Agrawal, Amit Awekar, European conference on information retrieval, Deep learning for detecting cyberbullying across multiple social media platforms, 141-153, 2018 3. Ong Chee Hang, Halina Mohamed Dahlan 2019 6th International Conference on Research and Innovation in Information Systems, Cyberbullying lexicon for social media, (ICRIIS), 1-6, 2019 4. John Hani, Mohamed Nashaat, Mostafa Ahmed, Zeyad Emad, Eslam Amer, Ammar Mohammed, Social media cyberbullying detection using machine learning, Int. J. Adv. Comput. Sci. Appl 10 (5), 703-707, 2019