SlideShare a Scribd company logo
4
Most read
6
Most read
7
Most read
CmpE 274 –Business Intelligence  Technologies.
Jinal Shah (ID-005242095) Sohel Dadia (ID-005177251) Ankit Khera (ID-005226495)  Riddhi shah(ID-005359513) Vivek  Modi(Id-005208581) Parth Vora  (ID-005169100)
--Knowledge Discovery?? --KDD Process --Data Mining Algorithms --Different forms of Mining Models --Classification of Algorithms --Weka --DEMO -- Questions??????
It is a process of searching knowledge from data and it focuses on the high level application of various data mining methods. It main goal is  mining information from raw data in the context of large databases. It makes use of different  data mining algorithms to extract information.
KDD is used in machine learning, pattern-recognition, databases , AI, MIS  and lot of other applications. It does the transformation according to the measures and thresholds.  It also takes in to account the preprocessing, sub-sampling, and transformation of the database if required.
1. Data Cleaning  2. Data Integration 3. Data Selection 4. Data transformation 5. Data Mining  6. Pattern Evaluation 7. Knowledge Presentation
 
The data mining algorithm is the mechanism that creates mining models.  To create a model, an algorithm first analyzes a set of data, looking for specific patterns and trends. The algorithm then uses the results of this analysis to define the parameters of the mining model.
Decision Trees and Rules Non-linear regression and classification Methods Example-based Methods Probabilistic Graphical Dependency Models Relational Learning Models
A set of rules that describe how products are grouped together in a transaction. A decision tree that predicts whether a particular customer will buy a product. A mathematical model that forecasts sales. A set of clusters that describe how the cases in a dataset are related.
Classification algorithms  predict one or more discrete variables, based on the other attributes in the dataset.  Regression algorithms  predict one or more continuous variables, such as profit or loss, based on other attributes in the dataset.  Segmentation algorithms  divide data into groups, or clusters, of items that have similar properties.
Association algorithms  find correlations between different attributes in a dataset. The most common application of this kind of algorithm is for creating association rules, which can be used in a market basket analysis.  Sequence analysis algorithms  summarize frequent sequences or episodes in data, such as a Web path flow.
Apriori Algorithm :-  is a classic algorithm for learning association rules. Apriori is designed to operate on databases containing transactions (for example, collections of items bought by customers, or details of a website frequentation).  Apriori uses breadth-first search and a hash tree structure to count candidate item sets efficiently.
What is Weka ? Weka is a collection of machine learning algorithms for data mining tasks. Why Weka ? Open Source. The algorithms can either be applied directly to a dataset or called from your own Java code.
It  contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.
Java 1.4 (or later) is required to run Weka 3.4.x and older versions. The developer versions, starting with 3.5.3, also require Java 5.0. Platform : Windows/ Linux
 
 
 
 
http://www.cs.waikato.ac.nz/ml/weka/ http://msdn2.microsoft.com/En-US/library/ms175595.aspx http://en.wikipedia.org/wiki/Apriori_algorithm Text book “Data Mining” by Jiawei Han and Micheline Kamber
 

More Related Content

What's hot (20)

PPTX
Data warehouse,data mining & Big Data
Ravinder Kamboj
 
PPT
DATA WAREHOUSING AND DATA MINING
Lovely Professional University
 
PDF
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Edureka!
 
PPT
Information Retrieval Models
Nisha Arankandath
 
PPTX
Data Mining
SHIKHA GAUTAM
 
PPT
Data warehouse
Medma Infomatix (P) Ltd.
 
PPTX
Data mining , Knowledge Discovery Process, Classification
Dr. Abdul Ahad Abro
 
PPT
Web data mining
Institute of Technology Telkom
 
PDF
Data warehouse architecture
pcherukumalla
 
PPTX
Introduction of Data Science
Jason Geng
 
PDF
Advanced Database System
sushmita rathour
 
PPTX
Text mining
Koshy Geoji
 
PPTX
Data science chapter-7,8,9
varshakumar21
 
PDF
Business intelligence in the real time economy
Johan Blomme
 
PPTX
Data analytics vs. Data analysis
Dr. C.V. Suresh Babu
 
PPTX
Data For Datamining
DataminingTools Inc
 
PPTX
Data warehousing
Vigneshwaar Ponnuswamy
 
PPTX
Columnar Databases (1).pptx
ssuser55cbdb
 
PPTX
Agile Data Engineering - Intro to Data Vault Modeling (2016)
Kent Graziano
 
PPT
Textmining Introduction
Datamining Tools
 
Data warehouse,data mining & Big Data
Ravinder Kamboj
 
DATA WAREHOUSING AND DATA MINING
Lovely Professional University
 
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Edureka!
 
Information Retrieval Models
Nisha Arankandath
 
Data Mining
SHIKHA GAUTAM
 
Data warehouse
Medma Infomatix (P) Ltd.
 
Data mining , Knowledge Discovery Process, Classification
Dr. Abdul Ahad Abro
 
Data warehouse architecture
pcherukumalla
 
Introduction of Data Science
Jason Geng
 
Advanced Database System
sushmita rathour
 
Text mining
Koshy Geoji
 
Data science chapter-7,8,9
varshakumar21
 
Business intelligence in the real time economy
Johan Blomme
 
Data analytics vs. Data analysis
Dr. C.V. Suresh Babu
 
Data For Datamining
DataminingTools Inc
 
Data warehousing
Vigneshwaar Ponnuswamy
 
Columnar Databases (1).pptx
ssuser55cbdb
 
Agile Data Engineering - Intro to Data Vault Modeling (2016)
Kent Graziano
 
Textmining Introduction
Datamining Tools
 

Viewers also liked (20)

PDF
Segmentación
laury2295
 
PDF
American Academy Cerftifcate1
Elaf Al Taha
 
DOCX
Propiedad civil y propiedad agraria modificado
Millalaidelis
 
PDF
IADC Certificate
Ibrando Silalahi
 
PPTX
áLbum de fotografías de viryz!!
angela maldonado
 
PPTX
Marilin del Carmen Lopez
Marilin del Carmen Lopez
 
PDF
การกำหนดมาตรฐานการศึกษาของสถานศึกษา
worapanthewaha
 
DOCX
essential newborn care
Bernadette Corral
 
PPTX
सुबह व शाम उसकी पाकी बयान करो
FAHIM AKTHAR ULLAL
 
PPS
Comala Unamaravilla
Vive Colima
 
PDF
Social Studies Web Sites & Technology
Glenn Wiebe
 
PDF
Prevencion de los trastornos de la conducta alimenticia
Gabriela Calva Hernandez
 
PPTX
Rethinking SQL for Big Data with Apache Drill
MapR Technologies
 
PPT
Kimaru-Muchai - Communication Channels used in dissemination of soil fertilit...
CIALCA
 
PPTX
Czytelniczo językowa impreza dla fanów książki „igrzyska...
bibliotekaszkolnag3
 
PDF
Modeling the operation of a general insurance company with system dynamics ap...
Peyman Haghighattalab
 
PDF
6.2.15 Christa Evans Rogers Resume
Christa Evans Rogers
 
PDF
Next greatest generation 2011
Glenn Wiebe
 
PPT
Los tainos, primeros pobladores de las Antillas Mayores
rosam24
 
Segmentación
laury2295
 
American Academy Cerftifcate1
Elaf Al Taha
 
Propiedad civil y propiedad agraria modificado
Millalaidelis
 
IADC Certificate
Ibrando Silalahi
 
áLbum de fotografías de viryz!!
angela maldonado
 
Marilin del Carmen Lopez
Marilin del Carmen Lopez
 
การกำหนดมาตรฐานการศึกษาของสถานศึกษา
worapanthewaha
 
essential newborn care
Bernadette Corral
 
सुबह व शाम उसकी पाकी बयान करो
FAHIM AKTHAR ULLAL
 
Comala Unamaravilla
Vive Colima
 
Social Studies Web Sites & Technology
Glenn Wiebe
 
Prevencion de los trastornos de la conducta alimenticia
Gabriela Calva Hernandez
 
Rethinking SQL for Big Data with Apache Drill
MapR Technologies
 
Kimaru-Muchai - Communication Channels used in dissemination of soil fertilit...
CIALCA
 
Czytelniczo językowa impreza dla fanów książki „igrzyska...
bibliotekaszkolnag3
 
Modeling the operation of a general insurance company with system dynamics ap...
Peyman Haghighattalab
 
6.2.15 Christa Evans Rogers Resume
Christa Evans Rogers
 
Next greatest generation 2011
Glenn Wiebe
 
Los tainos, primeros pobladores de las Antillas Mayores
rosam24
 
Ad

Similar to Knowledge Discovery Using Data Mining (20)

PPTX
Data mining & Decison Trees
Selman Bozkır
 
PPT
Data Mining-2023 (2).ppt
SATYAJITJENABTECH
 
PPT
Sanjeev Kumar Dash D ata Mining-2023.ppt
gobeli2850
 
PPT
Dma unit 1
thamizh arasi
 
PPTX
Week-1-Introduction to Data Mining.pptx
Take1As
 
PPT
1328cvkdlgkdgjfdkjgjdfgdfkgdflgkgdfglkjgld8679 - Copy.ppt
JITENDER773791
 
PPTX
Data mining
Akannsha Totewar
 
PDF
An Overview of General Data Mining Tools
IRJET Journal
 
PDF
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...
IRJET Journal
 
PPTX
Data mining an introduction
Dr-Dipali Meher
 
PPTX
Data Mining Intro
Asma CHERIF
 
PPT
data mining
Geet chopra
 
PPT
Introduction To Data Mining
dataminers.ir
 
PPT
Introduction To Data Mining
Phi Jack
 
PPT
Data mining techniques unit 1
malathieswaran29
 
PPT
Data Mining Xuequn Shang NorthWestern Polytechnical University
butest
 
PPTX
DM
sowfi
 
PPTX
Data mining approaches and methods
sonangrai
 
PDF
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...
theijes
 
DOCX
Mining internal sources of data
nomanbhutta
 
Data mining & Decison Trees
Selman Bozkır
 
Data Mining-2023 (2).ppt
SATYAJITJENABTECH
 
Sanjeev Kumar Dash D ata Mining-2023.ppt
gobeli2850
 
Dma unit 1
thamizh arasi
 
Week-1-Introduction to Data Mining.pptx
Take1As
 
1328cvkdlgkdgjfdkjgjdfgdfkgdflgkgdfglkjgld8679 - Copy.ppt
JITENDER773791
 
Data mining
Akannsha Totewar
 
An Overview of General Data Mining Tools
IRJET Journal
 
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...
IRJET Journal
 
Data mining an introduction
Dr-Dipali Meher
 
Data Mining Intro
Asma CHERIF
 
data mining
Geet chopra
 
Introduction To Data Mining
dataminers.ir
 
Introduction To Data Mining
Phi Jack
 
Data mining techniques unit 1
malathieswaran29
 
Data Mining Xuequn Shang NorthWestern Polytechnical University
butest
 
DM
sowfi
 
Data mining approaches and methods
sonangrai
 
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...
theijes
 
Mining internal sources of data
nomanbhutta
 
Ad

Recently uploaded (20)

PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
July Patch Tuesday
Ivanti
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
July Patch Tuesday
Ivanti
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 

Knowledge Discovery Using Data Mining

  • 1. CmpE 274 –Business Intelligence Technologies.
  • 2. Jinal Shah (ID-005242095) Sohel Dadia (ID-005177251) Ankit Khera (ID-005226495) Riddhi shah(ID-005359513) Vivek Modi(Id-005208581) Parth Vora (ID-005169100)
  • 3. --Knowledge Discovery?? --KDD Process --Data Mining Algorithms --Different forms of Mining Models --Classification of Algorithms --Weka --DEMO -- Questions??????
  • 4. It is a process of searching knowledge from data and it focuses on the high level application of various data mining methods. It main goal is mining information from raw data in the context of large databases. It makes use of different data mining algorithms to extract information.
  • 5. KDD is used in machine learning, pattern-recognition, databases , AI, MIS and lot of other applications. It does the transformation according to the measures and thresholds. It also takes in to account the preprocessing, sub-sampling, and transformation of the database if required.
  • 6. 1. Data Cleaning 2. Data Integration 3. Data Selection 4. Data transformation 5. Data Mining 6. Pattern Evaluation 7. Knowledge Presentation
  • 7.  
  • 8. The data mining algorithm is the mechanism that creates mining models. To create a model, an algorithm first analyzes a set of data, looking for specific patterns and trends. The algorithm then uses the results of this analysis to define the parameters of the mining model.
  • 9. Decision Trees and Rules Non-linear regression and classification Methods Example-based Methods Probabilistic Graphical Dependency Models Relational Learning Models
  • 10. A set of rules that describe how products are grouped together in a transaction. A decision tree that predicts whether a particular customer will buy a product. A mathematical model that forecasts sales. A set of clusters that describe how the cases in a dataset are related.
  • 11. Classification algorithms predict one or more discrete variables, based on the other attributes in the dataset. Regression algorithms predict one or more continuous variables, such as profit or loss, based on other attributes in the dataset. Segmentation algorithms divide data into groups, or clusters, of items that have similar properties.
  • 12. Association algorithms find correlations between different attributes in a dataset. The most common application of this kind of algorithm is for creating association rules, which can be used in a market basket analysis. Sequence analysis algorithms summarize frequent sequences or episodes in data, such as a Web path flow.
  • 13. Apriori Algorithm :- is a classic algorithm for learning association rules. Apriori is designed to operate on databases containing transactions (for example, collections of items bought by customers, or details of a website frequentation). Apriori uses breadth-first search and a hash tree structure to count candidate item sets efficiently.
  • 14. What is Weka ? Weka is a collection of machine learning algorithms for data mining tasks. Why Weka ? Open Source. The algorithms can either be applied directly to a dataset or called from your own Java code.
  • 15. It contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.
  • 16. Java 1.4 (or later) is required to run Weka 3.4.x and older versions. The developer versions, starting with 3.5.3, also require Java 5.0. Platform : Windows/ Linux
  • 17.  
  • 18.  
  • 19.  
  • 20.  
  • 22.