SlideShare a Scribd company logo
3
Most read
4
Most read
9
Most read
BY
G.GOWRILATHA,M.Sc(Info Tech)
Department Of CS & IT
Nadar Saraswathi College Of Arts and Science,
Theni.
 A desired feature of data mining systems is the ability to support ad hoc
and interactive data
mining in order to facilitate the flexible and effective knowledge discovery.
Data mining query
languages can be designed to support such a feature.
 The standardization of relational query languages, which occurred at the
early stages of relational database development, is widely credited for the
success of the relational
database field.
The recent standardization
activities in database systems, such as work relating to SQL-3, and so on,
further illustrate the
importance of having a standard database language for success the
development and commercialization
of database systems.
• The set of task-relevant data to be mined
• The kind of knowledge to be mined
• The background to be mined
• The background knowledge to be used in the
discovery process
• The interestingness measures and thresholds for
pattern evolution
• The expected representation for visualizing the
discovered patterns
The first step in defining a data-mining task is the
specification of the task-relevant data, that is, the data
on which mining is to be performed. This involves
specifying the database and tables or data warehouse
containing the relevant data, conditions for selecting
the relevant data, the relevant attributes or merinos
for exploration, and instructions regarding the order
or grouping of the data retried.
Use database (database_name) or use the data
warehouse (data_warehouse_name): the use clause
directs the mining task to the database or data
warehouse specified.
From (relation (s)/cubes(s)) [ware (condition)]: the
from and where clauses respectively specify the
database tables or data cubes involved, and the
conditions defining the data to be retrieved
(DMQL)::= (DMQL_statement} ;{(DMQL_Statement)
(DMQL_Statement)::= (Data_Mining_Statment)
| {Concept_Hierarchy,.Denmtion_Statement)
| (Visualization_and_Presentation)
Syntax for Specifying the Kind of Knowledge to be mined
The (Mine_Knowledge_Specification) statement is used to
specify the kind of knowledge to be mined.
In other words, it indicates the data mining functionality to be
performed. Its syntax is defined below for
characterization, discrimination, association, and classification.
(Mine_Knowledge_Specification) ::=mine
characteristics [as (pattern_name) ]
Analyze (measure(s))
This specifies that characteristic descriptions are to be
mined. The analyze clause, when used for
characterization, specifies aggregate measure, such as
count, sum, or count%( percentage count, i.e.,
the percentage of tuples in the relevant data set with
the specified (characteristics). These measures are
to be computed for each data characteristic found.
Concept hierarchies allow the mining of knowledge at
multiple levels of abstraction. In order to
accommodate the different viewpoints of users with regard
to the data, there may be more than one
concept hierarchy per attribute or dimension. For instance,
some users may prefer to organize branch
locations by provinces and states, while others may prefer to
organize them according to languages
used. In such cases, a user can indicate which concept
hierarchy is to be used with statement use
hierarchy (hierarchy_name) for {attribute _or_dimension)
Otherwise, a default hierarchy per attribute
or dimension is used.
The user can help control the number of uninteresting
patterns returned by the data mining
system by specifying measures of pattern
interestingness and their corresponding thresholds.
Interestingness measures and thresholds can be
specified be the user with the statement with
{(interest_measure_name)] threshold-(threshold_value)
Our data mining query language needs syntax that allows
users to specify the
display of discovered patterns in one or more forms,
including rules, tables cross tabs, pie or bar charts,
decision trees ,cubes ,curves or surfaces-We define the
DMQL display statement for this purpose;
display a
(Result _form)
Where the ( result_form) could be any of the knowledge
presentation or visualization forms listedInteractive mining
should allow the discovered patterns to be viewed at
different concept levels or from
different angles.
The attribute added must be one of the attributes listed
in
the in relevance to clause for task-relevant specification.
The user can alternately view the patterns at
different levels of abstractions with the use of following
DMQL syntax:
(Multilevel_Manapulation)::= roll up on
(attribute_or_dimension)
| drill down on (attribute_or_dimension)
| add (attribute_or_dimension)
| drop (attribute_or_dimension)
• Data collection and data mining query compositions:
This component allows the user to specify task-relevant data
sets and to compose data mining queries. It is similar to
GUI’s used for the specification of relational queries.
• Presentations of discovered patterns:
This component allows the display of the discovered
patterns in various forms, including tables, graphs, charts,
curves and other visualization techniques.
• Hierarchy specification and manipulation:
This component allows for concept hierarchy specification,
either manually by the user or automatically (based on
analysis of the data athand). In addition, this component
should allow concept hierarchies to be modified by the user
or adjusted automatically based on the given data set
distribution.
• Manipulation of data mining primitives:
This component may allow the dynamic adjustment of
the data mining thresholds, as well as the selection,
display and modification of concept
hierarchies. It may also allow the modification of
previous data mining queries or conditions.
• Interactive multilevel mining:
This component should allow roll-up or drill-down
operations on discovered patterns. Other miscellaneous
information: this component may include on-line help
manuals indexed search, debugging, and other
interactive graphical facilities. The design of a
graphical user interface should also take into
consideration different classes of users of a data
mining system.
No coupling:
No coupling means that a DM system will not utilize any
function of a DB or DW system., It may fetch
data from a particular source (such as a file system), process
data using some data mining algorithms,
and then store the mining results in another file. Such a
system, though simple, suffers from several
drawbacks. First, a DB system provides a great deal of
flexibility and efficiency at storing, organizing,
accessing, and processing data. Without using a DB/DW
system, a DM system may spend a substantial
amount of time finding, collecting, cleaning, and
transforming data. In DB and/or DW systems, data tend
to be well organized, indexed, cleaned integrated, or
consolidated, so that finding the task-relevant,
high-quality data becomes an easy task. Second, there are
many tested, scalable algorithms and data
structures implemented in DB and DW systems.
Loose coupling:
Loose coupling means that a DM system will use some
facilities of a DB or DW system, fetching
data from a data repository managed by these systems,
performing data mining, and then storing the
mining results either in a file or in a designated place in
a database or data warehouse. Loose coupling is
better than no coupling since it con fetch any portion of
data stored in databases or data warehouses by
using query processing. Indexing and other system
facilities. It incurs some advantages of the flexibility,
efficiency, and other features provided by such systems.
Semi tight coupling:
Semi tight coupling means that besides linking a DM
system to a DB/DW system, efficient
implementations of a few identical data mining
functions) can be provided in the DB/DW system. These
primitives can include sorting, indexing, aggregation,
histogram analysis, multilayer join, and
precipitation of some essential statistical measures, such
as sum, count, max, min, standard deviation
and so on. Moreover, some frequently used intermediate
mining results can be pre-computed and
stored in the DB/DW system. Since these intermediate
mining results are either pre-computed or can
be computed efficiently, this design will enhance the
performance of a DM system.
Tight coupling:
Tight coupling means that a DM system is smoothly
integrated into the DB/W system. the data mining
subsystem is treated as one functional component of an
information system. Data mining queries and
functions are optimized based on mining query
analyses, data structures, indexing schemes, and query
processing methods of a DB/DW system. With further
technology advances, DM, DB, and DW systems
with evolve and integrate together as one information
system with multiple functionalities .This will
provide a uniform information processing environment.
ThankYou…

More Related Content

What's hot (20)

PPTX
Data reduction
kalavathisugan
 
PPTX
Database abstraction
RituBhargava7
 
PPTX
Clusters techniques
rajshreemuthiah
 
PDF
Ddb 1.6-design issues
Esar Qasmi
 
PPT
Data models
Usman Tariq
 
PPTX
05 Clustering in Data Mining
Valerii Klymchuk
 
PPTX
Relational database
Megha Sharma
 
PPTX
Major issues in data mining
Yashwant Rautela
 
PPTX
Data Reduction
Rajan Shah
 
PPTX
Clustering in Data Mining
Archana Swaminathan
 
PPT
13. Query Processing in DBMS
koolkampus
 
PPT
11. Storage and File Structure in DBMS
koolkampus
 
PDF
K means Clustering
Edureka!
 
PPTX
multi dimensional data model
moni sindhu
 
PPTX
Relational model
Dabbal Singh Mahara
 
PPTX
Introduction to Data Mining
DataminingTools Inc
 
PPTX
Distributed database
ReachLocal Services India
 
PDF
Data preprocessing using Machine Learning
Gopal Sakarkar
 
PDF
Relational algebra in dbms
Vignesh Saravanan
 
PPTX
data generalization and summarization
janani thirupathi
 
Data reduction
kalavathisugan
 
Database abstraction
RituBhargava7
 
Clusters techniques
rajshreemuthiah
 
Ddb 1.6-design issues
Esar Qasmi
 
Data models
Usman Tariq
 
05 Clustering in Data Mining
Valerii Klymchuk
 
Relational database
Megha Sharma
 
Major issues in data mining
Yashwant Rautela
 
Data Reduction
Rajan Shah
 
Clustering in Data Mining
Archana Swaminathan
 
13. Query Processing in DBMS
koolkampus
 
11. Storage and File Structure in DBMS
koolkampus
 
K means Clustering
Edureka!
 
multi dimensional data model
moni sindhu
 
Relational model
Dabbal Singh Mahara
 
Introduction to Data Mining
DataminingTools Inc
 
Distributed database
ReachLocal Services India
 
Data preprocessing using Machine Learning
Gopal Sakarkar
 
Relational algebra in dbms
Vignesh Saravanan
 
data generalization and summarization
janani thirupathi
 

Similar to Data mining query language (20)

PPT
Data Mining Presentation on Science Day 2023
SakshiTiwari490123
 
PPTX
Subhaschamdrabhosesubhqschndrachose.pptx
rocky170104
 
PPTX
The data mining query language
Ishucs
 
PPT
Data mining-primitives-languages-and-system-architectures2641
Aiswaryadevi Jaganmohan
 
PPT
Data-Mining-2.ppt
Lazher ZAIDI
 
PPT
Data mining-2
Nit Hik
 
PDF
data mining
manasa polu
 
PPT
Unit 3 part i Data mining
Dhilsath Fathima
 
PPTX
Architecture of data mining system
ramya marichamy
 
PDF
10.1.1.118.1099
Suresh Nannuri
 
PPTX
Unit i
AishwaryaLakshmiA
 
PPT
Mca i unit part 501 dm
neeraj365
 
PPTX
Unit3-AssociationRuleMining and data techniques.pptx
yokeshmca
 
PPTX
Data Mining: Classification and analysis
DataminingTools Inc
 
PPTX
Data Mining: Data mining classification and analysis
Datamining Tools
 
PPTX
2-Concept Hierarchy to Classification of DMS.pptx
shobyscms
 
PPT
Cssu dw dm
sumit621
 
PPT
Data mining-2
Arun Verma
 
PPT
Sanjeev Kumar Dash D ata Mining-2023.ppt
gobeli2850
 
Data Mining Presentation on Science Day 2023
SakshiTiwari490123
 
Subhaschamdrabhosesubhqschndrachose.pptx
rocky170104
 
The data mining query language
Ishucs
 
Data mining-primitives-languages-and-system-architectures2641
Aiswaryadevi Jaganmohan
 
Data-Mining-2.ppt
Lazher ZAIDI
 
Data mining-2
Nit Hik
 
data mining
manasa polu
 
Unit 3 part i Data mining
Dhilsath Fathima
 
Architecture of data mining system
ramya marichamy
 
10.1.1.118.1099
Suresh Nannuri
 
Mca i unit part 501 dm
neeraj365
 
Unit3-AssociationRuleMining and data techniques.pptx
yokeshmca
 
Data Mining: Classification and analysis
DataminingTools Inc
 
Data Mining: Data mining classification and analysis
Datamining Tools
 
2-Concept Hierarchy to Classification of DMS.pptx
shobyscms
 
Cssu dw dm
sumit621
 
Data mining-2
Arun Verma
 
Sanjeev Kumar Dash D ata Mining-2023.ppt
gobeli2850
 
Ad

More from GowriLatha1 (20)

PPTX
Filtering in frequency domain
GowriLatha1
 
PPTX
Demand assigned and packet reservation multiple access
GowriLatha1
 
PPTX
Software engineering
GowriLatha1
 
PPTX
Shadow paging
GowriLatha1
 
PPTX
Multithreading
GowriLatha1
 
PPTX
Hive
GowriLatha1
 
PPTX
Web services & com+ components
GowriLatha1
 
PPTX
Comparison with Traditional databases
GowriLatha1
 
PPTX
Recovery system
GowriLatha1
 
PPTX
Comparison with Traditional databases
GowriLatha1
 
PPTX
Static analysis
GowriLatha1
 
PPTX
Hema dm
GowriLatha1
 
PPTX
Data reduction
GowriLatha1
 
PPTX
Inter process communication
GowriLatha1
 
PPTX
computer network
GowriLatha1
 
PPTX
Operating System
GowriLatha1
 
PPTX
Enterprice java
GowriLatha1
 
PPTX
Ethernet
GowriLatha1
 
PPTX
Java script
GowriLatha1
 
PPTX
Path & application(ds)2
GowriLatha1
 
Filtering in frequency domain
GowriLatha1
 
Demand assigned and packet reservation multiple access
GowriLatha1
 
Software engineering
GowriLatha1
 
Shadow paging
GowriLatha1
 
Multithreading
GowriLatha1
 
Web services & com+ components
GowriLatha1
 
Comparison with Traditional databases
GowriLatha1
 
Recovery system
GowriLatha1
 
Comparison with Traditional databases
GowriLatha1
 
Static analysis
GowriLatha1
 
Hema dm
GowriLatha1
 
Data reduction
GowriLatha1
 
Inter process communication
GowriLatha1
 
computer network
GowriLatha1
 
Operating System
GowriLatha1
 
Enterprice java
GowriLatha1
 
Ethernet
GowriLatha1
 
Java script
GowriLatha1
 
Path & application(ds)2
GowriLatha1
 
Ad

Recently uploaded (20)

PDF
ARAL-Orientation_Morning-Session_Day-11.pdf
JoelVilloso1
 
PDF
Dimensions of Societal Planning in Commonism
StefanMz
 
PPTX
How to Set Maximum Difference Odoo 18 POS
Celine George
 
PDF
The Different Types of Non-Experimental Research
Thelma Villaflores
 
PDF
Chapter-V-DED-Entrepreneurship: Institutions Facilitating Entrepreneurship
Dayanand Huded
 
PDF
Reconstruct, Restore, Reimagine: New Perspectives on Stoke Newington’s Histor...
History of Stoke Newington
 
PPTX
Neurodivergent Friendly Schools - Slides from training session
Pooky Knightsmith
 
PDF
QNL June Edition hosted by Pragya the official Quiz Club of the University of...
Pragya - UEM Kolkata Quiz Club
 
PPTX
A PPT on Alfred Lord Tennyson's Ulysses.
Beena E S
 
PPTX
grade 5 lesson matatag ENGLISH 5_Q1_PPT_WEEK4.pptx
SireQuinn
 
PPTX
Stereochemistry-Optical Isomerism in organic compoundsptx
Tarannum Nadaf-Mansuri
 
PDF
DIGESTION OF CARBOHYDRATES,PROTEINS,LIPIDS
raviralanaresh2
 
PPTX
PATIENT ASSIGNMENTS AND NURSING CARE RESPONSIBILITIES.pptx
PRADEEP ABOTHU
 
PPTX
ASRB NET 2023 PREVIOUS YEAR QUESTION PAPER GENETICS AND PLANT BREEDING BY SAT...
Krashi Coaching
 
PPTX
How to Manage Large Scrollbar in Odoo 18 POS
Celine George
 
PPTX
STAFF DEVELOPMENT AND WELFARE: MANAGEMENT
PRADEEP ABOTHU
 
PPTX
SPINA BIFIDA: NURSING MANAGEMENT .pptx
PRADEEP ABOTHU
 
PDF
Exploring the Different Types of Experimental Research
Thelma Villaflores
 
PPTX
Unit 2 COMMERCIAL BANKING, Corporate banking.pptx
AnubalaSuresh1
 
PDF
Generative AI: it's STILL not a robot (CIJ Summer 2025)
Paul Bradshaw
 
ARAL-Orientation_Morning-Session_Day-11.pdf
JoelVilloso1
 
Dimensions of Societal Planning in Commonism
StefanMz
 
How to Set Maximum Difference Odoo 18 POS
Celine George
 
The Different Types of Non-Experimental Research
Thelma Villaflores
 
Chapter-V-DED-Entrepreneurship: Institutions Facilitating Entrepreneurship
Dayanand Huded
 
Reconstruct, Restore, Reimagine: New Perspectives on Stoke Newington’s Histor...
History of Stoke Newington
 
Neurodivergent Friendly Schools - Slides from training session
Pooky Knightsmith
 
QNL June Edition hosted by Pragya the official Quiz Club of the University of...
Pragya - UEM Kolkata Quiz Club
 
A PPT on Alfred Lord Tennyson's Ulysses.
Beena E S
 
grade 5 lesson matatag ENGLISH 5_Q1_PPT_WEEK4.pptx
SireQuinn
 
Stereochemistry-Optical Isomerism in organic compoundsptx
Tarannum Nadaf-Mansuri
 
DIGESTION OF CARBOHYDRATES,PROTEINS,LIPIDS
raviralanaresh2
 
PATIENT ASSIGNMENTS AND NURSING CARE RESPONSIBILITIES.pptx
PRADEEP ABOTHU
 
ASRB NET 2023 PREVIOUS YEAR QUESTION PAPER GENETICS AND PLANT BREEDING BY SAT...
Krashi Coaching
 
How to Manage Large Scrollbar in Odoo 18 POS
Celine George
 
STAFF DEVELOPMENT AND WELFARE: MANAGEMENT
PRADEEP ABOTHU
 
SPINA BIFIDA: NURSING MANAGEMENT .pptx
PRADEEP ABOTHU
 
Exploring the Different Types of Experimental Research
Thelma Villaflores
 
Unit 2 COMMERCIAL BANKING, Corporate banking.pptx
AnubalaSuresh1
 
Generative AI: it's STILL not a robot (CIJ Summer 2025)
Paul Bradshaw
 

Data mining query language

  • 1. BY G.GOWRILATHA,M.Sc(Info Tech) Department Of CS & IT Nadar Saraswathi College Of Arts and Science, Theni.
  • 2.  A desired feature of data mining systems is the ability to support ad hoc and interactive data mining in order to facilitate the flexible and effective knowledge discovery. Data mining query languages can be designed to support such a feature.  The standardization of relational query languages, which occurred at the early stages of relational database development, is widely credited for the success of the relational database field. The recent standardization activities in database systems, such as work relating to SQL-3, and so on, further illustrate the importance of having a standard database language for success the development and commercialization of database systems.
  • 3. • The set of task-relevant data to be mined • The kind of knowledge to be mined • The background to be mined • The background knowledge to be used in the discovery process • The interestingness measures and thresholds for pattern evolution • The expected representation for visualizing the discovered patterns
  • 4. The first step in defining a data-mining task is the specification of the task-relevant data, that is, the data on which mining is to be performed. This involves specifying the database and tables or data warehouse containing the relevant data, conditions for selecting the relevant data, the relevant attributes or merinos for exploration, and instructions regarding the order or grouping of the data retried. Use database (database_name) or use the data warehouse (data_warehouse_name): the use clause directs the mining task to the database or data warehouse specified.
  • 5. From (relation (s)/cubes(s)) [ware (condition)]: the from and where clauses respectively specify the database tables or data cubes involved, and the conditions defining the data to be retrieved (DMQL)::= (DMQL_statement} ;{(DMQL_Statement) (DMQL_Statement)::= (Data_Mining_Statment) | {Concept_Hierarchy,.Denmtion_Statement) | (Visualization_and_Presentation) Syntax for Specifying the Kind of Knowledge to be mined The (Mine_Knowledge_Specification) statement is used to specify the kind of knowledge to be mined. In other words, it indicates the data mining functionality to be performed. Its syntax is defined below for characterization, discrimination, association, and classification.
  • 6. (Mine_Knowledge_Specification) ::=mine characteristics [as (pattern_name) ] Analyze (measure(s)) This specifies that characteristic descriptions are to be mined. The analyze clause, when used for characterization, specifies aggregate measure, such as count, sum, or count%( percentage count, i.e., the percentage of tuples in the relevant data set with the specified (characteristics). These measures are to be computed for each data characteristic found.
  • 7. Concept hierarchies allow the mining of knowledge at multiple levels of abstraction. In order to accommodate the different viewpoints of users with regard to the data, there may be more than one concept hierarchy per attribute or dimension. For instance, some users may prefer to organize branch locations by provinces and states, while others may prefer to organize them according to languages used. In such cases, a user can indicate which concept hierarchy is to be used with statement use hierarchy (hierarchy_name) for {attribute _or_dimension) Otherwise, a default hierarchy per attribute or dimension is used.
  • 8. The user can help control the number of uninteresting patterns returned by the data mining system by specifying measures of pattern interestingness and their corresponding thresholds. Interestingness measures and thresholds can be specified be the user with the statement with {(interest_measure_name)] threshold-(threshold_value)
  • 9. Our data mining query language needs syntax that allows users to specify the display of discovered patterns in one or more forms, including rules, tables cross tabs, pie or bar charts, decision trees ,cubes ,curves or surfaces-We define the DMQL display statement for this purpose; display a (Result _form) Where the ( result_form) could be any of the knowledge presentation or visualization forms listedInteractive mining should allow the discovered patterns to be viewed at different concept levels or from different angles.
  • 10. The attribute added must be one of the attributes listed in the in relevance to clause for task-relevant specification. The user can alternately view the patterns at different levels of abstractions with the use of following DMQL syntax: (Multilevel_Manapulation)::= roll up on (attribute_or_dimension) | drill down on (attribute_or_dimension) | add (attribute_or_dimension) | drop (attribute_or_dimension)
  • 11. • Data collection and data mining query compositions: This component allows the user to specify task-relevant data sets and to compose data mining queries. It is similar to GUI’s used for the specification of relational queries. • Presentations of discovered patterns: This component allows the display of the discovered patterns in various forms, including tables, graphs, charts, curves and other visualization techniques. • Hierarchy specification and manipulation: This component allows for concept hierarchy specification, either manually by the user or automatically (based on analysis of the data athand). In addition, this component should allow concept hierarchies to be modified by the user or adjusted automatically based on the given data set distribution.
  • 12. • Manipulation of data mining primitives: This component may allow the dynamic adjustment of the data mining thresholds, as well as the selection, display and modification of concept hierarchies. It may also allow the modification of previous data mining queries or conditions. • Interactive multilevel mining: This component should allow roll-up or drill-down operations on discovered patterns. Other miscellaneous information: this component may include on-line help manuals indexed search, debugging, and other interactive graphical facilities. The design of a graphical user interface should also take into consideration different classes of users of a data mining system.
  • 13. No coupling: No coupling means that a DM system will not utilize any function of a DB or DW system., It may fetch data from a particular source (such as a file system), process data using some data mining algorithms, and then store the mining results in another file. Such a system, though simple, suffers from several drawbacks. First, a DB system provides a great deal of flexibility and efficiency at storing, organizing, accessing, and processing data. Without using a DB/DW system, a DM system may spend a substantial amount of time finding, collecting, cleaning, and transforming data. In DB and/or DW systems, data tend to be well organized, indexed, cleaned integrated, or consolidated, so that finding the task-relevant, high-quality data becomes an easy task. Second, there are many tested, scalable algorithms and data structures implemented in DB and DW systems.
  • 14. Loose coupling: Loose coupling means that a DM system will use some facilities of a DB or DW system, fetching data from a data repository managed by these systems, performing data mining, and then storing the mining results either in a file or in a designated place in a database or data warehouse. Loose coupling is better than no coupling since it con fetch any portion of data stored in databases or data warehouses by using query processing. Indexing and other system facilities. It incurs some advantages of the flexibility, efficiency, and other features provided by such systems.
  • 15. Semi tight coupling: Semi tight coupling means that besides linking a DM system to a DB/DW system, efficient implementations of a few identical data mining functions) can be provided in the DB/DW system. These primitives can include sorting, indexing, aggregation, histogram analysis, multilayer join, and precipitation of some essential statistical measures, such as sum, count, max, min, standard deviation and so on. Moreover, some frequently used intermediate mining results can be pre-computed and stored in the DB/DW system. Since these intermediate mining results are either pre-computed or can be computed efficiently, this design will enhance the performance of a DM system.
  • 16. Tight coupling: Tight coupling means that a DM system is smoothly integrated into the DB/W system. the data mining subsystem is treated as one functional component of an information system. Data mining queries and functions are optimized based on mining query analyses, data structures, indexing schemes, and query processing methods of a DB/DW system. With further technology advances, DM, DB, and DW systems with evolve and integrate together as one information system with multiple functionalities .This will provide a uniform information processing environment.