SlideShare a Scribd company logo
Data Mining Concepts
overviewHistory of DMXDMX IntroductionDMX objectsQuery SyntaxPrediction
History of DMXDMX was first introduced in the OLE DB for Data Mining specification authored by Microsoft in conjunction with other vendors in 1999.The goal of DMX is to define common concepts and common query expressions for the data mining world.It is similar to what SQL has done for databases.
overview of DMXData Mining Extensions (DMX) is a query language for Data Mining Models. It consists of: DDL (Data definition Language)The DDL of the DMX is used to create new data mining models and structures, export and import mining structures, copy or transfer data from one mining model to another and delete existing data mining models and mining structures. DML (Data Manipulation Language)The DML of the DMX is used to search and browse data in the data mining models, update the data mining models by insertion and updating of the data and derive predictions using the prediction query.
DMX objectsData Mining Extensions (DMX) is a language that you can use to create and work with data mining models in Microsoft SQL Server Analysis Services.DMX is used to create the structure of new data mining models, to train these models, and to browse, manage, and predict against them.There are two major objects that are used to manifest this transformation:The mining structure
The mining modelThe mining structureA mining structure is defined as a list of columns, with their data types and information describing how they should be handled.When a mining structure is processed, it contains a compressed cache  or copy  of the source data. This cache is used to train any models that are subsequently added to the structure. which can be queried to return its data or the distinct states that exist in any structure column. The cache is only maintained temporarily, and can be dropped at any time.
The mining modelA mining model is the object that transforms rows of data into cases and performs the machine learning using a specified data mining algorithm. A mining model is described as a subset of columns from the structure, how those columns are to be used as attributes along with the algorithm and parameters  to perform  machine learning on the structure data.Statistics about predictions are available as well, Additionally the learned patterns themselves can be queried to discover what the algorithm found. These patterns are generally referred to as the model content.
 Query SyntaxDMX statements  are used to create, process, delete, copy, browse, and predict against data mining models.The three basic steps for data mining process are:Creation
Prediction
TrainingCREATING MINING STRUCTURES:Creating mining structures is similar to creating tables in SQL.Syntax:CREATE [SESSION] MINING STRUCTURE <structure> ( [(<column definition list>)] )StructureA unique name for the structure.column definitionlistA comma-separated list of column definitions
CREATING MINING STRUCTURES:The following example creates a new mining structure called New Mailing.CREATE MINING STRUCTURE [New Mailing]    ( CustomerKey LONG KEY,    Gender TEXT DISCRETE,    [Number Cars Owned] LONG DISCRETE,    [Bike Buyer] LONG DISCRETE )
ALTERING MINING STRUCTURES:Creates a new mining model that is based on an existing mining structure. When you use the alter structure statement to create a new mining model, the structure must already exist. Syntax:ALTER MINING STRUCTURE <structure>      ADD MINING MODEL <model>      ( <column definition list> [(<nested column definition list>) [WITH FILTER (<nested filter criteria>)]] )      USING <algorithm> [(<parameter list>)] FILTER keyword is used to  filter condition.
ALTERING MINING STRUCTURESThe following example adds a Naive Bayes mining model to the New Mailing mining structure and limits the maximum number of attribute states to 50.ALTER MINING STRUCTURE [New Mailing]      ADD MINING MODEL [Naive Bayes]       ( CustomerKey,         Gender,        [Number Cars Owned],        [Bike Buyer] PREDICT )       USING Microsoft_Naive_Bayes (MAXIMUM_STATES = 50)
Data Types and Content typesThe following table shows the list of data types and content types for mining structure columns:Time Series models.Sequence Clustering models in nested tables.
DROP MINING MODEL Deletes a mining model from the database.Syntax:DROP MINING MODEL <model >ModelA model identifier.Ex:  The following sample code drops the mining model NBSample.DROP MINING MODEL [NBSample]
NESTED TABLESEx: Consider the following  case derived from two tables, one table that contains customer information and another table that contains customer purchases. A single customer in the customer table may have multiple purchases in the purchases table, which makes it difficult to describe the data using a single row. Analysis Services provides a unique method for handling these cases, by using nested tables. The concept of a nested table is demonstrated in the following illustration.
The first table is the parent table has information about customers, and associates a unique identifier for each customer. The second table, the child table, contains purchases for each customer. The purchases in the child table are related back to the parent table by the unique identifier, the CustomerKey column. The third table in the diagram shows the two tables combined.
PredictionPredictionmeansapplying the patterns that were found in the data to estimate unknown information. Examples:  of prediction might be predicting if a customer will or will not be good for a loan, estimating a credit score, determining to what cluster a case belongs, or predicting future values of a time series.
Prediction JoinUsing prediction join in this example we can come to conclusion that:‘‘if the kid is male and class is 5, then the highest scored subject is science.’’
Prediction Join syntaxSELECT [TOP <count>] <column references> FROM <mining model>    [[NATURAL] PREDICTION JOIN    <source-data> [ ON <mapping clause> ]   [ WHERE <condition clause> ]   [ ORDER BY <order clause> [DESC | ASC] ]]Count  Optional, An integer that specifies how many rows to return.column referencesA comma-separated list of column identifiers an expressions that are derived from the mining model.mining  modelA model identifier.source -dataThe source query.mapping clauseOptional, A logical expression that compares columns from the model to columns from the source query.condition clause Optional, A condition to restrict the values that are returned from the column list.order clause Optional, An expression that returns a scalar value.
summaryHistory of DMXDMX IntroductionDMX objectsQuery SyntaxPrediction join syntax

More Related Content

Viewers also liked (20)

PDF
可畏可愛的業力
lyquochoang
 
PDF
Zain surgalt2
Enhtuya Oidov
 
PPTX
STaR Chart Summary
hwhisenhunt
 
DOCX
тхг
hoianseku
 
PPTX
Пол Г. Хуппертц (Paul G. Huppertz), servicEvolution 121 Предоставление Сер...
Michael Dobner
 
PPTX
Emt ch 3
rizwankhan583
 
PPTX
Genderrace justice
jeffersonlibrary
 
PPTX
MS Sql Server: Reporting manipulating data
sqlserver content
 
PPTX
MS SQL SERVER: Deleting A Database
sqlserver content
 
PPTX
MS SQL SERVER: Customizing Your D Base Design
sqlserver content
 
PPTX
MS SQL SERVER: Decision trees algorithm
sqlserver content
 
PPTX
MS SQLSERVER:Deleting A Database
sqlserver content
 
PPTX
MS Sql Server: Business Intelligence
sqlserver content
 
PPTX
MS SQL SERVER: Microsoft sequence clustering and association rules
sqlserver content
 
PPTX
MS Sql Server: Datamining Introduction
sqlserver content
 
PPTX
MS SQL SERVER: Introduction To Database Concepts
sqlserver content
 
PPTX
MS SQL SERVER: Introduction To Datamining Suing Sql Server
sqlserver content
 
PPTX
MS SQLSERVER:Customizing Your D Base Design
sqlserver content
 
PPTX
MS SQL SERVER: Programming sql server data mining
sqlserver content
 
可畏可愛的業力
lyquochoang
 
Zain surgalt2
Enhtuya Oidov
 
STaR Chart Summary
hwhisenhunt
 
тхг
hoianseku
 
Пол Г. Хуппертц (Paul G. Huppertz), servicEvolution 121 Предоставление Сер...
Michael Dobner
 
Emt ch 3
rizwankhan583
 
Genderrace justice
jeffersonlibrary
 
MS Sql Server: Reporting manipulating data
sqlserver content
 
MS SQL SERVER: Deleting A Database
sqlserver content
 
MS SQL SERVER: Customizing Your D Base Design
sqlserver content
 
MS SQL SERVER: Decision trees algorithm
sqlserver content
 
MS SQLSERVER:Deleting A Database
sqlserver content
 
MS Sql Server: Business Intelligence
sqlserver content
 
MS SQL SERVER: Microsoft sequence clustering and association rules
sqlserver content
 
MS Sql Server: Datamining Introduction
sqlserver content
 
MS SQL SERVER: Introduction To Database Concepts
sqlserver content
 
MS SQL SERVER: Introduction To Datamining Suing Sql Server
sqlserver content
 
MS SQLSERVER:Customizing Your D Base Design
sqlserver content
 
MS SQL SERVER: Programming sql server data mining
sqlserver content
 

Similar to MS SQL Server: Data mining concepts and dmx (20)

PPTX
MS SQL SERVER: Microsoft time series algorithm
sqlserver content
 
PPT
Module02
Sridhar P
 
PPT
MYSQL.ppt
webhostingguy
 
PPTX
Analysis Services en SQL Server 2008
Eduardo Castro
 
PPTX
Data mining query language
GowriLatha1
 
PPTX
Bank mangement system
FaisalGhffar
 
PDF
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
Databricks
 
PDF
Database concepts
ACCESS Health Digital
 
DOC
Module 3
cs19club
 
PPTX
SQL-Demystified-A-Beginners-Guide-to-Database-Mastery.pptx
bhavaniteacher99
 
PPTX
At the core you will have KUSTO
Riccardo Zamana
 
PPT
Excel Datamining Addin Beginner
excel content
 
PPT
Excel Datamining Addin Beginner
DataminingTools Inc
 
PPTX
Introduction to SQl Commands.pptxhhjhvvb
DeepakSingh99214
 
PDF
Introduction to the Structured Query Language SQL
Harmony Kwawu
 
PDF
A Practical Enterprise Feature Store on Delta Lake
Databricks
 
PDF
Advanced MySQL Query Optimizations
Dave Stokes
 
PDF
MIS5101 WK10 Outcome Measures
Steven Johnson
 
PPTX
Day-09-Temp Table, Sub Query, View & Trigger.pptx
joynulabeden2
 
PPTX
Machine Learning Pipelines - Joseph Bradley - Databricks
Spark Summit
 
MS SQL SERVER: Microsoft time series algorithm
sqlserver content
 
Module02
Sridhar P
 
MYSQL.ppt
webhostingguy
 
Analysis Services en SQL Server 2008
Eduardo Castro
 
Data mining query language
GowriLatha1
 
Bank mangement system
FaisalGhffar
 
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
Databricks
 
Database concepts
ACCESS Health Digital
 
Module 3
cs19club
 
SQL-Demystified-A-Beginners-Guide-to-Database-Mastery.pptx
bhavaniteacher99
 
At the core you will have KUSTO
Riccardo Zamana
 
Excel Datamining Addin Beginner
excel content
 
Excel Datamining Addin Beginner
DataminingTools Inc
 
Introduction to SQl Commands.pptxhhjhvvb
DeepakSingh99214
 
Introduction to the Structured Query Language SQL
Harmony Kwawu
 
A Practical Enterprise Feature Store on Delta Lake
Databricks
 
Advanced MySQL Query Optimizations
Dave Stokes
 
MIS5101 WK10 Outcome Measures
Steven Johnson
 
Day-09-Temp Table, Sub Query, View & Trigger.pptx
joynulabeden2
 
Machine Learning Pipelines - Joseph Bradley - Databricks
Spark Summit
 
Ad

More from sqlserver content (20)

PPTX
MS SQL SERVER: Using the data mining tools
sqlserver content
 
PPTX
MS SQL SERVER: SSIS and data mining
sqlserver content
 
PPTX
MS SQL SERVER: Olap cubes and data mining
sqlserver content
 
PPTX
MS SQL SERVER: Neural network and logistic regression
sqlserver content
 
PPTX
MS SQL SERVER: Microsoft naive bayes algorithm
sqlserver content
 
PPTX
MS Sql Server: Reporting models
sqlserver content
 
PPTX
MS Sql Server: Reporting introduction
sqlserver content
 
PPTX
MS Sql Server: Reporting basics
sqlserver content
 
PPTX
MS SQLSERVER:Feeding Data Into Database
sqlserver content
 
PPTX
MS SQLSERVER:Doing Calculations With Functions
sqlserver content
 
PPTX
MS SQLSERVER:Creating Views
sqlserver content
 
PPTX
MS SQLSERVER:Creating A Database
sqlserver content
 
PPTX
MS SQLSERVER:Advanced Query Concepts Copy
sqlserver content
 
PPTX
MS SQLSERVER:Joining Databases
sqlserver content
 
PPTX
MS SQLSERVER:Manipulating Database
sqlserver content
 
PPTX
MS SQLSERVER:Retrieving Data From A Database
sqlserver content
 
PPTX
MS SQLSERVER:Sql Functions And Procedures
sqlserver content
 
PPTX
MS SQL SERVER: Sql Functions And Procedures
sqlserver content
 
PPTX
MS SQL SERVER: Retrieving Data From A Database
sqlserver content
 
PPTX
MS SQL SERVER: Manipulating Database
sqlserver content
 
MS SQL SERVER: Using the data mining tools
sqlserver content
 
MS SQL SERVER: SSIS and data mining
sqlserver content
 
MS SQL SERVER: Olap cubes and data mining
sqlserver content
 
MS SQL SERVER: Neural network and logistic regression
sqlserver content
 
MS SQL SERVER: Microsoft naive bayes algorithm
sqlserver content
 
MS Sql Server: Reporting models
sqlserver content
 
MS Sql Server: Reporting introduction
sqlserver content
 
MS Sql Server: Reporting basics
sqlserver content
 
MS SQLSERVER:Feeding Data Into Database
sqlserver content
 
MS SQLSERVER:Doing Calculations With Functions
sqlserver content
 
MS SQLSERVER:Creating Views
sqlserver content
 
MS SQLSERVER:Creating A Database
sqlserver content
 
MS SQLSERVER:Advanced Query Concepts Copy
sqlserver content
 
MS SQLSERVER:Joining Databases
sqlserver content
 
MS SQLSERVER:Manipulating Database
sqlserver content
 
MS SQLSERVER:Retrieving Data From A Database
sqlserver content
 
MS SQLSERVER:Sql Functions And Procedures
sqlserver content
 
MS SQL SERVER: Sql Functions And Procedures
sqlserver content
 
MS SQL SERVER: Retrieving Data From A Database
sqlserver content
 
MS SQL SERVER: Manipulating Database
sqlserver content
 
Ad

Recently uploaded (20)

PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
Python basic programing language for automation
DanialHabibi2
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Python basic programing language for automation
DanialHabibi2
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 

MS SQL Server: Data mining concepts and dmx

  • 2. overviewHistory of DMXDMX IntroductionDMX objectsQuery SyntaxPrediction
  • 3. History of DMXDMX was first introduced in the OLE DB for Data Mining specification authored by Microsoft in conjunction with other vendors in 1999.The goal of DMX is to define common concepts and common query expressions for the data mining world.It is similar to what SQL has done for databases.
  • 4. overview of DMXData Mining Extensions (DMX) is a query language for Data Mining Models. It consists of: DDL (Data definition Language)The DDL of the DMX is used to create new data mining models and structures, export and import mining structures, copy or transfer data from one mining model to another and delete existing data mining models and mining structures. DML (Data Manipulation Language)The DML of the DMX is used to search and browse data in the data mining models, update the data mining models by insertion and updating of the data and derive predictions using the prediction query.
  • 5. DMX objectsData Mining Extensions (DMX) is a language that you can use to create and work with data mining models in Microsoft SQL Server Analysis Services.DMX is used to create the structure of new data mining models, to train these models, and to browse, manage, and predict against them.There are two major objects that are used to manifest this transformation:The mining structure
  • 6. The mining modelThe mining structureA mining structure is defined as a list of columns, with their data types and information describing how they should be handled.When a mining structure is processed, it contains a compressed cache or copy of the source data. This cache is used to train any models that are subsequently added to the structure. which can be queried to return its data or the distinct states that exist in any structure column. The cache is only maintained temporarily, and can be dropped at any time.
  • 7. The mining modelA mining model is the object that transforms rows of data into cases and performs the machine learning using a specified data mining algorithm. A mining model is described as a subset of columns from the structure, how those columns are to be used as attributes along with the algorithm and parameters to perform machine learning on the structure data.Statistics about predictions are available as well, Additionally the learned patterns themselves can be queried to discover what the algorithm found. These patterns are generally referred to as the model content.
  • 8. Query SyntaxDMX statements are used to create, process, delete, copy, browse, and predict against data mining models.The three basic steps for data mining process are:Creation
  • 10. TrainingCREATING MINING STRUCTURES:Creating mining structures is similar to creating tables in SQL.Syntax:CREATE [SESSION] MINING STRUCTURE <structure> ( [(<column definition list>)] )StructureA unique name for the structure.column definitionlistA comma-separated list of column definitions
  • 11. CREATING MINING STRUCTURES:The following example creates a new mining structure called New Mailing.CREATE MINING STRUCTURE [New Mailing] ( CustomerKey LONG KEY, Gender TEXT DISCRETE, [Number Cars Owned] LONG DISCRETE, [Bike Buyer] LONG DISCRETE )
  • 12. ALTERING MINING STRUCTURES:Creates a new mining model that is based on an existing mining structure. When you use the alter structure statement to create a new mining model, the structure must already exist. Syntax:ALTER MINING STRUCTURE <structure> ADD MINING MODEL <model> ( <column definition list> [(<nested column definition list>) [WITH FILTER (<nested filter criteria>)]] ) USING <algorithm> [(<parameter list>)] FILTER keyword is used to filter condition.
  • 13. ALTERING MINING STRUCTURESThe following example adds a Naive Bayes mining model to the New Mailing mining structure and limits the maximum number of attribute states to 50.ALTER MINING STRUCTURE [New Mailing] ADD MINING MODEL [Naive Bayes] ( CustomerKey, Gender, [Number Cars Owned], [Bike Buyer] PREDICT ) USING Microsoft_Naive_Bayes (MAXIMUM_STATES = 50)
  • 14. Data Types and Content typesThe following table shows the list of data types and content types for mining structure columns:Time Series models.Sequence Clustering models in nested tables.
  • 15. DROP MINING MODEL Deletes a mining model from the database.Syntax:DROP MINING MODEL <model >ModelA model identifier.Ex: The following sample code drops the mining model NBSample.DROP MINING MODEL [NBSample]
  • 16. NESTED TABLESEx: Consider the following case derived from two tables, one table that contains customer information and another table that contains customer purchases. A single customer in the customer table may have multiple purchases in the purchases table, which makes it difficult to describe the data using a single row. Analysis Services provides a unique method for handling these cases, by using nested tables. The concept of a nested table is demonstrated in the following illustration.
  • 17. The first table is the parent table has information about customers, and associates a unique identifier for each customer. The second table, the child table, contains purchases for each customer. The purchases in the child table are related back to the parent table by the unique identifier, the CustomerKey column. The third table in the diagram shows the two tables combined.
  • 18. PredictionPredictionmeansapplying the patterns that were found in the data to estimate unknown information. Examples: of prediction might be predicting if a customer will or will not be good for a loan, estimating a credit score, determining to what cluster a case belongs, or predicting future values of a time series.
  • 19. Prediction JoinUsing prediction join in this example we can come to conclusion that:‘‘if the kid is male and class is 5, then the highest scored subject is science.’’
  • 20. Prediction Join syntaxSELECT [TOP <count>] <column references> FROM <mining model> [[NATURAL] PREDICTION JOIN <source-data> [ ON <mapping clause> ] [ WHERE <condition clause> ] [ ORDER BY <order clause> [DESC | ASC] ]]Count  Optional, An integer that specifies how many rows to return.column referencesA comma-separated list of column identifiers an expressions that are derived from the mining model.mining modelA model identifier.source -dataThe source query.mapping clauseOptional, A logical expression that compares columns from the model to columns from the source query.condition clause Optional, A condition to restrict the values that are returned from the column list.order clause Optional, An expression that returns a scalar value.
  • 21. summaryHistory of DMXDMX IntroductionDMX objectsQuery SyntaxPrediction join syntax
  • 22. Visit more self help tutorialsPick a tutorial of your choice and browse through it at your own pace.The tutorials section is free, self-guiding and will not involve any additional support.Visit us at www.dataminingtools.net