SlideShare a Scribd company logo
2
Most read
3
Most read
5
Most read
Copyright © 2024 Jayanti Rajdevendra Pande. All rights reserved.
RASHTRASANT TUKDOJI MAHARAJ NAGPUR UNIVERSITY
MBA
SEMESTER: 3
SPECIALIZATION
BUSINESS ANALYTICS (BA 2)
SUBJECT
DATA MINING
MODULE NO : 4
ASSOCIATION RULES
- Jayanti R Pande
DGICM College, Nagpur
Copyright © 2024 Jayanti Rajdevendra Pande. All rights reserved.
Q1 What is Market Basket Analysis? Give significance of Market Basket Analysis for retailers? What are necessary steps for
implementing Market Basket Analysis?
MARKET BASKET ANALYSIS (MBA) is a data mining technique used in the field of retail and marketing to discover associations
and correlations between items that customers frequently buy together. The primary goal is to identify patterns and
relationships within transactional data, helping retailers understand customer behavior and preferences. Here's a breakdown of
its significance and essential steps for implementation:
SIGNIFICANCE OF MARKET BASKET ANALYSIS FOR RETAILERS
1. Increases Customer Engagement: By understanding the relationships between products, retailers can create targeted
marketing campaigns and promotions, increasing customer engagement.
2. Boosts Sales and Increases ROI: Tailoring promotions based on customer buying patterns can lead to higher sales and a
better return on investment.
3. Improves Customer Experience: Personalized recommendations and promotions enhance the overall shopping experience
for customers.
4. Optimizes Marketing Strategies and Campaigns: Retailers can optimize their marketing efforts by focusing on promoting
items that are frequently purchased together.
5. Helps Understand Customers Better: MBA provides insights into customer preferences, enabling retailers to stock relevant
products and improve overall satisfaction.
6. Identifies Customer Behavior and Patterns: Retailers can uncover hidden patterns and trends in customer behavior, aiding in
strategic decision-making.
Copyright © 2024 Jayanti Rajdevendra Pande. All rights reserved.
ESSENTIAL STEPS FOR IMPLEMENTING MARKET BASKET ANALYSIS
1 Define Minimum Support and Confidence:
Support: The proportion of transactions that contain a particular itemset.
Confidence: The probability that a rule is true, given that the antecedent is true.
2 Identify Subsets with Higher Support: Find all itemsets (subsets) with a support higher than the defined minimum support threshold.
3 Generate Association Rules: For each high-support itemset, generate association rules based on the defined minimum confidence
threshold.
4 Sort Association Rules : Rank the association rules in decreasing order of confidence.
5 Analyze Rules: Examine the association rules along with their confidence and support values. Identify meaningful and actionable insights
from the discovered patterns.
Implementing Market Basket Analysis typically involves using algorithms like the Apriori algorithm or FP-growth algorithm. These
algorithms efficiently mine frequent itemsets and generate association rules from transactional data.
1 Define Minimum Support and Confidence
2 Identify Subsets with Higher Support
3 Generate Association Rules
5 Analyse Rules
4 Sort Association Rules
Copyright © 2024 Jayanti Rajdevendra Pande. All rights reserved.
Apriori FP Growth
• Array-based algorithm • Tree-based algorithm
• Uses Join and Prune techniques • Constructs conditional frequent pattern tree
• Utilizes breadth-first search algorithm • Utilizes depth-first search algorithm
• Level-wise approach for pattern generation • Pattern growth approach considering existing data
• Exponentially slow candidate generation • Linear runtime complexity
• Highly parallelizable candidate generation • Data interdependency, each node needs root
• Requires large memory space • Requires less memory space due to compact structure
• Scans database multiple times • Scans dataset only twice for constructing the tree
• Performance impacted by the number of items • Less impacted by the number of items
• Memory-intensive due to candidate generation • More memory-efficient with a compact structure
Q2 Compare Apriori and FP Growth Algorithm
Apriori Algorithm
The Apriori algorithm is a classic algorithm used for association rule mining, a technique in data mining that identifies relationships
between variables in large datasets. It was proposed by Agrawal and Srikant in 1994. The primary objective of the Apriori algorithm is to
find frequent item sets in a transaction database, which are sets of items that frequently occur together. These frequent itemsets are
then used to generate association rules.
FP-Growth Algorithm
The FP-Growth (Frequent Pattern Growth) algorithm is an alternative approach to association rule mining that aims to address some of
the limitations of the Apriori algorithm. It was proposed by Han, Pei, and Yin in 2000.
Copyright © 2024 Jayanti Rajdevendra Pande. All rights reserved.
Q3 What is FP Growth algorithm? State the advantages and Disadvantages of FP Growth Algorithm.
FP-GROWTH ALGORITHM
The FP-Growth Algorithm is an approach for finding frequent item sets in a database without using candidate generation. It
utilizes a divide-and-conquer strategy and employs a special data structure known as the frequent-pattern tree (FP-tree).
Algorithm Workflow:
Compresses the input database by creating an FP-tree to represent frequent items.
Divides the compressed database into sets of conditional databases, each associated with one frequent pattern.
Mines each conditional database separately.
Search Cost Reduction: Reduces search costs by recursively looking for short patterns and then concatenating them into long
frequent patterns.
Handling Large Databases: In large databases, where holding the FP tree in main memory is impractical, the algorithm partitions
the database into smaller databases (projected databases) and constructs an FP-tree for each.
ADVANTAGES OF FP-GROWTH ALGORITHM
1. Reduced Database Scans: Needs to scan the database twice, as opposed to Apriori, which scans transactions for each iteration.
2. Faster Execution: The pairing of items is not performed, making it faster compared to some other algorithms.
3. Compact Memory Storage: Stores the database in a compact version in memory, improving efficiency.
4. Scalability: Efficient and scalable for mining both long and short frequent patterns.
DISADVANTAGES OF FP-GROWTH ALGORITHM
1. Complex FP Tree Construction: Building the FP tree is more cumbersome and challenging than the Apriori algorithm.
2. Potential Expense: May be relatively expensive, particularly in certain scenarios.
3. Memory Constraints: The algorithm may face challenges fitting into shared memory when dealing with large databases.
Copyright © 2024 Jayanti Rajdevendra Pande. All rights reserved.
Q4 What are different types of Association rules in data mining? Briefly mention about the algorithms used for Association Rule
mining
TYPES OF ASSOCIATION RULES IN DATA MINING
1.Multi-Relational Association Rule (MRAR) : Derived from multi-relational databases, MRAR involves rules with one entity
having different relationships, representing indirect relationships between entities.
2.Generalized Association Rule : Used to discover hidden patterns in data, generalized association rules provide a rough idea
about interesting patterns.
3.Quantitative Association Rules : Involves numeric attributes in at least one part of the rule, distinguishing it from generalized
association rules where both sides consist of categorical attributes.
1
Multi-Relational
Association Rule
2
Generalized Association Rule
3
Quantitative Association
Rules
TYPES OF ASSOCIATION RULES
IN DATA MINING
Copyright © 2024 Jayanti Rajdevendra Pande. All rights reserved.
ALGORITHMS FOR ASSOCIATION RULE MINING
1 Apriori Algorithm:
Description: Identifies frequent individual items in a database and expands them to larger item sets, ensuring that these item
sets appear sufficiently often in the database.
Key Characteristics: Utilizes a breadth-first search algorithm and generates candidate itemsets with the Apriori property.
2 Eclat Algorithm:
Description: Also known as Equivalence Class Clustering and bottom-up, Eclat is considered by some as a more efficient
version of the Apriori algorithm. It employs lattice traversal to find frequent itemsets.
Key Characteristics: Focuses on intersection and support counting, avoiding the need for candidate generation.
3 FP-Growth Algorithm:
Description: Operates in two stages, including FP-tree construction and the extraction of frequently used item sets.
Particularly useful for finding frequent patterns without candidate generation.
Key Characteristics: Uses a divide-and-conquer strategy, creating an FP-tree to represent frequent items and then dividing the
database into conditional databases.
These algorithms play a crucial role in discovering associations and patterns within large datasets. While Apriori and Eclat
focus on candidate generation and support counting, FP-Growth eliminates the need for explicit candidate generation,
making it more efficient for certain scenarios. Each algorithm has its strengths and weaknesses, and the choice of the
algorithm depends on the specific requirements of the data mining task and the characteristics of the dataset.
Copyright © 2024 Jayanti Rajdevendra Pande. All rights reserved.
Q5 How to Apply the Apriori algorithm for the given data.
Step-1: K=1
1.Create Candidate Set C1: Count the occurrences of each individual item (I1, I2, I3, I4, I5) in the dataset.
2.Generate Frequent Itemset L1: Keep only the items with a support count greater than or equal to the minimum support
count (min_support=2).
Step-2: K=2
1.Generate Candidate Set C2: Join the items in L1 to form pairs and filter out those pairs that do not satisfy the Apriori
property (having subsets with minimum support). Count the occurrences of each candidate pair in the dataset.
2.Generate Frequent Itemset L2: Keep only the pairs with a support count greater than or equal to the minimum support
count.
Step-3: K=3
1.Generate Candidate Set C3: Join the items in L2 to form triplets and filter out those triplets that do not satisfy the Apriori
property. Count the occurrences of each candidate triplet in the dataset.
2.Generate Frequent Itemset L3: Keep only the triplets with a support count greater than or equal to the minimum support
count.
Continue this process until no more frequent itemsets can be generated.
Association Rule Generation:
1.For each frequent itemset, generate all possible non-empty subsets (itemset A and its complement B).
2.Calculate the confidence for each rule: Confidence(A->B) = Support_count(A∪B) / Support_count(A).
3.Keep only the rules with confidence greater than or equal to the minimum confidence threshold (min_confidence=50%).
Copyright © 2024 Jayanti Rajdevendra Pande. All rights reserved.
Q6 Write about Associative Classification method.
ASSOCIATIVE CLASSIFICATION is a method that combines principles from association rule mining and classification algorithms. It
aims to leverage the discovered associations in a dataset to enhance the performance of classification models. The following are
the key steps involved in the Associative Classification method:
1.Association Rule Mining: The first step involves mining association rules from the dataset. Association rule mining identifies
interesting relationships or associations between different attributes in the data. Common algorithms used for this step include
Apriori and FP-Growth.
2.Rule Pruning: Once the association rules are generated, a pruning step is often applied to filter out less relevant or less
significant rules. Pruning criteria may include measures like support, confidence, or other relevance measures.
3.Rule-to-Class Transformation: The association rules, typically in the form of "if-then" statements, are transformed into
classification rules. The antecedent part of the association rule becomes the condition for classifying instances, and the
consequent part becomes the predicted class label.
4.Building the Classification Model: The transformed rules are used to build a classification model. This model captures the
relationships and dependencies identified during the association rule mining phase.
5.Classifying New Instances: When a new, unseen instance needs to be classified, the rules are applied to determine the
predicted class label. Multiple rules may apply to a single instance, and conflict resolution strategies are employed to handle
such situations.
6.Conflict Resolution: Conflict resolution addresses cases where multiple rules predict conflicting class labels for the same
instance. Strategies include selecting the rule with the highest confidence, using a voting mechanism, or considering additional
criteria to resolve conflicts.
7.Evaluation and Fine-Tuning: The performance of the associative classification model is evaluated using standard metrics such as
accuracy, precision, recall, and F1-score. Fine-tuning may be performed to improve the model's performance.
Copyright © 2024 Jayanti Rajdevendra Pande. All rights reserved.
Copyright © 2024 Jayanti Rajdevendra Pande.
All rights reserved.
This content may be printed for personal use only. It may not be copied, distributed, or used for any other purpose
without the express written permission of the copyright owner.
This content is protected by copyright law. Any unauthorized use of the content may violate copyright laws and
other applicable laws.
For any further queries contact on email: jayantipande17@gmail.com

More Related Content

What's hot (20)

PDF
Business Analytics 1 Module 2.pdf
Jayanti Pande
 
PDF
Web & Social Media Analytics Module 2.pdf
Jayanti Pande
 
PDF
Business Analytics 1 Module 5.pdf
Jayanti Pande
 
PDF
Business Analytics 1 Module 4.pdf
Jayanti Pande
 
PDF
22PCOAM16 _ML_Unit 3 Notes & Question bank
Guru Nanak Technical Institutions
 
PPTX
UNIT-2 SOCIAL MEDIA AND WEB ANALYTICS (KMBN MK05)- by aditi narain.pptx
Ashishmishra500698
 
PPTX
How different between Big Data, Business Intelligence and Analytics ?
Thanakrit Lersmethasakul
 
PPT
Data Mining
Nour El Houda Megherbi
 
PPT
Visual Analytics in Big Data
Saurabh Shanbhag
 
PPTX
ETL Process
Rashmi Bhat
 
PDF
Introduction au Data Mining et Méthodes Statistiques
Giorgio Pauletto
 
PPT
Data Mining in Life Insurance Business
Ankur Khanna
 
PPTX
Data mining concepts and work
Amr Abd El Latief
 
PPTX
MODULE 5 _ Mining frequent patterns and associations.pptx
nikshaikh786
 
PPTX
Web mining (1)
ajaybabu1314
 
PPT
Data Mining: Concepts and techniques: Chapter 13 trend
Salah Amean
 
PDF
DWM-MODULE 6.pdf
nikshaikh786
 
PDF
Cours datamining
sarah Benmerzouk
 
PPTX
Cluster analysis
Pushkar Mishra
 
Business Analytics 1 Module 2.pdf
Jayanti Pande
 
Web & Social Media Analytics Module 2.pdf
Jayanti Pande
 
Business Analytics 1 Module 5.pdf
Jayanti Pande
 
Business Analytics 1 Module 4.pdf
Jayanti Pande
 
22PCOAM16 _ML_Unit 3 Notes & Question bank
Guru Nanak Technical Institutions
 
UNIT-2 SOCIAL MEDIA AND WEB ANALYTICS (KMBN MK05)- by aditi narain.pptx
Ashishmishra500698
 
How different between Big Data, Business Intelligence and Analytics ?
Thanakrit Lersmethasakul
 
Visual Analytics in Big Data
Saurabh Shanbhag
 
ETL Process
Rashmi Bhat
 
Introduction au Data Mining et Méthodes Statistiques
Giorgio Pauletto
 
Data Mining in Life Insurance Business
Ankur Khanna
 
Data mining concepts and work
Amr Abd El Latief
 
MODULE 5 _ Mining frequent patterns and associations.pptx
nikshaikh786
 
Web mining (1)
ajaybabu1314
 
Data Mining: Concepts and techniques: Chapter 13 trend
Salah Amean
 
DWM-MODULE 6.pdf
nikshaikh786
 
Cours datamining
sarah Benmerzouk
 
Cluster analysis
Pushkar Mishra
 

Similar to Data Mining Module 4 Business Analytics.pdf (20)

PDF
Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351
IRJET Journal
 
PDF
Gr2411971203
IJERA Editor
 
PDF
Multiple Minimum Support Implementations with Dynamic Matrix Apriori Algorith...
ijsrd.com
 
PDF
IRJET- Classification of Pattern Storage System and Analysis of Online Shoppi...
IRJET Journal
 
PDF
Data Mining based on Hashing Technique
ijtsrd
 
PDF
Frequent Item Set Mining - A Review
ijsrd.com
 
PDF
Volume 2-issue-6-2081-2084
Editor IJARCET
 
PDF
Volume 2-issue-6-2081-2084
Editor IJARCET
 
PDF
H044063843
IJERA Editor
 
PDF
Dy33753757
IJERA Editor
 
PDF
Dy33753757
IJERA Editor
 
PDF
Review on: Techniques for Predicting Frequent Items
vivatechijri
 
PDF
International Journal of Engineering Research and Development
IJERD Editor
 
PPTX
Association rule mining.pptx
maha797959
 
PDF
Data Mining For Supermarket Sale Analysis Using Association Rule
ijtsrd
 
DOCX
IEEE 2014 JAVA DATA MINING PROJECTS Secure mining of association rules in hor...
IEEEFINALYEARSTUDENTPROJECTS
 
DOCX
2014 IEEE JAVA DATA MINING PROJECT Secure mining of association rules in hori...
IEEEMEMTECHSTUDENTSPROJECTS
 
PDF
IRJET- Improving the Performance of Smart Heterogeneous Big Data
IRJET Journal
 
PDF
Adaptive and Fast Predictions by Minimal Itemsets Creation
IJERA Editor
 
PDF
Jurnal REMIK 2019
Sita Anggraeni
 
Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351
IRJET Journal
 
Gr2411971203
IJERA Editor
 
Multiple Minimum Support Implementations with Dynamic Matrix Apriori Algorith...
ijsrd.com
 
IRJET- Classification of Pattern Storage System and Analysis of Online Shoppi...
IRJET Journal
 
Data Mining based on Hashing Technique
ijtsrd
 
Frequent Item Set Mining - A Review
ijsrd.com
 
Volume 2-issue-6-2081-2084
Editor IJARCET
 
Volume 2-issue-6-2081-2084
Editor IJARCET
 
H044063843
IJERA Editor
 
Dy33753757
IJERA Editor
 
Dy33753757
IJERA Editor
 
Review on: Techniques for Predicting Frequent Items
vivatechijri
 
International Journal of Engineering Research and Development
IJERD Editor
 
Association rule mining.pptx
maha797959
 
Data Mining For Supermarket Sale Analysis Using Association Rule
ijtsrd
 
IEEE 2014 JAVA DATA MINING PROJECTS Secure mining of association rules in hor...
IEEEFINALYEARSTUDENTPROJECTS
 
2014 IEEE JAVA DATA MINING PROJECT Secure mining of association rules in hori...
IEEEMEMTECHSTUDENTSPROJECTS
 
IRJET- Improving the Performance of Smart Heterogeneous Big Data
IRJET Journal
 
Adaptive and Fast Predictions by Minimal Itemsets Creation
IJERA Editor
 
Jurnal REMIK 2019
Sita Anggraeni
 
Ad

More from Jayanti Pande (20)

PDF
UGC NET 2025 Current Affairs Module 3.pdf
Jayanti Pande
 
PDF
UGC NET 2025 Current Affairs Module 2.pdf
Jayanti Pande
 
PDF
UGC NET 2025 Current Affairs Module 1.pdf
Jayanti Pande
 
PDF
BBA Business Law Unit 4 Summary Notes.pdf
Jayanti Pande
 
PDF
BBA Business Law Unit 3 Summary Notes.pdf
Jayanti Pande
 
PDF
BBA Business Law Unit 2 Summary Notes.pdf
Jayanti Pande
 
PDF
BBA Business Law Unit 1 Summary Notes.pdf
Jayanti Pande
 
PDF
Asst Prof most probable Interview Questions.pdf
Jayanti Pande
 
PDF
Digital and Social Media Marketing Module 2.pdf
Jayanti Pande
 
PDF
Digital & Social Media Marketing Module 1.pdf
Jayanti Pande
 
PDF
Marketing Management Paper 3 Module 5.pdf
Jayanti Pande
 
PDF
Marketing Management Paper 3 Module 4.pdf
Jayanti Pande
 
PDF
Marketing Management Paper 3 Module 3 .pdf
Jayanti Pande
 
PDF
Marketing Management Paper 3 Module 2.pdf
Jayanti Pande
 
PDF
World Tread Organization [WTO] Overview.pdf
Jayanti Pande
 
PDF
Marketing Management Paper 3 Module 1.pdf
Jayanti Pande
 
PDF
Research Aptitude MCQ Series 1 for MAH SET Exam.pdf
Jayanti Pande
 
PDF
Strategy to qualify MH SET Exam in Management.pdf
Jayanti Pande
 
PDF
Digital Marketing Careers after MBA..pdf
Jayanti Pande
 
PDF
HRM Guide| Covering All HRM important topics | Best for Interview Preparation...
Jayanti Pande
 
UGC NET 2025 Current Affairs Module 3.pdf
Jayanti Pande
 
UGC NET 2025 Current Affairs Module 2.pdf
Jayanti Pande
 
UGC NET 2025 Current Affairs Module 1.pdf
Jayanti Pande
 
BBA Business Law Unit 4 Summary Notes.pdf
Jayanti Pande
 
BBA Business Law Unit 3 Summary Notes.pdf
Jayanti Pande
 
BBA Business Law Unit 2 Summary Notes.pdf
Jayanti Pande
 
BBA Business Law Unit 1 Summary Notes.pdf
Jayanti Pande
 
Asst Prof most probable Interview Questions.pdf
Jayanti Pande
 
Digital and Social Media Marketing Module 2.pdf
Jayanti Pande
 
Digital & Social Media Marketing Module 1.pdf
Jayanti Pande
 
Marketing Management Paper 3 Module 5.pdf
Jayanti Pande
 
Marketing Management Paper 3 Module 4.pdf
Jayanti Pande
 
Marketing Management Paper 3 Module 3 .pdf
Jayanti Pande
 
Marketing Management Paper 3 Module 2.pdf
Jayanti Pande
 
World Tread Organization [WTO] Overview.pdf
Jayanti Pande
 
Marketing Management Paper 3 Module 1.pdf
Jayanti Pande
 
Research Aptitude MCQ Series 1 for MAH SET Exam.pdf
Jayanti Pande
 
Strategy to qualify MH SET Exam in Management.pdf
Jayanti Pande
 
Digital Marketing Careers after MBA..pdf
Jayanti Pande
 
HRM Guide| Covering All HRM important topics | Best for Interview Preparation...
Jayanti Pande
 
Ad

Recently uploaded (20)

PDF
DIGESTION OF CARBOHYDRATES,PROTEINS,LIPIDS
raviralanaresh2
 
PDF
The Constitution Review Committee (CRC) has released an updated schedule for ...
nservice241
 
PPTX
How to Create a PDF Report in Odoo 18 - Odoo Slides
Celine George
 
PPTX
PATIENT ASSIGNMENTS AND NURSING CARE RESPONSIBILITIES.pptx
PRADEEP ABOTHU
 
PDF
ARAL-Orientation_Morning-Session_Day-11.pdf
JoelVilloso1
 
PPTX
Soil and agriculture microbiology .pptx
Keerthana Ramesh
 
PPTX
PPT on the Development of Education in the Victorian England
Beena E S
 
PDF
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - GLOBAL SUCCESS - CẢ NĂM - NĂM 2024 (VOCABULARY, ...
Nguyen Thanh Tu Collection
 
PPTX
Unit 2 COMMERCIAL BANKING, Corporate banking.pptx
AnubalaSuresh1
 
PDF
CEREBRAL PALSY: NURSING MANAGEMENT .pdf
PRADEEP ABOTHU
 
PPTX
2025 Winter SWAYAM NPTEL & A Student.pptx
Utsav Yagnik
 
PDF
Dimensions of Societal Planning in Commonism
StefanMz
 
PPTX
Views on Education of Indian Thinkers Mahatma Gandhi.pptx
ShrutiMahanta1
 
PDF
Isharyanti-2025-Cross Language Communication in Indonesian Language
Neny Isharyanti
 
PDF
People & Earth's Ecosystem -Lesson 2: People & Population
marvinnbustamante1
 
PPTX
SPINA BIFIDA: NURSING MANAGEMENT .pptx
PRADEEP ABOTHU
 
PPTX
MENINGITIS: NURSING MANAGEMENT, BACTERIAL MENINGITIS, VIRAL MENINGITIS.pptx
PRADEEP ABOTHU
 
PPT
Talk on Critical Theory, Part II, Philosophy of Social Sciences
Soraj Hongladarom
 
PDF
CHILD RIGHTS AND PROTECTION QUESTION BANK
Dr Raja Mohammed T
 
PPTX
Stereochemistry-Optical Isomerism in organic compoundsptx
Tarannum Nadaf-Mansuri
 
DIGESTION OF CARBOHYDRATES,PROTEINS,LIPIDS
raviralanaresh2
 
The Constitution Review Committee (CRC) has released an updated schedule for ...
nservice241
 
How to Create a PDF Report in Odoo 18 - Odoo Slides
Celine George
 
PATIENT ASSIGNMENTS AND NURSING CARE RESPONSIBILITIES.pptx
PRADEEP ABOTHU
 
ARAL-Orientation_Morning-Session_Day-11.pdf
JoelVilloso1
 
Soil and agriculture microbiology .pptx
Keerthana Ramesh
 
PPT on the Development of Education in the Victorian England
Beena E S
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - GLOBAL SUCCESS - CẢ NĂM - NĂM 2024 (VOCABULARY, ...
Nguyen Thanh Tu Collection
 
Unit 2 COMMERCIAL BANKING, Corporate banking.pptx
AnubalaSuresh1
 
CEREBRAL PALSY: NURSING MANAGEMENT .pdf
PRADEEP ABOTHU
 
2025 Winter SWAYAM NPTEL & A Student.pptx
Utsav Yagnik
 
Dimensions of Societal Planning in Commonism
StefanMz
 
Views on Education of Indian Thinkers Mahatma Gandhi.pptx
ShrutiMahanta1
 
Isharyanti-2025-Cross Language Communication in Indonesian Language
Neny Isharyanti
 
People & Earth's Ecosystem -Lesson 2: People & Population
marvinnbustamante1
 
SPINA BIFIDA: NURSING MANAGEMENT .pptx
PRADEEP ABOTHU
 
MENINGITIS: NURSING MANAGEMENT, BACTERIAL MENINGITIS, VIRAL MENINGITIS.pptx
PRADEEP ABOTHU
 
Talk on Critical Theory, Part II, Philosophy of Social Sciences
Soraj Hongladarom
 
CHILD RIGHTS AND PROTECTION QUESTION BANK
Dr Raja Mohammed T
 
Stereochemistry-Optical Isomerism in organic compoundsptx
Tarannum Nadaf-Mansuri
 

Data Mining Module 4 Business Analytics.pdf

  • 1. Copyright © 2024 Jayanti Rajdevendra Pande. All rights reserved. RASHTRASANT TUKDOJI MAHARAJ NAGPUR UNIVERSITY MBA SEMESTER: 3 SPECIALIZATION BUSINESS ANALYTICS (BA 2) SUBJECT DATA MINING MODULE NO : 4 ASSOCIATION RULES - Jayanti R Pande DGICM College, Nagpur
  • 2. Copyright © 2024 Jayanti Rajdevendra Pande. All rights reserved. Q1 What is Market Basket Analysis? Give significance of Market Basket Analysis for retailers? What are necessary steps for implementing Market Basket Analysis? MARKET BASKET ANALYSIS (MBA) is a data mining technique used in the field of retail and marketing to discover associations and correlations between items that customers frequently buy together. The primary goal is to identify patterns and relationships within transactional data, helping retailers understand customer behavior and preferences. Here's a breakdown of its significance and essential steps for implementation: SIGNIFICANCE OF MARKET BASKET ANALYSIS FOR RETAILERS 1. Increases Customer Engagement: By understanding the relationships between products, retailers can create targeted marketing campaigns and promotions, increasing customer engagement. 2. Boosts Sales and Increases ROI: Tailoring promotions based on customer buying patterns can lead to higher sales and a better return on investment. 3. Improves Customer Experience: Personalized recommendations and promotions enhance the overall shopping experience for customers. 4. Optimizes Marketing Strategies and Campaigns: Retailers can optimize their marketing efforts by focusing on promoting items that are frequently purchased together. 5. Helps Understand Customers Better: MBA provides insights into customer preferences, enabling retailers to stock relevant products and improve overall satisfaction. 6. Identifies Customer Behavior and Patterns: Retailers can uncover hidden patterns and trends in customer behavior, aiding in strategic decision-making.
  • 3. Copyright © 2024 Jayanti Rajdevendra Pande. All rights reserved. ESSENTIAL STEPS FOR IMPLEMENTING MARKET BASKET ANALYSIS 1 Define Minimum Support and Confidence: Support: The proportion of transactions that contain a particular itemset. Confidence: The probability that a rule is true, given that the antecedent is true. 2 Identify Subsets with Higher Support: Find all itemsets (subsets) with a support higher than the defined minimum support threshold. 3 Generate Association Rules: For each high-support itemset, generate association rules based on the defined minimum confidence threshold. 4 Sort Association Rules : Rank the association rules in decreasing order of confidence. 5 Analyze Rules: Examine the association rules along with their confidence and support values. Identify meaningful and actionable insights from the discovered patterns. Implementing Market Basket Analysis typically involves using algorithms like the Apriori algorithm or FP-growth algorithm. These algorithms efficiently mine frequent itemsets and generate association rules from transactional data. 1 Define Minimum Support and Confidence 2 Identify Subsets with Higher Support 3 Generate Association Rules 5 Analyse Rules 4 Sort Association Rules
  • 4. Copyright © 2024 Jayanti Rajdevendra Pande. All rights reserved. Apriori FP Growth • Array-based algorithm • Tree-based algorithm • Uses Join and Prune techniques • Constructs conditional frequent pattern tree • Utilizes breadth-first search algorithm • Utilizes depth-first search algorithm • Level-wise approach for pattern generation • Pattern growth approach considering existing data • Exponentially slow candidate generation • Linear runtime complexity • Highly parallelizable candidate generation • Data interdependency, each node needs root • Requires large memory space • Requires less memory space due to compact structure • Scans database multiple times • Scans dataset only twice for constructing the tree • Performance impacted by the number of items • Less impacted by the number of items • Memory-intensive due to candidate generation • More memory-efficient with a compact structure Q2 Compare Apriori and FP Growth Algorithm Apriori Algorithm The Apriori algorithm is a classic algorithm used for association rule mining, a technique in data mining that identifies relationships between variables in large datasets. It was proposed by Agrawal and Srikant in 1994. The primary objective of the Apriori algorithm is to find frequent item sets in a transaction database, which are sets of items that frequently occur together. These frequent itemsets are then used to generate association rules. FP-Growth Algorithm The FP-Growth (Frequent Pattern Growth) algorithm is an alternative approach to association rule mining that aims to address some of the limitations of the Apriori algorithm. It was proposed by Han, Pei, and Yin in 2000.
  • 5. Copyright © 2024 Jayanti Rajdevendra Pande. All rights reserved. Q3 What is FP Growth algorithm? State the advantages and Disadvantages of FP Growth Algorithm. FP-GROWTH ALGORITHM The FP-Growth Algorithm is an approach for finding frequent item sets in a database without using candidate generation. It utilizes a divide-and-conquer strategy and employs a special data structure known as the frequent-pattern tree (FP-tree). Algorithm Workflow: Compresses the input database by creating an FP-tree to represent frequent items. Divides the compressed database into sets of conditional databases, each associated with one frequent pattern. Mines each conditional database separately. Search Cost Reduction: Reduces search costs by recursively looking for short patterns and then concatenating them into long frequent patterns. Handling Large Databases: In large databases, where holding the FP tree in main memory is impractical, the algorithm partitions the database into smaller databases (projected databases) and constructs an FP-tree for each. ADVANTAGES OF FP-GROWTH ALGORITHM 1. Reduced Database Scans: Needs to scan the database twice, as opposed to Apriori, which scans transactions for each iteration. 2. Faster Execution: The pairing of items is not performed, making it faster compared to some other algorithms. 3. Compact Memory Storage: Stores the database in a compact version in memory, improving efficiency. 4. Scalability: Efficient and scalable for mining both long and short frequent patterns. DISADVANTAGES OF FP-GROWTH ALGORITHM 1. Complex FP Tree Construction: Building the FP tree is more cumbersome and challenging than the Apriori algorithm. 2. Potential Expense: May be relatively expensive, particularly in certain scenarios. 3. Memory Constraints: The algorithm may face challenges fitting into shared memory when dealing with large databases.
  • 6. Copyright © 2024 Jayanti Rajdevendra Pande. All rights reserved. Q4 What are different types of Association rules in data mining? Briefly mention about the algorithms used for Association Rule mining TYPES OF ASSOCIATION RULES IN DATA MINING 1.Multi-Relational Association Rule (MRAR) : Derived from multi-relational databases, MRAR involves rules with one entity having different relationships, representing indirect relationships between entities. 2.Generalized Association Rule : Used to discover hidden patterns in data, generalized association rules provide a rough idea about interesting patterns. 3.Quantitative Association Rules : Involves numeric attributes in at least one part of the rule, distinguishing it from generalized association rules where both sides consist of categorical attributes. 1 Multi-Relational Association Rule 2 Generalized Association Rule 3 Quantitative Association Rules TYPES OF ASSOCIATION RULES IN DATA MINING
  • 7. Copyright © 2024 Jayanti Rajdevendra Pande. All rights reserved. ALGORITHMS FOR ASSOCIATION RULE MINING 1 Apriori Algorithm: Description: Identifies frequent individual items in a database and expands them to larger item sets, ensuring that these item sets appear sufficiently often in the database. Key Characteristics: Utilizes a breadth-first search algorithm and generates candidate itemsets with the Apriori property. 2 Eclat Algorithm: Description: Also known as Equivalence Class Clustering and bottom-up, Eclat is considered by some as a more efficient version of the Apriori algorithm. It employs lattice traversal to find frequent itemsets. Key Characteristics: Focuses on intersection and support counting, avoiding the need for candidate generation. 3 FP-Growth Algorithm: Description: Operates in two stages, including FP-tree construction and the extraction of frequently used item sets. Particularly useful for finding frequent patterns without candidate generation. Key Characteristics: Uses a divide-and-conquer strategy, creating an FP-tree to represent frequent items and then dividing the database into conditional databases. These algorithms play a crucial role in discovering associations and patterns within large datasets. While Apriori and Eclat focus on candidate generation and support counting, FP-Growth eliminates the need for explicit candidate generation, making it more efficient for certain scenarios. Each algorithm has its strengths and weaknesses, and the choice of the algorithm depends on the specific requirements of the data mining task and the characteristics of the dataset.
  • 8. Copyright © 2024 Jayanti Rajdevendra Pande. All rights reserved. Q5 How to Apply the Apriori algorithm for the given data. Step-1: K=1 1.Create Candidate Set C1: Count the occurrences of each individual item (I1, I2, I3, I4, I5) in the dataset. 2.Generate Frequent Itemset L1: Keep only the items with a support count greater than or equal to the minimum support count (min_support=2). Step-2: K=2 1.Generate Candidate Set C2: Join the items in L1 to form pairs and filter out those pairs that do not satisfy the Apriori property (having subsets with minimum support). Count the occurrences of each candidate pair in the dataset. 2.Generate Frequent Itemset L2: Keep only the pairs with a support count greater than or equal to the minimum support count. Step-3: K=3 1.Generate Candidate Set C3: Join the items in L2 to form triplets and filter out those triplets that do not satisfy the Apriori property. Count the occurrences of each candidate triplet in the dataset. 2.Generate Frequent Itemset L3: Keep only the triplets with a support count greater than or equal to the minimum support count. Continue this process until no more frequent itemsets can be generated. Association Rule Generation: 1.For each frequent itemset, generate all possible non-empty subsets (itemset A and its complement B). 2.Calculate the confidence for each rule: Confidence(A->B) = Support_count(A∪B) / Support_count(A). 3.Keep only the rules with confidence greater than or equal to the minimum confidence threshold (min_confidence=50%).
  • 9. Copyright © 2024 Jayanti Rajdevendra Pande. All rights reserved. Q6 Write about Associative Classification method. ASSOCIATIVE CLASSIFICATION is a method that combines principles from association rule mining and classification algorithms. It aims to leverage the discovered associations in a dataset to enhance the performance of classification models. The following are the key steps involved in the Associative Classification method: 1.Association Rule Mining: The first step involves mining association rules from the dataset. Association rule mining identifies interesting relationships or associations between different attributes in the data. Common algorithms used for this step include Apriori and FP-Growth. 2.Rule Pruning: Once the association rules are generated, a pruning step is often applied to filter out less relevant or less significant rules. Pruning criteria may include measures like support, confidence, or other relevance measures. 3.Rule-to-Class Transformation: The association rules, typically in the form of "if-then" statements, are transformed into classification rules. The antecedent part of the association rule becomes the condition for classifying instances, and the consequent part becomes the predicted class label. 4.Building the Classification Model: The transformed rules are used to build a classification model. This model captures the relationships and dependencies identified during the association rule mining phase. 5.Classifying New Instances: When a new, unseen instance needs to be classified, the rules are applied to determine the predicted class label. Multiple rules may apply to a single instance, and conflict resolution strategies are employed to handle such situations. 6.Conflict Resolution: Conflict resolution addresses cases where multiple rules predict conflicting class labels for the same instance. Strategies include selecting the rule with the highest confidence, using a voting mechanism, or considering additional criteria to resolve conflicts. 7.Evaluation and Fine-Tuning: The performance of the associative classification model is evaluated using standard metrics such as accuracy, precision, recall, and F1-score. Fine-tuning may be performed to improve the model's performance.
  • 10. Copyright © 2024 Jayanti Rajdevendra Pande. All rights reserved. Copyright © 2024 Jayanti Rajdevendra Pande. All rights reserved. This content may be printed for personal use only. It may not be copied, distributed, or used for any other purpose without the express written permission of the copyright owner. This content is protected by copyright law. Any unauthorized use of the content may violate copyright laws and other applicable laws. For any further queries contact on email: [email protected]