0% found this document useful (0 votes)

74 views19 pages

AzqaSaleemKhan (SP22 RCS 003) FPGrowth

The FP-Growth algorithm is an improvement on the Apriori algorithm for frequent pattern mining. It avoids candidate generation and instead constructs a frequent-pattern tree (FP-tree) to store transaction data, compressing the database. Frequent patterns are generated by traversing the FP-tree without candidate generation. The algorithm scans the database to determine frequent items, constructs the FP-tree, and then mines the tree to find frequent patterns.

Uploaded by

Azqa Saleem Khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

74 views19 pages

AzqaSaleemKhan (SP22 RCS 003) FPGrowth

Uploaded by

Azqa Saleem Khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 19

FP-Growth Algorithm

(Frequent Pattern Growth Algorithm)

Azqa Saleem Khan

(SP22-RCS-003)
Department of Computer Science
Advanced Algorithm Analysis
Dr. Nadeem Javaid

(Date: 30/05/2022)

COMSATS University Islamabad, Islamabad Pakistan

Preliminaries (1/3)
 Artificial Intelligence: Artificial intelligence (AI) refers to the simulation of human intelligence in
machines that are programmed to think like humans and mimic their actions.

 Machine Learning: Machine learning is a branch of artificial intelligence (AI) and computer science
which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its
accuracy.[1]

 Unsupervised Learning: Unsupervised Learning is a machine learning technique in which the users do
not need to supervise the model. Instead, it allows the model to work on its own to discover patterns and
information that was previously undetected. It mainly deals with the unlabeled data. [2]

 Unsupervised Learning Algorithms: Unsupervised machine learning algorithms are used when the
information used to train is neither classified nor labeled. It studies how systems can infer a function to
describe a hidden structure from unlabeled data. At no point does the system know the correct output with
certainty. Instead, it draws inferences from datasets as to what the output should be. [2]

1
Preliminaries (2/3)
Frequent itemset: Frequent itemset are those items whose support is greater than the threshold value or
user-specified minimum support. It means if A & B are the frequent itemset together, then individually A and
B should also be the frequent itemset.
 Suppose there are the two transactions: A= {1,2,3,4,5}, and B= {2,3,7}, in these two transactions, 2 and 3 are
the frequent itemset.[3]

Association Rule : [3]

 Association rule is a type of unsupervised learning technique that checks for the dependency of one data item
on another data item and maps accordingly so that it can be more profitable. It tries to find some interesting
relations or associations among the variables of dataset. It is based on different rules to discover the
interesting relations between variables in the database.
 The association rule learning is one of the very important concepts of machine learning, and it is employed
in Market Basket analysis, Web usage mining, continuous production, etc.
 An implication expression of the form X → Y, where X and Y are any 2 itemsets.

2
Preliminaries (3/3)
Apriori Algorithm: Apriori algorithm finds the most frequent itemsets or elements in a transaction
database and identifies association rules between the items. It uses “join” and “prune” to reduce the search
space. It is an iterative approach to discover the most frequent itemsets.
 The two primary drawbacks of the Apriori Algorithm are:
1. At each step, candidate sets have to be built.
2. To build the candidate sets, the algorithm has to repeatedly scan the database.
 These two properties inevitably make the algorithm slower. To overcome these redundant steps, a new
association-rule mining algorithm was developed named Frequent Pattern Growth Algorithm. [3]

3
FP-Growth- Introduction
 FP-Growth Algorithm was introduced by Han, Pei and Yin in 2000 to eliminate the candidate generation of
Apriori Algorithm.
 This algorithm is an improvement to the Apriori method.
 A frequent pattern is generated without the need for candidate generation. FP growth algorithm represents the
database in the form of a tree called a frequent pattern tree or FP tree.
 This tree structure will maintain the association between the itemsets. The database is fragmented using one
frequent item. This fragmented part is called “pattern fragment”.
 Apriori is a Join-Based algorithm and FP-Growth is Tree-Based algorithm for frequent itemset mining or
frequent pattern mining for market basket analysis.
 A FP-tree is a compact data structure that represents the data set in tree form. Each transaction is read and
then mapped onto a path in the FP-tree. This is done until all transactions have been read. Different
transactions that have common subsets allow the tree to remain compact because their paths overlap.

4
Flowchart
1. Scan the data set to determine the support count of
each item, discard the infrequent items and sort the
frequent items in decreasing order.
2. Scan the data set one transaction at a time to create
the FP-tree. For each transaction:
a. If it is a unique transaction form a new path
and set the counter for each node to 1.
b. If it shares a common prefix itemset then
increment the common itemset node counters
and create new nodes if needed.
3. Continue this until each transaction has been
mapped unto the tree.

Source: https://arxiv.org/ftp/arxiv/papers/1901/1901.11376.pdf 5
FP-Growth Algorithm

FPGrowth(FPTree, a, support) {

• for each item ai in the header of FPTree {

• generate β = ai ∪ FPTree with support = ai.support
• construct β conditional pattern base and
• conditional FP-Tree (Tree β)
• if Tree β != null
• FP-Growth (FPTree β, β)
• }
• return frequent_patterns(FPTree)
}

6
Source: https://arxiv.org/ftp/arxiv/papers/1901/1901.11376.pdf/
FP-Growth- Example
The given data is a hypothetical dataset of transactions with each letter representing an item.

min_support = 3.

TID Items Bought

T1 f,a,c,d,g,i,m,p

T2 a,b,c,f,l,m,o

T3 b,f,h,j,o

T4 b,c,k,s,p

T5 a,f,c,e,l,p,m,n

TABLE-1

Source: https://www.vtupulse.com/big-data-analytics/frequent-pattern-fp-growth-algorithm-example/ 7
We find the frequency of each item.
The following table gives the frequency of each item in the given data.
This is the count of each item, such as if we see item C has been bought in 4 transactions in, T1, T2, T4, &
T5, so the support count is 4 for C.

Item Frequency Item Frequency

a 3 j 1
b 3 k 1
c 4 l 2
d 1 m 3
e 1 n 1
f 4 o 2
g 1 p 3
h 1 s 1
i 1

TABLE-2
8
A Frequent Pattern set (L) is built, which will contain all the elements whose frequency is greater than
or equal to the minimum support.
These elements are stored in descending order of their respective frequencies.
As minimum support is 3.
After insertion of the relevant items, the set L looks like this:-
L = { (f:4), (c:4), (a:3), (b:3), (m:3), (p:3) }

Now, for each transaction, the respective Ordered-Item set is built.

TID Items Bought (Ordered) Frequent Items

T1 f,a,c,d,g,i,m,p f,c,a,m,p

T2 a,b,c,f,l,m,o f,c,a,b,m

T3 b,f,h,j,o f,b

T4 b,c,k,s,p c,b,p

T5 a,f,c,e,l,p,m,n f,c,a,m,p

TABLE-3
9
 Now, all the Ordered-Item sets are to be inserted into a Trie Data
Structure (frequent pattern tree).

Create Root Transaction 1

10
Transaction 2 Transaction 3

11
Transaction 4 Transaction 5

12
 For each item, the Conditional Pattern Base is computed which is the path labels of all the paths
which lead to any node of the given item in the frequent-pattern tree.

Item Conditional Pattern Base

p {{f,c,a,m:2}, {c,b:1}}

m {{f,c,a:2},{f,c,a,b:1}}

b {{f,c,a:1},{f:1},{c:1}}

a {{f,c:3}}

c {{f:3}}

f Ø

TABLE-4

13
 Now the Conditional Frequent Pattern Tree is built.
It is done by taking the set of elements that is common in all the paths in the Conditional Pattern
Base of that item and calculating its support count by summing the support counts of all the paths
in the Conditional Pattern Base.

Item Conditional Pattern Base Conditional FP-Tree

p {{f,c,a,m:2}, {c,b:1}} {c:3}

m {{f,c,a:2},{f,c,a,b:1}} {f,c,a:3}

b {{f,c,a:1},{f:1},{c:1}} Ø

a {{f,c:3}} {f,c:3}

c {{f:3}} {f:3}

f Ø Ø

TABLE-5

14
 Next, the Frequent Pattern rules are generated by pairing the items of the Conditional Frequent
Pattern Tree set to the corresponding item.

Item Conditional Pattern Conditional FP-Tree Frequent Pattern Generated

Base
p {{f,c,a,m:2}, {c,b:1}} {c:3} {<c,p:3>}
{<f,m:3>, <c,m:3>, <a,m:3>,
m {{f,c,a:2},{f,c,a,b:1}} {f,c,a:3} <f,c,m:3>, <f,a,m:3>, <c,a,m:3>}
b {{f,c,a:1},{f:1},{c:1}} Ø {}

a {{f,c:3}} {f,c:3} {<f,a:3>, <c,a:3>, <f,c,a:3>}

c {{f:3}} {f:3} {<f,c:3>}

f Ø Ø {}

TABLE-6
15
FP-Growth
 For each row, two types of association rules can be inferred.
For example for the first row which contains the element, the rules K→Y and Y→K can be inferred.

 Advantages Of FP Growth Algorithm

1. This algorithm needs to scan the database only twice when compared to Apriori which scans the
transactions for each iteration.
2. The pairing of items is not done in this algorithm and this makes it faster.
3. The database is stored in a compact version in memory.
4. It is efficient and scalable for mining both long and short frequent patterns.

 Disadvantages Of FP-Growth Algorithm

1. FP Tree is more cumbersome and difficult to build than Apriori.
2. It may be expensive.
3. When the database is large, the algorithm may not fit in the shared memory.

16
Source: https://www.vtupulse.com/big-data-analytics/frequent-pattern-fp-growth-algorithm-example/
References
1. https://www.ibm.com/cloud/learn/machine-learning#:~:text=Machine%20learni
ng%20is%20a%20branch,rich%20history%20with%20machine%20learning.
2. https://www.guru99.com/unsupervised-machine-learning.html/
3. https://www.javatpoint.com/apriori-algorithm-in-machine-learning
4. https://arxiv.org/ftp/arxiv/papers/1901/1901.11376.pdf/
5. https://www.vtupulse.com/big-data-analytics/frequent-pattern-fp-growth-algorit
hm-example/
6. https://www.geeksforgeeks.org/ml-frequent-pattern-growth-algorithm/

17
Thank You!!
T

FP Growth Alg
No ratings yet
FP Growth Alg
17 pages
Tutorial 02
No ratings yet
Tutorial 02
17 pages
Fpgrowth
No ratings yet
Fpgrowth
11 pages
Improv Me Net
No ratings yet
Improv Me Net
7 pages
15-Fp-Tree Problem-10-09-2024
No ratings yet
15-Fp-Tree Problem-10-09-2024
2 pages
2024 Lecture7
No ratings yet
2024 Lecture7
28 pages
Unit4 2 Association Rules FP Growth
No ratings yet
Unit4 2 Association Rules FP Growth
33 pages
FP Tree Growth: Frequent Pattern Growth Algorithm
100% (1)
FP Tree Growth: Frequent Pattern Growth Algorithm
2 pages
Association Rule Mining Guide
No ratings yet
Association Rule Mining Guide
88 pages
Fp-Tree Growth Algorithm
No ratings yet
Fp-Tree Growth Algorithm
11 pages
18-FP-Growth Algorithm-12-02-2025
No ratings yet
18-FP-Growth Algorithm-12-02-2025
24 pages
Association Rule Mining: FP Growth
No ratings yet
Association Rule Mining: FP Growth
22 pages
Lecture 2.3.3 2.3.4
No ratings yet
Lecture 2.3.3 2.3.4
29 pages
FP-Growth Algorithm
No ratings yet
FP-Growth Algorithm
5 pages
DM Unit2 - 1 Association Mining 19I504
No ratings yet
DM Unit2 - 1 Association Mining 19I504
86 pages
ESE Handouts 4 - FP Growth Algorithm (Fall 2016)
No ratings yet
ESE Handouts 4 - FP Growth Algorithm (Fall 2016)
13 pages
Lecture 6
No ratings yet
Lecture 6
18 pages
A Frequent Pattern Mining Algorithm Based On Fp-Tree Structure Andapriori Algorithm
No ratings yet
A Frequent Pattern Mining Algorithm Based On Fp-Tree Structure Andapriori Algorithm
3 pages
FP Growth Algorithm
No ratings yet
FP Growth Algorithm
17 pages
Machine Learning Based FP Growth Algorithm
No ratings yet
Machine Learning Based FP Growth Algorithm
8 pages
What Is Frequent Pattern Analysis?
No ratings yet
What Is Frequent Pattern Analysis?
37 pages
Efficient Algorithm For Mining Frequent Patterns Java Project
No ratings yet
Efficient Algorithm For Mining Frequent Patterns Java Project
38 pages
Lecture 6 - Association Analysis
No ratings yet
Lecture 6 - Association Analysis
62 pages
U3 - FP Trees - 5th Sem - DS
No ratings yet
U3 - FP Trees - 5th Sem - DS
9 pages
Lecture 5 - FP-Growth Algorithm
No ratings yet
Lecture 5 - FP-Growth Algorithm
26 pages
Chapter 5
No ratings yet
Chapter 5
24 pages
FP Tree
No ratings yet
FP Tree
54 pages
Mining Frequent Patterns Without Candidate Generation
No ratings yet
Mining Frequent Patterns Without Candidate Generation
44 pages
DM Unit-2
No ratings yet
DM Unit-2
14 pages
Untitled Document
No ratings yet
Untitled Document
5 pages
FP-Growth Algorithm Overview
No ratings yet
FP-Growth Algorithm Overview
21 pages
Frequent Pattern Analysis Guide
No ratings yet
Frequent Pattern Analysis Guide
5 pages
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
No ratings yet
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
33 pages
Mining Frequent Patterns Without Candidate Generation
No ratings yet
Mining Frequent Patterns Without Candidate Generation
12 pages
Q) FP Growth Algorithm?: This Algorithm Works As Follows
No ratings yet
Q) FP Growth Algorithm?: This Algorithm Works As Follows
3 pages
Data Mining - : Dr. Mahmoud Mounir Mahmoud - Mounir@cis - Asu.edu - Eg
No ratings yet
Data Mining - : Dr. Mahmoud Mounir Mahmoud - Mounir@cis - Asu.edu - Eg
23 pages
Powerpoint Presentation On Somlething
No ratings yet
Powerpoint Presentation On Somlething
181 pages
Association Rule Mining Lesson PDF
No ratings yet
Association Rule Mining Lesson PDF
9 pages
FP Tree
No ratings yet
FP Tree
37 pages
2 Unit DM K Raj Kuamr
No ratings yet
2 Unit DM K Raj Kuamr
26 pages
From Introduction To Data Mining: Data Mining Association Analysis: Basic Concepts and Algorithms
No ratings yet
From Introduction To Data Mining: Data Mining Association Analysis: Basic Concepts and Algorithms
37 pages
Estimating Frequent Patterns Using FP-Growth On A Transactional Data Stream
No ratings yet
Estimating Frequent Patterns Using FP-Growth On A Transactional Data Stream
3 pages
ML 4
No ratings yet
ML 4
13 pages
FP Tree
No ratings yet
FP Tree
42 pages
DWDM Unit-3
100% (1)
DWDM Unit-3
63 pages
FP Growth
No ratings yet
FP Growth
30 pages
FP-Growth for Data Analysts
No ratings yet
FP-Growth for Data Analysts
24 pages
FPgrowth
No ratings yet
FPgrowth
2 pages
Frequent Closed Pattern Mining Algorithm Based On COFI-Tree
No ratings yet
Frequent Closed Pattern Mining Algorithm Based On COFI-Tree
2 pages
Shihab Rahman Dolon Chanpa Department of Computer Science and Engineering, University of Dhaka
No ratings yet
Shihab Rahman Dolon Chanpa Department of Computer Science and Engineering, University of Dhaka
23 pages
Lecture 5 - Monday, September 3, 2007: 2.1 Example From Paper
No ratings yet
Lecture 5 - Monday, September 3, 2007: 2.1 Example From Paper
6 pages
Note 1455181909
No ratings yet
Note 1455181909
30 pages
Lecture 13 14 FP
No ratings yet
Lecture 13 14 FP
41 pages
Guide: Mr. Gautam Borkar: Group Members: Rahul Kelaskar A - 636 Anish Khale A - 638 Dhaval Doshi A - 682
No ratings yet
Guide: Mr. Gautam Borkar: Group Members: Rahul Kelaskar A - 636 Anish Khale A - 638 Dhaval Doshi A - 682
22 pages
FP-Tree Growth Algorithm
No ratings yet
FP-Tree Growth Algorithm
15 pages
Mining Frequent Patterns Unit-3
No ratings yet
Mining Frequent Patterns Unit-3
13 pages
FP-Growth for Data Scientists
No ratings yet
FP-Growth for Data Scientists
20 pages
2 Otura Ogbe Otura Oriko Otura Orire 16 Ese Ifa PDF
100% (7)
2 Otura Ogbe Otura Oriko Otura Orire 16 Ese Ifa PDF
20 pages
Review of Literature: 2.1 Originality and Distribution of Black Mulberry
No ratings yet
Review of Literature: 2.1 Originality and Distribution of Black Mulberry
18 pages
Notes - Nanotechnology
No ratings yet
Notes - Nanotechnology
20 pages
Ameripolish Tds SR2
No ratings yet
Ameripolish Tds SR2
2 pages
Conjunction and Exposition Text-2
No ratings yet
Conjunction and Exposition Text-2
19 pages
BSV-8 Safety Relief Valves Guide
No ratings yet
BSV-8 Safety Relief Valves Guide
10 pages
Basic Electronics Mcqs 2
No ratings yet
Basic Electronics Mcqs 2
4 pages
KPI Safety Performance 2023-2024
No ratings yet
KPI Safety Performance 2023-2024
2 pages
Electrostatic Worksheet-1
No ratings yet
Electrostatic Worksheet-1
1 page
Saad Part (Nestle)
No ratings yet
Saad Part (Nestle)
7 pages
Compressor Performance Specs
100% (2)
Compressor Performance Specs
1 page
Production Cost Analysis Report
No ratings yet
Production Cost Analysis Report
37 pages
Paradise Lost Than in Lanyer's "Apology" in Detail, Characteristics, and Relationship
No ratings yet
Paradise Lost Than in Lanyer's "Apology" in Detail, Characteristics, and Relationship
4 pages
Eleven-Level Diode Clamped Inverter Analysis
No ratings yet
Eleven-Level Diode Clamped Inverter Analysis
5 pages
Yamaha RAV364 WJ165600 Layout
No ratings yet
Yamaha RAV364 WJ165600 Layout
5 pages
Tourism's Global Impacts
100% (1)
Tourism's Global Impacts
50 pages
How To Crack JEE Main In: 2 Months Preparation?
No ratings yet
How To Crack JEE Main In: 2 Months Preparation?
26 pages
X-431 Euro Pro HD+: Professional Diagnostic Device For Truck
100% (1)
X-431 Euro Pro HD+: Professional Diagnostic Device For Truck
15 pages
KA 350 PL21 Recurrent Client Guide
100% (2)
KA 350 PL21 Recurrent Client Guide
66 pages
Case Study Surgery
0% (1)
Case Study Surgery
12 pages
Robert Shaw 335 - 84392
No ratings yet
Robert Shaw 335 - 84392
8 pages
(E-Module) Math Ch6
No ratings yet
(E-Module) Math Ch6
45 pages
(C1) Week 4 - Grammar 7 - Mind Map - Cleft Sentences
No ratings yet
(C1) Week 4 - Grammar 7 - Mind Map - Cleft Sentences
1 page
Mount Tambora: By: Group 7
No ratings yet
Mount Tambora: By: Group 7
13 pages
Material Safety Data Sheet: National Collection of Type Cultures (NCTC) Bacterial Culture Products in Freeze-Dried Format
No ratings yet
Material Safety Data Sheet: National Collection of Type Cultures (NCTC) Bacterial Culture Products in Freeze-Dried Format
4 pages
CDI9 Syllabus 1st Sem SY 2024 2025
No ratings yet
CDI9 Syllabus 1st Sem SY 2024 2025
2 pages
Arm Wrestling Workout Guide PDF
100% (2)
Arm Wrestling Workout Guide PDF
3 pages
Squidmusicaltoy Lalylala
100% (7)
Squidmusicaltoy Lalylala
22 pages
Laboratory Experiment 5: Power Measurement and Maximum Power Transfer
No ratings yet
Laboratory Experiment 5: Power Measurement and Maximum Power Transfer
7 pages
Ansi-Saami z299.1 2018 Rimfire
No ratings yet
Ansi-Saami z299.1 2018 Rimfire
89 pages