Final_Presentation_ENDSEMFORNITJSRI.pptx

CONTENTS
 Introduction
 Literature Survey
 Problem Statement
 Model Architecture
 Dataset
 Preprocessing
 Feature Selection using MCDM and Statistical methods
 Trainning model
 Comparision b/w MCDM and Statistical method

Introduction
•Network security is paramount in today's interconnected world.
•However, emerging threats like DoS,Deauthentication,Kr00k
etc. attacks pose risks to even the strongest security measures.
•This presentation explores how Multi-Criteria Decision Making
(MCDM) can help us identify and implement the most effective
strategies to enhance network security.

Literature Survey
INDEX AUTHOR TITLE SUMMARY
1 Abhijit Sharad Warhekar A Wireless Intrusion Detection
System using Feature Selection
with Random Forest
Using Ensemble methods to
select features and compare
performance of RandomForest.
2 Şevval Şolpan Wi-Fi Network Intrusion
Detection: Enhanced with
Feature Extraction and Machine
Learning
Feature Extraction and then
using CNN

Literature Survey
INDEX AUTHOR TITLE SUMMARY
1 Abhijit Kamble Feature Selection in Wireless
Intrusion Detection System for
Evil Twin Attack Detection
Using GiniIndex feature
extraction method and then
using CNN.
2 Rayed S. Ahmad A GAF and CNN based Wi-Fi
Network Intrusion Detection
System
GAF to convert csv to image
files and using CNN.

Problem Statement
• Wireless networks based on the IEEE 802.11 standard (Wi-Fi) are increasingly vulnerable to
sophisticated, evolving cyber-attacks, posing severe threats to organizational and user
security. Traditional multiclass intrusion detection systems typically rely on static, statistically-
driven feature selection methods, which fail to adequately capture the complex, multi-
dimensional nature of different attack types. Moreover, existing methods often neglect crucial
decision-making criteria such as computational efficiency, interpretability, and attack-specific
criticality. Consequently, current detection systems lack robustness, adaptability, and
accuracy, particularly when distinguishing among multiple intrusion scenarios.
• Therefore, there is a critical need to develop an effective and adaptive feature selection
framework, incorporating Multi-Criteria Decision-Making (MCDM) techniques, to improve the
detection and classification accuracy of multiclass Wi-Fi network intrusions. This research
addresses this gap by integrating a novel feature-level MCDM approach, specifically using
methods such as Fuzzy AHP and Fuzzy TOPSIS, applied to the AWID3 dataset, to enhance
the accuracy, interpretability, and practical applicability of Wireless Intrusion Detection
Systems (WIDS).

▼
[Data Preprocessing] → [Initial Features] → [Criteria Definition]
│ │ │
│ │ ▼
│ ▼ [Entropy Weight Method]
│ [Fuzzy Decision Matrix] │
│ │ │
│ ▼ ▼
│ [FPIS & FNIS Calculation] ←─────┘
│ │
│ ▼
│ [Distance Calculation]
│ │
│ ▼
└────────►[Feature Ranking]
│
▼
[Final Selected Feature Set]
│
▼
[Classification Models]
│
▼
[Model Evaluation]
│
▼
[Intrusion Detection Decision]

DATASET
 AWID3 DATASET:
 The AWID3 dataset is a popular benchmark dataset for Network intrusion detection attacks
research.
 Shape of AWID3- (37592300, 255) to (3759230, 66) to (3759230, 31) to (3759230,15)

Libraries
 Numpy (for numerical operations)
 Pandas (for data handling using DataFrames)
 Tensorflow and Keras (for building and training neural networks).
 Matplotlib.pyplot and Seaborn (for creating plots and charts)
 OS (for interacting with the operating system)
 Warnings (for managing warnings)

Preprocessing
• AWID3 dataset consists of 13 folders conaining csv files.
• Initially we need to combine all csv files of a folder into 1 csv file using loop.
• Given below code is used to combine data,we used this for all 13 folders except
some.

Preprocessing
• Though not all folders were able to be combined by this procedure. Some were very
large so, we had to use chunking method.

Preprocessing
• After al folders were combined,we needed to combine all 13 attacks data into 1 file.
• For that we used this code.
• 1.We combine 2 attacks at first
• To reduce 13 files into 7 files.
• Consecutively we combined all.

Preprocessing
• Initially we had to drop columns that are 95% null because thay have hardly any
impact on our algorithm.Just use drop command.

Preprocessing
 After dropping columns, datatypes of all columns are disturbed. Hence we need to take
proper steps like taking a glimpse at data-frame and noting down corresponding datatype
and initializing beforehand.
 Next step covers converting object dtype
 Columns into numeric dtypes.
 After dropping useless features we are left
 To 66 features in which 35 are objects
 All needs to be preprocessed individually,
 Because most of them have different format
• Of data like-datetime
• (Dec 15- 2020 15:27:38.167117000 GTB Standard Time)
 ,signal negative values,wifi bssid,hexadecimal address,
 ,flags,payload text values etc..

Preprocessing
 We have written different functions like-datestamp to convert string time to datetime
 Compute_mean to convert signal values into numeric.
 Similarly convert hexadecimal values (base 16) to base 2.
 Replace Label attack names with encoder.
 FILLING MISSING VALUES-
 We have used IterativeImputer(round robin loop estimation)
 But we have a instance running in cloud using KNNimputer.

Preprocessing
 Downscaling is needed because there are approximately 30 million rows and its way
oversized to run in our machine.Thus downsampling the attacks was implemented.
 The ratio maintained was
 1:10(Attack:Normal)
 Over sampling is needed
 For attacks who have less
 No of rows.

Preprocessing
 Standardization of input values was needed hence we wrote a function to identify-
 StandardScalar
 Min-Max
 Above 2 methods
 Are used basis
 On outliers.

Feature Selection
 Initially we had 255 features.
 1st drop-Dropped 95% null columns reduced to 66 features.
 2nd drop-Drop columns that causes model to memorization(The main reason we are using
MCDM),reducing to 31 features.
 We have used RandomForestClassifier to filter out noise and we are left with only necessary
features but we still need to perform Feature Selection because of aligning around data
points.

Limitations of Existing Feature Selection
Methods
• Typically rely on standard statistical methods (Information Gain, Mutual Information, Chi-
Square, etc.).
• Usually focus on global accuracy, neglecting:
• Real-time computational constraints.
• Domain-specific interpretability.
• Criticality and impact of individual attacks.
• Limited adaptability and robustness to multiclass attack scenarios.

Multi-Criteria Decision-Making
• Multi-Criteria Decision-Making (MCDM): powerful decision support methods considering
multiple conflicting criteria.
• Popular MCDM Techniques:
• Fuzzy Analytic Hierarchy Process (Fuzzy-AHP)
• Fuzzy Technique for Order of Preference by Similarity to Ideal Solution (Fuzzy-TOPSIS)
• Why MCDM in Intrusion Detection?
• Balances various practical criteria: computational cost, interpretability, criticality, and
accuracy impact.
• Provides adaptive, robust, and interpretable feature selection.
• AWID3 dataset: comprehensive data for testing Wi-Fi-specific attacks but rarely explored
through MCDM perspective.
• Need for adaptive feature selection addressing the complexities of real-world multiclass
intrusion detection scenarios.

Multi-Criteria Decision-Making
• Selecting the most effective and adaptive feature subset using a Multi-Criteria Decision
Making (MCDM) framework, specifically using Fuzzy AHP for criteria weighting and Fuzzy
TOPSIS for ranking.
• Criterion-
• C1: Computational Cost
How computationally expensive or time-consuming is it to extract the feature? (lower
cost = better)
• C2: Interpretability
How easily interpretable or understandable is the feature to a security analyst? (high
interpretability = better)
• C3: Criticality (attack sensitivity)
How critical or influential is the feature in detecting specific attacks? (higher
criticality = better)
• C4: Accuracy Impact
How significantly does the feature contribute to overall classification accuracy?
(higher accuracy impact = better)

Assigning Criteria Weights
• Entropy Weight Method (EWM)—an objective weighting
method. EWM calculates weights directly from the dataset
itself based on the degree of variation (entropy) within the
data.
• How to populate :
• Computational Cost: measure computational runtime
per feature extraction (normalized).
• Interpretability: number of domain literature references,
or easily interpretable (1/0).
• Criticality & Accuracy Impact: use statistical metrics.

Feature Ranking Using Fuzzy TOPSIS
• tcp.flags.ack 0.907168
• ip.src 0.847907
• ip.ttl 0.847480
• tcp.ack 0.832926
• frame.len 0.820400
• radiotap.timestamp.ts 0.740591
• frame.time_epoch 0.732114
• ip.dst 0.668655
• tcp.srcport 0.651035
• wlan_radio.duration 0.647041
• frame.time 0.627127
• wlan.fc.type 0.618783
• tcp.time_relative 0.605249
• tcp.seq 0.560216

Feature Ranking Using Statistical methods
• 'ip.ttl', 'ip.src', 'wlan.fc.type', 'tcp.ack', 'frame.time_epoch', 'frame.len', 'frame.time',
'radiotap.timestamp.ts', 'frame.number', 'wlan.fc.protected', 'tcp.time_relative',
'wlan_radio.data_rate', 'tcp.srcport', 'wlan.ta', 'ip.dst'

Model Summary
Dataset
AWID3 Dataset (IEEE 802.11 enterprise networks)
Preprocessing Steps
Missing Value Handling, Normalization, Encoding
Initial Features
31 Features extracted from AWID3
Criteria for MCDM
Computational Cost, Interpretability, Criticality, Accuracy Impact
Weight Calculation
Entropy Weight Method (Objective and data-driven)
Feature Ranking Method
Fuzzy TOPSIS (Triangular Fuzzy Numbers)
Final Selected Features
Top-ranked features based on closeness coefficient
ML & DL Models Used
Random Forest, XGBoost, CNN
Evaluation Metrics
Accuracy, Precision, Recall, F1-Score, ROC
Key Contributions:
• Proposed a novel adaptive MCDM-based feature selection framework.
• Improved robustness and interpretability of Wi-Fi multiclass intrusion detection.
• Demonstrated significant performance improvement over traditional feature selection methods.

Conclusion
• MCDM selected features accuracy-
• Accuracy: 0.9703
• Precision: 0.9614
• Recall: 0.9703
• F1-Score: 0.9650
• Statistical methods selected features accuracy-
• Accuracy: 0.9661
• Precision: 0.9546
• Recall: 0.9661
• F1-Score: 0.9591

Future Work

• Implementation of Specialized Models
• Develop 13 specialized (attack-specific) models—each uniquely trained
for a specific intrusion attack (Evil Twin, Kr00K, Deauth, etc.).
• Combine predictions of these specialized models to further boost
classification accuracy.
• Enhanced Multi-Criteria Decision Making (MCDM)
• Further optimize and refine the MCDM-based feature selection process.
• Investigate additional MCDM methods (e.g., VIKOR, PROMETHEE) to
potentially increase the accuracy and interpretability of selected features.
• Continuous Learning and Adaptation
• Develop mechanisms for models to adapt continuously to emerging Wi-Fi
intrusion threats.
• InformationGain,GiniIndex,MutualInformation,CorrelationCofficient,ReliefFS
core are 5 popular metrics which we will use further to improve efficieny.

Final_Presentation_ENDSEMFORNITJSRI.pptx

More Related Content

Similar to Final_Presentation_ENDSEMFORNITJSRI.pptx (20)

Recently uploaded (20)

Final_Presentation_ENDSEMFORNITJSRI.pptx