SlideShare a Scribd company logo
GNN
PROACTIVEVULNERABILITYDETECTIONUSINGGRAPHNEURALNETWORKS(GNNS)
Transforming Software Security with AI
XHIELD.TECH
BYRANJANKUMARBAISAK
THE
CHALLENGE
Codebases are becoming massive and
complex.
Traditional static analysis tools often
miss vulnerabilities.
Security testing is reactive, not
proactive.
NEURALNETWORKSINCODE
ANALYSIS
How Neural Networks Help:
Learn patterns of insecure code
from historical data.
Generalize across languages and
styles.
Predict potential vulnerabilities early
in the SDLC (Software Development
Life Cycle).
WHYGRAPHNEURAL
NETWORKS(GNNS)?
Code is a Graph:
Code can be represented as ASTs
(AbstractSyntax Trees), Control Flow Graphs,
or Call Graphs.
GNNs understand structured data:
Nodes = code elements (functions, variables,
classes)
Edges = relationships (calls, dependencies,
inheritance)
GNNs naturally fit code structure better
than CNNs or RNNs.
HOWGNNSANALYZECODE
Predict vulnerability scores at node or
graph level
Parse code into a graph (AST/CFG/PDG)
Initialize node embeddings (syntax, type
info, etc.)
Message passing between nodes (learn
context)
PROPERGNNDESIGNFOR
VULNERABILITYDETECTION
Graph Construction: AST + semantic information (e.g.,
variable types, data flow)
Rich Node Features: Token types, function names, data
types
Deep Message Passing: Capture long-range
dependencies (e.g., taint flows)
Attention Mechanisms: Focus on critical code paths
Multi-task Learning: Predict multiple vulnerability types
at once
BENEFITSOF
GNN-BasedDetection
Proactive Access: Predict unknown (zero-day)
vulnerabilities based on patterns.
Scalable: Works across large codebases automatically..
Explainable AI: Highlight suspicious code snippets
(important for developer trust).
REAL-WORLD
APPLICATIONS
Facebook’s “SapFix” and “Getafix” for
automated bug fixing
Microsoft’s “DeepVul” model for
vulnerability detection
AI is a tool, not a threat!
Open-source projects like Code
Property Graphs (CPG).
CHALLENGESAND
LIMITATIONS
Labelled data scarcity for vulnerabilities
Imbalanced datasets (few vulnerabilities vs lots of clean
code)
Risk of false positives/negatives
Model interpretability
FUTUREOPPORTUNITIES
Combining GNNs with Large Language Models (LLMs)
Dynamic analysis + static GNN models
Automated code patch suggestions
Self-training with weak supervision
CONCLUSION
GNNs represent a powerful frontier for proactive
vulnerability detection.
With the right design and training, GNNs can shift
security left, saving organizations millions.
"Think like an attacker, code like a graph!"
THANKYOU!
REAL-WORLDAPPLICATIONS
CHALLENGES
OR
THEIMPORTANCEOF
PRECISION
Why it matters:
High false positive rates = Developer fatigue
False trust can be worse than no detection
Security tools must be reliable and explainable
STRATEGY#1—BETTERGRAPHDESIGN
Combine AST + Control Flow Graph + Data Flow
Graph
Enrich nodes with:
Token type, data type, symbol role
API risk classification
Diagram: Side-by-side of AST vs Hybrid Graph
STRATEGY#2—CLEANANDBALANCED
DATA
Use high-quality, labeled datasets (e.g., Juliet,
Devign, CodeXGLUE)
Address data imbalance:
Oversample rare vulnerabilities
Apply cost-sensitive loss functions
Visual: Pie chart of class imbalance and how
sampling improves it
STRATEGY#3—FOCUSWITHATTENTION
Add attention layers to the GNN
Prioritize user input, dangerous function calls,
control paths
Highlight how attention reduces noise from
irrelevant code
Diagram: GNN with attention heatmap on code
graph
STRATEGY#4—POST-PREDICTION
FILTERING
Rule-based filtering after GNN output:
Example: Reject if input is already sanitized
Hybrid model = AI + domain rules
Benefits:
Remove obvious FPs
Improve trust in model output
STRATEGY#5—EXPLAINABILITY
Use GNNExplainer or saliency maps for:
Highlighting vulnerable code paths
Making predictions interpretable
Screenshot: Sample output with highlighted risky
lines
STRATEGY#6—FEEDBACKLOOP
Deploy GNN with human feedback
Collect true/false positive flags from developers
Periodically fine-tune model using this data
Visual: Lifecycle diagram of GNN improvement via
feedback
STRATEGY#7—ENSEMBLEMODELS
Combine multiple GNN types (GAT, GCN,
GraphSAGE)
Cross-validate predictions → majority voting or
learned fusion
Lower model variance = fewer false alarms
SUMMARYTABLE
FINALTHOUGHTS
GNNs are powerful, but not perfect.
Combining machine learning + human insight is key.
The goal: Actionable, accurate, explainable
vulnerability detection.
REFERNCES
GNNs for Code Representation & Vulnerability Detection
[1] Allamanis, M., Barr, E. T., Devanbu, P., & Sutton, C. (2018). A Survey of Machine Learning for Big Code and
Naturalness.
DOI: 10.1145/3212695
Overview of ML and GNNs for code representation.
[2] Zhou, Y., Liu, S., Siow, J., Du, X., & Liu, Y. (2019). Devign: Effective Vulnerability Identification by Learning Comprehensive
Program Semantics via Graph Neural Networks.
GNNs are powerful, but not perfect.
Combining machine learning + human insight is key.
The goal: Actionable, accurate, explainable vulnerability detection.
Introduced GNN-based vulnerability detection using joint AST/CFG models.
[3] Lin, Z., Sun, Y., Wang, H., Wang, Z., & Liu, X. (2020). Graph-based Deep Learning for Software Vulnerability Detection: A
Survey.
GNNs are powerful, but not perfect.
Combining machine learning + human insight is key.
The goal: Actionable, accurate, explainable vulnerability detection.
Comprehensive survey of graph-based vulnerability detection methods.
REFERNCES
Graph Construction & Feature Engineering
[4] Fernandes, E., Pauck, F., & Bodden, E. (2022). A Review of Graph Representations for Source Code.
Discusses AST, PDG, DFG, and hybrid graph approaches.
https://arxiv.org/abs/2211.03138
[5] Yamaguchi, F., Golde, N., Arp, D., & Rieck, K. (2014). Modeling and discovering vulnerabilities with code property graphs.
https://www.usenix.org/system/files/conference/sp14/sp14-paper-yamaguchi.pdf
Seminal work introducing Code Property Graphs (CPG) for vulnerability mining.
Reducing False Positives
[6] Demetrio, L., Pascarella, L., Palomba, F., & Russo, B. (2021). An Empirical Evaluation of Vulnerability Prediction
Models Using Real-World Data.
https://arxiv.org/abs/2103.06788
Highlights the need for realistic training data and discusses model overfitting and false positives.
[7] Shastry, S., & Sankaranarayanan, S. (2022). Improving Software Vulnerability Detection using Ensemble Learning.
Shows benefits of combining multiple models to reduce noise and improve accuracy.
[8] Wang, S., Liu, S., Yang, J., Zhang, X., & Chen, Z. (2022). AlphaVul: Exploiting Attention and Multi-View Graph Learning
for Vulnerability Detection.
https://arxiv.org/abs/2203.05396, Demonstrates the use of attention layers in GNNs for code vulnerability detection.
REFERNCES
Explainability in GNNs
[9] Ying, R., Bourgeois, D., You, J., Zitnik, M., & Leskovec, J. (2019). GNNExplainer: Generating Explanations for Graph Neural
Networks.
[10] Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why Should I Trust You?” Explaining the Predictions of Any Classifier.
CONTACT
RANJANKUMARBAISAK
RANJAN.BAISAK@GMAIL.COM
+919880398951

More Related Content

Similar to Proactive Vulnerability Detection in Source Code Using Graph Neural Networks: Reducing False Positives and Improving Reliability (20)

PDF
TienResumeFinalV22016
Nora Tien
 
PDF
Survey of Adversarial Attacks in Deep Learning Models
IRJET Journal
 
DOCX
Ramakeerthi_1+yr_resume
botcha ramakeerthi
 
PDF
SANN: Programming Code Representation Using Attention Neural Network with Opt...
Peter Brusilovsky
 
DOCX
Omkar revankar resume
OmkarRevankar1
 
PPT
Sw Software Design
jonathan077070
 
DOCX
NetFense Adversarial Defenses Against Privacy Attacks on Neural Networks for ...
Shakas Technologies
 
DOCX
Resume_Vignesh_ThulasiDass
VigneshThulasiDass
 
DOC
Chandra_CV 3 8Yr Exp
Chandrashekar Murthy c n
 
PDF
Jain_Navya_resume
Navya Jain
 
PDF
Ashutosh jaimini resume
rit2007062
 
PDF
Ashutosh jaimini resume
rit2007062
 
PPTX
1st review android malware.pptx
Nambiraju
 
PPT
Topic 1 PBO
Imanuel Nugroho
 
PPTX
Leveraging Grad-CAM to Improve the Accuracy of Network Intrusion Detection Sy...
Francesco Paolo Caforio
 
PDF
Hunlan Lin_resume
hunlan lin
 
DOCX
kavita_resume_3
Kavita Raghunathan
 
PDF
IRJET- Effective Technique Used for Malware Detection using Machine Learning
IRJET Journal
 
PDF
Neo4j GraphTalk Helsinki - Next-Gerneation Telecommunication Solutions with N...
Neo4j
 
DOCX
Dipalee Shah Resume
Dipalee Shah
 
TienResumeFinalV22016
Nora Tien
 
Survey of Adversarial Attacks in Deep Learning Models
IRJET Journal
 
Ramakeerthi_1+yr_resume
botcha ramakeerthi
 
SANN: Programming Code Representation Using Attention Neural Network with Opt...
Peter Brusilovsky
 
Omkar revankar resume
OmkarRevankar1
 
Sw Software Design
jonathan077070
 
NetFense Adversarial Defenses Against Privacy Attacks on Neural Networks for ...
Shakas Technologies
 
Resume_Vignesh_ThulasiDass
VigneshThulasiDass
 
Chandra_CV 3 8Yr Exp
Chandrashekar Murthy c n
 
Jain_Navya_resume
Navya Jain
 
Ashutosh jaimini resume
rit2007062
 
Ashutosh jaimini resume
rit2007062
 
1st review android malware.pptx
Nambiraju
 
Topic 1 PBO
Imanuel Nugroho
 
Leveraging Grad-CAM to Improve the Accuracy of Network Intrusion Detection Sy...
Francesco Paolo Caforio
 
Hunlan Lin_resume
hunlan lin
 
kavita_resume_3
Kavita Raghunathan
 
IRJET- Effective Technique Used for Malware Detection using Machine Learning
IRJET Journal
 
Neo4j GraphTalk Helsinki - Next-Gerneation Telecommunication Solutions with N...
Neo4j
 
Dipalee Shah Resume
Dipalee Shah
 

More from Ranjan Baisak (6)

PPTX
Cloud Native Migration Steps
Ranjan Baisak
 
PPTX
PR agency - a personalized marketing analysis platform
Ranjan Baisak
 
PPTX
CabXury - a social cab sharing service
Ranjan Baisak
 
PPTX
Semantic based Enterprise Search Solution in Networking Domain
Ranjan Baisak
 
PPTX
Micro Services Architecture
Ranjan Baisak
 
PPTX
Docker : Container Virtualization
Ranjan Baisak
 
Cloud Native Migration Steps
Ranjan Baisak
 
PR agency - a personalized marketing analysis platform
Ranjan Baisak
 
CabXury - a social cab sharing service
Ranjan Baisak
 
Semantic based Enterprise Search Solution in Networking Domain
Ranjan Baisak
 
Micro Services Architecture
Ranjan Baisak
 
Docker : Container Virtualization
Ranjan Baisak
 
Ad

Recently uploaded (20)

PDF
SciPy 2025 - Packaging a Scientific Python Project
Henry Schreiner
 
PPTX
Agentic Automation: Build & Deploy Your First UiPath Agent
klpathrudu
 
PPTX
Equipment Management Software BIS Safety UK.pptx
BIS Safety Software
 
PDF
Understanding the Need for Systemic Change in Open Source Through Intersectio...
Imma Valls Bernaus
 
PPTX
Empowering Asian Contributions: The Rise of Regional User Groups in Open Sour...
Shane Coughlan
 
PDF
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
PDF
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pdf
Varsha Nayak
 
PPTX
Agentic Automation Journey Series Day 2 – Prompt Engineering for UiPath Agents
klpathrudu
 
PPTX
In From the Cold: Open Source as Part of Mainstream Software Asset Management
Shane Coughlan
 
PDF
Unlock Efficiency with Insurance Policy Administration Systems
Insurance Tech Services
 
PDF
Alexander Marshalov - How to use AI Assistants with your Monitoring system Q2...
VictoriaMetrics
 
PDF
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
PDF
Online Queue Management System for Public Service Offices in Nepal [Focused i...
Rishab Acharya
 
PDF
Build It, Buy It, or Already Got It? Make Smarter Martech Decisions
bbedford2
 
PDF
Revenue streams of the Wazirx clone script.pdf
aaronjeffray
 
PDF
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
PDF
Efficient, Automated Claims Processing Software for Insurers
Insurance Tech Services
 
PDF
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
PDF
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
PDF
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
SciPy 2025 - Packaging a Scientific Python Project
Henry Schreiner
 
Agentic Automation: Build & Deploy Your First UiPath Agent
klpathrudu
 
Equipment Management Software BIS Safety UK.pptx
BIS Safety Software
 
Understanding the Need for Systemic Change in Open Source Through Intersectio...
Imma Valls Bernaus
 
Empowering Asian Contributions: The Rise of Regional User Groups in Open Sour...
Shane Coughlan
 
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pdf
Varsha Nayak
 
Agentic Automation Journey Series Day 2 – Prompt Engineering for UiPath Agents
klpathrudu
 
In From the Cold: Open Source as Part of Mainstream Software Asset Management
Shane Coughlan
 
Unlock Efficiency with Insurance Policy Administration Systems
Insurance Tech Services
 
Alexander Marshalov - How to use AI Assistants with your Monitoring system Q2...
VictoriaMetrics
 
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
Online Queue Management System for Public Service Offices in Nepal [Focused i...
Rishab Acharya
 
Build It, Buy It, or Already Got It? Make Smarter Martech Decisions
bbedford2
 
Revenue streams of the Wazirx clone script.pdf
aaronjeffray
 
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
Efficient, Automated Claims Processing Software for Insurers
Insurance Tech Services
 
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
Ad

Proactive Vulnerability Detection in Source Code Using Graph Neural Networks: Reducing False Positives and Improving Reliability

  • 2. THE CHALLENGE Codebases are becoming massive and complex. Traditional static analysis tools often miss vulnerabilities. Security testing is reactive, not proactive.
  • 3. NEURALNETWORKSINCODE ANALYSIS How Neural Networks Help: Learn patterns of insecure code from historical data. Generalize across languages and styles. Predict potential vulnerabilities early in the SDLC (Software Development Life Cycle).
  • 4. WHYGRAPHNEURAL NETWORKS(GNNS)? Code is a Graph: Code can be represented as ASTs (AbstractSyntax Trees), Control Flow Graphs, or Call Graphs. GNNs understand structured data: Nodes = code elements (functions, variables, classes) Edges = relationships (calls, dependencies, inheritance) GNNs naturally fit code structure better than CNNs or RNNs.
  • 5. HOWGNNSANALYZECODE Predict vulnerability scores at node or graph level Parse code into a graph (AST/CFG/PDG) Initialize node embeddings (syntax, type info, etc.) Message passing between nodes (learn context)
  • 6. PROPERGNNDESIGNFOR VULNERABILITYDETECTION Graph Construction: AST + semantic information (e.g., variable types, data flow) Rich Node Features: Token types, function names, data types Deep Message Passing: Capture long-range dependencies (e.g., taint flows) Attention Mechanisms: Focus on critical code paths Multi-task Learning: Predict multiple vulnerability types at once
  • 7. BENEFITSOF GNN-BasedDetection Proactive Access: Predict unknown (zero-day) vulnerabilities based on patterns. Scalable: Works across large codebases automatically.. Explainable AI: Highlight suspicious code snippets (important for developer trust).
  • 8. REAL-WORLD APPLICATIONS Facebook’s “SapFix” and “Getafix” for automated bug fixing Microsoft’s “DeepVul” model for vulnerability detection AI is a tool, not a threat! Open-source projects like Code Property Graphs (CPG).
  • 9. CHALLENGESAND LIMITATIONS Labelled data scarcity for vulnerabilities Imbalanced datasets (few vulnerabilities vs lots of clean code) Risk of false positives/negatives Model interpretability
  • 10. FUTUREOPPORTUNITIES Combining GNNs with Large Language Models (LLMs) Dynamic analysis + static GNN models Automated code patch suggestions Self-training with weak supervision
  • 11. CONCLUSION GNNs represent a powerful frontier for proactive vulnerability detection. With the right design and training, GNNs can shift security left, saving organizations millions. "Think like an attacker, code like a graph!"
  • 14. THEIMPORTANCEOF PRECISION Why it matters: High false positive rates = Developer fatigue False trust can be worse than no detection Security tools must be reliable and explainable
  • 15. STRATEGY#1—BETTERGRAPHDESIGN Combine AST + Control Flow Graph + Data Flow Graph Enrich nodes with: Token type, data type, symbol role API risk classification Diagram: Side-by-side of AST vs Hybrid Graph
  • 16. STRATEGY#2—CLEANANDBALANCED DATA Use high-quality, labeled datasets (e.g., Juliet, Devign, CodeXGLUE) Address data imbalance: Oversample rare vulnerabilities Apply cost-sensitive loss functions Visual: Pie chart of class imbalance and how sampling improves it
  • 17. STRATEGY#3—FOCUSWITHATTENTION Add attention layers to the GNN Prioritize user input, dangerous function calls, control paths Highlight how attention reduces noise from irrelevant code Diagram: GNN with attention heatmap on code graph
  • 18. STRATEGY#4—POST-PREDICTION FILTERING Rule-based filtering after GNN output: Example: Reject if input is already sanitized Hybrid model = AI + domain rules Benefits: Remove obvious FPs Improve trust in model output
  • 19. STRATEGY#5—EXPLAINABILITY Use GNNExplainer or saliency maps for: Highlighting vulnerable code paths Making predictions interpretable Screenshot: Sample output with highlighted risky lines
  • 20. STRATEGY#6—FEEDBACKLOOP Deploy GNN with human feedback Collect true/false positive flags from developers Periodically fine-tune model using this data Visual: Lifecycle diagram of GNN improvement via feedback
  • 21. STRATEGY#7—ENSEMBLEMODELS Combine multiple GNN types (GAT, GCN, GraphSAGE) Cross-validate predictions → majority voting or learned fusion Lower model variance = fewer false alarms
  • 23. FINALTHOUGHTS GNNs are powerful, but not perfect. Combining machine learning + human insight is key. The goal: Actionable, accurate, explainable vulnerability detection.
  • 24. REFERNCES GNNs for Code Representation & Vulnerability Detection [1] Allamanis, M., Barr, E. T., Devanbu, P., & Sutton, C. (2018). A Survey of Machine Learning for Big Code and Naturalness. DOI: 10.1145/3212695 Overview of ML and GNNs for code representation. [2] Zhou, Y., Liu, S., Siow, J., Du, X., & Liu, Y. (2019). Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks. GNNs are powerful, but not perfect. Combining machine learning + human insight is key. The goal: Actionable, accurate, explainable vulnerability detection. Introduced GNN-based vulnerability detection using joint AST/CFG models. [3] Lin, Z., Sun, Y., Wang, H., Wang, Z., & Liu, X. (2020). Graph-based Deep Learning for Software Vulnerability Detection: A Survey. GNNs are powerful, but not perfect. Combining machine learning + human insight is key. The goal: Actionable, accurate, explainable vulnerability detection. Comprehensive survey of graph-based vulnerability detection methods.
  • 25. REFERNCES Graph Construction & Feature Engineering [4] Fernandes, E., Pauck, F., & Bodden, E. (2022). A Review of Graph Representations for Source Code. Discusses AST, PDG, DFG, and hybrid graph approaches. https://arxiv.org/abs/2211.03138 [5] Yamaguchi, F., Golde, N., Arp, D., & Rieck, K. (2014). Modeling and discovering vulnerabilities with code property graphs. https://www.usenix.org/system/files/conference/sp14/sp14-paper-yamaguchi.pdf Seminal work introducing Code Property Graphs (CPG) for vulnerability mining. Reducing False Positives [6] Demetrio, L., Pascarella, L., Palomba, F., & Russo, B. (2021). An Empirical Evaluation of Vulnerability Prediction Models Using Real-World Data. https://arxiv.org/abs/2103.06788 Highlights the need for realistic training data and discusses model overfitting and false positives. [7] Shastry, S., & Sankaranarayanan, S. (2022). Improving Software Vulnerability Detection using Ensemble Learning. Shows benefits of combining multiple models to reduce noise and improve accuracy. [8] Wang, S., Liu, S., Yang, J., Zhang, X., & Chen, Z. (2022). AlphaVul: Exploiting Attention and Multi-View Graph Learning for Vulnerability Detection. https://arxiv.org/abs/2203.05396, Demonstrates the use of attention layers in GNNs for code vulnerability detection.
  • 26. REFERNCES Explainability in GNNs [9] Ying, R., Bourgeois, D., You, J., Zitnik, M., & Leskovec, J. (2019). GNNExplainer: Generating Explanations for Graph Neural Networks. [10] Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why Should I Trust You?” Explaining the Predictions of Any Classifier.