SlideShare a Scribd company logo
Leveraging Pre-Trained
Transformer Models for
Protein Function
Prediction
Tia Pope
I am not a
biochemist, but…
• Cybersecurity
• Healthcare
• Software Engineering
• Interoperability
• Machine Intelligence
I computationally
study proteins…
Protein AI
• Novel Proteins
• In Silico Evaluation
• 3rd Year Ph.D.
Candidate
• Collaborations with
J&J, Lilly, MIT LL
My
Background
Protein Wars The Good Side – Our Body’s Defenders
• Antibodies – fight off viruses and bacteria.
• Enzymes – builders and repair workers
• Hormones – messengers telling our body what to do.
The Enemy – Bad or Sneaky Proteins
• Virus Proteins – Like spies (e.g., COVID-19's spike
protein) that trick our cells into letting them in.
• Mutated Proteins – Sometimes, our own proteins go
rogue and cause diseases (like cancer).
• Misfolded Proteins – When proteins fold wrong, they
can damage the brain (like in Alzheimer’s).
Image: https://mutantreviewersmovies.com/2022/06/07/osmosis-jones-2001-
what-you-never-wanted-to-know-about-your-own-body/
Image: https://www.pbs.org/wnet/americanmasters/rod-serling-about-rod-serling/702/
Leveraging Pre-Trained Transformer Models for Protein Function Prediction - Tia Pope, North Carolina A&T
COVID-19 is a Protein? No, but uses them.
• Membrane (M) protein – Helps
shape the virus.
• Envelope (E) protein – Plays a
role in viral assembly and release.
• Nucleocapsid (N) protein – Binds
to the viral RNA genome.
• Spike (S) protein – Helps the virus
attach to and enter human cells.
Enter Artificial Intelligence
“Open-Source Tools”
Advances in
BioChemistry
AlphaFold revolutionized
protein structure prediction
by using deep learning to
accurately model 3D
protein structures from
amino acid sequences,
reducing prediction time
from months or years to
just hours or minutes.
There’s…
But there is also…
Feature RNN (Recurrent Neural Network) Transformer
Processing Sequential (one word at a time) Parallel (all words at once)
Speed Slower (can't process in parallel) Faster (processes entire input at once)
Long-Range
Dependencies
Struggles with long sentences
(information gets lost overtime)
Captures long-range dependencies efficiently
Memory Usage
Uses less memory per step but
needs more steps
Requires more memory but fewer steps
Training Efficiency Harder to train (vanishing gradient problem) Easier to train with large datasets
Example Task Predicting the next word in a sequence Translating entire paragraphs accurately
Real-World Example
Older chatbots, speech recognition
(e.g., Siri’s early versions)
GPT, BERT, modern AI chatbots
AI for Treatments
AI dramatically accelerated
experimentation, testing
and vaccine development.
Example includes helping
bring COVID-19 vaccines to
the public in less than a
year—a process that
usually takes many years.
Protein Function
Using AI Prediction Tools
Protein
Production
Natural biological
process that we don’t
100% understand.
The majority of known
protein sequences lack
experimental validation
and are labeled as
hypothetical or
uncharacterized
proteins. Image: https://slideplayer.com/slide/15368149/
Every BODY is Different & More to STUDY
https://www.thebureauinvestigates.com/stories/2024-09-16/superbugs-will-kill-three-every-minute-by-
2050#:~:text=Published%20September%2016%202024&text=Drug%2Dresistant%20infections%20ar
e%20expected,have%20become%20untreatable%20by%20antibiotics.
Protein Function Prediction
Helps us understand biological processes, develop new drugs, diagnose
diseases and engineer novel proteins for medicine, biotechnology and
environmental applications.
• Disease Research: Identifies disease-related proteins, aiding in treatments for
cancer, Alzheimer’s, and viral infections (e.g., COVID-19, Superbug X).
• Drug Discovery: Helps design targeted drugs by identifying protein
interactions and binding sites. Or identify novel sequences and variants of existing
proteins.
• Synthetic Biology: Enables the design of custom proteins for bioengineering,
like enzymes for sustainable fuels.
• Environmental Science: Helps in breaking down pollutants or engineering
bacteria for waste management.
ESM ProtGPT2
Evolutionary Scale Modeling Built on GPT-2, Not ProteinGPT
ProtGPT2 for Novel Protein Generation
• Install the required libraries
• Load the ProtGPT2 model from Hugging Face (Pre-Trained)
• Provide a seed sequence (optional) or generate a new one
• Generate a new protein sequence
Pre-Trained Decoder
Transformer
ProtGPT2
Example 1:
• Input (Seed Sequence): "MKTLLLTLV"
• Output (Generates): "MKTLLLTLVVVTIVCLDLGYTGTVNNSM...”
Example 2:
• Input (Unconditional or No Seed): " "
• Output (Generates): "MGRKYLTVASM..."
The Actual 3D Structure
Predicted with ESMFold
Using ESM for Function Prediction
• Install the ESM model dependencies
• Load the ESM model from Hugging Face (Pre-Trained)
• Prepare the test sequence
• Run the function prediction
• Interpret the output
Example:
• Input (Protein Sequence): "MKTLLLTLVVVTIVCLDLGYTGRKYLTVNNSM..."
• Output (Predicted Function): Enzyme (Kinase)
Gene Ontology: GO:0004674 (Protein Serine/Threonine Kinase Activity)
Pre-Trained Encoder
Transformer
Some ESM versions have decoder capabilities
ESM (Encoder-Based) Use Cases
Task Input Output
Protein Folding (ESMFold) Sequence 3D structure (PDB)
Mutation Effects Wild-type & mutant sequence Mutation heatmap
Evolutionary Velocity Two sequences Evolutionary direction
Binding Site Prediction Sequence Binding site classification (0/1)
PPI Prediction Two sequences Interaction probability (0-1)
AI-Guided Protein Design Protein motif Designed full sequence
MSA-Free Homology Detection Sequence List of similar proteins
Function Prediction Sequence Biological function label
Intrinsic Dimension Analysis Sequence Complexity score
Persistent Homology Clustering Protein embeddings Cluster assignments
Fine-Tuning (LoRA) Labeled protein sequences Improved function prediction
Bacta Tank
• Learn to Leverage Pre-
Trained Transformer
Models for Protein
Function Prediction
• Maybe we can work
together to take down
the next super bug…
• Or create a Bacta!
Connect. Fact Check. Try it Out! Contribute
Notebook &
Sample Data
Referenced
Papers & Articles
LinkedIn

More Related Content

PPT
Prediction of protein function from sequence derived protein features
Lars Juhl Jensen
 
PPTX
protein design, principles and examples.pptx
GopiChand121
 
PDF
Investigating the biological relevance in trained embedding representations o...
Ghent University Global Campus
 
PDF
upload.pdf
zohra72
 
PDF
Zarlish attique 187104 project assignment modeller
ZarlishAttique1
 
PDF
Proteinprotein Interactions Computational Experimental Tools W Cai
xmtplyff636
 
PPTX
Protein motif analysis and optimization using neural algorithms
Samvo Chowdhury
 
PPTX
[IJCAI 2023 - Poster] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network...
Ziyuan Zhao
 
Prediction of protein function from sequence derived protein features
Lars Juhl Jensen
 
protein design, principles and examples.pptx
GopiChand121
 
Investigating the biological relevance in trained embedding representations o...
Ghent University Global Campus
 
upload.pdf
zohra72
 
Zarlish attique 187104 project assignment modeller
ZarlishAttique1
 
Proteinprotein Interactions Computational Experimental Tools W Cai
xmtplyff636
 
Protein motif analysis and optimization using neural algorithms
Samvo Chowdhury
 
[IJCAI 2023 - Poster] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network...
Ziyuan Zhao
 

Similar to Leveraging Pre-Trained Transformer Models for Protein Function Prediction - Tia Pope, North Carolina A&T (20)

PDF
SonPhamSVURS2015
Son Pham
 
PPTX
protein-protein interaction
Zeshan Haider
 
PDF
How to use deep learning on biological data
Aly Abdelkareem
 
PPTX
AlphaFold-Revolutionizing Protein Structure Prediction.pptx
University of Malakand
 
PDF
An Overview to Protein bioinformatics
Joel Ricci-López
 
PPTX
Pycon
ShellyDeForte1
 
PDF
Drug Repositioning Conference Washington DC 20190923
Tudor Oprea
 
PDF
Design and development of learning model for compression and processing of d...
IJECEIAES
 
PDF
Illuminating the Druggable Genome with Knowledge Engineering and Machine Lear...
Jeremy Yang
 
PPTX
[IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Effi...
Ziyuan Zhao
 
PPTX
NEURAL NETWORKS bioinformatics biotechnology
jsssanjaystanlin
 
PDF
Modelling Biomedical Signals 1st Edition Giuseppe Nardulli
tjohwnjv336
 
DOC
551report.doc
butest
 
PDF
Deep learning for biomedicine
Deakin University
 
PDF
Homology modeling
Ajay Murali
 
PDF
Protein Bioinformatics From Sequence to Function 1st Edition M. Michael Gromiha
solavxzt787
 
PDF
Modelling Biomedical Signals 1st Edition Giuseppe Nardulli
vladnancedo
 
PDF
Pharmaceutical biotechnology introduction.pdf
UVAS
 
PDF
Using natural language processing to evaluate the impact of specialized trans...
IAESIJAI
 
PDF
A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...
Sara Alvarez
 
SonPhamSVURS2015
Son Pham
 
protein-protein interaction
Zeshan Haider
 
How to use deep learning on biological data
Aly Abdelkareem
 
AlphaFold-Revolutionizing Protein Structure Prediction.pptx
University of Malakand
 
An Overview to Protein bioinformatics
Joel Ricci-López
 
Drug Repositioning Conference Washington DC 20190923
Tudor Oprea
 
Design and development of learning model for compression and processing of d...
IJECEIAES
 
Illuminating the Druggable Genome with Knowledge Engineering and Machine Lear...
Jeremy Yang
 
[IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Effi...
Ziyuan Zhao
 
NEURAL NETWORKS bioinformatics biotechnology
jsssanjaystanlin
 
Modelling Biomedical Signals 1st Edition Giuseppe Nardulli
tjohwnjv336
 
551report.doc
butest
 
Deep learning for biomedicine
Deakin University
 
Homology modeling
Ajay Murali
 
Protein Bioinformatics From Sequence to Function 1st Edition M. Michael Gromiha
solavxzt787
 
Modelling Biomedical Signals 1st Edition Giuseppe Nardulli
vladnancedo
 
Pharmaceutical biotechnology introduction.pdf
UVAS
 
Using natural language processing to evaluate the impact of specialized trans...
IAESIJAI
 
A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...
Sara Alvarez
 
Ad

More from All Things Open (20)

PDF
Agentic AI for Developers and Data Scientists Build an AI Agent in 10 Lines o...
All Things Open
 
PPTX
Big Data on a Small Budget: Scalable Data Visualization for the Rest of Us - ...
All Things Open
 
PDF
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
All Things Open
 
PDF
Let's Create a GitHub Copilot Extension! - Nick Taylor, Pomerium
All Things Open
 
PDF
Gen AI: AI Agents - Making LLMs work together in an organized way - Brent Las...
All Things Open
 
PDF
You Don't Need an AI Strategy, But You Do Need to Be Strategic About AI - Jes...
All Things Open
 
PPTX
DON’T PANIC: AI IS COMING – The Hitchhiker’s Guide to AI - Mark Hinkle, Perip...
All Things Open
 
PDF
Fine-Tuning Large Language Models with Declarative ML Orchestration - Shivay ...
All Things Open
 
PDF
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...
All Things Open
 
PPTX
Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...
All Things Open
 
PDF
Don't just talk to AI, do more with AI: how to improve productivity with AI a...
All Things Open
 
PPTX
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
All Things Open
 
PDF
The Death of the Browser - Rachel-Lee Nabors, AgentQL
All Things Open
 
PDF
Making Operating System updates fast, easy, and safe
All Things Open
 
PDF
Reshaping the landscape of belonging to transform community
All Things Open
 
PDF
The Unseen, Underappreciated Security Work Your Maintainers May (or may not) ...
All Things Open
 
PDF
Integrating Diversity, Equity, and Inclusion into Product Design
All Things Open
 
PDF
The Open Source Ecosystem for eBPF in Kubernetes
All Things Open
 
PDF
Open Source Privacy-Preserving Metrics - Sarah Gran & Brandon Pitman
All Things Open
 
PDF
Open-Source Low-Code - Craig St. Jean, Xebia
All Things Open
 
Agentic AI for Developers and Data Scientists Build an AI Agent in 10 Lines o...
All Things Open
 
Big Data on a Small Budget: Scalable Data Visualization for the Rest of Us - ...
All Things Open
 
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
All Things Open
 
Let's Create a GitHub Copilot Extension! - Nick Taylor, Pomerium
All Things Open
 
Gen AI: AI Agents - Making LLMs work together in an organized way - Brent Las...
All Things Open
 
You Don't Need an AI Strategy, But You Do Need to Be Strategic About AI - Jes...
All Things Open
 
DON’T PANIC: AI IS COMING – The Hitchhiker’s Guide to AI - Mark Hinkle, Perip...
All Things Open
 
Fine-Tuning Large Language Models with Declarative ML Orchestration - Shivay ...
All Things Open
 
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...
All Things Open
 
Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...
All Things Open
 
Don't just talk to AI, do more with AI: how to improve productivity with AI a...
All Things Open
 
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
All Things Open
 
The Death of the Browser - Rachel-Lee Nabors, AgentQL
All Things Open
 
Making Operating System updates fast, easy, and safe
All Things Open
 
Reshaping the landscape of belonging to transform community
All Things Open
 
The Unseen, Underappreciated Security Work Your Maintainers May (or may not) ...
All Things Open
 
Integrating Diversity, Equity, and Inclusion into Product Design
All Things Open
 
The Open Source Ecosystem for eBPF in Kubernetes
All Things Open
 
Open Source Privacy-Preserving Metrics - Sarah Gran & Brandon Pitman
All Things Open
 
Open-Source Low-Code - Craig St. Jean, Xebia
All Things Open
 
Ad

Recently uploaded (20)

PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PDF
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PDF
Event Presentation Google Cloud Next Extended 2025
minhtrietgect
 
PDF
How-Cloud-Computing-Impacts-Businesses-in-2025-and-Beyond.pdf
Artjoker Software Development Company
 
PDF
Doc9.....................................
SofiaCollazos
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PDF
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
PDF
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PDF
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
PPTX
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
PPTX
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
PDF
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
PDF
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
PDF
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
PDF
Software Development Methodologies in 2025
KodekX
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
Event Presentation Google Cloud Next Extended 2025
minhtrietgect
 
How-Cloud-Computing-Impacts-Businesses-in-2025-and-Beyond.pdf
Artjoker Software Development Company
 
Doc9.....................................
SofiaCollazos
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
Software Development Methodologies in 2025
KodekX
 

Leveraging Pre-Trained Transformer Models for Protein Function Prediction - Tia Pope, North Carolina A&T

  • 1. Leveraging Pre-Trained Transformer Models for Protein Function Prediction Tia Pope
  • 2. I am not a biochemist, but… • Cybersecurity • Healthcare • Software Engineering • Interoperability • Machine Intelligence I computationally study proteins… Protein AI • Novel Proteins • In Silico Evaluation • 3rd Year Ph.D. Candidate • Collaborations with J&J, Lilly, MIT LL My Background
  • 3. Protein Wars The Good Side – Our Body’s Defenders • Antibodies – fight off viruses and bacteria. • Enzymes – builders and repair workers • Hormones – messengers telling our body what to do. The Enemy – Bad or Sneaky Proteins • Virus Proteins – Like spies (e.g., COVID-19's spike protein) that trick our cells into letting them in. • Mutated Proteins – Sometimes, our own proteins go rogue and cause diseases (like cancer). • Misfolded Proteins – When proteins fold wrong, they can damage the brain (like in Alzheimer’s). Image: https://mutantreviewersmovies.com/2022/06/07/osmosis-jones-2001- what-you-never-wanted-to-know-about-your-own-body/
  • 6. COVID-19 is a Protein? No, but uses them. • Membrane (M) protein – Helps shape the virus. • Envelope (E) protein – Plays a role in viral assembly and release. • Nucleocapsid (N) protein – Binds to the viral RNA genome. • Spike (S) protein – Helps the virus attach to and enter human cells.
  • 8. Advances in BioChemistry AlphaFold revolutionized protein structure prediction by using deep learning to accurately model 3D protein structures from amino acid sequences, reducing prediction time from months or years to just hours or minutes.
  • 10. But there is also…
  • 11. Feature RNN (Recurrent Neural Network) Transformer Processing Sequential (one word at a time) Parallel (all words at once) Speed Slower (can't process in parallel) Faster (processes entire input at once) Long-Range Dependencies Struggles with long sentences (information gets lost overtime) Captures long-range dependencies efficiently Memory Usage Uses less memory per step but needs more steps Requires more memory but fewer steps Training Efficiency Harder to train (vanishing gradient problem) Easier to train with large datasets Example Task Predicting the next word in a sequence Translating entire paragraphs accurately Real-World Example Older chatbots, speech recognition (e.g., Siri’s early versions) GPT, BERT, modern AI chatbots
  • 12. AI for Treatments AI dramatically accelerated experimentation, testing and vaccine development. Example includes helping bring COVID-19 vaccines to the public in less than a year—a process that usually takes many years.
  • 13. Protein Function Using AI Prediction Tools
  • 14. Protein Production Natural biological process that we don’t 100% understand. The majority of known protein sequences lack experimental validation and are labeled as hypothetical or uncharacterized proteins. Image: https://slideplayer.com/slide/15368149/
  • 15. Every BODY is Different & More to STUDY https://www.thebureauinvestigates.com/stories/2024-09-16/superbugs-will-kill-three-every-minute-by- 2050#:~:text=Published%20September%2016%202024&text=Drug%2Dresistant%20infections%20ar e%20expected,have%20become%20untreatable%20by%20antibiotics.
  • 16. Protein Function Prediction Helps us understand biological processes, develop new drugs, diagnose diseases and engineer novel proteins for medicine, biotechnology and environmental applications. • Disease Research: Identifies disease-related proteins, aiding in treatments for cancer, Alzheimer’s, and viral infections (e.g., COVID-19, Superbug X). • Drug Discovery: Helps design targeted drugs by identifying protein interactions and binding sites. Or identify novel sequences and variants of existing proteins. • Synthetic Biology: Enables the design of custom proteins for bioengineering, like enzymes for sustainable fuels. • Environmental Science: Helps in breaking down pollutants or engineering bacteria for waste management.
  • 17. ESM ProtGPT2 Evolutionary Scale Modeling Built on GPT-2, Not ProteinGPT
  • 18. ProtGPT2 for Novel Protein Generation • Install the required libraries • Load the ProtGPT2 model from Hugging Face (Pre-Trained) • Provide a seed sequence (optional) or generate a new one • Generate a new protein sequence Pre-Trained Decoder Transformer
  • 19. ProtGPT2 Example 1: • Input (Seed Sequence): "MKTLLLTLV" • Output (Generates): "MKTLLLTLVVVTIVCLDLGYTGTVNNSM...” Example 2: • Input (Unconditional or No Seed): " " • Output (Generates): "MGRKYLTVASM..." The Actual 3D Structure Predicted with ESMFold
  • 20. Using ESM for Function Prediction • Install the ESM model dependencies • Load the ESM model from Hugging Face (Pre-Trained) • Prepare the test sequence • Run the function prediction • Interpret the output Example: • Input (Protein Sequence): "MKTLLLTLVVVTIVCLDLGYTGRKYLTVNNSM..." • Output (Predicted Function): Enzyme (Kinase) Gene Ontology: GO:0004674 (Protein Serine/Threonine Kinase Activity) Pre-Trained Encoder Transformer Some ESM versions have decoder capabilities
  • 21. ESM (Encoder-Based) Use Cases Task Input Output Protein Folding (ESMFold) Sequence 3D structure (PDB) Mutation Effects Wild-type & mutant sequence Mutation heatmap Evolutionary Velocity Two sequences Evolutionary direction Binding Site Prediction Sequence Binding site classification (0/1) PPI Prediction Two sequences Interaction probability (0-1) AI-Guided Protein Design Protein motif Designed full sequence MSA-Free Homology Detection Sequence List of similar proteins Function Prediction Sequence Biological function label Intrinsic Dimension Analysis Sequence Complexity score Persistent Homology Clustering Protein embeddings Cluster assignments Fine-Tuning (LoRA) Labeled protein sequences Improved function prediction
  • 22. Bacta Tank • Learn to Leverage Pre- Trained Transformer Models for Protein Function Prediction • Maybe we can work together to take down the next super bug… • Or create a Bacta!
  • 23. Connect. Fact Check. Try it Out! Contribute Notebook & Sample Data Referenced Papers & Articles LinkedIn