SlideShare a Scribd company logo
Empower Your AI Journey
Hands-on Machine Learning with fastai for Graphics and NLP
David vonThenen
Software Engineer/Developer Advocate
@dvonthenen
San Francisco, CA
May 29-30, 2024
David
vonThenen
● Are you Human or AI?
● I want 5 Kubernetes
● Virtual Machines are Real
● Replacing Myself with
Bots…
● Cloudy, cloudy, cloudy…
● There is storage for that!
@dvonthenen
San Francisco, CA
May 29-30, 2024
3
Agenda
● What is fastai?
● Image Classification
● Natural Language Processing
○ Dataset Considerations
● Resources
● Q&A
San Francisco, CA
May 29-30, 2024
San Francisco, CA
May 29-30, 2024
What is fastai?
Accelerate Model Creation
fastai: Deep Learning Library for All
5
fastai - https://github.com/fastai/fastai
● Deep Learning Abstraction on top of PyTorch
● Top-Down Approach to Learning AI/ML
● Fast and Easy Neural Network Training
○ ML Model is Your Output
● Usability Without Sacrificing Performance
● Provides High-level Components
● Enables Transfer Learning with Minimal Code
● Audience: Everyone!
○ Beginners in Machine Learning
○ Rapid Prototyping for Experts
San Francisco, CA
May 29-30, 2024
Great! What Do I Need to Know?
6
Core Components:
● Data Loaders - Flexible Data Loading and Transformations
● Learner - Training Models
Wide Variety of Domains (these are Data Loaders!
● Images
● Text
● Tabular
● Collaborative) Filtering
● Data Block Generic/Custom)
● And more!
San Francisco, CA
May 29-30, 2024
Opinionated Data, Pipeline,
Memory Management
San Francisco, CA
May 29-30, 2024
Image Classifier
Discussion and Demo
Problem to Blueprint
8
Image Classifier for Dogs Boxer, German Shepherd, Golden Retriever)
● Need to Create a Dataset
○ 30 Unique Images of Each Type of Dog
○ Make Images Uniform
○ Incorrectly Download Dogs?
● Build and Train the Model
○ Create Dataset Based on Categories
○ Augment Pre-Trained Model with Layers
○ Train the New Model Apply Weights)
● Test the Model with Images
○ Success? Probability?
San Francisco, CA
May 29-30, 2024
San Francisco, CA
May 29-30, 2024
Demo
Who Doesn't Love Dogs?!?
San Francisco, CA
May 29-30, 2024
Natural Language Processing
Discussion and Demo
https://youtu.be/DC9Q6hlQiTs
Question vs Non-Question Sentences
11
How to Approach This Problem:
● Need to Create a Dataset
○ Big Problem: Where to Get Data?
○ Qs: Stanford Question Answering Dataset
■ Remove the Answers
○ Non-Qs: Use Answers OR Rand Wiki Pages
● Build and Train the Model
○ Using Tabular/DataFrame
■ 2 Cols: Sentence, Is Question
● Test the Model
○ Create a Q and Non-Q and Test!
San Francisco, CA
May 29-30, 2024
San Francisco, CA
May 29-30, 2024
Demo
Identifying Questions in Text
https://youtu.be/jjzgaRIrYSg
Right Data for the Right Problem
13
Why Did We Fail At First?
● Dataset Needs to be Representative
○ Obtain the Right Kind of Data
○ Verify the Data is Correct
○ Modify the Data to Fit?!?
● Verify the Model
○ Test and Retest
○ More Data or More "Alignment"
○ Repeat!
San Francisco, CA
May 29-30, 2024
Getting Answers to Questions
14
Use Case: In Meetings, Phone Calls, etc, Use an LLM to Answer Questions
● Useful Because:
○ Provide "Level Set" on Participant Knowledge
○ Get Answers Now vs "Taking Offline"
○ Provide Other Lesser Known Facts
● LLM Could Be:
○ Geared Toward a Domain (ex. Medical)
○ Trained on Company Docs
○ Generalized
Demo: Deepgram STT + OpenAI + Deepgram TTS
San Francisco, CA
May 29-30, 2024
San Francisco, CA
May 29-30, 2024
Demo
Answering Questions with LLMs in Real-Time
Using Deepgram and OpenAI
https://youtu.be/NCDLFGmK9Sw
San Francisco, CA
May 29-30, 2024
Resources
Materials, Links, Docs, Extras, etc
All Of The Things
17
CLICK HERE] for All Material Contained in this Session CLICK HERE
All the Things in This Presentation
● Example: Which Breed of Dog?
● Example: Question vs Non-Question
● Example: LLM Answering Questions in Meetings
● STT & TTS Deepgram Speech-to-Text and Text-to-Speech
● Additional Readings:
○ Python for Data Analysis: Data Wrangling with pandas, NumPy, and
Jupyter by Wes McKinney
○ Deep Learning for Coders with fastai and pytorch by Howard Gugger
San Francisco, CA
May 29-30, 2024
All Of The Things(part 2)
18
CLICK HERE] for All Material Contained in this Session CLICK HERE
How To Get Started
● Tutorials
○ FREE Online Courses - https://course.fast.ai/
○ Examples via Jupyter Notebook - https://github.com/fastai/fastbook
● Image Processing
○ Kaggle Jupyter Book by Jeremy Howard
○ FastAI Data Tutorial - Image Classification by Julius Simonelli
● Natural Language Processing
○ NLP from Scratch with PyTorch, fastai, and HuggingFace by Amar Saini
○ A Hackers' Guide to Language Models
San Francisco, CA
May 29-30, 2024
Thank you
David vonThenen
Software Engineer/Developer Advocate
@dvonthenen San Francisco, CA
May 29-30, 2024

More Related Content

Similar to AI Dev Summit 2024 - Empower Your AI Journey_ Hands-on Machine Learning with fastai for Graphics and NLP (20)

PDF
Quick Start Guide To Large Language Models Second Edition Sinan Ozdemir
eziddasiva
 
PDF
Master LLMs with LangChain -the basics of LLM
ssuser3d8087
 
PDF
Intro to LLMs
Loic Merckel
 
PDF
Should we be afraid of Transformers?
Dominik Seisser
 
PDF
Introducción práctica al análisis de datos hasta la inteligencia artificial
fcoalberto
 
PPTX
Future of AI - 2023 07 25.pptx
Greg Makowski
 
PDF
LLM Cheatsheet and it's brief introduction
DarkKnight437486
 
PDF
Google Cloud - Google's vision on AI
BigDataExpo
 
PDF
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Anant Corporation
 
PDF
LLM.pdf
MedBelatrach
 
PPTX
‘Big models’: the success and pitfalls of Transformer models in natural langu...
Leiden University
 
PPTX
Real-world Document Classification with Transfer Learning
Pradeep Thiyyagura
 
PPTX
Large Language Models vs Small Language Models
Nathan Bijnens
 
PDF
Introduction to LLMs
Loic Merckel
 
PDF
You and Your Research -- LLMs Perspective
Mohamed Elawady
 
PDF
Some Preliminary Thoughts on Artificial Intelligence - April 20, 2023.pdf
Kent Bye
 
PDF
Deep Learning Cases: Text and Image Processing
Grigory Sapunov
 
PDF
NLP@DATEV: Setting up a domain specific language model, Dr. Jonas Rende & Tho...
Erlangen Artificial Intelligence & Machine Learning Meetup
 
PPTX
Building-a-strong-AI-and-ML-presentation-through-various-Domains (1).pptxjejd
kaushalya2891989
 
PPTX
From c# Into Machine Learning
Dev Raj Gautam
 
Quick Start Guide To Large Language Models Second Edition Sinan Ozdemir
eziddasiva
 
Master LLMs with LangChain -the basics of LLM
ssuser3d8087
 
Intro to LLMs
Loic Merckel
 
Should we be afraid of Transformers?
Dominik Seisser
 
Introducción práctica al análisis de datos hasta la inteligencia artificial
fcoalberto
 
Future of AI - 2023 07 25.pptx
Greg Makowski
 
LLM Cheatsheet and it's brief introduction
DarkKnight437486
 
Google Cloud - Google's vision on AI
BigDataExpo
 
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Anant Corporation
 
LLM.pdf
MedBelatrach
 
‘Big models’: the success and pitfalls of Transformer models in natural langu...
Leiden University
 
Real-world Document Classification with Transfer Learning
Pradeep Thiyyagura
 
Large Language Models vs Small Language Models
Nathan Bijnens
 
Introduction to LLMs
Loic Merckel
 
You and Your Research -- LLMs Perspective
Mohamed Elawady
 
Some Preliminary Thoughts on Artificial Intelligence - April 20, 2023.pdf
Kent Bye
 
Deep Learning Cases: Text and Image Processing
Grigory Sapunov
 
NLP@DATEV: Setting up a domain specific language model, Dr. Jonas Rende & Tho...
Erlangen Artificial Intelligence & Machine Learning Meetup
 
Building-a-strong-AI-and-ML-presentation-through-various-Domains (1).pptxjejd
kaushalya2891989
 
From c# Into Machine Learning
Dev Raj Gautam
 

More from David vonThenen (20)

PDF
The Future of UI/UX AI Generated Interfaces Tailored Just in Time
David vonThenen
 
PDF
Adaptive RAG Systems with Knowledge Graphs Building Reinforcement Learning Dr...
David vonThenen
 
PDF
The Rise of Agentic AI Harnessing Open Source for Dynamic Decision Making
David vonThenen
 
PDF
Explaining the Unexplainable Python Tools for AI Transparency using Captum
David vonThenen
 
PDF
2025 All Things Open AI - Leveraging Knowledge Graphs for RAG - A Smarter App...
David vonThenen
 
PDF
2025 NVIDIA GTC: Crack the AI Black Box: Practical Techniques for Explainable AI
David vonThenen
 
PDF
2025 SCaLE 22x - Training Multi-Modal ML Classification Models for Real Time ...
David vonThenen
 
PDF
2025 SCaLE 22x - Demystifying Building Natural Language Processing ML Models ...
David vonThenen
 
PDF
2025 DeveloperWeek - The Sound of Innovation: Why Voice Cloning Will Redefine...
David vonThenen
 
PDF
2025 Developer Week - Navigating the Edge-Cloud Bridge_ Building Resource Opt...
David vonThenen
 
PDF
2024 RTC Conference - Training Machine Learning Classification Models for Cre...
David vonThenen
 
PDF
2024 RTC CONF - Building Multiple Natural Language Processing Models to Work ...
David vonThenen
 
PDF
SCaLE 21x - Voice-Activated AI Collaborators: A Hands-On Guide Using LLMs in ...
David vonThenen
 
PDF
RTC Conference 2023 - Edge Devices as Interactive Personal Assistants_ Unleas...
David vonThenen
 
PDF
RTC Conference 2023 - Enhancing Real-Time WebRTC Conversation Understanding U...
David vonThenen
 
PDF
Cloud Native Rejekts Europe 2022 - Learnings From Creating CI/CD Pipelines
David vonThenen
 
PDF
KubeCon EU 2020 - Provider vSphere All Things vSphere Working Group
David vonThenen
 
PDF
KubeCon Europe 2019 - VMware SIG - Intro to the CSI driver
David vonThenen
 
PDF
OSS Japan - Application Monitoring And Tracing In Kubernetes
David vonThenen
 
PDF
SCaLE 16x - Application Monitoring And Tracing In Kubernetes
David vonThenen
 
The Future of UI/UX AI Generated Interfaces Tailored Just in Time
David vonThenen
 
Adaptive RAG Systems with Knowledge Graphs Building Reinforcement Learning Dr...
David vonThenen
 
The Rise of Agentic AI Harnessing Open Source for Dynamic Decision Making
David vonThenen
 
Explaining the Unexplainable Python Tools for AI Transparency using Captum
David vonThenen
 
2025 All Things Open AI - Leveraging Knowledge Graphs for RAG - A Smarter App...
David vonThenen
 
2025 NVIDIA GTC: Crack the AI Black Box: Practical Techniques for Explainable AI
David vonThenen
 
2025 SCaLE 22x - Training Multi-Modal ML Classification Models for Real Time ...
David vonThenen
 
2025 SCaLE 22x - Demystifying Building Natural Language Processing ML Models ...
David vonThenen
 
2025 DeveloperWeek - The Sound of Innovation: Why Voice Cloning Will Redefine...
David vonThenen
 
2025 Developer Week - Navigating the Edge-Cloud Bridge_ Building Resource Opt...
David vonThenen
 
2024 RTC Conference - Training Machine Learning Classification Models for Cre...
David vonThenen
 
2024 RTC CONF - Building Multiple Natural Language Processing Models to Work ...
David vonThenen
 
SCaLE 21x - Voice-Activated AI Collaborators: A Hands-On Guide Using LLMs in ...
David vonThenen
 
RTC Conference 2023 - Edge Devices as Interactive Personal Assistants_ Unleas...
David vonThenen
 
RTC Conference 2023 - Enhancing Real-Time WebRTC Conversation Understanding U...
David vonThenen
 
Cloud Native Rejekts Europe 2022 - Learnings From Creating CI/CD Pipelines
David vonThenen
 
KubeCon EU 2020 - Provider vSphere All Things vSphere Working Group
David vonThenen
 
KubeCon Europe 2019 - VMware SIG - Intro to the CSI driver
David vonThenen
 
OSS Japan - Application Monitoring And Tracing In Kubernetes
David vonThenen
 
SCaLE 16x - Application Monitoring And Tracing In Kubernetes
David vonThenen
 
Ad

Recently uploaded (20)

PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
July Patch Tuesday
Ivanti
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
July Patch Tuesday
Ivanti
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
Ad

AI Dev Summit 2024 - Empower Your AI Journey_ Hands-on Machine Learning with fastai for Graphics and NLP

  • 1. Empower Your AI Journey Hands-on Machine Learning with fastai for Graphics and NLP David vonThenen Software Engineer/Developer Advocate @dvonthenen San Francisco, CA May 29-30, 2024
  • 2. David vonThenen ● Are you Human or AI? ● I want 5 Kubernetes ● Virtual Machines are Real ● Replacing Myself with Bots… ● Cloudy, cloudy, cloudy… ● There is storage for that! @dvonthenen San Francisco, CA May 29-30, 2024
  • 3. 3 Agenda ● What is fastai? ● Image Classification ● Natural Language Processing ○ Dataset Considerations ● Resources ● Q&A San Francisco, CA May 29-30, 2024
  • 4. San Francisco, CA May 29-30, 2024 What is fastai? Accelerate Model Creation
  • 5. fastai: Deep Learning Library for All 5 fastai - https://github.com/fastai/fastai ● Deep Learning Abstraction on top of PyTorch ● Top-Down Approach to Learning AI/ML ● Fast and Easy Neural Network Training ○ ML Model is Your Output ● Usability Without Sacrificing Performance ● Provides High-level Components ● Enables Transfer Learning with Minimal Code ● Audience: Everyone! ○ Beginners in Machine Learning ○ Rapid Prototyping for Experts San Francisco, CA May 29-30, 2024
  • 6. Great! What Do I Need to Know? 6 Core Components: ● Data Loaders - Flexible Data Loading and Transformations ● Learner - Training Models Wide Variety of Domains (these are Data Loaders! ● Images ● Text ● Tabular ● Collaborative) Filtering ● Data Block Generic/Custom) ● And more! San Francisco, CA May 29-30, 2024 Opinionated Data, Pipeline, Memory Management
  • 7. San Francisco, CA May 29-30, 2024 Image Classifier Discussion and Demo
  • 8. Problem to Blueprint 8 Image Classifier for Dogs Boxer, German Shepherd, Golden Retriever) ● Need to Create a Dataset ○ 30 Unique Images of Each Type of Dog ○ Make Images Uniform ○ Incorrectly Download Dogs? ● Build and Train the Model ○ Create Dataset Based on Categories ○ Augment Pre-Trained Model with Layers ○ Train the New Model Apply Weights) ● Test the Model with Images ○ Success? Probability? San Francisco, CA May 29-30, 2024
  • 9. San Francisco, CA May 29-30, 2024 Demo Who Doesn't Love Dogs?!?
  • 10. San Francisco, CA May 29-30, 2024 Natural Language Processing Discussion and Demo https://youtu.be/DC9Q6hlQiTs
  • 11. Question vs Non-Question Sentences 11 How to Approach This Problem: ● Need to Create a Dataset ○ Big Problem: Where to Get Data? ○ Qs: Stanford Question Answering Dataset ■ Remove the Answers ○ Non-Qs: Use Answers OR Rand Wiki Pages ● Build and Train the Model ○ Using Tabular/DataFrame ■ 2 Cols: Sentence, Is Question ● Test the Model ○ Create a Q and Non-Q and Test! San Francisco, CA May 29-30, 2024
  • 12. San Francisco, CA May 29-30, 2024 Demo Identifying Questions in Text https://youtu.be/jjzgaRIrYSg
  • 13. Right Data for the Right Problem 13 Why Did We Fail At First? ● Dataset Needs to be Representative ○ Obtain the Right Kind of Data ○ Verify the Data is Correct ○ Modify the Data to Fit?!? ● Verify the Model ○ Test and Retest ○ More Data or More "Alignment" ○ Repeat! San Francisco, CA May 29-30, 2024
  • 14. Getting Answers to Questions 14 Use Case: In Meetings, Phone Calls, etc, Use an LLM to Answer Questions ● Useful Because: ○ Provide "Level Set" on Participant Knowledge ○ Get Answers Now vs "Taking Offline" ○ Provide Other Lesser Known Facts ● LLM Could Be: ○ Geared Toward a Domain (ex. Medical) ○ Trained on Company Docs ○ Generalized Demo: Deepgram STT + OpenAI + Deepgram TTS San Francisco, CA May 29-30, 2024
  • 15. San Francisco, CA May 29-30, 2024 Demo Answering Questions with LLMs in Real-Time Using Deepgram and OpenAI https://youtu.be/NCDLFGmK9Sw
  • 16. San Francisco, CA May 29-30, 2024 Resources Materials, Links, Docs, Extras, etc
  • 17. All Of The Things 17 CLICK HERE] for All Material Contained in this Session CLICK HERE All the Things in This Presentation ● Example: Which Breed of Dog? ● Example: Question vs Non-Question ● Example: LLM Answering Questions in Meetings ● STT & TTS Deepgram Speech-to-Text and Text-to-Speech ● Additional Readings: ○ Python for Data Analysis: Data Wrangling with pandas, NumPy, and Jupyter by Wes McKinney ○ Deep Learning for Coders with fastai and pytorch by Howard Gugger San Francisco, CA May 29-30, 2024
  • 18. All Of The Things(part 2) 18 CLICK HERE] for All Material Contained in this Session CLICK HERE How To Get Started ● Tutorials ○ FREE Online Courses - https://course.fast.ai/ ○ Examples via Jupyter Notebook - https://github.com/fastai/fastbook ● Image Processing ○ Kaggle Jupyter Book by Jeremy Howard ○ FastAI Data Tutorial - Image Classification by Julius Simonelli ● Natural Language Processing ○ NLP from Scratch with PyTorch, fastai, and HuggingFace by Amar Saini ○ A Hackers' Guide to Language Models San Francisco, CA May 29-30, 2024
  • 19. Thank you David vonThenen Software Engineer/Developer Advocate @dvonthenen San Francisco, CA May 29-30, 2024