Where there’s a will,
there’s a way
The mad genius of using LLMs as classifiers
Katherine Munro, DSC Belgrade 2024
2
About Me
Data Scientist,
Computational Linguist,
Conversational AI Engineer
Catch me talking about all
things data, AI, NLP, and
innovation, here…
linkedin.com/in/katherine-
munro
katherine-munro.com
Or else, just find me in the
mountains …
Katherine
Munro,
DSC
Belgrade,
November
2024
What we’ll talk about:
• LLMs as classifiers:A weird idea you should probably try anyway
• Intent Detection: A real-world classification use case
• Possible architectures and techniques
• Generalizing to other classification tasks
Katherine
Munro,
DSC
Belgrade,
November
2024
4
LLMs as classifiers is a weird idea…
A Machine Learning (ML) technique where a model
must classify an input into one of a set of possible
classes, denoted by a single label. For example:
• Sentiment Detection
• Email Triage
Complex algorithms able to solve diverse tasks and
respond with diverse, long-form outputs.
Classification: Large Language Models:
Typically, classification should be:
• Accurate
• Consistent
• Interpretable
• Fast
LLMs are:
• Slow
• Random
• Not at all interpretable
Katherine
Munro,
DSC
Belgrade,
November
2024
Depending on the method used, LLM classifiers offer the following benefits:
• Useful for prototyping.
• No training required.
• Scalability and flexibility when adding new classes.
• An existing body of data sets, evaluation metrics and best practices.
• Ability to handle diverse, inconsistent, multi-modal data.
• Easily transferable to multiple languages.
5
… but you might want to do classification with LLMs anyway
Katherine
Munro,
DSC
Belgrade,
November
2024
Introduction to Intent
Detection
A real-world classification use case
Katherine
Munro,
DSC
Belgrade,
November
2024
When customers call, we need to quickly and accurately route
them to a customer service team or a self-service use case.
This requires detecting which product or service they’re calling
about, and what need they’re trying to solve.
This is a classification task.
7
Intent detection for a conversationalAI agent
Katherine
Munro,
DSC
Belgrade,
November
2024
Variability in how customers express themselves is confusing, even for humans.
Meaning is built up across multiple turns of dialogue.
Context is key: Which services does the customer have? What’s the status of those services?
Customers don’t always know what they want or need.
Misalignment between how customers think about problems and how we’re set up to help them.
Customers can have multiple intents.
Data “noise”, e.g. speech transcription errors for our voicebot.
8
Why intent detection is hard…
Helpful resource:
“HowYour Digital Personal Assistant
Understands WhatYou Want (And
Gets it Done)"
Katherine
Munro,
DSC
Belgrade,
November
2024
Despite the challenges, there are also benefits to this approach. Intent-based logic:
• helps us simplify natural language to make it workable
• helps us plan quality customer experiences using conversation design
• makes our system more deterministic, interpretable and testable
• can reduce hallucinations and improve efficiency by eradicating unnecessary function or API calls
• can be practical and logical for other parts of the business, e.g. for reporting and planning employee training
9
… But we do it anyway
Helpful resource:
“Why you still need NLP skills
in the ‘age of ChatGPT’"
Katherine
Munro,
DSC
Belgrade,
November
2024
If intents are so tricky, and LLMs so great, why do intent detection at all?
Why not use an LLM end to end?
It’s possible. But:
• Routine use cases don’t need an LLM’s creativity or spontaneity.
• LLMs can be a better “front door”, whenever you have good routing
logic but poor intent detection accuracy.
• You might not trust an LLM with your most valuable interactions –
customer contacts.
• Converting an open problem to a closed one, and breaking it down into
stages, are already prompting best practices.
• You can still use LLMs for other use cases, e.g RAG and chit-chat.
10
Couldn’t we use LLMs for the lot?
Katherine
Munro,
DSC
Belgrade,
November
2024
Katherine
Munro,
DSC
Belgrade,
November
2024
Intent Detection Strategies
PossibleTechniques andArchitectures
In a classic NLU bot, a trained ML model performs the intent (and possibly) topic detection.
Business logic, encoded as rules, uses these predictions to define a final routing.
LLMs can be used not for detection, but for rephrasing system utterances.
12
“The classic”
Katherine
Munro,
DSC
Belgrade,
November
2024
Pros: ML models can be simple, small, fast, interpretable, and highly specified.
Evaluation is concrete and can be automated.
Cons: Adding new prediction classes and implementing the business logic is not very scalable.
Real-world example: Many, many Conversational AI systems that exist today.
This approach still uses an ML model for the initial classification,
but defers to an LLM when the prediction uncertainty is high.
13
“The hybrid”
Helpful resource:
Vux Podcast: The AI chatbot
serving 10m customers a year
Katherine
Munro,
DSC
Belgrade,
November
2024
Pros: All the benefits of ML models, plus LLM strengths, somewhat
cheap.
Cons: System becomes more complex to deploy and maintain.
LLMs add expense, latency, opacity and unpredictability.
Real-world example: Lufthansa Group
Another hybrid approach:An NLU model retrieves the top N most likely intents.
These are injected into a prompt and given to an LLM to make the final classification.
14
“The filter”
Helpful resource: Benchmarking
hybrid LLM classification
systems,Voiceflow
Katherine
Munro,
DSC
Belgrade,
November
2024
Pros: Gives a chance to recover from poor ML accuracy or model drift.
Cons: Same as before.
Real-world example:Voiceflow
AKA “few-shot prompting”: have your bot make a prediction based on the descriptions, and a few examples, of the possible
intents.
15
“The fast learner”
Katherine
Munro,
DSC
Belgrade,
November
2024
Pros: Uses LLMs’ out-of-the-box capabilities, making implementing new cases much more scalable.
Cons: All the issues LLMs bring, especially costs, security and latency challenges.
Real-world example: Rasa
With few-shot inference, as you add more use cases, the prompt explodes, causing latency and accuracy issues.
More classes can also lead to lower confidence and more fallbacks.
In an embedding approach:
• you embed the intent labels and descriptions
• retrieve the ones that are most similar to the (embedded) user utterance
• and inject only those into the LLM prompt.
16
“The embedder”
Katherine
Munro,
DSC
Belgrade,
November
2024
Pros: Tackles latency and accuracy problems and adds some interpretability.
Cons: Dissimilarities between customers’ spoken style and developer descriptions can make matching difficult, and thus
impact accuracy.
Real-world example: Rasa
LLM next token prediction is essentially a classification task, where the possible labels are its token vocabulary.
To fine-tune an LLM for classification:
• Attach a custom head to the model, e.g. a Gradient Boosted tree or logistic regression
• Fine tune it to map the logit distribution of the whole vocabulary to just the output labels you want.
• Could also fine-tune the LLM combined with this head, e.g. LORA training for the LLM component
17
“The tuner”
Katherine
Munro,
DSC
Belgrade,
November
2024
Pros: Could provide more control, interpretability and accuracy.
Cons: All the complexity that comes with adding an ML model to your stack.
LLMs for Classification
Tasks with LLM Studio
Helpful resource: Mastering
Classification & Regression with LLMs:
Insights from Kaggle Competitions
Before: The Classic:An NLU based bot, featuring logical rules based on predictions by ML classifiers for intent and topic.
Right Now: The Fast Learner: Few-shot inference is working well, especially for prototyping.
In the future, we’d like to try:
• Approaches which reduce the number of possible labels for the LLM, e.g. The Embedder
• Multi-modal models: to (hopefully) make our system simpler, faster, and less sensitive to ASR issues
18
What we’ve tried so far
Helpful resource:
5 Ways to Optimize your Prompts
for LLM Intent Classification
Katherine
Munro,
DSC
Belgrade,
November
2024
Katherine
Munro,
DSC
Belgrade,
November
2024
What’s in it for you?
Generalizing to other classification tasks
These architectures can be applied to any kind of labelled classification problem.
Start simple and try it out!:
• Grab some production data
• Label it: It’s worth the effort!
• Establish your baselines: simple majority, and existing classifier accuracy
• Try few-shot inference, either via an API or direct in the UI
• Hybrid Approach: Test only the samples where the current classifiers fail; add the results
• Fast-learner Approach: Compare all samples against existing classifier predictions
If results are promising, you can think about testing more complex approaches, i.e. filtering possible
intents using ML confidence scores (The Filter) or embeddings (The Embedder), or fine-tuning
a classifier component (TheTuner).
20
Putting it into practice
Katherine
Munro,
DSC
Belgrade,
November
2024
Prompting “best practices” can be unexpected and model-dependent.
Experiment a lot, and document what works best for your problem.
Include domain experts, e.g. conversation designers or call-centre agents, when
designing prompts and exploring model outputs.
Remember data protection laws when testing a public LLM!
Situate your problem in a business context. For example:
• What are the business impacts of different kinds of misclassifications?
• Is your data telling you the full story?
Get clever about how to measure progress.
21
A FewTopTips
Katherine
Munro,
DSC
Belgrade,
November
2024
Helpful resource:
No Baseline? No Benchmarks? No
Biggie! An Experimental Approach to
Agile Chatbot Development
Katherine
Munro,
DSC
Belgrade,
November
2024
Questions? Get inTouch!
linkedin.com/in/katherine-munro
katherine-munro.com

More Related Content

PDF
Quick Start Guide To Large Language Models Second Edition Sinan Ozdemir
PDF
Presentation Session 2 -Context Grounding.pdf
PPTX
The Beginner's Guide To Large Language Models
PDF
Comparing LLMs using a Unified Performance Ranking System
PPTX
Large Language Models (LLMs) part one.pptx
PDF
Comparing LLMs Using a Unified Performance Ranking System
PDF
BUILDING Q&A EDUCATIONAL APPLICATIONS WITH LLMS - MARCH 2024.pdf
PDF
The Significance of Large Language Models (LLMs) in Generative AI2.pdf
Quick Start Guide To Large Language Models Second Edition Sinan Ozdemir
Presentation Session 2 -Context Grounding.pdf
The Beginner's Guide To Large Language Models
Comparing LLMs using a Unified Performance Ranking System
Large Language Models (LLMs) part one.pptx
Comparing LLMs Using a Unified Performance Ranking System
BUILDING Q&A EDUCATIONAL APPLICATIONS WITH LLMS - MARCH 2024.pdf
The Significance of Large Language Models (LLMs) in Generative AI2.pdf

Similar to Capcut Pro Crack For PC Latest Version {Fully Unlocked} 2025 (20)

PPTX
Understanding Machine Learning --- Chapter 2.pptx
PDF
Comparison of Large Language Models The Ultimate Guide.pdf
PPTX
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
PDF
Train foundation model for domain-specific language model
PDF
Large Language Models - Chat AI.pdf
PPTX
[DSC Europe 24] Guilherme Diaz-Berrio - The (mis)use of GenAI in Data Analytics
PDF
Top Comparison of Large Language ModelsLLMs Explained.pdf
PDF
solulab.com-Comparison of Large Language Models The Ultimate Guide (1).pdf
PPTX
LLMs and Case Study: Evolution of ChatGPT.pptx
PDF
Top Comparison of Large Language ModelsLLMs Explained.pdf
PDF
Top Comparison of Large Language ModelsLLMs Explained (2).pdf
PDF
Build a Large Language Model From Scratch MEAP Sebastian Raschka
PPTX
Understanding Large Language Models (1).pptx
PPTX
An Introduction to AI LLMs & SharePoint For Champions and Super Users Part 1
PDF
API Workshop Series Part 2: The Future of Intelligent User Interactions
PDF
Benchmarking Large Language Models with a Unified Performance Ranking Metric
PDF
Benchmarking Large Language Models with a Unified Performance Ranking Metric
PDF
Benchmarking Large Language Models with a Unified Performance Ranking Metric
PDF
solulab.com-Create Custom LLMs for Enterprise Solutions.pdf
PDF
solulab.com-Create Custom LLMs for Enterprise Solutions (1).pdf
Understanding Machine Learning --- Chapter 2.pptx
Comparison of Large Language Models The Ultimate Guide.pdf
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Train foundation model for domain-specific language model
Large Language Models - Chat AI.pdf
[DSC Europe 24] Guilherme Diaz-Berrio - The (mis)use of GenAI in Data Analytics
Top Comparison of Large Language ModelsLLMs Explained.pdf
solulab.com-Comparison of Large Language Models The Ultimate Guide (1).pdf
LLMs and Case Study: Evolution of ChatGPT.pptx
Top Comparison of Large Language ModelsLLMs Explained.pdf
Top Comparison of Large Language ModelsLLMs Explained (2).pdf
Build a Large Language Model From Scratch MEAP Sebastian Raschka
Understanding Large Language Models (1).pptx
An Introduction to AI LLMs & SharePoint For Champions and Super Users Part 1
API Workshop Series Part 2: The Future of Intelligent User Interactions
Benchmarking Large Language Models with a Unified Performance Ranking Metric
Benchmarking Large Language Models with a Unified Performance Ranking Metric
Benchmarking Large Language Models with a Unified Performance Ranking Metric
solulab.com-Create Custom LLMs for Enterprise Solutions.pdf
solulab.com-Create Custom LLMs for Enterprise Solutions (1).pdf
Ad

Recently uploaded (20)

PPTX
Swiggy API Scraping A Comprehensive Guide on Data Sets and Applications.pptx
PDF
IT Consulting Services to Secure Future Growth
PPTX
MCP empowers AI Agents from Zero to Production
PPTX
ROI from Efficient Content & Campaign Management in the Digital Media Industry
PDF
Crypto Loss And Recovery Guide By Expert Recovery Agency.
PPTX
Human-Computer Interaction for Lecture 1
PPTX
ESDS_SAP Application Cloud Offerings.pptx
PDF
Top 10 Project Management Software for Small Teams in 2025.pdf
PDF
Coding with GPT-5- What’s New in GPT 5 That Benefits Developers.pdf
PPTX
Folder Lock 10.1.9 Crack With Serial Key
PPTX
Lesson-3-Operation-System-Support.pptx-I
PPT
3.Software Design for software engineering
PPTX
Post-Migration Optimization Playbook: Getting the Most Out of Your New Adobe ...
PPTX
Foundations of Marketo Engage: Nurturing
PPTX
Streamlining Project Management in the AV Industry with D-Tools for Zoho CRM ...
PDF
infoteam HELLAS company profile 2025 presentation
PPTX
UNIT II: Software design, software .pptx
PPTX
Bandicam Screen Recorder 8.2.1 Build 2529 Crack
PDF
Module 1 - Introduction to Generative AI.pdf
PPTX
Chapter_05_System Modeling for software engineering
Swiggy API Scraping A Comprehensive Guide on Data Sets and Applications.pptx
IT Consulting Services to Secure Future Growth
MCP empowers AI Agents from Zero to Production
ROI from Efficient Content & Campaign Management in the Digital Media Industry
Crypto Loss And Recovery Guide By Expert Recovery Agency.
Human-Computer Interaction for Lecture 1
ESDS_SAP Application Cloud Offerings.pptx
Top 10 Project Management Software for Small Teams in 2025.pdf
Coding with GPT-5- What’s New in GPT 5 That Benefits Developers.pdf
Folder Lock 10.1.9 Crack With Serial Key
Lesson-3-Operation-System-Support.pptx-I
3.Software Design for software engineering
Post-Migration Optimization Playbook: Getting the Most Out of Your New Adobe ...
Foundations of Marketo Engage: Nurturing
Streamlining Project Management in the AV Industry with D-Tools for Zoho CRM ...
infoteam HELLAS company profile 2025 presentation
UNIT II: Software design, software .pptx
Bandicam Screen Recorder 8.2.1 Build 2529 Crack
Module 1 - Introduction to Generative AI.pdf
Chapter_05_System Modeling for software engineering
Ad

Capcut Pro Crack For PC Latest Version {Fully Unlocked} 2025

  • 1. Where there’s a will, there’s a way The mad genius of using LLMs as classifiers Katherine Munro, DSC Belgrade 2024
  • 2. 2 About Me Data Scientist, Computational Linguist, Conversational AI Engineer Catch me talking about all things data, AI, NLP, and innovation, here… linkedin.com/in/katherine- munro katherine-munro.com Or else, just find me in the mountains … Katherine Munro, DSC Belgrade, November 2024
  • 3. What we’ll talk about: • LLMs as classifiers:A weird idea you should probably try anyway • Intent Detection: A real-world classification use case • Possible architectures and techniques • Generalizing to other classification tasks Katherine Munro, DSC Belgrade, November 2024
  • 4. 4 LLMs as classifiers is a weird idea… A Machine Learning (ML) technique where a model must classify an input into one of a set of possible classes, denoted by a single label. For example: • Sentiment Detection • Email Triage Complex algorithms able to solve diverse tasks and respond with diverse, long-form outputs. Classification: Large Language Models: Typically, classification should be: • Accurate • Consistent • Interpretable • Fast LLMs are: • Slow • Random • Not at all interpretable Katherine Munro, DSC Belgrade, November 2024
  • 5. Depending on the method used, LLM classifiers offer the following benefits: • Useful for prototyping. • No training required. • Scalability and flexibility when adding new classes. • An existing body of data sets, evaluation metrics and best practices. • Ability to handle diverse, inconsistent, multi-modal data. • Easily transferable to multiple languages. 5 … but you might want to do classification with LLMs anyway Katherine Munro, DSC Belgrade, November 2024
  • 6. Introduction to Intent Detection A real-world classification use case Katherine Munro, DSC Belgrade, November 2024
  • 7. When customers call, we need to quickly and accurately route them to a customer service team or a self-service use case. This requires detecting which product or service they’re calling about, and what need they’re trying to solve. This is a classification task. 7 Intent detection for a conversationalAI agent Katherine Munro, DSC Belgrade, November 2024
  • 8. Variability in how customers express themselves is confusing, even for humans. Meaning is built up across multiple turns of dialogue. Context is key: Which services does the customer have? What’s the status of those services? Customers don’t always know what they want or need. Misalignment between how customers think about problems and how we’re set up to help them. Customers can have multiple intents. Data “noise”, e.g. speech transcription errors for our voicebot. 8 Why intent detection is hard… Helpful resource: “HowYour Digital Personal Assistant Understands WhatYou Want (And Gets it Done)" Katherine Munro, DSC Belgrade, November 2024
  • 9. Despite the challenges, there are also benefits to this approach. Intent-based logic: • helps us simplify natural language to make it workable • helps us plan quality customer experiences using conversation design • makes our system more deterministic, interpretable and testable • can reduce hallucinations and improve efficiency by eradicating unnecessary function or API calls • can be practical and logical for other parts of the business, e.g. for reporting and planning employee training 9 … But we do it anyway Helpful resource: “Why you still need NLP skills in the ‘age of ChatGPT’" Katherine Munro, DSC Belgrade, November 2024
  • 10. If intents are so tricky, and LLMs so great, why do intent detection at all? Why not use an LLM end to end? It’s possible. But: • Routine use cases don’t need an LLM’s creativity or spontaneity. • LLMs can be a better “front door”, whenever you have good routing logic but poor intent detection accuracy. • You might not trust an LLM with your most valuable interactions – customer contacts. • Converting an open problem to a closed one, and breaking it down into stages, are already prompting best practices. • You can still use LLMs for other use cases, e.g RAG and chit-chat. 10 Couldn’t we use LLMs for the lot? Katherine Munro, DSC Belgrade, November 2024
  • 12. In a classic NLU bot, a trained ML model performs the intent (and possibly) topic detection. Business logic, encoded as rules, uses these predictions to define a final routing. LLMs can be used not for detection, but for rephrasing system utterances. 12 “The classic” Katherine Munro, DSC Belgrade, November 2024 Pros: ML models can be simple, small, fast, interpretable, and highly specified. Evaluation is concrete and can be automated. Cons: Adding new prediction classes and implementing the business logic is not very scalable. Real-world example: Many, many Conversational AI systems that exist today.
  • 13. This approach still uses an ML model for the initial classification, but defers to an LLM when the prediction uncertainty is high. 13 “The hybrid” Helpful resource: Vux Podcast: The AI chatbot serving 10m customers a year Katherine Munro, DSC Belgrade, November 2024 Pros: All the benefits of ML models, plus LLM strengths, somewhat cheap. Cons: System becomes more complex to deploy and maintain. LLMs add expense, latency, opacity and unpredictability. Real-world example: Lufthansa Group
  • 14. Another hybrid approach:An NLU model retrieves the top N most likely intents. These are injected into a prompt and given to an LLM to make the final classification. 14 “The filter” Helpful resource: Benchmarking hybrid LLM classification systems,Voiceflow Katherine Munro, DSC Belgrade, November 2024 Pros: Gives a chance to recover from poor ML accuracy or model drift. Cons: Same as before. Real-world example:Voiceflow
  • 15. AKA “few-shot prompting”: have your bot make a prediction based on the descriptions, and a few examples, of the possible intents. 15 “The fast learner” Katherine Munro, DSC Belgrade, November 2024 Pros: Uses LLMs’ out-of-the-box capabilities, making implementing new cases much more scalable. Cons: All the issues LLMs bring, especially costs, security and latency challenges. Real-world example: Rasa
  • 16. With few-shot inference, as you add more use cases, the prompt explodes, causing latency and accuracy issues. More classes can also lead to lower confidence and more fallbacks. In an embedding approach: • you embed the intent labels and descriptions • retrieve the ones that are most similar to the (embedded) user utterance • and inject only those into the LLM prompt. 16 “The embedder” Katherine Munro, DSC Belgrade, November 2024 Pros: Tackles latency and accuracy problems and adds some interpretability. Cons: Dissimilarities between customers’ spoken style and developer descriptions can make matching difficult, and thus impact accuracy. Real-world example: Rasa
  • 17. LLM next token prediction is essentially a classification task, where the possible labels are its token vocabulary. To fine-tune an LLM for classification: • Attach a custom head to the model, e.g. a Gradient Boosted tree or logistic regression • Fine tune it to map the logit distribution of the whole vocabulary to just the output labels you want. • Could also fine-tune the LLM combined with this head, e.g. LORA training for the LLM component 17 “The tuner” Katherine Munro, DSC Belgrade, November 2024 Pros: Could provide more control, interpretability and accuracy. Cons: All the complexity that comes with adding an ML model to your stack. LLMs for Classification Tasks with LLM Studio Helpful resource: Mastering Classification & Regression with LLMs: Insights from Kaggle Competitions
  • 18. Before: The Classic:An NLU based bot, featuring logical rules based on predictions by ML classifiers for intent and topic. Right Now: The Fast Learner: Few-shot inference is working well, especially for prototyping. In the future, we’d like to try: • Approaches which reduce the number of possible labels for the LLM, e.g. The Embedder • Multi-modal models: to (hopefully) make our system simpler, faster, and less sensitive to ASR issues 18 What we’ve tried so far Helpful resource: 5 Ways to Optimize your Prompts for LLM Intent Classification Katherine Munro, DSC Belgrade, November 2024
  • 19. Katherine Munro, DSC Belgrade, November 2024 What’s in it for you? Generalizing to other classification tasks
  • 20. These architectures can be applied to any kind of labelled classification problem. Start simple and try it out!: • Grab some production data • Label it: It’s worth the effort! • Establish your baselines: simple majority, and existing classifier accuracy • Try few-shot inference, either via an API or direct in the UI • Hybrid Approach: Test only the samples where the current classifiers fail; add the results • Fast-learner Approach: Compare all samples against existing classifier predictions If results are promising, you can think about testing more complex approaches, i.e. filtering possible intents using ML confidence scores (The Filter) or embeddings (The Embedder), or fine-tuning a classifier component (TheTuner). 20 Putting it into practice Katherine Munro, DSC Belgrade, November 2024
  • 21. Prompting “best practices” can be unexpected and model-dependent. Experiment a lot, and document what works best for your problem. Include domain experts, e.g. conversation designers or call-centre agents, when designing prompts and exploring model outputs. Remember data protection laws when testing a public LLM! Situate your problem in a business context. For example: • What are the business impacts of different kinds of misclassifications? • Is your data telling you the full story? Get clever about how to measure progress. 21 A FewTopTips Katherine Munro, DSC Belgrade, November 2024 Helpful resource: No Baseline? No Benchmarks? No Biggie! An Experimental Approach to Agile Chatbot Development