SlideShare a Scribd company logo
NLP IN THE WILD
COLLEEN M. FARRELLY, DATASEMBLY
COMMON INDUSTRY NLP
PROBLEMS
• Sentiment analysis/tracking of customer
feedback
• Computational linguistics/psychology of
language usage
• Chatbots
• Translation services
• Supervised learning
• Document summary
Problem
formulation
Data
collection
Choice of
tools
(math/ML)
Application Results
CASE STUDY 1: CONSUMER GROUP
CLUSTERING
• Want to understand how
different groups interact
with a chatbot
• Sales implications
• Groups-specific needs
for future feature builds
• Chatbot conversation data
sample
• NLP to derive salient text
features
• Persistent homology to
CASE 2: SUPERVISED LEARNING
• Want to classify products by
type (such as fruit or canned
soup) using title text
• Data includes a small sample of
scraped titles from a sample of
retailers with manual annotation
of product type
• Text cleaning and embedding
algorithms to prepare the text
data for machine learning
• Supervised learning algorithm
to create the classier
CASE 3: TOPIC
MODELING
• Want to find main
topics discussed in a
corpus of documents
(poems)
• Poetry data sample
across genres of
poetry by a single
author
• Topic modeling to
classify poems
CASE 4:
TIME-
BASED
ANALYSIS
OF
MINDSET
• Want to quickly understand
changes in leader’s behavior at
onset of war
• Public statement sample by
president over course of several
weeks as input data
• NLP to derive linguistic features
• Longitudinal models and topology-
based changepoint algorithm on
linguistic feature time series
HELPFUL
PYTHON
PACKAGES
• NLP:
• NLTK (parts of speech tagging,
munging data…)
• Gensim (topic models)
• Vader (sentiment analysis)
• TDA
• Persim/ripser (persistent
homology)
• Kmapper (Mapper algorithm)
• Structural equation modeling/latent
class modeling
• Semopy (similar to lavaan in R)
CONTACT ME
• cfarrelly@med.miami.edu
• LinkedIn (Colleen M. Farrelly)

More Related Content

PPTX
Technical writing, an introduction to academic writing
Colleen Farrelly
 
PPTX
Technical writing for scientific journalism and lay science
Colleen Farrelly
 
PDF
Upl2015 assignment 2
Galala University
 
PDF
Upl2015 assignment 1
Galala University
 
PPTX
Research Process: Selecting and Evaluating Sources
Janice Orcutt
 
PDF
Soc 355
Tiffini Travis
 
PPTX
What is a survey paper
Aasheesh Tandon
 
PPTX
ResEval: Resource-oriented Research Impact Evaluation platform
Muhammad Imran
 
Technical writing, an introduction to academic writing
Colleen Farrelly
 
Technical writing for scientific journalism and lay science
Colleen Farrelly
 
Upl2015 assignment 2
Galala University
 
Upl2015 assignment 1
Galala University
 
Research Process: Selecting and Evaluating Sources
Janice Orcutt
 
What is a survey paper
Aasheesh Tandon
 
ResEval: Resource-oriented Research Impact Evaluation platform
Muhammad Imran
 

What's hot (18)

PPTX
Literaure searching for systematic reviews
PaulaFunnell
 
PPTX
Writing a literature review
Nancy Little
 
PPT
Ajay swayam
AjayRaj139
 
PDF
Research report Qualitative Psychology
Dr. Chinchu C
 
PPT
Good practice in researching: A qualitative and cross-disciplinary research
Richard Lalleman
 
PDF
Research project guidelines by nmims
Harshita Wankhedkar
 
PPT
Math history r
Jane Wu
 
PDF
Syllabus final
Dr. Shankar Subramaniam
 
PPT
Identifying Scholarly Articles
denyserodrigues
 
PPTX
Portfolio and Presentation Project - Cindy Cruz-Cabrera
University of the Philippines Diliman
 
PPT
Econ3132 spring2011
lindahauck
 
PPTX
Search of the Evidence: Effective Use of ICT
Dave Marcial
 
PDF
4. Publication Strategy - Iustin Dornescu (UoW)
RIILP
 
PPT
Data in the HS Classroom: When, Why, and How?
ICPSR
 
PPT
Scientific incubation: The “Interim” as case study in scientific writing by P...
SATN
 
PPTX
Are topic-specific search term, journal name and author name recommendations ...
GESIS
 
DOC
How to write a research proposal dr toh.docx
tohsc
 
PPT
Rubric assignment 2
kompellark
 
Literaure searching for systematic reviews
PaulaFunnell
 
Writing a literature review
Nancy Little
 
Ajay swayam
AjayRaj139
 
Research report Qualitative Psychology
Dr. Chinchu C
 
Good practice in researching: A qualitative and cross-disciplinary research
Richard Lalleman
 
Research project guidelines by nmims
Harshita Wankhedkar
 
Math history r
Jane Wu
 
Syllabus final
Dr. Shankar Subramaniam
 
Identifying Scholarly Articles
denyserodrigues
 
Portfolio and Presentation Project - Cindy Cruz-Cabrera
University of the Philippines Diliman
 
Econ3132 spring2011
lindahauck
 
Search of the Evidence: Effective Use of ICT
Dave Marcial
 
4. Publication Strategy - Iustin Dornescu (UoW)
RIILP
 
Data in the HS Classroom: When, Why, and How?
ICPSR
 
Scientific incubation: The “Interim” as case study in scientific writing by P...
SATN
 
Are topic-specific search term, journal name and author name recommendations ...
GESIS
 
How to write a research proposal dr toh.docx
tohsc
 
Rubric assignment 2
kompellark
 
Ad

Similar to Natural Language Processing in the Wild.pptx (20)

PDF
Jacob Eisenstein, Assistant Professor, School of Interactive Computing, Georg...
MLconf
 
PDF
IRJET - Cyberbulling Detection Model
IRJET Journal
 
PDF
Natural Language Processing, Techniques, Current Trends and Applications in I...
RajkiranVeluri
 
PPTX
CS269-01 (1).pptx
INyomanSwitrayana
 
PPTX
Natural Language Processing ktu syllabus module 1
AbhijithMWarrier1
 
PDF
Yulia-Tsvetkov-slides-AI-and-ethics-projects.pdf
scribdaccount314159
 
PPTX
Building NLP solutions for Davidson ML Group
botsplash.com
 
PDF
An in-depth review on News Classification through NLP
IRJET Journal
 
PDF
NLP for Everyday People
Rebecca Bilbro
 
PDF
NLP Project Full Cycle
Vsevolod Dyomkin
 
PPTX
Unit - I Sentiment anlysis with logistic regression.pptx
AnilkumarBrahmane2
 
PPTX
Presentacion_Procesamiento_Lenguaje.pptx
TeresaGarca89
 
PPTX
HateSpeech Detection.pptx
FarazulHoda2
 
PPTX
Natural language processing and search
Nathan McMinn
 
PDF
Hate Speech Recognition System through NLP and Deep Learning
IRJET Journal
 
PPTX
Building NLP solutions using Python
botsplash.com
 
PPT
NLP Tasks and Applications.ppt useful in
Kumari Naveen
 
PPT
lect36-tasks.ppt
HaHa501620
 
PDF
DataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyan
rudolf eremyan
 
PDF
Mining Opinion Features in Customer Reviews
IJCERT JOURNAL
 
Jacob Eisenstein, Assistant Professor, School of Interactive Computing, Georg...
MLconf
 
IRJET - Cyberbulling Detection Model
IRJET Journal
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
RajkiranVeluri
 
CS269-01 (1).pptx
INyomanSwitrayana
 
Natural Language Processing ktu syllabus module 1
AbhijithMWarrier1
 
Yulia-Tsvetkov-slides-AI-and-ethics-projects.pdf
scribdaccount314159
 
Building NLP solutions for Davidson ML Group
botsplash.com
 
An in-depth review on News Classification through NLP
IRJET Journal
 
NLP for Everyday People
Rebecca Bilbro
 
NLP Project Full Cycle
Vsevolod Dyomkin
 
Unit - I Sentiment anlysis with logistic regression.pptx
AnilkumarBrahmane2
 
Presentacion_Procesamiento_Lenguaje.pptx
TeresaGarca89
 
HateSpeech Detection.pptx
FarazulHoda2
 
Natural language processing and search
Nathan McMinn
 
Hate Speech Recognition System through NLP and Deep Learning
IRJET Journal
 
Building NLP solutions using Python
botsplash.com
 
NLP Tasks and Applications.ppt useful in
Kumari Naveen
 
lect36-tasks.ppt
HaHa501620
 
DataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyan
rudolf eremyan
 
Mining Opinion Features in Customer Reviews
IJCERT JOURNAL
 
Ad

More from Colleen Farrelly (20)

PPTX
Generative AI for Social Good at Open Data Science East 2024
Colleen Farrelly
 
PPTX
Hands-On Network Science, PyData Global 2023
Colleen Farrelly
 
PPTX
Modeling Climate Change.pptx
Colleen Farrelly
 
PPTX
Natural Language Processing for Beginners.pptx
Colleen Farrelly
 
PPTX
The Shape of Data--ODSC.pptx
Colleen Farrelly
 
PPTX
Generative AI, WiDS 2023.pptx
Colleen Farrelly
 
PPTX
Emerging Technologies for Public Health in Remote Locations.pptx
Colleen Farrelly
 
PPTX
Applications of Forman-Ricci Curvature.pptx
Colleen Farrelly
 
PPTX
Geometry for Social Good.pptx
Colleen Farrelly
 
PPTX
Topology for Time Series.pptx
Colleen Farrelly
 
PPTX
Time Series Applications AMLD.pptx
Colleen Farrelly
 
PPTX
An introduction to quantum machine learning.pptx
Colleen Farrelly
 
PPTX
An introduction to time series data with R.pptx
Colleen Farrelly
 
PPTX
NLP: Challenges and Opportunities in Underserved Areas
Colleen Farrelly
 
PPTX
Geometry, Data, and One Path Into Data Science.pptx
Colleen Farrelly
 
PPTX
Topological Data Analysis.pptx
Colleen Farrelly
 
PPTX
Transforming Text Data to Matrix Data via Embeddings.pptx
Colleen Farrelly
 
PPTX
SAS Global 2021 Introduction to Natural Language Processing
Colleen Farrelly
 
PPTX
2021 American Mathematical Society Data Science Talk
Colleen Farrelly
 
PPTX
WIDS 2021--An Introduction to Network Science
Colleen Farrelly
 
Generative AI for Social Good at Open Data Science East 2024
Colleen Farrelly
 
Hands-On Network Science, PyData Global 2023
Colleen Farrelly
 
Modeling Climate Change.pptx
Colleen Farrelly
 
Natural Language Processing for Beginners.pptx
Colleen Farrelly
 
The Shape of Data--ODSC.pptx
Colleen Farrelly
 
Generative AI, WiDS 2023.pptx
Colleen Farrelly
 
Emerging Technologies for Public Health in Remote Locations.pptx
Colleen Farrelly
 
Applications of Forman-Ricci Curvature.pptx
Colleen Farrelly
 
Geometry for Social Good.pptx
Colleen Farrelly
 
Topology for Time Series.pptx
Colleen Farrelly
 
Time Series Applications AMLD.pptx
Colleen Farrelly
 
An introduction to quantum machine learning.pptx
Colleen Farrelly
 
An introduction to time series data with R.pptx
Colleen Farrelly
 
NLP: Challenges and Opportunities in Underserved Areas
Colleen Farrelly
 
Geometry, Data, and One Path Into Data Science.pptx
Colleen Farrelly
 
Topological Data Analysis.pptx
Colleen Farrelly
 
Transforming Text Data to Matrix Data via Embeddings.pptx
Colleen Farrelly
 
SAS Global 2021 Introduction to Natural Language Processing
Colleen Farrelly
 
2021 American Mathematical Society Data Science Talk
Colleen Farrelly
 
WIDS 2021--An Introduction to Network Science
Colleen Farrelly
 

Recently uploaded (20)

PDF
Key_Statistical_Techniques_in_Analytics_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PPTX
Pipeline Automatic Leak Detection for Water Distribution Systems
Sione Palu
 
PPTX
Employee Salary Presentation.l based on data science collection of data
barridevakumari2004
 
PPTX
Complete_STATA_Introduction_Beginner.pptx
mbayekebe
 
PPTX
INFO8116 - Week 10 - Slides.pptx data analutics
guddipatel10
 
PPTX
IP_Journal_Articles_2025IP_Journal_Articles_2025
mishell212144
 
PPTX
Web dev -ppt that helps us understand web technology
shubhragoyal12
 
PPTX
Data Security Breach: Immediate Action Plan
varmabhuvan266
 
PPTX
Fuzzy_Membership_Functions_Presentation.pptx
pythoncrazy2024
 
PPTX
INFO8116 -Big data architecture and analytics
guddipatel10
 
PDF
SUMMER INTERNSHIP REPORT[1] (AutoRecovered) (6) (1).pdf
pandeydiksha814
 
PDF
WISE main accomplishments for ISQOLS award July 2025.pdf
StatsCommunications
 
PPTX
Introduction-to-Python-Programming-Language (1).pptx
dhyeysapariya
 
PDF
Blue Futuristic Cyber Security Presentation.pdf
tanvikhunt1003
 
PDF
Technical Writing Module-I Complete Notes.pdf
VedprakashArya13
 
PDF
Chad Readey - An Independent Thinker
Chad Readey
 
PDF
An Uncut Conversation With Grok | PDF Document
Mike Hydes
 
PPT
Grade 5 PPT_Science_Q2_W6_Methods of reproduction.ppt
AaronBaluyut
 
PDF
TIC ACTIVIDAD 1geeeeeeeeeeeeeeeeeeeeeeeeeeeeeer3.pdf
Thais Ruiz
 
PDF
Classifcation using Machine Learning and deep learning
bhaveshagrawal35
 
Key_Statistical_Techniques_in_Analytics_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
Pipeline Automatic Leak Detection for Water Distribution Systems
Sione Palu
 
Employee Salary Presentation.l based on data science collection of data
barridevakumari2004
 
Complete_STATA_Introduction_Beginner.pptx
mbayekebe
 
INFO8116 - Week 10 - Slides.pptx data analutics
guddipatel10
 
IP_Journal_Articles_2025IP_Journal_Articles_2025
mishell212144
 
Web dev -ppt that helps us understand web technology
shubhragoyal12
 
Data Security Breach: Immediate Action Plan
varmabhuvan266
 
Fuzzy_Membership_Functions_Presentation.pptx
pythoncrazy2024
 
INFO8116 -Big data architecture and analytics
guddipatel10
 
SUMMER INTERNSHIP REPORT[1] (AutoRecovered) (6) (1).pdf
pandeydiksha814
 
WISE main accomplishments for ISQOLS award July 2025.pdf
StatsCommunications
 
Introduction-to-Python-Programming-Language (1).pptx
dhyeysapariya
 
Blue Futuristic Cyber Security Presentation.pdf
tanvikhunt1003
 
Technical Writing Module-I Complete Notes.pdf
VedprakashArya13
 
Chad Readey - An Independent Thinker
Chad Readey
 
An Uncut Conversation With Grok | PDF Document
Mike Hydes
 
Grade 5 PPT_Science_Q2_W6_Methods of reproduction.ppt
AaronBaluyut
 
TIC ACTIVIDAD 1geeeeeeeeeeeeeeeeeeeeeeeeeeeeeer3.pdf
Thais Ruiz
 
Classifcation using Machine Learning and deep learning
bhaveshagrawal35
 

Natural Language Processing in the Wild.pptx

  • 1. NLP IN THE WILD COLLEEN M. FARRELLY, DATASEMBLY
  • 2. COMMON INDUSTRY NLP PROBLEMS • Sentiment analysis/tracking of customer feedback • Computational linguistics/psychology of language usage • Chatbots • Translation services • Supervised learning • Document summary
  • 4. CASE STUDY 1: CONSUMER GROUP CLUSTERING • Want to understand how different groups interact with a chatbot • Sales implications • Groups-specific needs for future feature builds • Chatbot conversation data sample • NLP to derive salient text features • Persistent homology to
  • 5. CASE 2: SUPERVISED LEARNING • Want to classify products by type (such as fruit or canned soup) using title text • Data includes a small sample of scraped titles from a sample of retailers with manual annotation of product type • Text cleaning and embedding algorithms to prepare the text data for machine learning • Supervised learning algorithm to create the classier
  • 6. CASE 3: TOPIC MODELING • Want to find main topics discussed in a corpus of documents (poems) • Poetry data sample across genres of poetry by a single author • Topic modeling to classify poems
  • 7. CASE 4: TIME- BASED ANALYSIS OF MINDSET • Want to quickly understand changes in leader’s behavior at onset of war • Public statement sample by president over course of several weeks as input data • NLP to derive linguistic features • Longitudinal models and topology- based changepoint algorithm on linguistic feature time series
  • 8. HELPFUL PYTHON PACKAGES • NLP: • NLTK (parts of speech tagging, munging data…) • Gensim (topic models) • Vader (sentiment analysis) • TDA • Persim/ripser (persistent homology) • Kmapper (Mapper algorithm) • Structural equation modeling/latent class modeling • Semopy (similar to lavaan in R)
  • 9. CONTACT ME • [email protected] LinkedIn (Colleen M. Farrelly)