SlideShare a Scribd company logo
Natural language processing 2
 Overview
 Basic knowledge
 Demonstration
LOGO

USER : Men are all alike.
ELIZA : In what way?
USER : They’re always bugging us about something or other.
ELIZA : Can you think of a specific example?
USER : Well, my boyfriend made me come here.
ELIZA : Your boyfriend made you come here ?
USER : He says I’m depressed much of the time.
ELIZA : I am sorry to hear you are depressed.
USER : It’s true, I am unhappy.
ELIZA : Do you think coming here will help you not to be
unhappy?
USER : I need some help; that much seems certain.
ELIZA : What would it mean to you if you got some help?
USER : Perhaps I could learn to get along with my mother.
ELIZA : Tell me more about your family.
USER : My mother takes care of me.
ELIZA : Who else in your family takes care of you?
USER : My father.
ELIZA : Your father ?
USER : You are like my father in some ways.
ELIZA : What resemblance do you see?
 A sub-field of Artificial Intelligent, since 1960s …


 Concerned with the interactions between computers and
  human languages with one ultimate goal : Computers can
  “understand” human



 Many applications in real world
 Natural language unit?
    Natural language understanding
    Natural language generation


 Data?
    Speech processing
    Text processing


Natural language text understanding!
 Task of generating natural language from a machine
  representation
 May be viewed as the opposite of natural language
  understanding .

 Applications:
   Jokes generation
   Textual summaries of databases
   Enhancing accessibility
 An advanced subtopic of NLP deals with reading
  comprehension
 More complex than NLG
 Many commercial interest in this field
   News-gathering
   Data-Mining
   Voice-Activation
   Large-scale content analysis
 Logic is too clear, the lost of flexibility cause
  difficulties in NLP

 Examples :
   Time flies like an arrow
  Can be understood in 7 ways !!!

   I never said she stole my money !
      Someone else said it, but I didn't.
 Logic is too clear, the lost of flexibility become
  difficulties in NLP

 Examples :
   Time flies like an arrow
  Can be understood in 7 ways !!!

   I never said she stole my money !
      I simply didn't ever say it
 Logic is too clear, the lost of flexibility become
  difficulties in NLP

 Examples :
   Time flies like an arrow
  Can be understood in 7 ways !!!

   I never said she stole my money !
      I might have implied it in some way, but I never explicitly said it
 Logic is too clear, the lost of flexibility become
  difficulties in NLP

 Examples :
   Time flies like an arrow
  Can be understood in 7 ways !!!

   I never said she stole my money !
      I said someone took it; I didn't say it was she
 Logic is too clear, the lost of flexibility become
  difficulties in NLP

 Examples:
   Time flies like an arrow
  Can be understood in 7 ways !!!

   I never said she stole my money !
      I just said she probably borrowed it
 Logic is too clear, the lost of flexibility become
  difficulties in NLP

 Examples :
   Time flies like an arrow
  Can be understood in 7 ways !!!

   I never said she stole my money !
      I said she stole someone else's money
 Logic is too clear, the lost of flexibility become
  difficulties in NLP

 Examples :
   Time flies like an arrow
  Can be understood in 7 ways !!!

   I never said she stole my money !
      I said she stole something, but not my money
 Words combination and division
 Stress placing on words
 The properties of subjects
   We gave the monkeys the bananas because they were
    hungry
   We gave the monkeys the bananas because they were
    over-ripe
 Specifying which word an adjective applies to
   A pretty little girls' school
 Involves reasoning about the world
 Embedded a social system of people interacting
   persuading, insulting and amusing them
   changing over time
 Homonymous
Natural language processing 2
 Automatic Summarization
 Information Extraction
 Grammar Testing
Natural language processing 2
Natural language processing 2
Natural language processing 2
Natural language processing 2
Natural language processing 2
Natural language processing 2
Natural language processing 2
 ePi Group:
   Automatic Vietnamese processing system
   www.baomoi.com
      Collecting news from all Vietnamese e-newspapers

 EVTrans – Softex Co Ltd.
 Cyclop
 VnKim
Natural language processing 2
Natural language processing 2
Natural language processing 2
Natural language processing 2
 Morphological analysis :
   Individual words are analyzed into their
     components
 Syntactic analysis
   Linear sequence of words are transformed
      into structures that show how the words
      relate to each other
 Semantic analysis
    A transformation is made from the input
     text to an internal representation that
     reflects the meaning
 Pragmatic analysis
    To reinterpret what was said to what was
     actually meant
 Discourse analysis
    Resolving references between sentences
Morphology

Syntax

Semantic

Pragmatic

Discourse
Morphology

Syntax

Semantic

Pragmatic

Discourse
 Morphemes: smallest meaningful unit
 spoken units of language.
   Stem: book, cat, car, …
   Affixes : un-, -s, -es, ..               Morphology

   Clitic: ‘ve, ‘m                          Syntax

                                             Semantic
 Morphological parsing: parsing a word
                                             Pragmatic
 into stem and affixes and identifying the
                                             Discourse
 parts and their relationships
 Word Classes
   Parts of speech: noun, verb, adjectives,
    etc.
                                               Morphology
   Word class dictates how a word combines
    with morphemes to form new words           Syntax

                                               Semantic
 Examples                                     Pragmatic
   Books: book + s
                                               Discourse
   Unladylike = un + lady + like
 Vietnamese?
   Ăn = ăn
                                  Morphology
   Uống = uống
   Xe = xe                       Syntax

                                  Semantic

 No ‘Xes’ in Vietnamese!         Pragmatic
 Problems are text tokenizing.   Discourse
 Why parse words?

                                          Morphology
   To identify a word’s part-of-speech
   To identify a word’s stem (IR)        Syntax

                                          Semantic

… then?                                   Pragmatic
   Spell- checking
                                          Discourse
   To predict next words
   To predict the word’s accent
 Ambiguity
   I want her to go to the cinema with me
                                             Morphology
  To - infinitive?                           Syntax

  To - preposition?                          Semantic

                                             Pragmatic
   Con ngựa đá đá con ngựa đá.
                                             Discourse



    đá = đá?
 How to implement?
   Regular expression
   Finite State Transducers (FST)
   Finite State Accepter (FSA)      Morphology

                                     Syntax
  *.exe                              Semantic
  ir??man
                                     Pragmatic
  b[0-9]+ *(Mb|[Mm]egabytes?)b
                                     Discourse
Natural language processing 2
 Relate terms:
   Stem, stemming   Morphology
   Part of speech
                     Syntax
   N-gram
                     Semantic

                     Pragmatic

                     Discourse
Morphology

Syntax

Semantic

Pragmatic

Discourse
Morphology

SYNTAX   Syntax

         Semantic

         Pragmatic

         Discourse
 Linear sequence of words are transformed into
  structures that show how the words relate to
  each other.
                                                    Morphology
 Determine grammatical structure.
                                                    Syntax

                                                    Semantic

                                                    Pragmatic

 I am a boy = [Subject] [Verb] [Cardinal] [Noun]   Discourse
Morphology

Syntax

Semantic

Pragmatic

Discourse
 Syntax
   Actual structure of a sentence
                                        Morphology

                                        Syntax
 Grammar
                                        Semantic
   The rule set used in the analysis
                                        Pragmatic

                                        Discourse
 A grammar define syntactically legal sentences
    I ate an apple     (syntactic legal)
    I ate apple        (not syntactic legal)
    I ate a building   (syntactic legal, but?)    Morphology

                                                   Syntax

   doesn’t mean that it’s meaningful!              Semantic

                                                   Pragmatic

                                                   Discourse
 Ambiguities




                Morphology

                Syntax

                Semantic

                Pragmatic

                Discourse
Morphology

Syntax

Semantic

Pragmatic

Discourse
Morphology

           Syntax

SEMANTIC   Semantic

           Pragmatic

           Discourse
 What could this mean…
   Representations of linguistic inputs that capture
    the meanings of those inputs


 For us it means                                       Morphology
   Representations that permit or   facilitate         Syntax
    semantic processing
   Permit us to reason   about their truth             Semantic
    (relationship to some world)
                                                        Pragmatic
   Permit us to answer questions based on their
    content                                             Discourse
   Permit us to perform   inference (answer
    questions and determine the truth of things we
    don’t actually know)
Morphology

Syntax

Semantic

Pragmatic

Discourse
 Requirements


   Verifiability
   Ambiguity
                     Morphology
   Canonical Form
   Inference        Syntax

   Expressiveness
                     Semantic

                     Pragmatic

                     Discourse
Morphology

Syntax

Semantic

Pragmatic

Discourse
 Pragmatics: concerns how sentences are
 used in different situations and how use
                                              Morphology
 affects the interpretation of the sentence
                                              Syntax

                                              Semantic

 Discourse: concerns how the                 Pragmatic
 immediately preceding sentences affect
                                              Discourse
 the interpretation of the next sentence
Morphology

                                           Syntax
 ‘He’, ‘it’, ‘his’ can be inferred from
                                           Semantic
  previous sentence
                                           Pragmatic


 It’s   discourse                         Discourse
Morphology

Syntax

Semantic

Pragmatic

Discourse
Morphology

Syntax

Semantic

Pragmatic

Discourse
Morphology

Syntax

Semantic

Pragmatic

Discourse
Morphology

Syntax

Semantic

Pragmatic

Discourse
Morphology

Syntax

Semantic

Pragmatic

Discourse
 Wordnet
 Mindnet
 Stanford Tagger
 Stanford Parser
 ……..
 Machine translation
 Search engine
 Information extraction
 Chat bot
Natural language processing 2
Natural language processing 2
Natural language processing 2
 Can we use previously translated text to learn how to
 translate new texts?
   Yes! But, it’s not so easy
   Two paradigms, statistical MT, and EBMT
 Requirements:
   Aligned large parallel corpus of translated sentences
   {S source  S target }
   Bilingual dictionary for intra-S alignment
   Generalization patterns (names, numbers, dates…)
 Simplest: Translation Memory
   If S new= S source in corpus, output aligned S target


 Compositional EBMT
   If fragment of Snew matches fragment of Ss, output
    corresponding fragment of aligned St
   Prefer maximal-length fragments
   Maximize grammatical compositionality
      Via a target language grammar
      Or, via an N-gram statistical language model
 Requires an Interlingua - language-neutral Knowledge
  Representation (KR)
 Philosophical debate: Is there an interlingua?
   FOL is not totally language neutral (predicates,
    functions, expressed in a language)
   Other near-interlinguas (Conceptual Dependency)
 Requires a fully-disambiguating parser
   Domain model of legal objects, actions, relations
 Requires a NL generator (KR -> text)
 Applicable only to well-defined technical domains
 Produces high-quality MT in those domains
 Intelingua-based MT
 Rule-based MT
 Each approach has its own strength


   Rapidly adaptable: statistical, example-based
   Good grammar: rule-based (grammar)
   High precision in narrow domain: Intelingua
 Google
 Yahoo
 Alta-vista
 Answer.com
 Spider - a browser-like program that downloads web pages.
 Crawler – a program that automatically follows all of the
    links on each web page.
   Indexer - a program that analyzes web pages downloaded
    by the spider and the crawler.
   Database– storage for downloaded and processed pages.
   Results engine – extracts search results from the database.
    Web server – a server that is responsible for interaction
    between the user and other search engine components.
   Spider - a browser-like program that downloads web pages.
   Crawler – a program that automatically follows all of the
    links on each web page.
   Indexer - a program that analyzes web pages downloaded
    by the spider and the crawler.
   Database– storage for downloaded and processed pages.
   Results engine – extracts search results from the database.
    Web server – a server that is responsible for interaction
    between the user and other search engine components.
Natural language processing 2
Natural language processing 2
Natural language processing 2
 Idea is to ‘extract’ particular types of information from
  arbitrary text or transcribed speech

 Examples:
   Names entities: people, places, organization
   Telephone numbers
   Dates
 Many uses:
   Question answering systems, fisting of news or mail…
   Job ads, financial information, terrorist attacks
 Often use a set of simple templates or frames with slots
 to be filled in from input text. Ignore everything else.
   Husni’s number is 966-3-860-2624.
   The inventor of the First plane was Abbas ibnu Fernas
   The British King died in March of 1932.
 Named Entity recognition (NE)
   Finds and classifies names, places etc.
 Co-reference Resolution (CO)
   Identifies identity relations between entities in texts.
 Template Element construction (TE)
   Adds descriptive information to NE results (using CO).
 Template Relation construction (TR)
   Finds relations between TE entities. Scenario
 Template production (ST)
   Fits TE and TR results into specified event scenarios.
Natural language processing 2
Natural language processing 2
Natural language processing 2
Natural language processing 2
Natural language processing 2
Natural language processing 2
 AIML = Artificial Intelligent Mark-up Language
 Alice
 A.L.I.C.E. (Artificial Linguistic Internet Computer
 Entity)
   an award-winning free natural language artificial
    intelligence chat robot.


 Ruled-base
 Human-like answer without complicated “brain”
 Multi-language
Natural language processing 2
 NLP’s course , Husni Al-Muhtaseb
 Lexical descriptions for Vietnamese language
  processing .
 en.wikipedia.org
 www.xulyngonngu.com
Natural language processing 2

More Related Content

PPTX
Semantics - Introduction to Linguistic
Aviihms
 
PPTX
Paradigmatic and sintagmatic relation
ryufaliza
 
PDF
Natural Language Ambiguity and its Effect on Machine Learning
IJMER
 
PDF
Corpus-based part-of-speech disambiguation of Persian
IDES Editor
 
PPTX
Semantic
Moni Moni
 
PDF
Cognitive Grammar: Word Network
JESSIE GRACE RUBRICO
 
PDF
OPTIMIZE THE LEARNING RATE OF NEURAL ARCHITECTURE IN MYANMAR STEMMER
ijnlc
 
PDF
Cognitive Grammar: teaching the verb 'to be'
JESSIE GRACE RUBRICO
 
Semantics - Introduction to Linguistic
Aviihms
 
Paradigmatic and sintagmatic relation
ryufaliza
 
Natural Language Ambiguity and its Effect on Machine Learning
IJMER
 
Corpus-based part-of-speech disambiguation of Persian
IDES Editor
 
Semantic
Moni Moni
 
Cognitive Grammar: Word Network
JESSIE GRACE RUBRICO
 
OPTIMIZE THE LEARNING RATE OF NEURAL ARCHITECTURE IN MYANMAR STEMMER
ijnlc
 
Cognitive Grammar: teaching the verb 'to be'
JESSIE GRACE RUBRICO
 

What's hot (19)

PPT
Group presentation lexical semantics
blessedkkr
 
PPT
Langacker's cognitive grammar
JOy Verzosa
 
PDF
MORPHOLOGICAL SEGMENTATION WITH LSTM NEURAL NETWORKS FOR TIGRINYA
ijnlc
 
DOCX
Minimalist program
Amusan Kayode
 
DOCX
ACTIVIDAD 7
Maira Roxana Garcia
 
PDF
Translation
mjkay
 
PPTX
Presentation1
rebeccaTorres123
 
PDF
Feature Structure Unification Syntactic Parser 2.0
rcaneba
 
PDF
5a use of annotated corpus
ThennarasuSakkan
 
PDF
Semantics
Aef Tony
 
PDF
A Constructive Mathematics approach for NL formal grammars
Federico Gobbo
 
PPT
Unit 1 Semantics
mjgvalcarce
 
PPTX
Natural language-processing
Hareem Naz
 
PDF
Constructive Hybrid Logics
Valeria de Paiva
 
PDF
Constructive Description Logics 2006
Valeria de Paiva
 
DOCX
Narrative
irbaz khan
 
PPTX
Prosodic Morphology
Maroua Harrif
 
PPTX
Text : Definition, Elaboration and Examples
Alaahussein81
 
PPTX
Minimalist program
RabbiaAzam
 
Group presentation lexical semantics
blessedkkr
 
Langacker's cognitive grammar
JOy Verzosa
 
MORPHOLOGICAL SEGMENTATION WITH LSTM NEURAL NETWORKS FOR TIGRINYA
ijnlc
 
Minimalist program
Amusan Kayode
 
ACTIVIDAD 7
Maira Roxana Garcia
 
Translation
mjkay
 
Presentation1
rebeccaTorres123
 
Feature Structure Unification Syntactic Parser 2.0
rcaneba
 
5a use of annotated corpus
ThennarasuSakkan
 
Semantics
Aef Tony
 
A Constructive Mathematics approach for NL formal grammars
Federico Gobbo
 
Unit 1 Semantics
mjgvalcarce
 
Natural language-processing
Hareem Naz
 
Constructive Hybrid Logics
Valeria de Paiva
 
Constructive Description Logics 2006
Valeria de Paiva
 
Narrative
irbaz khan
 
Prosodic Morphology
Maroua Harrif
 
Text : Definition, Elaboration and Examples
Alaahussein81
 
Minimalist program
RabbiaAzam
 
Ad

Viewers also liked (20)

DOCX
NLP and its applications
Utphala P
 
PPTX
Natural Language Processing: Definition and Application
Stephen Shellman
 
PPTX
Statistical machine translation
Hrishikesh Nair
 
PPTX
Jeeves -natural language interface application
Karan Harsh Wardhan
 
PPTX
Startupfest 2015: HARPER REED (Modest, Inc.) - Lightning Keynote
Startupfest
 
PDF
Statistical machine translation in a few slides
Forcada Mikel
 
PDF
Natural language procesing in R
Olabanji Shonibare
 
PPTX
Machine translation with statistical approach
vini89
 
PDF
Intro to nlp
Rutu Mulkar-Mehta
 
PPSX
Gordana Panajotović - NLP Master
NLP Centar Beograd
 
PPTX
Text Mining Infrastructure in R
Ashraf Uddin
 
PDF
Introduction to nlp 2014
Grant Hamel
 
PPT
Types of machine translation
Rushdi Shams
 
PPTX
Text analytics in Python and R with examples from Tobacco Control
Ben Healey
 
PDF
Natural language processing (NLP) introduction
Robert Lujo
 
PDF
Practical Natural Language Processing
Jaganadh Gopinadhan
 
PPT
Introduction to Natural Language Processing
rohitnayak
 
PDF
Introducing natural language processing(NLP) with r
Vivian S. Zhang
 
PPTX
Natural language processing
Yogendra Tamang
 
PPTX
Natural Language Processing in R (rNLP)
fridolin.wild
 
NLP and its applications
Utphala P
 
Natural Language Processing: Definition and Application
Stephen Shellman
 
Statistical machine translation
Hrishikesh Nair
 
Jeeves -natural language interface application
Karan Harsh Wardhan
 
Startupfest 2015: HARPER REED (Modest, Inc.) - Lightning Keynote
Startupfest
 
Statistical machine translation in a few slides
Forcada Mikel
 
Natural language procesing in R
Olabanji Shonibare
 
Machine translation with statistical approach
vini89
 
Intro to nlp
Rutu Mulkar-Mehta
 
Gordana Panajotović - NLP Master
NLP Centar Beograd
 
Text Mining Infrastructure in R
Ashraf Uddin
 
Introduction to nlp 2014
Grant Hamel
 
Types of machine translation
Rushdi Shams
 
Text analytics in Python and R with examples from Tobacco Control
Ben Healey
 
Natural language processing (NLP) introduction
Robert Lujo
 
Practical Natural Language Processing
Jaganadh Gopinadhan
 
Introduction to Natural Language Processing
rohitnayak
 
Introducing natural language processing(NLP) with r
Vivian S. Zhang
 
Natural language processing
Yogendra Tamang
 
Natural Language Processing in R (rNLP)
fridolin.wild
 
Ad

Similar to Natural language processing 2 (20)

PPTX
Semantics
gandesAM
 
PPTX
Mental grammar
Rona Andres
 
PPTX
Visual Word Recognition. The Journey from Features to Meaning
fawzia
 
PPTX
Structural grammar iii
flakcute
 
PPTX
Syntactic Features in Mother Tongue.pptx
JamelMirafuentes
 
PDF
Understanding ASL Grammatical Features and Discourse Mapping
Doug Stringham
 
PPTX
Language in cognitive psychology
Ali Bahrani
 
PDF
Nlp ambiguity presentation
Gurram Poorna Prudhvi
 
PPT
Lecture Number 2 of Natural Language Processing
abcdefghijklmtuvwxyz
 
PPTX
Nlp Sentemental analysis of Tweetr And CaseStudy
Raza Azeem
 
PPSX
Semantics
Kocaeli University
 
PDF
05 linguistic theory meets lexicography
Duygu Aşıklar
 
PDF
Syntax
amirasoul
 
PDF
Setswana Tokenisation and Computational Verb Morphology: Facing the Challenge...
Guy De Pauw
 
KEY
Grammar Presentation
tickingmindpd
 
DOC
Assignment on morphology
Linda Midy
 
PPT
What English Do University Students Really Need
Hala Nur
 
PPTX
Grammar 4
liliaindriani
 
Semantics
gandesAM
 
Mental grammar
Rona Andres
 
Visual Word Recognition. The Journey from Features to Meaning
fawzia
 
Structural grammar iii
flakcute
 
Syntactic Features in Mother Tongue.pptx
JamelMirafuentes
 
Understanding ASL Grammatical Features and Discourse Mapping
Doug Stringham
 
Language in cognitive psychology
Ali Bahrani
 
Nlp ambiguity presentation
Gurram Poorna Prudhvi
 
Lecture Number 2 of Natural Language Processing
abcdefghijklmtuvwxyz
 
Nlp Sentemental analysis of Tweetr And CaseStudy
Raza Azeem
 
05 linguistic theory meets lexicography
Duygu Aşıklar
 
Syntax
amirasoul
 
Setswana Tokenisation and Computational Verb Morphology: Facing the Challenge...
Guy De Pauw
 
Grammar Presentation
tickingmindpd
 
Assignment on morphology
Linda Midy
 
What English Do University Students Really Need
Hala Nur
 
Grammar 4
liliaindriani
 

Recently uploaded (20)

PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PDF
Software Development Methodologies in 2025
KodekX
 
PPTX
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PDF
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
PDF
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PDF
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PPTX
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
PPTX
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
PDF
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PPTX
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
PPTX
The Future of AI & Machine Learning.pptx
pritsen4700
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
Software Development Methodologies in 2025
KodekX
 
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
The Future of AI & Machine Learning.pptx
pritsen4700
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 

Natural language processing 2

  • 2.  Overview  Basic knowledge  Demonstration
  • 3. LOGO USER : Men are all alike. ELIZA : In what way? USER : They’re always bugging us about something or other. ELIZA : Can you think of a specific example? USER : Well, my boyfriend made me come here. ELIZA : Your boyfriend made you come here ? USER : He says I’m depressed much of the time. ELIZA : I am sorry to hear you are depressed. USER : It’s true, I am unhappy. ELIZA : Do you think coming here will help you not to be unhappy? USER : I need some help; that much seems certain. ELIZA : What would it mean to you if you got some help? USER : Perhaps I could learn to get along with my mother. ELIZA : Tell me more about your family. USER : My mother takes care of me. ELIZA : Who else in your family takes care of you? USER : My father. ELIZA : Your father ? USER : You are like my father in some ways. ELIZA : What resemblance do you see?
  • 4.  A sub-field of Artificial Intelligent, since 1960s …  Concerned with the interactions between computers and human languages with one ultimate goal : Computers can “understand” human  Many applications in real world
  • 5.  Natural language unit?  Natural language understanding  Natural language generation  Data?  Speech processing  Text processing Natural language text understanding!
  • 6.  Task of generating natural language from a machine representation  May be viewed as the opposite of natural language understanding .  Applications:  Jokes generation  Textual summaries of databases  Enhancing accessibility
  • 7.  An advanced subtopic of NLP deals with reading comprehension  More complex than NLG  Many commercial interest in this field  News-gathering  Data-Mining  Voice-Activation  Large-scale content analysis
  • 8.  Logic is too clear, the lost of flexibility cause difficulties in NLP  Examples :  Time flies like an arrow Can be understood in 7 ways !!!  I never said she stole my money !  Someone else said it, but I didn't.
  • 9.  Logic is too clear, the lost of flexibility become difficulties in NLP  Examples :  Time flies like an arrow Can be understood in 7 ways !!!  I never said she stole my money !  I simply didn't ever say it
  • 10.  Logic is too clear, the lost of flexibility become difficulties in NLP  Examples :  Time flies like an arrow Can be understood in 7 ways !!!  I never said she stole my money !  I might have implied it in some way, but I never explicitly said it
  • 11.  Logic is too clear, the lost of flexibility become difficulties in NLP  Examples :  Time flies like an arrow Can be understood in 7 ways !!!  I never said she stole my money !  I said someone took it; I didn't say it was she
  • 12.  Logic is too clear, the lost of flexibility become difficulties in NLP  Examples:  Time flies like an arrow Can be understood in 7 ways !!!  I never said she stole my money !  I just said she probably borrowed it
  • 13.  Logic is too clear, the lost of flexibility become difficulties in NLP  Examples :  Time flies like an arrow Can be understood in 7 ways !!!  I never said she stole my money !  I said she stole someone else's money
  • 14.  Logic is too clear, the lost of flexibility become difficulties in NLP  Examples :  Time flies like an arrow Can be understood in 7 ways !!!  I never said she stole my money !  I said she stole something, but not my money
  • 15.  Words combination and division  Stress placing on words  The properties of subjects  We gave the monkeys the bananas because they were hungry  We gave the monkeys the bananas because they were over-ripe  Specifying which word an adjective applies to  A pretty little girls' school
  • 16.  Involves reasoning about the world  Embedded a social system of people interacting  persuading, insulting and amusing them  changing over time  Homonymous
  • 28.  ePi Group:  Automatic Vietnamese processing system  www.baomoi.com  Collecting news from all Vietnamese e-newspapers  EVTrans – Softex Co Ltd.  Cyclop  VnKim
  • 33.  Morphological analysis : Individual words are analyzed into their components  Syntactic analysis Linear sequence of words are transformed into structures that show how the words relate to each other  Semantic analysis  A transformation is made from the input text to an internal representation that reflects the meaning  Pragmatic analysis  To reinterpret what was said to what was actually meant  Discourse analysis  Resolving references between sentences
  • 36.  Morphemes: smallest meaningful unit spoken units of language.  Stem: book, cat, car, …  Affixes : un-, -s, -es, .. Morphology  Clitic: ‘ve, ‘m Syntax Semantic  Morphological parsing: parsing a word Pragmatic into stem and affixes and identifying the Discourse parts and their relationships
  • 37.  Word Classes  Parts of speech: noun, verb, adjectives, etc. Morphology  Word class dictates how a word combines with morphemes to form new words Syntax Semantic  Examples Pragmatic  Books: book + s Discourse  Unladylike = un + lady + like
  • 38.  Vietnamese?  Ăn = ăn Morphology  Uống = uống  Xe = xe Syntax Semantic  No ‘Xes’ in Vietnamese! Pragmatic  Problems are text tokenizing. Discourse
  • 39.  Why parse words? Morphology  To identify a word’s part-of-speech  To identify a word’s stem (IR) Syntax Semantic … then? Pragmatic  Spell- checking Discourse  To predict next words  To predict the word’s accent
  • 40.  Ambiguity  I want her to go to the cinema with me Morphology To - infinitive? Syntax To - preposition? Semantic Pragmatic  Con ngựa đá đá con ngựa đá. Discourse đá = đá?
  • 41.  How to implement?  Regular expression  Finite State Transducers (FST)  Finite State Accepter (FSA) Morphology Syntax *.exe Semantic ir??man Pragmatic b[0-9]+ *(Mb|[Mm]egabytes?)b Discourse
  • 43.  Relate terms:  Stem, stemming Morphology  Part of speech Syntax  N-gram Semantic Pragmatic Discourse
  • 45. Morphology SYNTAX Syntax Semantic Pragmatic Discourse
  • 46.  Linear sequence of words are transformed into structures that show how the words relate to each other. Morphology  Determine grammatical structure. Syntax Semantic Pragmatic  I am a boy = [Subject] [Verb] [Cardinal] [Noun] Discourse
  • 48.  Syntax  Actual structure of a sentence Morphology Syntax  Grammar Semantic  The rule set used in the analysis Pragmatic Discourse
  • 49.  A grammar define syntactically legal sentences  I ate an apple (syntactic legal)  I ate apple (not syntactic legal)  I ate a building (syntactic legal, but?) Morphology Syntax doesn’t mean that it’s meaningful! Semantic Pragmatic Discourse
  • 50.  Ambiguities Morphology Syntax Semantic Pragmatic Discourse
  • 52. Morphology Syntax SEMANTIC Semantic Pragmatic Discourse
  • 53.  What could this mean…  Representations of linguistic inputs that capture the meanings of those inputs  For us it means Morphology  Representations that permit or facilitate Syntax semantic processing  Permit us to reason about their truth Semantic (relationship to some world) Pragmatic  Permit us to answer questions based on their content Discourse  Permit us to perform inference (answer questions and determine the truth of things we don’t actually know)
  • 55.  Requirements  Verifiability  Ambiguity Morphology  Canonical Form  Inference Syntax  Expressiveness Semantic Pragmatic Discourse
  • 57.  Pragmatics: concerns how sentences are used in different situations and how use Morphology affects the interpretation of the sentence Syntax Semantic  Discourse: concerns how the Pragmatic immediately preceding sentences affect Discourse the interpretation of the next sentence
  • 58. Morphology Syntax  ‘He’, ‘it’, ‘his’ can be inferred from Semantic previous sentence Pragmatic  It’s discourse Discourse
  • 64.  Wordnet  Mindnet  Stanford Tagger  Stanford Parser  ……..
  • 65.  Machine translation  Search engine  Information extraction  Chat bot
  • 69.  Can we use previously translated text to learn how to translate new texts?  Yes! But, it’s not so easy  Two paradigms, statistical MT, and EBMT  Requirements:  Aligned large parallel corpus of translated sentences  {S source  S target }  Bilingual dictionary for intra-S alignment  Generalization patterns (names, numbers, dates…)
  • 70.  Simplest: Translation Memory  If S new= S source in corpus, output aligned S target  Compositional EBMT  If fragment of Snew matches fragment of Ss, output corresponding fragment of aligned St  Prefer maximal-length fragments  Maximize grammatical compositionality  Via a target language grammar  Or, via an N-gram statistical language model
  • 71.  Requires an Interlingua - language-neutral Knowledge Representation (KR)  Philosophical debate: Is there an interlingua?  FOL is not totally language neutral (predicates, functions, expressed in a language)  Other near-interlinguas (Conceptual Dependency)  Requires a fully-disambiguating parser  Domain model of legal objects, actions, relations  Requires a NL generator (KR -> text)  Applicable only to well-defined technical domains  Produces high-quality MT in those domains
  • 73.  Each approach has its own strength  Rapidly adaptable: statistical, example-based  Good grammar: rule-based (grammar)  High precision in narrow domain: Intelingua
  • 74.  Google  Yahoo  Alta-vista  Answer.com
  • 75.  Spider - a browser-like program that downloads web pages.  Crawler – a program that automatically follows all of the links on each web page.  Indexer - a program that analyzes web pages downloaded by the spider and the crawler.  Database– storage for downloaded and processed pages.  Results engine – extracts search results from the database.  Web server – a server that is responsible for interaction between the user and other search engine components.
  • 76. Spider - a browser-like program that downloads web pages.  Crawler – a program that automatically follows all of the links on each web page.  Indexer - a program that analyzes web pages downloaded by the spider and the crawler.  Database– storage for downloaded and processed pages.  Results engine – extracts search results from the database.  Web server – a server that is responsible for interaction between the user and other search engine components.
  • 80.  Idea is to ‘extract’ particular types of information from arbitrary text or transcribed speech  Examples:  Names entities: people, places, organization  Telephone numbers  Dates  Many uses:  Question answering systems, fisting of news or mail…  Job ads, financial information, terrorist attacks
  • 81.  Often use a set of simple templates or frames with slots to be filled in from input text. Ignore everything else.  Husni’s number is 966-3-860-2624.  The inventor of the First plane was Abbas ibnu Fernas  The British King died in March of 1932.
  • 82.  Named Entity recognition (NE)  Finds and classifies names, places etc.  Co-reference Resolution (CO)  Identifies identity relations between entities in texts.  Template Element construction (TE)  Adds descriptive information to NE results (using CO).  Template Relation construction (TR)  Finds relations between TE entities. Scenario  Template production (ST)  Fits TE and TR results into specified event scenarios.
  • 89.  AIML = Artificial Intelligent Mark-up Language  Alice
  • 90.  A.L.I.C.E. (Artificial Linguistic Internet Computer Entity)  an award-winning free natural language artificial intelligence chat robot.  Ruled-base  Human-like answer without complicated “brain”  Multi-language
  • 92.  NLP’s course , Husni Al-Muhtaseb  Lexical descriptions for Vietnamese language processing .  en.wikipedia.org  www.xulyngonngu.com