SlideShare a Scribd company logo
Context-Free Grammars
Slideshare: http://www.slideshare.net/marinasantini1/lecture-contextfree-grammars
Mathematics for Language Technology
http://stp.lingfil.uu.se/~matsd/uv/uv15/mfst/
Last Updated: 6 March 2015
Marina Santini
santinim@stp.lingfil.uu.se
Department of Linguistics and Philology
Uppsala University, Uppsala, Sweden
Spring 2015
1
Acknowledgements
 Several	
  slides	
  borrowed	
  from	
  Jurafsky	
  and	
  Mar6n	
  
(2009).	
  
 Prac6cal	
  ac6vi6es	
  by	
  Mats	
  Dahllöf	
  and	
  J&M	
  (2009).	
  
2
Reading
 Required Reading:
  Compendium (3): 8
  Mats Dahllöf: Kontext-fria grammatiker (CFG)
•  http://stp.lingfil.uu.se/~matsd/uv/uv14/mfst/dok/oh7.pdf
 Further Reading:
  Chapter	
  12.1-­‐12.3	
  and	
  Chapter	
  16.1-­‐16.2.1	
  in	
  Jurafsky	
  
D.	
  &	
  Mar6n	
  J.	
  (2009)	
  Speech	
  and	
  Language	
  Processing:	
  
An	
  introduc,on	
  to	
  natural	
  language	
  processing,	
  
computa,onal	
  linguis,cs,	
  and	
  speech	
  recogni,on.	
  Online	
  
draM	
  version:	
  hOp://stp.lingfil.uu.se/~san6nim/ml/2014/
JurafskyMar6nSpeechAndLanguageProcessing2ed_draM%202007.pdf	
  
3
Outline
 Context-Free Grammars (CFGs)
 A grammar for English: Examples
 Practical Activities
4
5
Context-Free Grammars
6
The Chomsky Hierachy
Regular
(DFA)
Context-
free
(PDA)
Context-
sensitive
(LBA)
Recursively-
enumerable
(TM)
•  A containment hierarchy of classes of formal languages
7
Informal Comments
 A context-free grammar is a notation
for describing languages.
 It is more powerful than finite automata
or REs, but still cannot define all
possible languages.
 Useful for nested structures.
8
Constituency
 The basic idea here is that groups of
words within utterances can be shown
to act as single units.
 And in a given language, these units
form coherent classes that can be
shown to behave in similar ways
  With respect to their internal structure
  And with respect to other units in the
language
9
Constituency
 Internal structure
  We can describe an internal structure to the
class (might have to use disjunctions of
somewhat unlike sub-classes to do this).
 External behavior
  For example, we can say that noun phrases
can come before verbs
10
Constituency
 For example, it makes sense to the say
that the following are all noun phrases
in English...
 Why? One piece of evidence is that they
can all precede verbs.
  This is external evidence
11
Grammars and Constituency
 Of course, there’s nothing easy or obvious
about how we come up with right set of
constituents and the rules that govern how
they combine...
 That’s why there are so many different
theories of grammar and competing analyses
of the same data.
12
Context-Free Grammars
 Context-free grammars (CFGs)
  Also known as
•  Phrase structure grammars
•  Backus-Naur form
 Consist of
  Rules
  Terminals
  Non-terminals
13
Context-Free Grammars
 Terminals
  We’ll take these to be words (for now)
 Non-Terminals
  The constituents in a language
•  Like noun phrase, verb phrase and sentence
 Rules
  Rules are equations that consist of a single
non-terminal on the left and any number of
terminals and non-terminals on the right.
14
Some NP Rules
 Here are some rules for our noun phrases
 Together, these describe two kinds of NPs.
  One that consists of a determiner followed by a nominal
  And another that says that proper names are NPs.
  The third rule illustrates two things
•  An explicit disjunction
–  Two kinds of nominals
•  A recursive definition
–  Same non-terminal on the right and left-side of the rule
15
L0 Grammar
16
Generativity
 As with FSAs, you can view these rules
as either analysis or synthesis machines
  Generate strings in the language
  Reject strings not in the language
  Impose structures (trees) on strings in the
language
17
Derivations
 A derivation is a
sequence of rules
applied to a string
that accounts for
that string
  Covers all the
elements in the
string
  Covers only the
elements in the
string
18
Definition - Repetition
 More formally, a CFG consists of
19
Parsing
 Parsing is the process of taking a string
and a grammar and returning a
(multiple?) parse tree(s) for that string
  CFG is just more powerful language than
FSA
•  Remember this means that there are languages
we can capture with CFGs that we can’t
capture with finite-state methods
A grammar for English: Examples
20
21
An English Grammar Fragment
 Sentences
 Noun phrases
 Verb phrases
22
Sentence Types
 Declaratives: A plane left.
S → NP VP
 Imperatives: Leave!
S → VP
 Yes-No Questions: Did the plane leave?
S → Aux NP VP
 WH Questions: When did the plane leave?
S → WH-NP Aux NP VP
23
Noun Phrases
 Let’s consider the following rule in
more detail...
NP → Det Nominal
 Most of the complexity of English noun
phrases is hidden in this rule.
 Consider the derivation for the following
example
  All the morning flights from Denver to
Tampa leaving before 10
24
Noun Phrases
25
NP Structure
 Clearly this NP is really about flights.
That’s the central criticial noun in this
NP. Let’s call that the head.
 We can dissect this kind of NP into the
stuff that can come before the head,
and the stuff that can come after it.
26
Determiners
 Noun phrases can start with
determiners...
 Determiners can be
  Simple lexical items: the, this, a, an, etc.
•  A car
  Or simple possessives
•  John’s car
  Or complex recursive versions of that
•  John’s sister’s husband’s son’s car
27
Nominals
 Contains the head and any pre- and
post- modifiers of the head.
  Pre-
•  Quantifiers, cardinals, ordinals...
–  Three cars
•  Adjectives and Aps
–  large cars
•  Ordering constraints
–  Three large cars
–  ?large three cars
28
Postmodifiers
 Three kinds
  Prepositional phrases
•  From Seattle
  Non-finite clauses
•  Arriving before noon
  Relative clauses
•  That serve breakfast
 Same general (recursive) rule to handle these
  Nominal → Nominal PP
  Nominal → Nominal GerundVP
  Nominal → Nominal RelClause
29
Agreement
 By agreement, we have in mind
constraints that hold among various
constituents that take part in a rule or set
of rules
 For example, in English, determiners and
the head nouns in NPs have to agree in
their number.
This flight
Those flights
*This flights
*Those flight
30
Problem
 Our earlier NP rules are clearly deficient
since they don’t capture this constraint
  NP → Det Nominal
•  Accepts, and assigns correct structures, to
grammatical examples (this flight)
•  But its also happy with incorrect examples
(*these flight)
  Such a rule is said to overgenerate.
  We’ll come back to this in a bit
31
Verb Phrases
 English VPs consist of a head verb along
with 0 or more following constituents
which we’ll call arguments.
32
Subcategorization
 But, even though there are many valid VP rules
in English, not all verbs are allowed to
participate in all those VP rules.
 We can subcategorize the verbs in a language
according to the sets of VP rules that they
participate in.
 This is a modern take on the traditional notion
of transitive/intransitive.
 Modern grammars may have 100s or such
classes.
33
Subcategorization
 Sneeze: John sneezed
 Find: Please find [a flight to NY]NP
 Give: Give [me]NP[a cheaper fare]NP
 Help: Can you help [me]NP[with a
flight]PP
 Prefer: I prefer [to leave earlier]TO-VP
 Told: I was told [United has a flight]S
 …
34
Subcategorization
 *John sneezed the book
 *I prefer United has a flight
 *Give with a flight
 As with agreement phenomena, we
need a way to formally express the
constraints
35
Why?
 Right now, the various rules for VPs
overgenerate.
  They permit the presence of strings containing
verbs and arguments that don’t go together
  For example
  VP -> V NP therefore
Sneezed the book is a VP since “sneeze” is a
verb and “the book” is a valid NP
36
Possible CFG Solution
 Possible solution for
agreement.
 Can use the same
trick for all the verb/
VP classes.
 SgS -> SgNP SgVP
 PlS -> PlNp PlVP
 SgNP -> SgDet
SgNom
 PlNP -> PlDet PlNom
 PlVP -> PlV NP
 SgVP ->SgV Np
 …
37
CFG Solution for Agreement
 It works and stays within the power of CFGs
 But its ugly
 And it does not scale all that well because of
the interaction among the various constraints
explodes the number of rules in our grammar
38
The Point
 CFGs appear to be just about what we need to
account for a lot of basic syntactic structure in
English.
 But there are problems
  That can be dealt with adequately, although not
elegantly, by staying within the CFG framework.
 There are simpler, more elegant, solutions that
take us out of the CFG framework (beyond its
formal power)
  Ex: LFG (Lexical Functional Grammar), HPSG (Head-
Driven Phrase Structure Grammar), etc.
39
Summary
 Context-free grammars can be used to model
various facts about the syntax of a language.
 When paired with parsers, such grammars
consititute a critical component in many
applications.
 Constituency is a key phenomena easily
captured with CFG rules.
  But agreement and subcategorization do pose
significant problems
Prac6cal	
  Ac6vity	
  1	
  
 The	
  language	
  L	
  contains	
  all	
  strings	
  over	
  the	
  
alphabet	
  {a,b}	
  that	
  begin	
  with	
  a	
  and	
  end	
  with	
  b,	
  
ie:	
  
 Write context-free grammar rules that
generate the language L.	
  	
  	
  
40
Practical Activity 1:
Possible Solution
41
Practical Activity 2
 Look at the following CFG rules:
 Draw a parse tree for the string
aba.
42
Practical Activity 2:
Possible Solution
43
Practical Activity 3
Draw tree structures for the following phrases
and sentences:
1.  Dallas
2.  I would like to fly on American airlines.
3.  after five p.m.
4.  Does American 487 have a first class section?
5.  early flights
6.  any delays in Denver
44
Practical Activity 3: Possible Solutions
45
The End
46

More Related Content

What's hot (20)

PDF
TOC 1 | Introduction to Theory of Computation
Mohammad Imam Hossain
 
PDF
Theory of Computation Lecture Notes
FellowBuddy.com
 
PDF
Flat unit 3
VenkataRaoS1
 
PPT
Chomsky Hierarchy.ppt
AayushSingh233965
 
PDF
NFA to DFA
Animesh Chaturvedi
 
PPT
Turing Machine
Rajendran
 
PPT
Natural language processing
Basha Chand
 
PPTX
NLP_KASHK:Morphology
Hemantha Kulathilake
 
PDF
Natural language processing
National Institute of Technology Durgapur
 
PPT
Regular Languages
parmeet834
 
PPTX
Natural Language Processing
Saurabh Kaushik
 
PDF
Lecture Notes-Finite State Automata for NLP.pdf
Deptii Chaudhari
 
PPTX
Language Model (N-Gram).pptx
HeneWijaya
 
PPTX
Natural Language Processing (NLP).pptx
SHIBDASDUTTA
 
PPT
Turing Machine
Rajendran
 
PPT
TM - Techniques
Rajendran
 
PPT
context free language
khush_boo31
 
PPTX
NLP_KASHK:Evaluating Language Model
Hemantha Kulathilake
 
PPTX
3.6 & 7. pumping lemma for cfl & problems based on pl
Sampath Kumar S
 
DOCX
Natural Language Processing
Mariana Soffer
 
TOC 1 | Introduction to Theory of Computation
Mohammad Imam Hossain
 
Theory of Computation Lecture Notes
FellowBuddy.com
 
Flat unit 3
VenkataRaoS1
 
Chomsky Hierarchy.ppt
AayushSingh233965
 
NFA to DFA
Animesh Chaturvedi
 
Turing Machine
Rajendran
 
Natural language processing
Basha Chand
 
NLP_KASHK:Morphology
Hemantha Kulathilake
 
Natural language processing
National Institute of Technology Durgapur
 
Regular Languages
parmeet834
 
Natural Language Processing
Saurabh Kaushik
 
Lecture Notes-Finite State Automata for NLP.pdf
Deptii Chaudhari
 
Language Model (N-Gram).pptx
HeneWijaya
 
Natural Language Processing (NLP).pptx
SHIBDASDUTTA
 
Turing Machine
Rajendran
 
TM - Techniques
Rajendran
 
context free language
khush_boo31
 
NLP_KASHK:Evaluating Language Model
Hemantha Kulathilake
 
3.6 & 7. pumping lemma for cfl & problems based on pl
Sampath Kumar S
 
Natural Language Processing
Mariana Soffer
 

Viewers also liked (20)

PPT
Context free languages
Jahurul Islam
 
PPTX
Context free grammars
Shiraz316
 
PDF
Context free langauges
sudhir sharma
 
PPT
2. context free langauages
danhumble
 
PPT
Class7
issbp
 
PDF
Lecture: Regular Expressions and Regular Languages
Marina Santini
 
PPTX
Deterministic context free grammars &non-deterministic
Leyo Stephen
 
PPT
Properties of cfg
lavishka_anuj
 
PPTX
Theory of computation Lec1
Arab Open University and Cairo University
 
PPTX
Parsing
Roohaali
 
PPTX
Normal forms cfg
Rajendran
 
PPTX
Top Down Parsing, Predictive Parsing
Tanzeela_Hussain
 
PDF
Bottom up parser
Akshaya Arunan
 
PPTX
Parsing
Tech_MX
 
PDF
L3 cfg
Self-employed
 
PPT
Top down parsing
ASHOK KUMAR REDDY
 
PPT
Turing machines
lavishka_anuj
 
PPTX
Top down and botttom up Parsing
Gerwin Ocsena
 
PPT
Grammar
lavishka_anuj
 
PPT
Bakus naur form
grahamwell
 
Context free languages
Jahurul Islam
 
Context free grammars
Shiraz316
 
Context free langauges
sudhir sharma
 
2. context free langauages
danhumble
 
Class7
issbp
 
Lecture: Regular Expressions and Regular Languages
Marina Santini
 
Deterministic context free grammars &non-deterministic
Leyo Stephen
 
Properties of cfg
lavishka_anuj
 
Theory of computation Lec1
Arab Open University and Cairo University
 
Parsing
Roohaali
 
Normal forms cfg
Rajendran
 
Top Down Parsing, Predictive Parsing
Tanzeela_Hussain
 
Bottom up parser
Akshaya Arunan
 
Parsing
Tech_MX
 
Top down parsing
ASHOK KUMAR REDDY
 
Turing machines
lavishka_anuj
 
Top down and botttom up Parsing
Gerwin Ocsena
 
Grammar
lavishka_anuj
 
Bakus naur form
grahamwell
 
Ad

Similar to Lecture: Context-Free Grammars (20)

PPT
ssNL11SyntaxandContext-free grammars.ppt
MadhuCK2
 
PPTX
NLP_KASHK:Context-Free Grammar for English
Hemantha Kulathilake
 
PPTX
Formal Grammars of English
Luciano Sclovsky
 
PPT
NLP Natural Language Processing 8th Chapter.ppt
pandeyharshita00
 
PDF
Morphology-and-Syntax-CFG for another random place
ssuser2a38d0
 
PPTX
CS911-Lecture-21_43709.pptx
AliZaib71
 
PPT
Lecture 2009-09-22
hirafoundation school
 
PPT
Natural Language Processing 9th Chapter.ppt
pandeyharshita00
 
PPTX
5. Syntacticfffgffg analysis-Parsing.pptx
NehanTanwar1
 
PPTX
Dhdhddhd5. Syntactic analysis-Parsing.pptx
NehanTanwar1
 
PPTX
natural language processing
sunanthakrishnan
 
PPTX
Debugging Chomsky's Hierarchy
Hussein Ghaly
 
PDF
Lect6-Syntax.pdf data syntax for lemmatization
praBeeInadhikari
 
PDF
Adnan: Introduction to Natural Language Processing
Mustafa Jarrar
 
PDF
New compiler design 101 April 13 2024.pdf
eliasabdi2024
 
PPTX
Grammar rules in English, Dependency Parsing, Shallow parsing
Kirti Verma
 
PDF
語言學概論Syntax
棠貝 白
 
PDF
CS571: Tree Adjoining Grammar
Jinho Choi
 
PPTX
nlp (1).pptx
Subramanian Mani
 
PPTX
Conteext-free Grammer
HASHIR RAZA
 
ssNL11SyntaxandContext-free grammars.ppt
MadhuCK2
 
NLP_KASHK:Context-Free Grammar for English
Hemantha Kulathilake
 
Formal Grammars of English
Luciano Sclovsky
 
NLP Natural Language Processing 8th Chapter.ppt
pandeyharshita00
 
Morphology-and-Syntax-CFG for another random place
ssuser2a38d0
 
CS911-Lecture-21_43709.pptx
AliZaib71
 
Lecture 2009-09-22
hirafoundation school
 
Natural Language Processing 9th Chapter.ppt
pandeyharshita00
 
5. Syntacticfffgffg analysis-Parsing.pptx
NehanTanwar1
 
Dhdhddhd5. Syntactic analysis-Parsing.pptx
NehanTanwar1
 
natural language processing
sunanthakrishnan
 
Debugging Chomsky's Hierarchy
Hussein Ghaly
 
Lect6-Syntax.pdf data syntax for lemmatization
praBeeInadhikari
 
Adnan: Introduction to Natural Language Processing
Mustafa Jarrar
 
New compiler design 101 April 13 2024.pdf
eliasabdi2024
 
Grammar rules in English, Dependency Parsing, Shallow parsing
Kirti Verma
 
語言學概論Syntax
棠貝 白
 
CS571: Tree Adjoining Grammar
Jinho Choi
 
nlp (1).pptx
Subramanian Mani
 
Conteext-free Grammer
HASHIR RAZA
 
Ad

More from Marina Santini (20)

PDF
Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...
Marina Santini
 
PDF
Towards a Quality Assessment of Web Corpora for Language Technology Applications
Marina Santini
 
PDF
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-
Marina Santini
 
PDF
An Exploratory Study on Genre Classification using Readability Features
Marina Santini
 
PDF
Lecture: Semantic Word Clouds
Marina Santini
 
PDF
Lecture: Ontologies and the Semantic Web
Marina Santini
 
PDF
Lecture: Summarization
Marina Santini
 
PDF
Relation Extraction
Marina Santini
 
PDF
Lecture: Question Answering
Marina Santini
 
PDF
IE: Named Entity Recognition (NER)
Marina Santini
 
PDF
Lecture: Vector Semantics (aka Distributional Semantics)
Marina Santini
 
PDF
Lecture: Word Sense Disambiguation
Marina Santini
 
PDF
Lecture: Word Senses
Marina Santini
 
PDF
Sentiment Analysis
Marina Santini
 
PDF
Semantic Role Labeling
Marina Santini
 
PDF
Semantics and Computational Semantics
Marina Santini
 
PDF
Lecture 9: Machine Learning in Practice (2)
Marina Santini
 
PDF
Lecture 8: Machine Learning in Practice (1)
Marina Santini
 
PDF
Lecture 5: Interval Estimation
Marina Santini
 
PDF
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Marina Santini
 
Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...
Marina Santini
 
Towards a Quality Assessment of Web Corpora for Language Technology Applications
Marina Santini
 
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-
Marina Santini
 
An Exploratory Study on Genre Classification using Readability Features
Marina Santini
 
Lecture: Semantic Word Clouds
Marina Santini
 
Lecture: Ontologies and the Semantic Web
Marina Santini
 
Lecture: Summarization
Marina Santini
 
Relation Extraction
Marina Santini
 
Lecture: Question Answering
Marina Santini
 
IE: Named Entity Recognition (NER)
Marina Santini
 
Lecture: Vector Semantics (aka Distributional Semantics)
Marina Santini
 
Lecture: Word Sense Disambiguation
Marina Santini
 
Lecture: Word Senses
Marina Santini
 
Sentiment Analysis
Marina Santini
 
Semantic Role Labeling
Marina Santini
 
Semantics and Computational Semantics
Marina Santini
 
Lecture 9: Machine Learning in Practice (2)
Marina Santini
 
Lecture 8: Machine Learning in Practice (1)
Marina Santini
 
Lecture 5: Interval Estimation
Marina Santini
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Marina Santini
 

Recently uploaded (20)

PDF
Is Assignment Help Legal in Australia_.pdf
thomas19williams83
 
PPTX
Introduction to Indian Writing in English
Trushali Dodiya
 
PDF
Knee Extensor Mechanism Injuries - Orthopedic Radiologic Imaging
Sean M. Fox
 
PPTX
How to Manage Allocation Report for Manufacturing Orders in Odoo 18
Celine George
 
PDF
Mahidol_Change_Agent_Note_2025-06-27-29_MUSEF
Tassanee Lerksuthirat
 
PPTX
Introduction to Biochemistry & Cellular Foundations.pptx
marvinnbustamante1
 
PDF
Chapter-V-DED-Entrepreneurship: Institutions Facilitating Entrepreneurship
Dayanand Huded
 
PDF
Vani - The Voice of Excellence - Jul 2025 issue
Savipriya Raghavendra
 
PDF
Exploring the Different Types of Experimental Research
Thelma Villaflores
 
PPTX
How to Configure Re-Ordering From Portal in Odoo 18 Website
Celine George
 
PDF
Council of Chalcedon Re-Examined
Smiling Lungs
 
PDF
Stokey: A Jewish Village by Rachel Kolsky
History of Stoke Newington
 
PPTX
Identifying elements in the story. Arrange the events in the story
geraldineamahido2
 
PPTX
CATEGORIES OF NURSING PERSONNEL: HOSPITAL & COLLEGE
PRADEEP ABOTHU
 
PDF
The History of Phone Numbers in Stoke Newington by Billy Thomas
History of Stoke Newington
 
PDF
Aprendendo Arquitetura Framework Salesforce - Dia 03
Mauricio Alexandre Silva
 
PDF
Characteristics, Strengths and Weaknesses of Quantitative Research.pdf
Thelma Villaflores
 
PPTX
grade 5 lesson matatag ENGLISH 5_Q1_PPT_WEEK4.pptx
SireQuinn
 
PDF
Women's Health: Essential Tips for Every Stage.pdf
Iftikhar Ahmed
 
PPTX
Post Dated Cheque(PDC) Management in Odoo 18
Celine George
 
Is Assignment Help Legal in Australia_.pdf
thomas19williams83
 
Introduction to Indian Writing in English
Trushali Dodiya
 
Knee Extensor Mechanism Injuries - Orthopedic Radiologic Imaging
Sean M. Fox
 
How to Manage Allocation Report for Manufacturing Orders in Odoo 18
Celine George
 
Mahidol_Change_Agent_Note_2025-06-27-29_MUSEF
Tassanee Lerksuthirat
 
Introduction to Biochemistry & Cellular Foundations.pptx
marvinnbustamante1
 
Chapter-V-DED-Entrepreneurship: Institutions Facilitating Entrepreneurship
Dayanand Huded
 
Vani - The Voice of Excellence - Jul 2025 issue
Savipriya Raghavendra
 
Exploring the Different Types of Experimental Research
Thelma Villaflores
 
How to Configure Re-Ordering From Portal in Odoo 18 Website
Celine George
 
Council of Chalcedon Re-Examined
Smiling Lungs
 
Stokey: A Jewish Village by Rachel Kolsky
History of Stoke Newington
 
Identifying elements in the story. Arrange the events in the story
geraldineamahido2
 
CATEGORIES OF NURSING PERSONNEL: HOSPITAL & COLLEGE
PRADEEP ABOTHU
 
The History of Phone Numbers in Stoke Newington by Billy Thomas
History of Stoke Newington
 
Aprendendo Arquitetura Framework Salesforce - Dia 03
Mauricio Alexandre Silva
 
Characteristics, Strengths and Weaknesses of Quantitative Research.pdf
Thelma Villaflores
 
grade 5 lesson matatag ENGLISH 5_Q1_PPT_WEEK4.pptx
SireQuinn
 
Women's Health: Essential Tips for Every Stage.pdf
Iftikhar Ahmed
 
Post Dated Cheque(PDC) Management in Odoo 18
Celine George
 

Lecture: Context-Free Grammars

  • 1. Context-Free Grammars Slideshare: http://www.slideshare.net/marinasantini1/lecture-contextfree-grammars Mathematics for Language Technology http://stp.lingfil.uu.se/~matsd/uv/uv15/mfst/ Last Updated: 6 March 2015 Marina Santini [email protected] Department of Linguistics and Philology Uppsala University, Uppsala, Sweden Spring 2015 1
  • 2. Acknowledgements  Several  slides  borrowed  from  Jurafsky  and  Mar6n   (2009).    Prac6cal  ac6vi6es  by  Mats  Dahllöf  and  J&M  (2009).   2
  • 3. Reading  Required Reading:   Compendium (3): 8   Mats Dahllöf: Kontext-fria grammatiker (CFG) •  http://stp.lingfil.uu.se/~matsd/uv/uv14/mfst/dok/oh7.pdf  Further Reading:   Chapter  12.1-­‐12.3  and  Chapter  16.1-­‐16.2.1  in  Jurafsky   D.  &  Mar6n  J.  (2009)  Speech  and  Language  Processing:   An  introduc,on  to  natural  language  processing,   computa,onal  linguis,cs,  and  speech  recogni,on.  Online   draM  version:  hOp://stp.lingfil.uu.se/~san6nim/ml/2014/ JurafskyMar6nSpeechAndLanguageProcessing2ed_draM%202007.pdf   3
  • 4. Outline  Context-Free Grammars (CFGs)  A grammar for English: Examples  Practical Activities 4
  • 7. 7 Informal Comments  A context-free grammar is a notation for describing languages.  It is more powerful than finite automata or REs, but still cannot define all possible languages.  Useful for nested structures.
  • 8. 8 Constituency  The basic idea here is that groups of words within utterances can be shown to act as single units.  And in a given language, these units form coherent classes that can be shown to behave in similar ways   With respect to their internal structure   And with respect to other units in the language
  • 9. 9 Constituency  Internal structure   We can describe an internal structure to the class (might have to use disjunctions of somewhat unlike sub-classes to do this).  External behavior   For example, we can say that noun phrases can come before verbs
  • 10. 10 Constituency  For example, it makes sense to the say that the following are all noun phrases in English...  Why? One piece of evidence is that they can all precede verbs.   This is external evidence
  • 11. 11 Grammars and Constituency  Of course, there’s nothing easy or obvious about how we come up with right set of constituents and the rules that govern how they combine...  That’s why there are so many different theories of grammar and competing analyses of the same data.
  • 12. 12 Context-Free Grammars  Context-free grammars (CFGs)   Also known as •  Phrase structure grammars •  Backus-Naur form  Consist of   Rules   Terminals   Non-terminals
  • 13. 13 Context-Free Grammars  Terminals   We’ll take these to be words (for now)  Non-Terminals   The constituents in a language •  Like noun phrase, verb phrase and sentence  Rules   Rules are equations that consist of a single non-terminal on the left and any number of terminals and non-terminals on the right.
  • 14. 14 Some NP Rules  Here are some rules for our noun phrases  Together, these describe two kinds of NPs.   One that consists of a determiner followed by a nominal   And another that says that proper names are NPs.   The third rule illustrates two things •  An explicit disjunction –  Two kinds of nominals •  A recursive definition –  Same non-terminal on the right and left-side of the rule
  • 16. 16 Generativity  As with FSAs, you can view these rules as either analysis or synthesis machines   Generate strings in the language   Reject strings not in the language   Impose structures (trees) on strings in the language
  • 17. 17 Derivations  A derivation is a sequence of rules applied to a string that accounts for that string   Covers all the elements in the string   Covers only the elements in the string
  • 18. 18 Definition - Repetition  More formally, a CFG consists of
  • 19. 19 Parsing  Parsing is the process of taking a string and a grammar and returning a (multiple?) parse tree(s) for that string   CFG is just more powerful language than FSA •  Remember this means that there are languages we can capture with CFGs that we can’t capture with finite-state methods
  • 20. A grammar for English: Examples 20
  • 21. 21 An English Grammar Fragment  Sentences  Noun phrases  Verb phrases
  • 22. 22 Sentence Types  Declaratives: A plane left. S → NP VP  Imperatives: Leave! S → VP  Yes-No Questions: Did the plane leave? S → Aux NP VP  WH Questions: When did the plane leave? S → WH-NP Aux NP VP
  • 23. 23 Noun Phrases  Let’s consider the following rule in more detail... NP → Det Nominal  Most of the complexity of English noun phrases is hidden in this rule.  Consider the derivation for the following example   All the morning flights from Denver to Tampa leaving before 10
  • 25. 25 NP Structure  Clearly this NP is really about flights. That’s the central criticial noun in this NP. Let’s call that the head.  We can dissect this kind of NP into the stuff that can come before the head, and the stuff that can come after it.
  • 26. 26 Determiners  Noun phrases can start with determiners...  Determiners can be   Simple lexical items: the, this, a, an, etc. •  A car   Or simple possessives •  John’s car   Or complex recursive versions of that •  John’s sister’s husband’s son’s car
  • 27. 27 Nominals  Contains the head and any pre- and post- modifiers of the head.   Pre- •  Quantifiers, cardinals, ordinals... –  Three cars •  Adjectives and Aps –  large cars •  Ordering constraints –  Three large cars –  ?large three cars
  • 28. 28 Postmodifiers  Three kinds   Prepositional phrases •  From Seattle   Non-finite clauses •  Arriving before noon   Relative clauses •  That serve breakfast  Same general (recursive) rule to handle these   Nominal → Nominal PP   Nominal → Nominal GerundVP   Nominal → Nominal RelClause
  • 29. 29 Agreement  By agreement, we have in mind constraints that hold among various constituents that take part in a rule or set of rules  For example, in English, determiners and the head nouns in NPs have to agree in their number. This flight Those flights *This flights *Those flight
  • 30. 30 Problem  Our earlier NP rules are clearly deficient since they don’t capture this constraint   NP → Det Nominal •  Accepts, and assigns correct structures, to grammatical examples (this flight) •  But its also happy with incorrect examples (*these flight)   Such a rule is said to overgenerate.   We’ll come back to this in a bit
  • 31. 31 Verb Phrases  English VPs consist of a head verb along with 0 or more following constituents which we’ll call arguments.
  • 32. 32 Subcategorization  But, even though there are many valid VP rules in English, not all verbs are allowed to participate in all those VP rules.  We can subcategorize the verbs in a language according to the sets of VP rules that they participate in.  This is a modern take on the traditional notion of transitive/intransitive.  Modern grammars may have 100s or such classes.
  • 33. 33 Subcategorization  Sneeze: John sneezed  Find: Please find [a flight to NY]NP  Give: Give [me]NP[a cheaper fare]NP  Help: Can you help [me]NP[with a flight]PP  Prefer: I prefer [to leave earlier]TO-VP  Told: I was told [United has a flight]S  …
  • 34. 34 Subcategorization  *John sneezed the book  *I prefer United has a flight  *Give with a flight  As with agreement phenomena, we need a way to formally express the constraints
  • 35. 35 Why?  Right now, the various rules for VPs overgenerate.   They permit the presence of strings containing verbs and arguments that don’t go together   For example   VP -> V NP therefore Sneezed the book is a VP since “sneeze” is a verb and “the book” is a valid NP
  • 36. 36 Possible CFG Solution  Possible solution for agreement.  Can use the same trick for all the verb/ VP classes.  SgS -> SgNP SgVP  PlS -> PlNp PlVP  SgNP -> SgDet SgNom  PlNP -> PlDet PlNom  PlVP -> PlV NP  SgVP ->SgV Np  …
  • 37. 37 CFG Solution for Agreement  It works and stays within the power of CFGs  But its ugly  And it does not scale all that well because of the interaction among the various constraints explodes the number of rules in our grammar
  • 38. 38 The Point  CFGs appear to be just about what we need to account for a lot of basic syntactic structure in English.  But there are problems   That can be dealt with adequately, although not elegantly, by staying within the CFG framework.  There are simpler, more elegant, solutions that take us out of the CFG framework (beyond its formal power)   Ex: LFG (Lexical Functional Grammar), HPSG (Head- Driven Phrase Structure Grammar), etc.
  • 39. 39 Summary  Context-free grammars can be used to model various facts about the syntax of a language.  When paired with parsers, such grammars consititute a critical component in many applications.  Constituency is a key phenomena easily captured with CFG rules.   But agreement and subcategorization do pose significant problems
  • 40. Prac6cal  Ac6vity  1    The  language  L  contains  all  strings  over  the   alphabet  {a,b}  that  begin  with  a  and  end  with  b,   ie:    Write context-free grammar rules that generate the language L.       40
  • 42. Practical Activity 2  Look at the following CFG rules:  Draw a parse tree for the string aba. 42
  • 44. Practical Activity 3 Draw tree structures for the following phrases and sentences: 1.  Dallas 2.  I would like to fly on American airlines. 3.  after five p.m. 4.  Does American 487 have a first class section? 5.  early flights 6.  any delays in Denver 44
  • 45. Practical Activity 3: Possible Solutions 45