SlideShare a Scribd company logo
Lecture 3: Syntactic Editor Services
CS4200 Compiler Construction
Eelco Visser
TU Delft
September 2018
Lexical syntax
- defining the syntax of tokens / terminals including layout

- making lexical syntax and layout explicit

Syntactic editor services
- more interpretations of syntax definitions

Formatting specification
- how to map (abstract syntax) trees to text

Syntactic completion
- proposing valid syntactic completions in an editor
!2
This Lecture
Reading Material
3
!4
The inverse of parsing is unparsing or pretty-printing or
formatting, i.e. mapping a tree representation of a program
to a textual representation. A plain context-free grammar
can be used as specification of an unparser. However, then
it is unclear where the whitespace should go.

This paper extends context-free grammars with templates
that provide hints for layout of program text when
formatting.
https://doi.org/10.1145/2427048.2427056
Based on master’s thesis project of Tobi Vollebregt
!5
Syntax definitions cannot be used just for parsing, but
for many other operations. This paper shows how
syntactic completion can be provided generically given
a syntax definition.
https://doi.org/10.1145/2427048.2427056
Part of PhD thesis work Eduardo Amorim
!6
The SDF3 syntax
definition formalism is
documented at the
metaborg.org website.
http://www.metaborg.org/en/latest/source/langdev/meta/lang/sdf3/index.html
Tiger Lexical Syntax
7
Tiger Lexical Syntax: Identifiers
!8
module Identifiers
lexical syntax
Id = [a-zA-Z] [a-zA-Z0-9_]*
lexical restrictions
Id -/- [a-zA-Z0-9_]
Tiger Lexical Syntax: Number Literals
!9
module Numbers
lexical syntax
IntConst = [0-9]+
lexical syntax
RealConst.RealConstNoExp = IntConst "." IntConst
RealConst.RealConst = IntConst "." IntConst "e" Sign IntConst
Sign = "+"
Sign = "-"
context-free syntax
Exp.Int = IntConst
Tiger Lexical Syntax: String Literals
!10
module Strings
sorts StrConst
lexical syntax
StrConst = """ StrChar* """
StrChar = ~["n]
StrChar = [] [n]
StrChar = [] [t]
StrChar = [] [^] [A-Z]
StrChar = [] [0-9] [0-9] [0-9]
StrChar = [] ["]
StrChar = [] []
StrChar = [] [ tn]+ []
context-free syntax // records
Exp.String = StrConst
Tiger Lexical Syntax: Whitespace
!11
module Whitespace
lexical syntax
LAYOUT = [ tnr]
context-free restrictions
// Ensure greedy matching for comments
LAYOUT? -/- [ tnr]
LAYOUT? -/- [/].[/]
LAYOUT? -/- [/].[*]
syntax
LAYOUT-CF = LAYOUT-LEX
LAYOUT-CF = LAYOUT-CF LAYOUT-CF {left}
Implicit composition of layout
Tiger Lexical Syntax: Comment
!12
lexical syntax
CommentChar = [*]
LAYOUT = "/*" InsideComment* "*/"
InsideComment = ~[*]
InsideComment = CommentChar
lexical restrictions
CommentChar -/- [/]
context-free restrictions
LAYOUT? -/- [/].[*]
lexical syntax
LAYOUT = SingleLineComment
SingleLineComment = "//" ~[nr]* NewLineEOF
NewLineEOF = [nr]
NewLineEOF = EOF
EOF =
lexical restrictions
EOF -/- ~[]
context-free restrictions
LAYOUT? -/- [/].[/]
Desugaring Lexical Syntax
13
Core language
- context-free grammar productions

- with constructors

- only character classes as terminals

- explicit definition of layout

Desugaring
- express lexical syntax in terms of character classes

- explicate layout between context-free syntax symbols

- separate lexical and context-free syntax non-terminals
!14
Explication of Lexical Syntax
Explication of Layout by Transformation
!15
context-free syntax
Exp.Int = IntConst
Exp.Uminus = "-" Exp
Exp.Times = Exp "*" Exp {left}
Exp.Divide = Exp "/" Exp {left}
Exp.Plus = Exp "+" Exp {left}
syntax
Exp-CF.Int = IntConst-CF
Exp-CF.Uminus = "-" LAYOUT?-CF Exp-CF
Exp-CF.Times = Exp-CF LAYOUT?-CF "*" LAYOUT?-CF Exp-CF {left}
Exp-CF.Divide = Exp-CF LAYOUT?-CF "/" LAYOUT?-CF Exp-CF {left}
Exp-CF.Plus = Exp-CF LAYOUT?-CF "+" LAYOUT?-CF Exp-CF {left}
Symbols in context-free syntax are
implicitly separated by optional layout
Separation of Lexical and Context-free Syntax
!16
syntax
Id-LEX = [65-9097-122] [48-5765-909597-122]*-LEX
Id-LEX = "if" {reject}
Id-LEX = "then" {reject}
Id-CF = Id-LEX
Exp-CF.Var = Id-CF
lexical syntax
Id = [a-zA-Z] [a-zA-Z0-9_]*
Id = "if" {reject}
Id = "then" {reject}
context-free syntax
Exp.Var = Id
syntax
Id = [65-9097-122] [48-5765-909597-122]*
Id = "if" {reject}
Id = "then" {reject}
Exp.Var = Id
Why Separation of Lexical and Context-Free Syntax?
!17
Homework: what would go wrong if we not do this?
lexical syntax
Id = [a-zA-Z] [a-zA-Z0-9_]*
Id = "if" {reject}
Id = "then" {reject}
context-free syntax
Exp.Var = Id
!18
syntax
"if" = [105] [102]
"then" = [116] [104] [101] [110]
[48-5765-909597-122]+-LEX = [48-5765-909597-122]
[48-5765-909597-122]+-LEX = [48-5765-909597-122]+-LEX [48-5765-909597-122]
[48-5765-909597-122]*-LEX =
[48-5765-909597-122]*-LEX = [48-5765-909597-122]+-LEX
Id-LEX = [65-9097-122] [48-5765-909597-122]*-LEX
Id-LEX = "if" {reject}
Id-LEX = "then" {reject}
Id-CF = Id-LEX
Exp-CF.Var = Id-CF
Exp-CF.Call = Exp-CF LAYOUT?-CF Exp-CF {left}
Exp-CF.IfThen = "if" LAYOUT?-CF Exp-CF LAYOUT?-CF "then" LAYOUT?-CF Exp-CF
LAYOUT-CF = LAYOUT-CF LAYOUT-CF {left}
LAYOUT?-CF = LAYOUT-CF
LAYOUT?-CF =
restrictions
Id-LEX -/- [48-5765-909597-122]
"if" -/- [48-5765-909597-122]
"then" -/- [48-5765-909597-122]
priorities
Exp-CF.Call left Exp-CF.Call,
LAYOUT-CF = LAYOUT-CF LAYOUT-CF left LAYOUT-CF = LAYOUT-CF LAYOUT-CF
separate lexical and
context-free syntax
separate context-
free symbols by
optional layout
character classes
as only terminals
lexical syntax
Id = [a-zA-Z] [a-zA-Z0-9_]*
Id = “if" {reject}
Id = "then" {reject}
lexical restrictions
Id -/- [a-zA-Z0-9_]
"if" "then" -/- [a-zA-Z0-9_]
context-free syntax
Exp.Var = Id
Exp.Call = Exp Exp {left}
Exp.IfThen = "if" Exp "then" Exp
Syntactic Editor Services
19
Editor Services
!20
Editor Services
!21
Source
Code
Editor
Parse
Feedback
& Operations
Abstract
Syntax
Tree
Feedback

- syntax coloring

- syntax checking

- outline view
Operations

- syntactic completion

- formatting

- abstract syntax tree
Language Project Configuration (ESV)
!22
module Main
imports
Syntax
Analysis
language
extensions : tig
//provider : target/metaborg/stratego.ctree
provider : target/metaborg/stratego.jar
provider : target/metaborg/stratego-javastrat.jar
menus
menu: "Transform" (openeditor) (realtime)
action: "Desugar" = editor-desugar (source)
action: "Desugar AST" = editor-desugar-ast (source)
Configuration of Syntactic Services (ESV)
!23
module Syntax
imports
libspoofax/color/default
completion/colorer/Tiger-cc-esv
language
table : target/metaborg/sdf-new.tbl
//table : target/metaborg/sdf.tbl
start symbols : Module
line comment : "//"
block comment : "/*" * "*/"
fences : [ ] ( ) { }
menus
menu: "Syntax" (openeditor)
action: "Format" = editor-format (source)
action: "Show parsed AST" = debug-show-aterm (source)
views
outline view: editor-outline (source)
expand to level: 3
Syntax Coloring
24
Generated Syntax Coloring
!25
// compute the n-th fibonacci number
let function fib(n: int): int =
if n <= 1 then 1
else fib(n - 1) + fib(n - 2)
in fib(10)
end
module libspoofax/color/default
imports
libspoofax/color/colors
colorer // Default, token-based
highlighting
keyword : 127 0 85 bold
identifier : default
string : blue
number : darkgreen
var : 139 69 19 italic
operator : 0 0 128
layout : 63 127 95 italic
Customized Syntax Coloring
!26
module Tiger-Colorer
colorer
red = 255 0 0
green = 0 255 0
blue = 0 0 255
TUDlavender = 123 160 201
colorer token-based highlighting
keyword : red
Id : TUDlavender
StrConst : darkgreen
TypeId : blue
layout : green
// compute the n-th fibonacci number
let function fib(n: int): int =
if n <= 1 then 1
else fib(n - 1) + fib(n - 2)
in fib(10)
end
From Unparsers
to Pretty-Printers
with Templates
27
!28
The inverse of parsing is unparsing or pretty-printing or
formatting, i.e. mapping a tree representation of a program
to a textual representation. A plain context-free grammar
can be used as specification of an unparser. However, then
it is unclear where the whitespace should go.

This paper extends context-free grammars with templates
that provide hints for layout of program text when
formatting.
https://doi.org/10.1145/2427048.2427056
Based on master’s thesis project of Tobi Vollebregt
Converting between Text and Tree Representations
!29
parse
format
Unparsing: From Abstract Syntax Term to Text
!30
rules
unparse :
Int(x) -> x
unparse :
Plus(e1, e2) -> $[[<unparse> e1] + [<unparse> e2]]
unparse :
Times(e1, e2) -> $[[<unparse> e1] * [<unparse> e2]]
Mod(
Plus(
Int("1")
, Times(
Int("2")
, Plus(Int("3"), Int("4"))
)
)
)
1 + 2 * 3 + 4
take priorities into account!
context-free syntax
Exp.Int = IntConst
Exp.Times = Exp “*" Exp
Exp.Plus = Exp “+" Exp
From ASTs to text
- insert keywords

- insert layout: spaces, line breaks, indentation

- insert parentheses to preserve tree structure

Unparser
- derive transformation rules from context-free grammar

- keywords, literals defined in grammar productions

- parentheses determined by priority, associativity rules

- separate all symbols by a space => not pretty, or even readable

Pretty-printer
- introduce spaces, line breaks, and indentation to produce readable text

- doing that manually is tedious
!31
Pretty-Printing
Specifying Formatting Layout with Templates
!32
context-free syntax
Exp.Seq = <
(
<{Exp ";n"}*>
)
>
Exp.If = <
if <Exp> then
<Exp>
else
<Exp>
>
Exp.IfThen = <
if <Exp> then
<Exp>
>
Exp.While = <
while <Exp> do
<Exp>
>
Inverse quotation

- template quotes literal text with <>

- anti-quotations insert non-terminals with <>
Layout directives

- whitespace (linebreaks, indentation, spaces) in template
guides formatting

- is interpreted as LAYOUT? for parsing
Formatter generation

- generate rules for mapping AST to text (via box expressions)
Applications

- code generation; pretty-printing generated AST

- syntactic completions

- formatting
Templates for Tiger: Binary Expressions
!33
context-free syntax
Exp.Int = IntConst
Exp.Uminus = [- [Exp]]
Exp.Times = [[Exp] * [Exp]] {left}
Exp.Divide = [[Exp] / [Exp]] {left}
Exp.Plus = [[Exp] + [Exp]] {left}
Exp.Minus = [[Exp] - [Exp]] {left}
Exp.Eq = [[Exp] = [Exp]] {non-assoc}
Exp.Neq = [[Exp] <> [Exp]] {non-assoc}
Exp.Gt = [[Exp] > [Exp]] {non-assoc}
Exp.Lt = [[Exp] < [Exp]] {non-assoc}
Exp.Geq = [[Exp] >= [Exp]] {non-assoc}
Exp.Leq = [[Exp] <= [Exp]] {non-assoc}
Exp.And = [[Exp] & [Exp]] {left}
Exp.Or = [[Exp] | [Exp]] {left}
Use [] quotes instead of
<> to avoid clash with
comparison operators
Templates for Tiger: Functions
!34
context-free syntax
Dec.FunDecs = <<{FunDec "n"}+>> {longest-match}
FunDec.ProcDec = <
function <Id>(<{FArg ", "}*>) =
<Exp>
>
FunDec.FunDec = <
function <Id>(<{FArg ", "}*>) : <Type> =
<Exp>
>
FArg.FArg = <<Id> : <Type>>
Exp.Call = <<Id>(<{Exp ", "}*>)>
No space after function name in call
Space after comma!
Function declarations
separated by newline
Indent body
of function
Templates for Tiger: Bindings and Records
!35
context-free syntax
Exp.Let = <
let
<{Dec "n"}*>
in
<{Exp ";n"}*>
end
>
context-free syntax // records
Type.RecordTy = <
{
<{Field ", n"}*>
}
>
Field.Field = <<Id> : <TypeId>>
Exp.NilExp = <nil>
Exp.Record = <<TypeId>{ <{InitField ", "}*> }>
InitField.InitField = <<Id> = <Exp>>
LValue.FieldVar = <<LValue>.<Id>>
Note spacing / layout in separators
Generating Pretty-Print Rules from Template Productions
!36
context-free syntax
FunDec.FunDec = <
function <Id>(<{FArg ", "}*>) : <Type> =
<Exp>
>
rules
prettyprint-Tiger-FunDec :
ProcDec(t1__, t2__, t3__) -> [ H(
[SOpt(HS(), "0")]
, [ S("function ")
, t1__
, S("(")
, t2__'
, S(") =")
]
)
, t3__'
]
with t1__' := <pp-one-Z(prettyprint-Tiger-Id) <+ pp-one-Z(prettyprint-completion-aux)> t1__
with t2__' := <pp-H-list(prettyprint-Tiger-FArg|", ")
<+ pp-one-Z(prettyprint-completion-aux)> t2__
with t3__' := <pp-indent(|"2")> [ <pp-one-Z(prettyprint-Tiger-Exp) <+ pp-one-Z(prettyprint-completion-aux)> t3__ ]
Separation of concerns: 

- generated formatter transforms AST to Box

- Box formatter produces text
Boxes for Formatting
!37
_1
“foo”
KW [ “foo” ]
literal text, keywords, parameters
Horizontal Layout
!38
B B B
H hs=x [ ]B B Bhs=x
hs: horizontal space between boxes
Vertical Layout
!39
V hs=x is=i [ ]B B Bvs=y is=i
B
B
B
vs: vertical space between boxes; is: indentation space
Tiger Syntax Definition
with Templates
40
Tiger Syntax: Composition
!41
module Tiger
imports Whitespace
imports Comments
imports Types
imports Identifiers
imports Bindings
imports Variables
imports Functions
imports Numbers
imports Strings
imports Records
imports Arrays
imports Control-Flow
context-free start-symbols Module
context-free syntax
Module.Mod = Exp
context-free priorities
Exp.Or > Exp.Array > Exp.Assign ,
{Exp.Uminus LValue.FieldVar LValue.Subscript}
> {left : Exp.Times Exp.Divide}
Tiger Syntax: Identifiers and Strings
!42
module Identifiers
lexical syntax
Id = [a-zA-Z] [a-zA-Z0-9_]*
lexical restrictions
Id -/- [a-zA-Z0-9_]
lexical syntax
Id = "nil" {reject}
Id = "let" {reject}
Id = … {reject}
module Strings
sorts StrConst
lexical syntax
StrConst = """ StrChar* """
StrChar = ~["n]
StrChar = [] [n]
StrChar = [] [t]
StrChar = [] [^] [A-Z]
StrChar = [] [0-9] [0-9] [0-9]
StrChar = [] ["]
StrChar = [] []
StrChar = [] [ tn]+ []
context-free syntax // records
Exp.String = StrConst
Tiger Syntax: Whitespace & Comments
!43
module Whitespace
lexical syntax
LAYOUT = [ tnr]
context-free restrictions
LAYOUT? -/- [ tnr]
module Comments
lexical syntax // multiline comments
CommentChar = [*]
LAYOUT = "/*" InsideComment* "*/"
InsideComment = ~[*]
InsideComment = CommentChar
lexical restrictions
CommentChar -/- [/]
context-free restrictions
LAYOUT? -/- [/].[/]
lexical syntax // single line comments
LAYOUT = "//" ~[nr]* NewLineEOF
NewLineEOF = [nr]
NewLineEOF = EOF
EOF =
// end of file since it cannot be followed by any character
// avoids the need for a newline to close a single line comment
// at the last line of a file
lexical restrictions
EOF -/- ~[]
context-free restrictions
LAYOUT? -/- [/].[*]
Tiger Syntax: Numbers
!44
module Numbers
lexical syntax
IntConst = [0-9]+
context-free syntax
Exp.Int = IntConst
Exp.Uminus = [- [Exp]]
Exp.Times = [[Exp] * [Exp]] {left}
Exp.Divide = [[Exp] / [Exp]] {left}
Exp.Plus = [[Exp] + [Exp]] {left}
Exp.Minus = [[Exp] - [Exp]] {left}
Exp.Eq = [[Exp] = [Exp]] {non-assoc}
Exp.Neq = [[Exp] <> [Exp]] {non-assoc}
Exp.Gt = [[Exp] > [Exp]] {non-assoc}
Exp.Lt = [[Exp] < [Exp]] {non-assoc}
Exp.Geq = [[Exp] >= [Exp]] {non-assoc}
Exp.Leq = [[Exp] <= [Exp]] {non-assoc}
Exp.And = [[Exp] & [Exp]] {left}
Exp.Or = [[Exp] | [Exp]] {left}
context-free
priorities
{Exp.Uminus}
> {left :
Exp.Times
Exp.Divide}
> {left :
Exp.Plus
Exp.Minus}
> {non-assoc :
Exp.Eq
Exp.Neq
Exp.Gt
Exp.Lt
Exp.Geq
Exp.Leq}
> Exp.And
> Exp.Or
Tiger Syntax: Variables and Functions
!45
module Bindings
imports Control-Flow
imports Identifiers
imports Types
imports Functions
imports Variables
sorts Declarations
context-free syntax
Exp.Let = <
let
<{Dec "n"}*>
in
<{Exp ";n"}*>
end
>
Declarations.Declarations = <
declarations <{Dec "n"}*>
>
module Variables
imports Identifiers
imports Types
sorts Var
context-free syntax
Dec.VarDec = <var <Id> : <Type> := <Exp>>
Dec.VarDecNoType = <var <Id> := <Exp>>
Var.Var = Id
LValue = Var
Exp = LValue
Exp.Assign = <<LValue> := <Exp>>
module Functions
imports Identifiers
imports Types
context-free syntax
Dec.FunDecs = <<{FunDec "n"}+>> {longest-match}
FunDec.ProcDec = <
function <Id>(<{FArg ", "}*>) =
<Exp>
>
FunDec.FunDec = <
function <Id>(<{FArg ", "}*>) : <Type> =
<Exp>
>
FArg.FArg = <<Id> : <Type>>
Exp.Call = <<Id>(<{Exp ", "}*>)>
Tiger Syntax: Records, Arrays, Types
!46
module Records
imports Base
imports Identifiers
imports Types
context-free syntax // records
Type.RecordTy = <
{
<{Field ", n"}*>
}
>
Field.Field = <<Id> : <TypeId>>
Exp.NilExp = <nil>
Exp.Record = <<TypeId>{ <{InitField ", "}*> }>
InitField.InitField = <<Id> = <Exp>>
LValue.FieldVar = <<LValue>.<Id>> module Arrays
imports Types
context-free syntax // arrays
Type.ArrayTy = <array of <TypeId>>
Exp.Array = <<TypeId>[<Exp>] of <Exp>>
LValue.Subscript = <<LValue>[<Index>]>
Index = Exp
module Types
imports Identifiers
imports Bindings
sorts Type
context-free syntax // type declarations
Dec.TypeDecs = <<{TypeDec "n"}+>> {longest-match}
TypeDec.TypeDec = <type <Id> = <Type>>
context-free syntax // type expressions
Type = TypeId
TypeId.Tid = Id
sorts Ty
context-free syntax // semantic types
Ty.INT = <INT>
Ty.STRING = <STRING>
Ty.NIL = <NIL>
Ty.UNIT = <UNIT>
Ty.NAME = <NAME <Id>>
Ty.RECORD = <RECORD <Id>>
Ty.ARRAY = <ARRAY <Ty> <Id>>
Ty.FUN = <FUN ( <{Ty ","}*> ) <Ty>>
Syntactic Completion
47
!48
Syntax definitions cannot be used just for parsing, but
for many other operations. This paper shows how
syntactic completion can be provided generically given
a syntax definition.
https://doi.org/10.1145/2427048.2427056
Part of PhD thesis work Eduardo Amorim
!49
S. Amann, S. Proksch, S. Nadi, and M. Mezini. A study of
visual studio usage in practice. In SANER, 2016.
!50
Syntactic 

Completion
Semantic 

Completion
Problems
51
Ad-hoc: re-implement for each language / IDE
Incomplete: not all programs reachableUnsound: propose invalid constructs
Sound and Complete
Syntactic Code Completion
from Syntax Definition
52
Explicit Placeholders
53
!54
Incomplete
programs
!55
Incomplete
programs
Make incompleteness explicit using placeholders!
56
!57
!58
!59
!60
4
2
3
1
Deriving Syntactic Completion from Syntax Definition
!61
context-free syntax // regular production
Statement.If = "if" "(" Exp ")" Statement "else" Statement
Placeholders as Language Constructs
!62
context-free syntax // placeholder rule
Statement.Statement-Plhdr = “$Statement"
context-free syntax // regular production
Statement.If = "if" "(" Exp ")" Statement "else" Statement
Calculate Placeholder Expansions
!63
context-free syntax // regular production
Statement.If = "if" "(" Exp ")" Statement "else" Statement
Stm-Plhdr
If
Exp-Plhdr Stm-Plhdr
Stm-Plhdr
context-free syntax // placeholder rule
Statement.Statement-Plhdr = “$Statement"
rules
rewrite-placeholder:
Statement-Plhdr() -> If(Exp-Plhdr(), Statement-Plhdr(),
Statement-Plhdr())
Calculate Placeholder Expansions
!64
context-free syntax // regular production
Statement.If = "if" "(" Exp ")" Statement "else" Statement
context-free syntax // placeholder rule
Statement.Statement-Plhdr = “$Statement"
rules
rewrite-placeholder:
Statement-Plhdr() -> If(Exp-Plhdr(), Statement-Plhdr(),
Statement-Plhdr())
!65
Incomplete
programs
Complete
programs
Expand
placeholder
Expand/overwrite
placeholders
Correct
programs
!66
Complete
programs
How to expand a complete program?
!67
Insert a placeholder?
How to expand a complete program?
Inferring Placeholders
68
Placeholder Inference
!69
Optional Parent
List of Statements
Add Optional Element
No source region
Add first or last element
Add Element to List
Placeholder Inference: Optional
ClassDecl
class “A” {NoParent ConsMethod
[…]
}NilField
70
Placeholder Inference: Optional
71
ClassDecl
class “A” {NoParent ConsMethod
[…]
}NilField
Placeholder Inference: Optional
72
ClassDecl
class “A” {NoParent ConsMethod
[…]
}NilField
Placeholder Inference: Optional
73
Placeholder Inference - Lists
74
[…]
[…]
Method
Cons
VarDecl
AssignInt “x”
VarRef
“x”
Add
Int
21
Cons
Nil
{ return
[…]
int ;
Int
=
21
Placeholder Inference - Lists
75
[…]
[…]
Method
Cons
VarDecl
AssignInt “x”
VarRef
“x”
Add
Int
21
Cons
Nil
{ return
[…]
int ;
Int
=
21
Placeholder Inference - Lists
76
[…]
[…]
Method
Cons
VarDecl
AssignInt “x”
VarRef
“x”
Add
Int
21
Cons
Nil
{ return
[…]
int ;
Int
=
21
Placeholder Inference - Lists
77
[…]
[…]
Method
Cons
VarDecl
AssignInt “x”
VarRef
“x”
Add
Int
21
Cons
Nil
{ return
[…]
int ;
Int
=
21
Completion Proposals for Inferred Placeholders
78
!79
Incomplete
programs
Complete
programs
Expand
placeholder
Infer
placeholder
Correct
programs
Infer
placeholder
Expand/overwrite
placeholders
80
Incorrect
programs
Error Recovery
81
!82
Insertion Rules
!83
context-free syntax //regular syntax rules
Statement.VarDecl = <<Type> <ID>;>
Statement.Assign = <<VarRef> = <Exp>;>
// derived insertion rules for placeholders
context-free syntax
Type.Type-Plhdr = {symbol-insertion}
ID.ID-Plhdr = {symbol-insertion}
Statement.Statement-Plhdr = {symbol-insertion}
VarRef.VarRef-Plhdr = {symbol-insertion}
Exp.Exp-Plhdr = {symbol-insertion}
// derived insertion rules for literals
lexical syntax
"=" = {symbol-completion}
";" = {symbol-completion}
Empty productions
Apply Insertion Rules at Cursor
84
Proposal nodes
Insertion nodes
[…]
[…]
Stmt*
amb
VarDecl
ClassType
“x”
VarDecl
[…]
Assign
ID-Plhdr ;
Assign
VarRef
“x”
Exp-Plhdr ;
=
Limit Search Space
!85
Exp-Plhdr Exp-Plhdr
+
Assign
VarRef
“x”
;
=
[…]
Add
Assign
VarRef
“x”
;=
[…]
Exp-Plhdr
Use the simplest possible expansions
Greedy Recovery
!86
[…]
[…]
Stmt*
amb
VarDecl
ClassType
“x”
VarDecl
[…]
Assign
ID-Plhdr ;
Assign
VarRef
“x”
Exp-Plhdr ;
=
Include postfix in recovery proposal
Nested Proposal Nodes
87
IntValue Exp-Plhdr+
Assign
VarRef
“x”
;=
[…]
Add
1
88
Syntactic Completion
89
Incorrect
programs
Incomplete
programs
Complete
programs
Correct
programs
90
Incorrect
programs
Incomplete
programs
Complete
programs
Expand
placeholder
Correct
programs
Expand/overwrite
placeholders
91
Incorrect
programs
Incomplete
programs
Complete
programs
Expand
placeholder
Infer
placeholder
Correct
programs
Infer
placeholder
Expand/overwrite
placeholders
92
Incorrect
programs
Incomplete
programs
Complete
programs
Expand
placeholder
Infer
placeholder
Correct
programs
Infer
placeholder
Expand/overwrite
placeholders
Recover
incomplete
program
Recover
complete
program
93
Syntactic Services:
Summary
94
Syntax definition = CFG++
- Concrete syntax: notation

- Constructors: abstract syntax / structure

- Associativity & priority: disambiguation 

- Templates: formatting

Parsing
- Mapping text to abstract syntax

- Syntax checking

- Permissive grammars: error recovery

!95
Syntactic Services from Syntax Definition
Syntax coloring
- ESV: mapping token sorts to colors

Formatting
- Unparsing: mapping from abstract syntax to concrete syntax

- Pretty-printing: derived from templates

- Parenthesization: derived from disambiguation declarations

Completion
- Make incompleteness explicit, part of the structure

- Generating proposals: derived from structure

- Pretty-printing proposals: derived from templates
!96
Syntactic Services from Syntax Definition
Next: Parsing
97
!98
Compilers: Principles, Techniques, and Tools, 2nd Edition
Alfred V. Aho, Columbia University

Monica S. Lam, Stanford University

Ravi Sethi, Avaya Labs

Jeffrey D. Ullman, Stanford University

2007 | Pearson
Classical compiler textbook

Chapter 4: Syntax Analysis
Except where otherwise noted, this work is licensed under

More Related Content

What's hot (20)

PDF
Programming languages
Eelco Visser
 
PPT
Lex (lexical analyzer)
Sami Said
 
PDF
Compiler Construction | Lecture 9 | Constraint Resolution
Eelco Visser
 
PDF
Declarative Type System Specification with Statix
Eelco Visser
 
PPTX
Introduction of flex
vip_du
 
PPTX
Introduction of bison
vip_du
 
PDF
CS4200 2019 | Lecture 2 | syntax-definition
Eelco Visser
 
PDF
Writing Parsers and Compilers with PLY
David Beazley (Dabeaz LLC)
 
PPT
Lexyacc
Hina Tahir
 
PDF
Compiler Construction | Lecture 10 | Data-Flow Analysis
Eelco Visser
 
PDF
Compiler Construction | Lecture 4 | Parsing
Eelco Visser
 
PDF
Declare Your Language: Type Checking
Eelco Visser
 
PDF
Declare Your Language: Name Resolution
Eelco Visser
 
PPT
Yacc lex
915086731
 
PDF
Declare Your Language: Transformation by Strategic Term Rewriting
Eelco Visser
 
PPTX
More on Lex
Tech_MX
 
PPT
Lex and Yacc ppt
pssraikar
 
PPTX
BUILDING BASIC STRECH SQL COMPILER
Ajeet Dubey
 
Programming languages
Eelco Visser
 
Lex (lexical analyzer)
Sami Said
 
Compiler Construction | Lecture 9 | Constraint Resolution
Eelco Visser
 
Declarative Type System Specification with Statix
Eelco Visser
 
Introduction of flex
vip_du
 
Introduction of bison
vip_du
 
CS4200 2019 | Lecture 2 | syntax-definition
Eelco Visser
 
Writing Parsers and Compilers with PLY
David Beazley (Dabeaz LLC)
 
Lexyacc
Hina Tahir
 
Compiler Construction | Lecture 10 | Data-Flow Analysis
Eelco Visser
 
Compiler Construction | Lecture 4 | Parsing
Eelco Visser
 
Declare Your Language: Type Checking
Eelco Visser
 
Declare Your Language: Name Resolution
Eelco Visser
 
Yacc lex
915086731
 
Declare Your Language: Transformation by Strategic Term Rewriting
Eelco Visser
 
More on Lex
Tech_MX
 
Lex and Yacc ppt
pssraikar
 
BUILDING BASIC STRECH SQL COMPILER
Ajeet Dubey
 

Similar to Compiler Construction | Lecture 3 | Syntactic Editor Services (20)

PDF
Declare Your Language: Syntactic (Editor) Services
Eelco Visser
 
PDF
syntaxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.pdf
MadhuCK2
 
PPTX
CH 2.pptx
Obsa2
 
PDF
CS-4337_03_Chapter3- syntax and semantics.pdf
FutureKids1
 
PPT
sabesta3.ppt
NaveedAfzal34
 
PDF
05SyntaxAnalysis in compiler design notespdf
Padamata Rameshbabu
 
PDF
Syntax Definition
Guido Wachsmuth
 
PPT
Ch2 (1).ppt
daniloalbay1
 
PDF
New compiler design 101 April 13 2024.pdf
eliasabdi2024
 
PPT
Lexical analysis, syntax analysis, semantic analysis. Ppt
ovidlivi91
 
PDF
Context free langauges
sudhir sharma
 
PPT
3 describing syntax
Munawar Ahmed
 
PPT
UNIT 1 part II.ppt
Ranjeet Reddy
 
PPTX
COMPILER DESIGN LECTURES -UNIT-2 ST.pptx
Ranjeet Reddy
 
DOCX
8-Practice problems on operator precedence parser-24-05-2023.docx
venkatapranaykumarGa
 
PDF
Syntax analysis
Akshaya Arunan
 
PDF
Context Free Grammar
niveditJain
 
PDF
Ch2_Compilers A Simple One-Pass Compiler.pdf
ssuser964532
 
DOCX
Parser
Mallikarjun Rao
 
Declare Your Language: Syntactic (Editor) Services
Eelco Visser
 
syntaxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.pdf
MadhuCK2
 
CH 2.pptx
Obsa2
 
CS-4337_03_Chapter3- syntax and semantics.pdf
FutureKids1
 
sabesta3.ppt
NaveedAfzal34
 
05SyntaxAnalysis in compiler design notespdf
Padamata Rameshbabu
 
Syntax Definition
Guido Wachsmuth
 
Ch2 (1).ppt
daniloalbay1
 
New compiler design 101 April 13 2024.pdf
eliasabdi2024
 
Lexical analysis, syntax analysis, semantic analysis. Ppt
ovidlivi91
 
Context free langauges
sudhir sharma
 
3 describing syntax
Munawar Ahmed
 
UNIT 1 part II.ppt
Ranjeet Reddy
 
COMPILER DESIGN LECTURES -UNIT-2 ST.pptx
Ranjeet Reddy
 
8-Practice problems on operator precedence parser-24-05-2023.docx
venkatapranaykumarGa
 
Syntax analysis
Akshaya Arunan
 
Context Free Grammar
niveditJain
 
Ch2_Compilers A Simple One-Pass Compiler.pdf
ssuser964532
 
Ad

More from Eelco Visser (15)

PDF
CS4200 2019 | Lecture 5 | Transformation by Term Rewriting
Eelco Visser
 
PDF
CS4200 2019 Lecture 1: Introduction
Eelco Visser
 
PDF
A Direct Semantics of Declarative Disambiguation Rules
Eelco Visser
 
PDF
Compiler Construction | Lecture 17 | Beyond Compiler Construction
Eelco Visser
 
PDF
Domain Specific Languages for Parallel Graph AnalytiX (PGX)
Eelco Visser
 
PDF
Compiler Construction | Lecture 15 | Memory Management
Eelco Visser
 
PDF
Compiler Construction | Lecture 14 | Interpreters
Eelco Visser
 
PDF
Compiler Construction | Lecture 11 | Monotone Frameworks
Eelco Visser
 
PDF
Compiler Construction | Lecture 7 | Type Checking
Eelco Visser
 
PDF
Compiler Construction | Lecture 2 | Declarative Syntax Definition
Eelco Visser
 
PDF
Compiler Construction | Lecture 1 | What is a compiler?
Eelco Visser
 
PDF
Declare Your Language: Virtual Machines & Code Generation
Eelco Visser
 
PDF
Declare Your Language: Dynamic Semantics
Eelco Visser
 
PDF
Declare Your Language: Constraint Resolution 2
Eelco Visser
 
PDF
Declare Your Language: Constraint Resolution 1
Eelco Visser
 
CS4200 2019 | Lecture 5 | Transformation by Term Rewriting
Eelco Visser
 
CS4200 2019 Lecture 1: Introduction
Eelco Visser
 
A Direct Semantics of Declarative Disambiguation Rules
Eelco Visser
 
Compiler Construction | Lecture 17 | Beyond Compiler Construction
Eelco Visser
 
Domain Specific Languages for Parallel Graph AnalytiX (PGX)
Eelco Visser
 
Compiler Construction | Lecture 15 | Memory Management
Eelco Visser
 
Compiler Construction | Lecture 14 | Interpreters
Eelco Visser
 
Compiler Construction | Lecture 11 | Monotone Frameworks
Eelco Visser
 
Compiler Construction | Lecture 7 | Type Checking
Eelco Visser
 
Compiler Construction | Lecture 2 | Declarative Syntax Definition
Eelco Visser
 
Compiler Construction | Lecture 1 | What is a compiler?
Eelco Visser
 
Declare Your Language: Virtual Machines & Code Generation
Eelco Visser
 
Declare Your Language: Dynamic Semantics
Eelco Visser
 
Declare Your Language: Constraint Resolution 2
Eelco Visser
 
Declare Your Language: Constraint Resolution 1
Eelco Visser
 
Ad

Recently uploaded (20)

PPTX
OpenChain @ OSS NA - In From the Cold: Open Source as Part of Mainstream Soft...
Shane Coughlan
 
PPTX
Finding Your License Details in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PPTX
ChiSquare Procedure in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
PPTX
Agentic Automation Journey Session 1/5: Context Grounding and Autopilot for E...
klpathrudu
 
PPTX
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 
PDF
[Solution] Why Choose the VeryPDF DRM Protector Custom-Built Solution for You...
Lingwen1998
 
PDF
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
PDF
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 
PDF
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
PDF
Wondershare PDFelement Pro Crack for MacOS New Version Latest 2025
bashirkhan333g
 
PDF
IDM Crack with Internet Download Manager 6.42 Build 43 with Patch Latest 2025
bashirkhan333g
 
PDF
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
PDF
Generic or Specific? Making sensible software design decisions
Bert Jan Schrijver
 
PPTX
Agentic Automation Journey Series Day 2 – Prompt Engineering for UiPath Agents
klpathrudu
 
PDF
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
PPTX
Help for Correlations in IBM SPSS Statistics.pptx
Version 1 Analytics
 
PPTX
Empowering Asian Contributions: The Rise of Regional User Groups in Open Sour...
Shane Coughlan
 
PPTX
Tally software_Introduction_Presentation
AditiBansal54083
 
PPTX
Transforming Mining & Engineering Operations with Odoo ERP | Streamline Proje...
SatishKumar2651
 
OpenChain @ OSS NA - In From the Cold: Open Source as Part of Mainstream Soft...
Shane Coughlan
 
Finding Your License Details in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
ChiSquare Procedure in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
Agentic Automation Journey Session 1/5: Context Grounding and Autopilot for E...
klpathrudu
 
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 
[Solution] Why Choose the VeryPDF DRM Protector Custom-Built Solution for You...
Lingwen1998
 
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
Wondershare PDFelement Pro Crack for MacOS New Version Latest 2025
bashirkhan333g
 
IDM Crack with Internet Download Manager 6.42 Build 43 with Patch Latest 2025
bashirkhan333g
 
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
Generic or Specific? Making sensible software design decisions
Bert Jan Schrijver
 
Agentic Automation Journey Series Day 2 – Prompt Engineering for UiPath Agents
klpathrudu
 
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
Help for Correlations in IBM SPSS Statistics.pptx
Version 1 Analytics
 
Empowering Asian Contributions: The Rise of Regional User Groups in Open Sour...
Shane Coughlan
 
Tally software_Introduction_Presentation
AditiBansal54083
 
Transforming Mining & Engineering Operations with Odoo ERP | Streamline Proje...
SatishKumar2651
 

Compiler Construction | Lecture 3 | Syntactic Editor Services

  • 1. Lecture 3: Syntactic Editor Services CS4200 Compiler Construction Eelco Visser TU Delft September 2018
  • 2. Lexical syntax - defining the syntax of tokens / terminals including layout - making lexical syntax and layout explicit Syntactic editor services - more interpretations of syntax definitions Formatting specification - how to map (abstract syntax) trees to text Syntactic completion - proposing valid syntactic completions in an editor !2 This Lecture
  • 4. !4 The inverse of parsing is unparsing or pretty-printing or formatting, i.e. mapping a tree representation of a program to a textual representation. A plain context-free grammar can be used as specification of an unparser. However, then it is unclear where the whitespace should go. This paper extends context-free grammars with templates that provide hints for layout of program text when formatting. https://doi.org/10.1145/2427048.2427056 Based on master’s thesis project of Tobi Vollebregt
  • 5. !5 Syntax definitions cannot be used just for parsing, but for many other operations. This paper shows how syntactic completion can be provided generically given a syntax definition. https://doi.org/10.1145/2427048.2427056 Part of PhD thesis work Eduardo Amorim
  • 6. !6 The SDF3 syntax definition formalism is documented at the metaborg.org website. http://www.metaborg.org/en/latest/source/langdev/meta/lang/sdf3/index.html
  • 8. Tiger Lexical Syntax: Identifiers !8 module Identifiers lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* lexical restrictions Id -/- [a-zA-Z0-9_]
  • 9. Tiger Lexical Syntax: Number Literals !9 module Numbers lexical syntax IntConst = [0-9]+ lexical syntax RealConst.RealConstNoExp = IntConst "." IntConst RealConst.RealConst = IntConst "." IntConst "e" Sign IntConst Sign = "+" Sign = "-" context-free syntax Exp.Int = IntConst
  • 10. Tiger Lexical Syntax: String Literals !10 module Strings sorts StrConst lexical syntax StrConst = """ StrChar* """ StrChar = ~["n] StrChar = [] [n] StrChar = [] [t] StrChar = [] [^] [A-Z] StrChar = [] [0-9] [0-9] [0-9] StrChar = [] ["] StrChar = [] [] StrChar = [] [ tn]+ [] context-free syntax // records Exp.String = StrConst
  • 11. Tiger Lexical Syntax: Whitespace !11 module Whitespace lexical syntax LAYOUT = [ tnr] context-free restrictions // Ensure greedy matching for comments LAYOUT? -/- [ tnr] LAYOUT? -/- [/].[/] LAYOUT? -/- [/].[*] syntax LAYOUT-CF = LAYOUT-LEX LAYOUT-CF = LAYOUT-CF LAYOUT-CF {left} Implicit composition of layout
  • 12. Tiger Lexical Syntax: Comment !12 lexical syntax CommentChar = [*] LAYOUT = "/*" InsideComment* "*/" InsideComment = ~[*] InsideComment = CommentChar lexical restrictions CommentChar -/- [/] context-free restrictions LAYOUT? -/- [/].[*] lexical syntax LAYOUT = SingleLineComment SingleLineComment = "//" ~[nr]* NewLineEOF NewLineEOF = [nr] NewLineEOF = EOF EOF = lexical restrictions EOF -/- ~[] context-free restrictions LAYOUT? -/- [/].[/]
  • 14. Core language - context-free grammar productions - with constructors - only character classes as terminals - explicit definition of layout Desugaring - express lexical syntax in terms of character classes - explicate layout between context-free syntax symbols - separate lexical and context-free syntax non-terminals !14 Explication of Lexical Syntax
  • 15. Explication of Layout by Transformation !15 context-free syntax Exp.Int = IntConst Exp.Uminus = "-" Exp Exp.Times = Exp "*" Exp {left} Exp.Divide = Exp "/" Exp {left} Exp.Plus = Exp "+" Exp {left} syntax Exp-CF.Int = IntConst-CF Exp-CF.Uminus = "-" LAYOUT?-CF Exp-CF Exp-CF.Times = Exp-CF LAYOUT?-CF "*" LAYOUT?-CF Exp-CF {left} Exp-CF.Divide = Exp-CF LAYOUT?-CF "/" LAYOUT?-CF Exp-CF {left} Exp-CF.Plus = Exp-CF LAYOUT?-CF "+" LAYOUT?-CF Exp-CF {left} Symbols in context-free syntax are implicitly separated by optional layout
  • 16. Separation of Lexical and Context-free Syntax !16 syntax Id-LEX = [65-9097-122] [48-5765-909597-122]*-LEX Id-LEX = "if" {reject} Id-LEX = "then" {reject} Id-CF = Id-LEX Exp-CF.Var = Id-CF lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* Id = "if" {reject} Id = "then" {reject} context-free syntax Exp.Var = Id
  • 17. syntax Id = [65-9097-122] [48-5765-909597-122]* Id = "if" {reject} Id = "then" {reject} Exp.Var = Id Why Separation of Lexical and Context-Free Syntax? !17 Homework: what would go wrong if we not do this? lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* Id = "if" {reject} Id = "then" {reject} context-free syntax Exp.Var = Id
  • 18. !18 syntax "if" = [105] [102] "then" = [116] [104] [101] [110] [48-5765-909597-122]+-LEX = [48-5765-909597-122] [48-5765-909597-122]+-LEX = [48-5765-909597-122]+-LEX [48-5765-909597-122] [48-5765-909597-122]*-LEX = [48-5765-909597-122]*-LEX = [48-5765-909597-122]+-LEX Id-LEX = [65-9097-122] [48-5765-909597-122]*-LEX Id-LEX = "if" {reject} Id-LEX = "then" {reject} Id-CF = Id-LEX Exp-CF.Var = Id-CF Exp-CF.Call = Exp-CF LAYOUT?-CF Exp-CF {left} Exp-CF.IfThen = "if" LAYOUT?-CF Exp-CF LAYOUT?-CF "then" LAYOUT?-CF Exp-CF LAYOUT-CF = LAYOUT-CF LAYOUT-CF {left} LAYOUT?-CF = LAYOUT-CF LAYOUT?-CF = restrictions Id-LEX -/- [48-5765-909597-122] "if" -/- [48-5765-909597-122] "then" -/- [48-5765-909597-122] priorities Exp-CF.Call left Exp-CF.Call, LAYOUT-CF = LAYOUT-CF LAYOUT-CF left LAYOUT-CF = LAYOUT-CF LAYOUT-CF separate lexical and context-free syntax separate context- free symbols by optional layout character classes as only terminals lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* Id = “if" {reject} Id = "then" {reject} lexical restrictions Id -/- [a-zA-Z0-9_] "if" "then" -/- [a-zA-Z0-9_] context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} Exp.IfThen = "if" Exp "then" Exp
  • 21. Editor Services !21 Source Code Editor Parse Feedback & Operations Abstract Syntax Tree Feedback - syntax coloring - syntax checking - outline view Operations - syntactic completion - formatting - abstract syntax tree
  • 22. Language Project Configuration (ESV) !22 module Main imports Syntax Analysis language extensions : tig //provider : target/metaborg/stratego.ctree provider : target/metaborg/stratego.jar provider : target/metaborg/stratego-javastrat.jar menus menu: "Transform" (openeditor) (realtime) action: "Desugar" = editor-desugar (source) action: "Desugar AST" = editor-desugar-ast (source)
  • 23. Configuration of Syntactic Services (ESV) !23 module Syntax imports libspoofax/color/default completion/colorer/Tiger-cc-esv language table : target/metaborg/sdf-new.tbl //table : target/metaborg/sdf.tbl start symbols : Module line comment : "//" block comment : "/*" * "*/" fences : [ ] ( ) { } menus menu: "Syntax" (openeditor) action: "Format" = editor-format (source) action: "Show parsed AST" = debug-show-aterm (source) views outline view: editor-outline (source) expand to level: 3
  • 25. Generated Syntax Coloring !25 // compute the n-th fibonacci number let function fib(n: int): int = if n <= 1 then 1 else fib(n - 1) + fib(n - 2) in fib(10) end module libspoofax/color/default imports libspoofax/color/colors colorer // Default, token-based highlighting keyword : 127 0 85 bold identifier : default string : blue number : darkgreen var : 139 69 19 italic operator : 0 0 128 layout : 63 127 95 italic
  • 26. Customized Syntax Coloring !26 module Tiger-Colorer colorer red = 255 0 0 green = 0 255 0 blue = 0 0 255 TUDlavender = 123 160 201 colorer token-based highlighting keyword : red Id : TUDlavender StrConst : darkgreen TypeId : blue layout : green // compute the n-th fibonacci number let function fib(n: int): int = if n <= 1 then 1 else fib(n - 1) + fib(n - 2) in fib(10) end
  • 28. !28 The inverse of parsing is unparsing or pretty-printing or formatting, i.e. mapping a tree representation of a program to a textual representation. A plain context-free grammar can be used as specification of an unparser. However, then it is unclear where the whitespace should go. This paper extends context-free grammars with templates that provide hints for layout of program text when formatting. https://doi.org/10.1145/2427048.2427056 Based on master’s thesis project of Tobi Vollebregt
  • 29. Converting between Text and Tree Representations !29 parse format
  • 30. Unparsing: From Abstract Syntax Term to Text !30 rules unparse : Int(x) -> x unparse : Plus(e1, e2) -> $[[<unparse> e1] + [<unparse> e2]] unparse : Times(e1, e2) -> $[[<unparse> e1] * [<unparse> e2]] Mod( Plus( Int("1") , Times( Int("2") , Plus(Int("3"), Int("4")) ) ) ) 1 + 2 * 3 + 4 take priorities into account! context-free syntax Exp.Int = IntConst Exp.Times = Exp “*" Exp Exp.Plus = Exp “+" Exp
  • 31. From ASTs to text - insert keywords - insert layout: spaces, line breaks, indentation - insert parentheses to preserve tree structure Unparser - derive transformation rules from context-free grammar - keywords, literals defined in grammar productions - parentheses determined by priority, associativity rules - separate all symbols by a space => not pretty, or even readable Pretty-printer - introduce spaces, line breaks, and indentation to produce readable text - doing that manually is tedious !31 Pretty-Printing
  • 32. Specifying Formatting Layout with Templates !32 context-free syntax Exp.Seq = < ( <{Exp ";n"}*> ) > Exp.If = < if <Exp> then <Exp> else <Exp> > Exp.IfThen = < if <Exp> then <Exp> > Exp.While = < while <Exp> do <Exp> > Inverse quotation - template quotes literal text with <> - anti-quotations insert non-terminals with <> Layout directives - whitespace (linebreaks, indentation, spaces) in template guides formatting - is interpreted as LAYOUT? for parsing Formatter generation - generate rules for mapping AST to text (via box expressions) Applications - code generation; pretty-printing generated AST - syntactic completions - formatting
  • 33. Templates for Tiger: Binary Expressions !33 context-free syntax Exp.Int = IntConst Exp.Uminus = [- [Exp]] Exp.Times = [[Exp] * [Exp]] {left} Exp.Divide = [[Exp] / [Exp]] {left} Exp.Plus = [[Exp] + [Exp]] {left} Exp.Minus = [[Exp] - [Exp]] {left} Exp.Eq = [[Exp] = [Exp]] {non-assoc} Exp.Neq = [[Exp] <> [Exp]] {non-assoc} Exp.Gt = [[Exp] > [Exp]] {non-assoc} Exp.Lt = [[Exp] < [Exp]] {non-assoc} Exp.Geq = [[Exp] >= [Exp]] {non-assoc} Exp.Leq = [[Exp] <= [Exp]] {non-assoc} Exp.And = [[Exp] & [Exp]] {left} Exp.Or = [[Exp] | [Exp]] {left} Use [] quotes instead of <> to avoid clash with comparison operators
  • 34. Templates for Tiger: Functions !34 context-free syntax Dec.FunDecs = <<{FunDec "n"}+>> {longest-match} FunDec.ProcDec = < function <Id>(<{FArg ", "}*>) = <Exp> > FunDec.FunDec = < function <Id>(<{FArg ", "}*>) : <Type> = <Exp> > FArg.FArg = <<Id> : <Type>> Exp.Call = <<Id>(<{Exp ", "}*>)> No space after function name in call Space after comma! Function declarations separated by newline Indent body of function
  • 35. Templates for Tiger: Bindings and Records !35 context-free syntax Exp.Let = < let <{Dec "n"}*> in <{Exp ";n"}*> end > context-free syntax // records Type.RecordTy = < { <{Field ", n"}*> } > Field.Field = <<Id> : <TypeId>> Exp.NilExp = <nil> Exp.Record = <<TypeId>{ <{InitField ", "}*> }> InitField.InitField = <<Id> = <Exp>> LValue.FieldVar = <<LValue>.<Id>> Note spacing / layout in separators
  • 36. Generating Pretty-Print Rules from Template Productions !36 context-free syntax FunDec.FunDec = < function <Id>(<{FArg ", "}*>) : <Type> = <Exp> > rules prettyprint-Tiger-FunDec : ProcDec(t1__, t2__, t3__) -> [ H( [SOpt(HS(), "0")] , [ S("function ") , t1__ , S("(") , t2__' , S(") =") ] ) , t3__' ] with t1__' := <pp-one-Z(prettyprint-Tiger-Id) <+ pp-one-Z(prettyprint-completion-aux)> t1__ with t2__' := <pp-H-list(prettyprint-Tiger-FArg|", ") <+ pp-one-Z(prettyprint-completion-aux)> t2__ with t3__' := <pp-indent(|"2")> [ <pp-one-Z(prettyprint-Tiger-Exp) <+ pp-one-Z(prettyprint-completion-aux)> t3__ ] Separation of concerns: - generated formatter transforms AST to Box - Box formatter produces text
  • 37. Boxes for Formatting !37 _1 “foo” KW [ “foo” ] literal text, keywords, parameters
  • 38. Horizontal Layout !38 B B B H hs=x [ ]B B Bhs=x hs: horizontal space between boxes
  • 39. Vertical Layout !39 V hs=x is=i [ ]B B Bvs=y is=i B B B vs: vertical space between boxes; is: indentation space
  • 41. Tiger Syntax: Composition !41 module Tiger imports Whitespace imports Comments imports Types imports Identifiers imports Bindings imports Variables imports Functions imports Numbers imports Strings imports Records imports Arrays imports Control-Flow context-free start-symbols Module context-free syntax Module.Mod = Exp context-free priorities Exp.Or > Exp.Array > Exp.Assign , {Exp.Uminus LValue.FieldVar LValue.Subscript} > {left : Exp.Times Exp.Divide}
  • 42. Tiger Syntax: Identifiers and Strings !42 module Identifiers lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* lexical restrictions Id -/- [a-zA-Z0-9_] lexical syntax Id = "nil" {reject} Id = "let" {reject} Id = … {reject} module Strings sorts StrConst lexical syntax StrConst = """ StrChar* """ StrChar = ~["n] StrChar = [] [n] StrChar = [] [t] StrChar = [] [^] [A-Z] StrChar = [] [0-9] [0-9] [0-9] StrChar = [] ["] StrChar = [] [] StrChar = [] [ tn]+ [] context-free syntax // records Exp.String = StrConst
  • 43. Tiger Syntax: Whitespace & Comments !43 module Whitespace lexical syntax LAYOUT = [ tnr] context-free restrictions LAYOUT? -/- [ tnr] module Comments lexical syntax // multiline comments CommentChar = [*] LAYOUT = "/*" InsideComment* "*/" InsideComment = ~[*] InsideComment = CommentChar lexical restrictions CommentChar -/- [/] context-free restrictions LAYOUT? -/- [/].[/] lexical syntax // single line comments LAYOUT = "//" ~[nr]* NewLineEOF NewLineEOF = [nr] NewLineEOF = EOF EOF = // end of file since it cannot be followed by any character // avoids the need for a newline to close a single line comment // at the last line of a file lexical restrictions EOF -/- ~[] context-free restrictions LAYOUT? -/- [/].[*]
  • 44. Tiger Syntax: Numbers !44 module Numbers lexical syntax IntConst = [0-9]+ context-free syntax Exp.Int = IntConst Exp.Uminus = [- [Exp]] Exp.Times = [[Exp] * [Exp]] {left} Exp.Divide = [[Exp] / [Exp]] {left} Exp.Plus = [[Exp] + [Exp]] {left} Exp.Minus = [[Exp] - [Exp]] {left} Exp.Eq = [[Exp] = [Exp]] {non-assoc} Exp.Neq = [[Exp] <> [Exp]] {non-assoc} Exp.Gt = [[Exp] > [Exp]] {non-assoc} Exp.Lt = [[Exp] < [Exp]] {non-assoc} Exp.Geq = [[Exp] >= [Exp]] {non-assoc} Exp.Leq = [[Exp] <= [Exp]] {non-assoc} Exp.And = [[Exp] & [Exp]] {left} Exp.Or = [[Exp] | [Exp]] {left} context-free priorities {Exp.Uminus} > {left : Exp.Times Exp.Divide} > {left : Exp.Plus Exp.Minus} > {non-assoc : Exp.Eq Exp.Neq Exp.Gt Exp.Lt Exp.Geq Exp.Leq} > Exp.And > Exp.Or
  • 45. Tiger Syntax: Variables and Functions !45 module Bindings imports Control-Flow imports Identifiers imports Types imports Functions imports Variables sorts Declarations context-free syntax Exp.Let = < let <{Dec "n"}*> in <{Exp ";n"}*> end > Declarations.Declarations = < declarations <{Dec "n"}*> > module Variables imports Identifiers imports Types sorts Var context-free syntax Dec.VarDec = <var <Id> : <Type> := <Exp>> Dec.VarDecNoType = <var <Id> := <Exp>> Var.Var = Id LValue = Var Exp = LValue Exp.Assign = <<LValue> := <Exp>> module Functions imports Identifiers imports Types context-free syntax Dec.FunDecs = <<{FunDec "n"}+>> {longest-match} FunDec.ProcDec = < function <Id>(<{FArg ", "}*>) = <Exp> > FunDec.FunDec = < function <Id>(<{FArg ", "}*>) : <Type> = <Exp> > FArg.FArg = <<Id> : <Type>> Exp.Call = <<Id>(<{Exp ", "}*>)>
  • 46. Tiger Syntax: Records, Arrays, Types !46 module Records imports Base imports Identifiers imports Types context-free syntax // records Type.RecordTy = < { <{Field ", n"}*> } > Field.Field = <<Id> : <TypeId>> Exp.NilExp = <nil> Exp.Record = <<TypeId>{ <{InitField ", "}*> }> InitField.InitField = <<Id> = <Exp>> LValue.FieldVar = <<LValue>.<Id>> module Arrays imports Types context-free syntax // arrays Type.ArrayTy = <array of <TypeId>> Exp.Array = <<TypeId>[<Exp>] of <Exp>> LValue.Subscript = <<LValue>[<Index>]> Index = Exp module Types imports Identifiers imports Bindings sorts Type context-free syntax // type declarations Dec.TypeDecs = <<{TypeDec "n"}+>> {longest-match} TypeDec.TypeDec = <type <Id> = <Type>> context-free syntax // type expressions Type = TypeId TypeId.Tid = Id sorts Ty context-free syntax // semantic types Ty.INT = <INT> Ty.STRING = <STRING> Ty.NIL = <NIL> Ty.UNIT = <UNIT> Ty.NAME = <NAME <Id>> Ty.RECORD = <RECORD <Id>> Ty.ARRAY = <ARRAY <Ty> <Id>> Ty.FUN = <FUN ( <{Ty ","}*> ) <Ty>>
  • 48. !48 Syntax definitions cannot be used just for parsing, but for many other operations. This paper shows how syntactic completion can be provided generically given a syntax definition. https://doi.org/10.1145/2427048.2427056 Part of PhD thesis work Eduardo Amorim
  • 49. !49 S. Amann, S. Proksch, S. Nadi, and M. Mezini. A study of visual studio usage in practice. In SANER, 2016.
  • 51. Problems 51 Ad-hoc: re-implement for each language / IDE Incomplete: not all programs reachableUnsound: propose invalid constructs
  • 52. Sound and Complete Syntactic Code Completion from Syntax Definition 52
  • 56. 56
  • 57. !57
  • 58. !58
  • 59. !59
  • 61. Deriving Syntactic Completion from Syntax Definition !61 context-free syntax // regular production Statement.If = "if" "(" Exp ")" Statement "else" Statement
  • 62. Placeholders as Language Constructs !62 context-free syntax // placeholder rule Statement.Statement-Plhdr = “$Statement" context-free syntax // regular production Statement.If = "if" "(" Exp ")" Statement "else" Statement
  • 63. Calculate Placeholder Expansions !63 context-free syntax // regular production Statement.If = "if" "(" Exp ")" Statement "else" Statement Stm-Plhdr If Exp-Plhdr Stm-Plhdr Stm-Plhdr context-free syntax // placeholder rule Statement.Statement-Plhdr = “$Statement" rules rewrite-placeholder: Statement-Plhdr() -> If(Exp-Plhdr(), Statement-Plhdr(), Statement-Plhdr())
  • 64. Calculate Placeholder Expansions !64 context-free syntax // regular production Statement.If = "if" "(" Exp ")" Statement "else" Statement context-free syntax // placeholder rule Statement.Statement-Plhdr = “$Statement" rules rewrite-placeholder: Statement-Plhdr() -> If(Exp-Plhdr(), Statement-Plhdr(), Statement-Plhdr())
  • 66. !66 Complete programs How to expand a complete program?
  • 67. !67 Insert a placeholder? How to expand a complete program?
  • 69. Placeholder Inference !69 Optional Parent List of Statements Add Optional Element No source region Add first or last element Add Element to List
  • 70. Placeholder Inference: Optional ClassDecl class “A” {NoParent ConsMethod […] }NilField 70
  • 71. Placeholder Inference: Optional 71 ClassDecl class “A” {NoParent ConsMethod […] }NilField
  • 72. Placeholder Inference: Optional 72 ClassDecl class “A” {NoParent ConsMethod […] }NilField
  • 74. Placeholder Inference - Lists 74 […] […] Method Cons VarDecl AssignInt “x” VarRef “x” Add Int 21 Cons Nil { return […] int ; Int = 21
  • 75. Placeholder Inference - Lists 75 […] […] Method Cons VarDecl AssignInt “x” VarRef “x” Add Int 21 Cons Nil { return […] int ; Int = 21
  • 76. Placeholder Inference - Lists 76 […] […] Method Cons VarDecl AssignInt “x” VarRef “x” Add Int 21 Cons Nil { return […] int ; Int = 21
  • 77. Placeholder Inference - Lists 77 […] […] Method Cons VarDecl AssignInt “x” VarRef “x” Add Int 21 Cons Nil { return […] int ; Int = 21
  • 78. Completion Proposals for Inferred Placeholders 78
  • 82. !82
  • 83. Insertion Rules !83 context-free syntax //regular syntax rules Statement.VarDecl = <<Type> <ID>;> Statement.Assign = <<VarRef> = <Exp>;> // derived insertion rules for placeholders context-free syntax Type.Type-Plhdr = {symbol-insertion} ID.ID-Plhdr = {symbol-insertion} Statement.Statement-Plhdr = {symbol-insertion} VarRef.VarRef-Plhdr = {symbol-insertion} Exp.Exp-Plhdr = {symbol-insertion} // derived insertion rules for literals lexical syntax "=" = {symbol-completion} ";" = {symbol-completion} Empty productions
  • 84. Apply Insertion Rules at Cursor 84 Proposal nodes Insertion nodes […] […] Stmt* amb VarDecl ClassType “x” VarDecl […] Assign ID-Plhdr ; Assign VarRef “x” Exp-Plhdr ; =
  • 85. Limit Search Space !85 Exp-Plhdr Exp-Plhdr + Assign VarRef “x” ; = […] Add Assign VarRef “x” ;= […] Exp-Plhdr Use the simplest possible expansions
  • 87. Nested Proposal Nodes 87 IntValue Exp-Plhdr+ Assign VarRef “x” ;= […] Add 1
  • 88. 88
  • 95. Syntax definition = CFG++ - Concrete syntax: notation - Constructors: abstract syntax / structure - Associativity & priority: disambiguation - Templates: formatting Parsing - Mapping text to abstract syntax - Syntax checking - Permissive grammars: error recovery !95 Syntactic Services from Syntax Definition
  • 96. Syntax coloring - ESV: mapping token sorts to colors Formatting - Unparsing: mapping from abstract syntax to concrete syntax - Pretty-printing: derived from templates - Parenthesization: derived from disambiguation declarations Completion - Make incompleteness explicit, part of the structure - Generating proposals: derived from structure - Pretty-printing proposals: derived from templates !96 Syntactic Services from Syntax Definition
  • 98. !98 Compilers: Principles, Techniques, and Tools, 2nd Edition Alfred V. Aho, Columbia University Monica S. Lam, Stanford University Ravi Sethi, Avaya Labs Jeffrey D. Ullman, Stanford University 2007 | Pearson Classical compiler textbook Chapter 4: Syntax Analysis
  • 99. Except where otherwise noted, this work is licensed under