SlideShare a Scribd company logo
Domain Specific Languages in
         Python




    Siddharta Govindaraj
siddharta@silverstripesoftware.com
What are DSLs?

Specialized mini-languages for specific problem
domains that make it easier to work in that
domain
Example: SQL

SQL is a mini language specialized to retrieve data
from a relational database
Example: Regular Expressions

Regular Expressions are mini languages
specialized to express string patterns to match
Life Without Regular Expressions
def is_ip_address(ip_address):
    components = ip_address_string.split(".")
    if len(components) != 4: return False
    try:
        int_components = [int(component) for component in
components]
    except ValueError:
           return False
    for component in int_components:
           if component < 0 or component > 255:
               return False
    return True
Life With Regular Expressions
def is_ip(ip_address_string):
    match = re.match(r"^(d{1,3}).(d{1,3}).(d{1,3}).
(d{1,3})$", ip_address_string)
    if not match: return False
    for component in match.groups():
        if int(component) < 0 or int(component) > 255:
return False
    return True
The DSL that simplifies our life


 ^(d{1,3}).(d{1,3}).(d{1,3}).(d{1,3})$
Why DSL - Answered

When working in a particular domain, write your
code in a syntax that fits the domain.

             When working with patterns, use RegEx
              When working with RDBMS, use SQL
       When working in your domain – create your own DSL
The two types of DSLs

External DSL – The code is written in an external
file or as a string, which is read and parsed by the
application
The two types of DSLs

Internal DSL – Use features of the language (like
metaclasses) to enable people to write code in
python that resembles the domain syntax
Creating Forms – No DSL
<form>
<label>Name:</label><input type=”text” name=”name”/>
<label>Email:</label><input type=”text” name=”email”/>
<label>Password:</label><input type=”password”
name=”name”/>
</form>
Creating Forms – No DSL

– Requires HTML knowledge to maintain
– Therefore it is not possible for the end user to
change the structure of the form by themselves
Creating Forms – External DSL
UserForm
name->CharField label:Username
email->EmailField label:Email Address
password->PasswordField




This text file is parsed and rendered by the app
Creating Forms – External DSL

+ Easy to understand form structure
+ Can be easily edited by end users
– Requires you to read and parse the file
Creating Forms – Internal DSL
class UserForm(forms.Form):
    username = forms.RegexField(regex=r'^w+$',
          max_length=30)
    email = forms.EmailField(maxlength=75)
    password =
          forms.CharField(widget=forms.PasswordInput())




Django uses metaclass magic to convert this
syntax to an easily manipulated python class
Creating Forms – Internal DSL

+ Easy to understand form structure
+ Easy to work with the form as it is regular python
+ No need to read and parse the file
– Cannot be used by non-programmers
– Can sometimes be complicated to implement
– Behind the scenes magic → debugging hell
Creating an External DSL
UserForm
name:CharField -> label:Username size:25
email:EmailField -> size:32
password:PasswordField




Lets write code to parse and render this form
Options for Parsing

Using string functions → You have to be crazy
Using regular expressions →
Some people, when confronted with a problem, think "I know, I'll use
regular expressions." Now they have two problems. - Jamie Zawinski


Writing a parser →         ✓   (we will use PyParsing)
Step 1: Get PyParsing
   pip install pyparsing
Step 2: Design the Grammar
form ::= form_name newline field+
field ::= field_name colon field_type [arrow property+]
property ::= key colon value
form_name ::= word
field_name ::= word
field_type ::= CharField | EmailField | PasswordField
key ::= word
value ::= alphanumeric+
word ::= alpha+
newline ::= n
colon ::= :
arrow ::= ->
Quick Note

Backus-Naur Form (BNF) is a syntax for
specifying grammers
Step 3: Implement the Grammar
newline = "n"
colon = ":"
arrow = "->"
word = Word(alphas)
key = word
value = Word(alphanums)
field_type = oneOf("CharField EmailField PasswordField")
field_name = word
form_name = word
field_property = key + colon + value
field = field_name + colon + field_type +
     Optional(arrow + OneOrMore(field_property)) + newline
form = form_name + newline + OneOrMore(field)
Quick Note

PyParsing itself implements a neat little internal
DSL for you to describe the parser grammer


Notice how the PyParsing code almost perfectly
reflects the BNF grammer
Output
> print form.parseString(input_form)


['UserForm', 'n', 'name', ':', 'CharField', '->',
'label', ':', 'Username', 'size', ':', '25', 'n',
'email', ':', 'EmailField', '->', 'size', ':', '25', 'n',
'password', ':', 'PasswordField', 'n']




PyParsing has neatly parsed our form input into
tokens. Thats nice, but we can do more.
Step 4: Suppressing Noise Tokens
newline = Suppress("n")
colon = Suppress(":")
arrow = Suppress("->")
Output
> print form.parseString(input_form)


['UserForm', 'name', 'CharField', 'label', 'Username',
'size', '25', 'email', 'EmailField', 'size', '25',
'password', 'PasswordField']




All the noise tokens are now removed from the
parsed output
Step 5: Grouping Tokens
field_property = Group(key + colon + value)
field = Group(field_name + colon + field_type +
Group(Optional(arrow + OneOrMore(field_property))) +
newline)
Output
> print form.parseString(input_form)


['UserForm',
  ['name', 'CharField',
    [['label', 'Username'], ['size', '25']]],
  ['email', 'EmailField',
    [['size', '25']]],
  ['password', 'PasswordField',[]]]

Related tokens are now grouped together in a list
Step 6: Give Names to Tokens
form_name = word.setResultsName("form_name")
field = Group(field_name + colon + field_type +
  Group(Optional(arrow + OneOrMore(field_property))) +
  newline).setResultsName("form_field")
Output
> parsed_form = form.parseString(input_form)
> print parsed_form.form_name


UserForm


> print parsed_form.fields[1].field_type


EmailField




Now we can refer to parsed tokens by name
Step 7: Convert Properties to Dict
def convert_prop_to_dict(tokens):
    prop_dict = {}
    for token in tokens:
        prop_dict[token.property_key] =
                                    token.property_value
    return prop_dict


field = Group(field_name + colon + field_type +
          Optional(arrow + OneOrMore(field_property))
             .setParseAction(convert_prop_to_dict) +
          newline).setResultsName("form_field")
Output
> print form.parseString(input_form)


['UserForm',
    ['name', 'CharField',
      {'size': '25', 'label': 'Username'}],
    ['email', 'EmailField',
      {'size': '32'}],
    ['password', 'PasswordField', {}]
]


Sweet! The field properties are parsed into a dict
Step 7: Generate HTML Output

We need to walk through the parsed form and
generate a html string out of it
def get_field_html(field):

   properties = field[2]

   label = properties["label"] if "label" in properties else field.field_name

   label_html = "<label>" + label + "</label>"

   attributes = {"name":field.field_name}

   attributes.update(properties)

   if field.field_type == "CharField" or field.field_type == "EmailField":

       attributes["type"] = "text"

   else:

       attributes["type"] = "password"

   if "label" in attributes:

       del attributes["label"]

   attributes_html = " ".join([name+"='"+value+"'" for name,value in attributes.items()])

   field_html = "<input " + attributes_html + "/>"

   return label_html + field_html + "<br/>"



def render(form):

   fields_html = "".join([get_field_html(field) for field in form.fields])

   return "<form id='" + form.form_name.lower() +"'>" + fields_html + "</form>"
Output
> print render(form.parseString(input_form))


<form id='userform'>
<label>Username</label>
<input type='text' name='name' size='25'/><br/>
<label>email</label>
<input type='text' name='email' size='32'/><br/>
<label>password</label>
<input type='password' name='password'/><br/>
</form>
It works, but....


                 Yuck!


The output rendering code is an UGLY MESS
Wish we could do this...
> print Form(CharField(name=”user”,size=”25”,label=”ID”),
             id=”myform”)


<form id='myform'>
<label>ID</label>
<input type='text' name='name' size='25'/><br/>
</form>




Neat, clean syntax that matches the output domain
well. But how do we create this kind of syntax?
Lets create an Internal DSL
class HtmlElement(object):

   default_attributes = {}

   tag = "unknown_tag"



   def __init__(self, *args, **kwargs):

       self.attributes = kwargs

       self.attributes.update(self.default_attributes)

       self.children = args



   def __str__(self):

       attribute_html = " ".join(["{}='{}'".format(name, value) for name,value in
                                                           self.attributes.items()])

       if not self.children:

            return "<{} {}/>".format(self.tag, attribute_html)

       else:

            children_html = "".join([str(child) for child in self.children])

            return "<{} {}>{}</{}>".format(self.tag, attribute_html, children_html,
self.tag)
> print HtmlElement(id=”test”)



<unknown_tag id='test'/>



> print HtmlElement(HtmlElement(name=”test”), id=”id”)



<unknown_tag id='id'><unknown_tag name='test'/></unknown_tag>
class Input(HtmlElement):

   tag = "input"



   def __init__(self, *args, **kwargs):

       HtmlElement.__init__(self, *args, **kwargs)

       self.label = self.attributes["label"] if "label" in self.attributes else

                                                             self.attributes["name"]

       if "label" in self.attributes:

           del self.attributes["label"]



   def __str__(self):

       label_html = "<label>{}</label>".format(self.label)

       return label_html + HtmlElement.__str__(self) + "<br/>"
> print InputElement(name=”username”)



<label>username</label><input name='username'/><br/>



> print InputElement(name=”username”, label=”User ID”)



<label>User ID</label><input name='username'/><br/>
class Form(HtmlElement):

   tag = "form"



class CharField(Input):

   default_attributes = {"type":"text"}



class EmailField(CharField):

   pass



class PasswordField(Input):

   default_attributes = {"type":"password"}
Now...
> print Form(CharField(name=”user”,size=”25”,label=”ID”),
             id=”myform”)


<form id='myform'>
<label>ID</label>
<input type='text' name='name' size='25'/><br/>
</form>




                            Nice!
Step 7 Revisited: Output HTML
def render(form):
    field_dict = {"CharField": CharField, "EmailField":
               EmailField, "PasswordField": PasswordField}
    fields = [field_dict[field.field_type]
          (name=field.field_name, **field[2]) for field in
                                              form.fields]
    return Form(*fields, id=form.form_name.lower())




Now our output code uses our Internal DSL!
INPUT
UserForm
name:CharField -> label:Username size:25
email:EmailField -> size:32
password:PasswordField
                          OUTPUT
<form id='userform'>
<label>Username</label>
<input type='text' name='name' size='25'/><br/>
<label>email</label>
<input type='text' name='email' size='32'/><br/>
<label>password</label>
<input type='password' name='password'/><br/>
</form>
Get the whole code

http://bit.ly/pyconindia_dsl
Summary

+ DSLs make your code easier to read
+ DSLs make your code easier to write
+ DSLs make it easy to for non-programmers to
maintain code
+ PyParsing makes is easy to write External DSLs
+ Python makes it easy to write Internal DSLs

More Related Content

What's hot (20)

PPTX
Running Free with the Monads
kenbot
 
PPTX
#7 formal methods – loop proof examples
Sharif Omar Salem
 
PPTX
Data Analysis with Python Pandas
Neeru Mittal
 
PPTX
Python - Numpy/Pandas/Matplot Machine Learning Libraries
Andrew Ferlitsch
 
PPTX
Algorithms
WaqarzadAa
 
PPTX
STRINGS IN PYTHON
TanushTM1
 
PPTX
Logic programming in python
Pierre Carbonnelle
 
PDF
COMPILER DESIGN- Syntax Directed Translation
Jyothishmathi Institute of Technology and Science Karimnagar
 
PDF
Function arguments In Python
Amit Upadhyay
 
PPT
A* Path Finding
dnatapov
 
PDF
Apache Druid 101
Data Con LA
 
PDF
Arrays In Python | Python Array Operations | Edureka
Edureka!
 
PDF
Bottom up parser
Akshaya Arunan
 
PPT
Introduction to Data Structures Sorting and searching
Mvenkatarao
 
PPTX
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
Dremio Corporation
 
PPTX
Introduction to RDF Data Model
Cesar Augusto Nogueira
 
PPTX
AlphaGo
Jackei Kuo
 
PDF
Fundamentals of Data Structures in C++ - Ellis Horowitz, Sartaj Sahni
Munawar Ahmed
 
PPTX
Spark SQL Tutorial | Spark SQL Using Scala | Apache Spark Tutorial For Beginn...
Simplilearn
 
PPTX
NumPy
AbhijeetAnand88
 
Running Free with the Monads
kenbot
 
#7 formal methods – loop proof examples
Sharif Omar Salem
 
Data Analysis with Python Pandas
Neeru Mittal
 
Python - Numpy/Pandas/Matplot Machine Learning Libraries
Andrew Ferlitsch
 
Algorithms
WaqarzadAa
 
STRINGS IN PYTHON
TanushTM1
 
Logic programming in python
Pierre Carbonnelle
 
COMPILER DESIGN- Syntax Directed Translation
Jyothishmathi Institute of Technology and Science Karimnagar
 
Function arguments In Python
Amit Upadhyay
 
A* Path Finding
dnatapov
 
Apache Druid 101
Data Con LA
 
Arrays In Python | Python Array Operations | Edureka
Edureka!
 
Bottom up parser
Akshaya Arunan
 
Introduction to Data Structures Sorting and searching
Mvenkatarao
 
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
Dremio Corporation
 
Introduction to RDF Data Model
Cesar Augusto Nogueira
 
AlphaGo
Jackei Kuo
 
Fundamentals of Data Structures in C++ - Ellis Horowitz, Sartaj Sahni
Munawar Ahmed
 
Spark SQL Tutorial | Spark SQL Using Scala | Apache Spark Tutorial For Beginn...
Simplilearn
 

Viewers also liked (20)

PDF
Managing Postgres with Ansible
Gulcin Yildirim Jelinek
 
PPTX
150928 - Verisign Public DNS
Michael Kaczmarek
 
PDF
A Designated ENUM DNS Zone Provisioning Architecture
enumplatform
 
PPTX
IDNOG - 2014
Barry Greene
 
PDF
BIND’s New Security Feature: DNSRPZ - the &quot;DNS Firewall&quot;
Barry Greene
 
PDF
TTÜ Geeky Weekly
Gulcin Yildirim Jelinek
 
PPT
DNS and Troubleshooting DNS issues in Linux
Konkona Basu
 
PDF
I Have the Power(View)
Will Schroeder
 
PDF
IoT Security in Action - Boston Sept 2015
Eurotech
 
PDF
Query-name Minimization and Authoritative Server Behavior
Shumon Huque
 
PDF
Hands-on getdns Tutorial
Shumon Huque
 
PPTX
Approaches to application request throttling
Maarten Balliauw
 
PDF
PostgreSQL DBA Neler Yapar?
Gulcin Yildirim Jelinek
 
PDF
PostgreSQL Hem Güçlü Hem Güzel!
Gulcin Yildirim Jelinek
 
PDF
Are you ready for the next attack? reviewing the sp security checklist (apnic...
Barry Greene
 
PPT
Indusrty Strategy For Action
Barry Greene
 
PPTX
OpenDNS Enterprise Web Content Filtering
OpenDNS
 
PPTX
Remediating Violated Customers
Barry Greene
 
PPTX
DNS for Developers - NDC Oslo 2016
Maarten Balliauw
 
Managing Postgres with Ansible
Gulcin Yildirim Jelinek
 
150928 - Verisign Public DNS
Michael Kaczmarek
 
A Designated ENUM DNS Zone Provisioning Architecture
enumplatform
 
IDNOG - 2014
Barry Greene
 
BIND’s New Security Feature: DNSRPZ - the &quot;DNS Firewall&quot;
Barry Greene
 
TTÜ Geeky Weekly
Gulcin Yildirim Jelinek
 
DNS and Troubleshooting DNS issues in Linux
Konkona Basu
 
I Have the Power(View)
Will Schroeder
 
IoT Security in Action - Boston Sept 2015
Eurotech
 
Query-name Minimization and Authoritative Server Behavior
Shumon Huque
 
Hands-on getdns Tutorial
Shumon Huque
 
Approaches to application request throttling
Maarten Balliauw
 
PostgreSQL DBA Neler Yapar?
Gulcin Yildirim Jelinek
 
PostgreSQL Hem Güçlü Hem Güzel!
Gulcin Yildirim Jelinek
 
Are you ready for the next attack? reviewing the sp security checklist (apnic...
Barry Greene
 
Indusrty Strategy For Action
Barry Greene
 
OpenDNS Enterprise Web Content Filtering
OpenDNS
 
Remediating Violated Customers
Barry Greene
 
DNS for Developers - NDC Oslo 2016
Maarten Balliauw
 
Ad

Similar to Creating Domain Specific Languages in Python (20)

PPT
pysdasdasdsadsadsadsadsadsadasdasdthon1.ppt
kashifmajeedjanjua
 
PPT
Lenguaje Python
RalAnteloJurado
 
PPT
python1.ppt
JemuelPinongcos1
 
PPT
python1.ppt
ALOK52916
 
PPT
Python Basics
MobeenAhmed25
 
PPT
python1.ppt
AshokRachapalli1
 
PPT
python1.ppt
VishwasKumar58
 
PPT
python1.ppt
RedenOriola
 
PPT
python1.ppt
RajPurohit33
 
PPT
coolstuff.ppt
GeorgePama1
 
PPT
Introductio_to_python_progamming_ppt.ppt
HiralPatel798996
 
PPT
python1.ppt
SATHYANARAYANAKB
 
PPT
Learn Python in Three Hours - Presentation
Naseer-ul-Hassan Rehman
 
PPTX
manish python.pptx
ssuser92d141
 
PPT
ENGLISH PYTHON.ppt
GlobalTransLogistics
 
PPT
Kavitha_python.ppt
KavithaMuralidharan2
 
PPTX
Python Code Camp for Professionals 4/4
DEVCON
 
PPT
python1.ppt
arivukarasi2
 
PPT
Python ppt
Mohita Pandey
 
PPTX
Python bible
adarsh j
 
pysdasdasdsadsadsadsadsadsadasdasdthon1.ppt
kashifmajeedjanjua
 
Lenguaje Python
RalAnteloJurado
 
python1.ppt
JemuelPinongcos1
 
python1.ppt
ALOK52916
 
Python Basics
MobeenAhmed25
 
python1.ppt
AshokRachapalli1
 
python1.ppt
VishwasKumar58
 
python1.ppt
RedenOriola
 
python1.ppt
RajPurohit33
 
coolstuff.ppt
GeorgePama1
 
Introductio_to_python_progamming_ppt.ppt
HiralPatel798996
 
python1.ppt
SATHYANARAYANAKB
 
Learn Python in Three Hours - Presentation
Naseer-ul-Hassan Rehman
 
manish python.pptx
ssuser92d141
 
ENGLISH PYTHON.ppt
GlobalTransLogistics
 
Kavitha_python.ppt
KavithaMuralidharan2
 
Python Code Camp for Professionals 4/4
DEVCON
 
python1.ppt
arivukarasi2
 
Python ppt
Mohita Pandey
 
Python bible
adarsh j
 
Ad

More from Siddhi (20)

PDF
Not all features are equal
Siddhi
 
PDF
The end of the backlog?
Siddhi
 
PDF
Growth hacks
Siddhi
 
PDF
Kanban for Startups
Siddhi
 
PDF
Venture lab tech entrepreneurship market survey
Siddhi
 
PDF
Technology Entrepreneurship: Assignment 2
Siddhi
 
PDF
5 steps to better user engagement
Siddhi
 
PPTX
Bridging the gap between your Agile project organisation and the traditional ...
Siddhi
 
PDF
So you wanna build something? Now what?
Siddhi
 
PDF
Agile in short projects
Siddhi
 
PDF
Continuous feedback
Siddhi
 
PDF
Organizational Dysfunctions - Agile to the Rescue
Siddhi
 
PDF
Agile is not the easy way out
Siddhi
 
PDF
The Three Amigos
Siddhi
 
PDF
Visualisation & Self Organisation
Siddhi
 
PDF
Portfolio Management - Figuring Out How to Say When and Why
Siddhi
 
PDF
Attention Middle Management Chickens
Siddhi
 
PDF
Agile Project Outsourcing - Dealing with RFP and RFI
Siddhi
 
PPTX
Migrating Legacy Code
Siddhi
 
PPTX
Big Bang Agile Roll-out
Siddhi
 
Not all features are equal
Siddhi
 
The end of the backlog?
Siddhi
 
Growth hacks
Siddhi
 
Kanban for Startups
Siddhi
 
Venture lab tech entrepreneurship market survey
Siddhi
 
Technology Entrepreneurship: Assignment 2
Siddhi
 
5 steps to better user engagement
Siddhi
 
Bridging the gap between your Agile project organisation and the traditional ...
Siddhi
 
So you wanna build something? Now what?
Siddhi
 
Agile in short projects
Siddhi
 
Continuous feedback
Siddhi
 
Organizational Dysfunctions - Agile to the Rescue
Siddhi
 
Agile is not the easy way out
Siddhi
 
The Three Amigos
Siddhi
 
Visualisation & Self Organisation
Siddhi
 
Portfolio Management - Figuring Out How to Say When and Why
Siddhi
 
Attention Middle Management Chickens
Siddhi
 
Agile Project Outsourcing - Dealing with RFP and RFI
Siddhi
 
Migrating Legacy Code
Siddhi
 
Big Bang Agile Roll-out
Siddhi
 

Recently uploaded (20)

PDF
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
PDF
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
PPTX
Agentforce World Tour Toronto '25 - Supercharge MuleSoft Development with Mod...
Alexandra N. Martinez
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PPTX
Digital Circuits, important subject in CS
contactparinay1
 
PPT
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
PDF
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
PPTX
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PPTX
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
Agentforce World Tour Toronto '25 - Supercharge MuleSoft Development with Mod...
Alexandra N. Martinez
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
Digital Circuits, important subject in CS
contactparinay1
 
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 

Creating Domain Specific Languages in Python

  • 1. Domain Specific Languages in Python Siddharta Govindaraj [email protected]
  • 2. What are DSLs? Specialized mini-languages for specific problem domains that make it easier to work in that domain
  • 3. Example: SQL SQL is a mini language specialized to retrieve data from a relational database
  • 4. Example: Regular Expressions Regular Expressions are mini languages specialized to express string patterns to match
  • 5. Life Without Regular Expressions def is_ip_address(ip_address): components = ip_address_string.split(".") if len(components) != 4: return False try: int_components = [int(component) for component in components] except ValueError: return False for component in int_components: if component < 0 or component > 255: return False return True
  • 6. Life With Regular Expressions def is_ip(ip_address_string): match = re.match(r"^(d{1,3}).(d{1,3}).(d{1,3}). (d{1,3})$", ip_address_string) if not match: return False for component in match.groups(): if int(component) < 0 or int(component) > 255: return False return True
  • 7. The DSL that simplifies our life ^(d{1,3}).(d{1,3}).(d{1,3}).(d{1,3})$
  • 8. Why DSL - Answered When working in a particular domain, write your code in a syntax that fits the domain. When working with patterns, use RegEx When working with RDBMS, use SQL When working in your domain – create your own DSL
  • 9. The two types of DSLs External DSL – The code is written in an external file or as a string, which is read and parsed by the application
  • 10. The two types of DSLs Internal DSL – Use features of the language (like metaclasses) to enable people to write code in python that resembles the domain syntax
  • 11. Creating Forms – No DSL <form> <label>Name:</label><input type=”text” name=”name”/> <label>Email:</label><input type=”text” name=”email”/> <label>Password:</label><input type=”password” name=”name”/> </form>
  • 12. Creating Forms – No DSL – Requires HTML knowledge to maintain – Therefore it is not possible for the end user to change the structure of the form by themselves
  • 13. Creating Forms – External DSL UserForm name->CharField label:Username email->EmailField label:Email Address password->PasswordField This text file is parsed and rendered by the app
  • 14. Creating Forms – External DSL + Easy to understand form structure + Can be easily edited by end users – Requires you to read and parse the file
  • 15. Creating Forms – Internal DSL class UserForm(forms.Form): username = forms.RegexField(regex=r'^w+$', max_length=30) email = forms.EmailField(maxlength=75) password = forms.CharField(widget=forms.PasswordInput()) Django uses metaclass magic to convert this syntax to an easily manipulated python class
  • 16. Creating Forms – Internal DSL + Easy to understand form structure + Easy to work with the form as it is regular python + No need to read and parse the file – Cannot be used by non-programmers – Can sometimes be complicated to implement – Behind the scenes magic → debugging hell
  • 17. Creating an External DSL UserForm name:CharField -> label:Username size:25 email:EmailField -> size:32 password:PasswordField Lets write code to parse and render this form
  • 18. Options for Parsing Using string functions → You have to be crazy Using regular expressions → Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems. - Jamie Zawinski Writing a parser → ✓ (we will use PyParsing)
  • 19. Step 1: Get PyParsing pip install pyparsing
  • 20. Step 2: Design the Grammar form ::= form_name newline field+ field ::= field_name colon field_type [arrow property+] property ::= key colon value form_name ::= word field_name ::= word field_type ::= CharField | EmailField | PasswordField key ::= word value ::= alphanumeric+ word ::= alpha+ newline ::= n colon ::= : arrow ::= ->
  • 21. Quick Note Backus-Naur Form (BNF) is a syntax for specifying grammers
  • 22. Step 3: Implement the Grammar newline = "n" colon = ":" arrow = "->" word = Word(alphas) key = word value = Word(alphanums) field_type = oneOf("CharField EmailField PasswordField") field_name = word form_name = word field_property = key + colon + value field = field_name + colon + field_type + Optional(arrow + OneOrMore(field_property)) + newline form = form_name + newline + OneOrMore(field)
  • 23. Quick Note PyParsing itself implements a neat little internal DSL for you to describe the parser grammer Notice how the PyParsing code almost perfectly reflects the BNF grammer
  • 24. Output > print form.parseString(input_form) ['UserForm', 'n', 'name', ':', 'CharField', '->', 'label', ':', 'Username', 'size', ':', '25', 'n', 'email', ':', 'EmailField', '->', 'size', ':', '25', 'n', 'password', ':', 'PasswordField', 'n'] PyParsing has neatly parsed our form input into tokens. Thats nice, but we can do more.
  • 25. Step 4: Suppressing Noise Tokens newline = Suppress("n") colon = Suppress(":") arrow = Suppress("->")
  • 26. Output > print form.parseString(input_form) ['UserForm', 'name', 'CharField', 'label', 'Username', 'size', '25', 'email', 'EmailField', 'size', '25', 'password', 'PasswordField'] All the noise tokens are now removed from the parsed output
  • 27. Step 5: Grouping Tokens field_property = Group(key + colon + value) field = Group(field_name + colon + field_type + Group(Optional(arrow + OneOrMore(field_property))) + newline)
  • 28. Output > print form.parseString(input_form) ['UserForm', ['name', 'CharField', [['label', 'Username'], ['size', '25']]], ['email', 'EmailField', [['size', '25']]], ['password', 'PasswordField',[]]] Related tokens are now grouped together in a list
  • 29. Step 6: Give Names to Tokens form_name = word.setResultsName("form_name") field = Group(field_name + colon + field_type + Group(Optional(arrow + OneOrMore(field_property))) + newline).setResultsName("form_field")
  • 30. Output > parsed_form = form.parseString(input_form) > print parsed_form.form_name UserForm > print parsed_form.fields[1].field_type EmailField Now we can refer to parsed tokens by name
  • 31. Step 7: Convert Properties to Dict def convert_prop_to_dict(tokens): prop_dict = {} for token in tokens: prop_dict[token.property_key] = token.property_value return prop_dict field = Group(field_name + colon + field_type + Optional(arrow + OneOrMore(field_property)) .setParseAction(convert_prop_to_dict) + newline).setResultsName("form_field")
  • 32. Output > print form.parseString(input_form) ['UserForm', ['name', 'CharField', {'size': '25', 'label': 'Username'}], ['email', 'EmailField', {'size': '32'}], ['password', 'PasswordField', {}] ] Sweet! The field properties are parsed into a dict
  • 33. Step 7: Generate HTML Output We need to walk through the parsed form and generate a html string out of it
  • 34. def get_field_html(field): properties = field[2] label = properties["label"] if "label" in properties else field.field_name label_html = "<label>" + label + "</label>" attributes = {"name":field.field_name} attributes.update(properties) if field.field_type == "CharField" or field.field_type == "EmailField": attributes["type"] = "text" else: attributes["type"] = "password" if "label" in attributes: del attributes["label"] attributes_html = " ".join([name+"='"+value+"'" for name,value in attributes.items()]) field_html = "<input " + attributes_html + "/>" return label_html + field_html + "<br/>" def render(form): fields_html = "".join([get_field_html(field) for field in form.fields]) return "<form id='" + form.form_name.lower() +"'>" + fields_html + "</form>"
  • 35. Output > print render(form.parseString(input_form)) <form id='userform'> <label>Username</label> <input type='text' name='name' size='25'/><br/> <label>email</label> <input type='text' name='email' size='32'/><br/> <label>password</label> <input type='password' name='password'/><br/> </form>
  • 36. It works, but.... Yuck! The output rendering code is an UGLY MESS
  • 37. Wish we could do this... > print Form(CharField(name=”user”,size=”25”,label=”ID”), id=”myform”) <form id='myform'> <label>ID</label> <input type='text' name='name' size='25'/><br/> </form> Neat, clean syntax that matches the output domain well. But how do we create this kind of syntax?
  • 38. Lets create an Internal DSL
  • 39. class HtmlElement(object): default_attributes = {} tag = "unknown_tag" def __init__(self, *args, **kwargs): self.attributes = kwargs self.attributes.update(self.default_attributes) self.children = args def __str__(self): attribute_html = " ".join(["{}='{}'".format(name, value) for name,value in self.attributes.items()]) if not self.children: return "<{} {}/>".format(self.tag, attribute_html) else: children_html = "".join([str(child) for child in self.children]) return "<{} {}>{}</{}>".format(self.tag, attribute_html, children_html, self.tag)
  • 40. > print HtmlElement(id=”test”) <unknown_tag id='test'/> > print HtmlElement(HtmlElement(name=”test”), id=”id”) <unknown_tag id='id'><unknown_tag name='test'/></unknown_tag>
  • 41. class Input(HtmlElement): tag = "input" def __init__(self, *args, **kwargs): HtmlElement.__init__(self, *args, **kwargs) self.label = self.attributes["label"] if "label" in self.attributes else self.attributes["name"] if "label" in self.attributes: del self.attributes["label"] def __str__(self): label_html = "<label>{}</label>".format(self.label) return label_html + HtmlElement.__str__(self) + "<br/>"
  • 42. > print InputElement(name=”username”) <label>username</label><input name='username'/><br/> > print InputElement(name=”username”, label=”User ID”) <label>User ID</label><input name='username'/><br/>
  • 43. class Form(HtmlElement): tag = "form" class CharField(Input): default_attributes = {"type":"text"} class EmailField(CharField): pass class PasswordField(Input): default_attributes = {"type":"password"}
  • 44. Now... > print Form(CharField(name=”user”,size=”25”,label=”ID”), id=”myform”) <form id='myform'> <label>ID</label> <input type='text' name='name' size='25'/><br/> </form> Nice!
  • 45. Step 7 Revisited: Output HTML def render(form): field_dict = {"CharField": CharField, "EmailField": EmailField, "PasswordField": PasswordField} fields = [field_dict[field.field_type] (name=field.field_name, **field[2]) for field in form.fields] return Form(*fields, id=form.form_name.lower()) Now our output code uses our Internal DSL!
  • 46. INPUT UserForm name:CharField -> label:Username size:25 email:EmailField -> size:32 password:PasswordField OUTPUT <form id='userform'> <label>Username</label> <input type='text' name='name' size='25'/><br/> <label>email</label> <input type='text' name='email' size='32'/><br/> <label>password</label> <input type='password' name='password'/><br/> </form>
  • 47. Get the whole code http://bit.ly/pyconindia_dsl
  • 48. Summary + DSLs make your code easier to read + DSLs make your code easier to write + DSLs make it easy to for non-programmers to maintain code + PyParsing makes is easy to write External DSLs + Python makes it easy to write Internal DSLs