Introduction
               Diving In
            Building up




/regular expressions/demystified
 From deckhand to pirate in 30 minutes


Philip Tellis / philip@bluesmoon.info
                      Yahoo!




                           /regular expressions/demystified
Introduction
                              Diving In
                           Building up


Outline

  1   Introduction
         Who’s playing?
         Conventions

  2   Diving In
        Starting Small
        Getting meta

  3   Building up
        More or less
        Alternation
        Groups


                                          /regular expressions/demystified
Introduction
                                    Who’s playing?
                        Diving In
                                    Conventions
                     Building up


$ whoami?




    Philip Tellis
    philip@bluesmoon.info
    @bluesmoon
    yahoo
    geek




                                    /regular expressions/demystified
Introduction
                                        Who’s playing?
                            Diving In
                                        Conventions
                         Building up


Who are you?




     Developer
     Curious
     Interested in regular expressions
     You may or may not have used them before




                                        /regular expressions/demystified
Introduction
                                        Who’s playing?
                            Diving In
                                        Conventions
                         Building up


What is a regular expression?




     A pattern that can match multiple strings
     A pattern matching language
     A Finite Automaton




                                        /regular expressions/demystified
Introduction
                                         Who’s playing?
                             Diving In
                                         Conventions
                          Building up


What is a regular expression?




      But this is a hacker session, so let’s forget the theory.
                   (You can read the book later.)




                                         /regular expressions/demystified
Introduction
                                         Who’s playing?
                             Diving In
                                         Conventions
                          Building up


What is a regular expression?




      But this is a hacker session, so let’s forget the theory.
                   (You can read the book later.)




                                         /regular expressions/demystified
Introduction
                                         Who’s playing?
                             Diving In
                                         Conventions
                          Building up


Conventions used in this talk



      Text in ’single quotes’ denotes a literal string
      Text in /forward slashes/ denotes a regular
      expression
      The operator =∼ indicates that the string on the left
      matches the pattern on the right
      The operator !∼ indicates that the string on the left does
      not match the pattern on the right
      $string denotes a variable containing a string




                                         /regular expressions/demystified
Introduction
                                     Starting Small
                         Diving In
                                     Getting meta
                      Building up


Match a single character




        ’a’ =~ /a/




                                     /regular expressions/demystified
Introduction
                                      Starting Small
                          Diving In
                                      Getting meta
                       Building up


Let’s try a different character




         ’t’ =~ /t/




                                      /regular expressions/demystified
Introduction
                                         Starting Small
                             Diving In
                                         Getting meta
                          Building up


Building up




  Combine the previous two into a single regular expression

        ’at’ =~ /at/




                                         /regular expressions/demystified
Introduction
                                         Starting Small
                             Diving In
                                         Getting meta
                          Building up


You now know regular expressions




    To build a regular expression, break the pattern into small
      manageable pieces and incrementally combine them.




                                         /regular expressions/demystified
Introduction
                                       Starting Small
                           Diving In
                                       Getting meta
                        Building up


Metacharacters



     The regex language has its own syntax characters to do
     funky things
     Some of these act as wild cards
     Others act as modifiers to whatever comes before them
     And some of them make your brain explode
     We won’t be blowing up brains today




                                       /regular expressions/demystified
Introduction
                                       Starting Small
                           Diving In
                                       Getting meta
                        Building up


Metacharacters



     The regex language has its own syntax characters to do
     funky things
     Some of these act as wild cards
     Others act as modifiers to whatever comes before them
     And some of them make your brain explode
     We won’t be blowing up brains today




                                       /regular expressions/demystified
Introduction
                                       Starting Small
                           Diving In
                                       Getting meta
                        Building up


Metacharacters



     The regex language has its own syntax characters to do
     funky things
     Some of these act as wild cards
     Others act as modifiers to whatever comes before them
     And some of them make your brain explode
     We won’t be blowing up brains today




                                       /regular expressions/demystified
Introduction
                                       Starting Small
                           Diving In
                                       Getting meta
                        Building up


The . metacharacter


     Matches ONE and ONLY ONE character
              ’a’     =~    /./
              ’b’     =~    /./
              ’c’     =~    /./
              ’’      !~    /./

     The empty string has less than ONE character
     ’abc’ has ONE character. . . three times
              ’abc’ =~ /./




                                       /regular expressions/demystified
Introduction
                                       Starting Small
                           Diving In
                                       Getting meta
                        Building up


The . metacharacter


     Matches ONE and ONLY ONE character
              ’a’     =~    /./
              ’b’     =~    /./
              ’c’     =~    /./
              ’’      !~    /./

     The empty string has less than ONE character
     ’abc’ has ONE character. . . three times
              ’abc’ =~ /./




                                       /regular expressions/demystified
Introduction
                                       Starting Small
                           Diving In
                                       Getting meta
                        Building up


The . metacharacter


     Matches ONE and ONLY ONE character
              ’a’     =~    /./
              ’b’     =~    /./
              ’c’     =~    /./
              ’’      !~    /./

     The empty string has less than ONE character
     ’abc’ has ONE character. . . three times
              ’abc’ =~ /./




                                       /regular expressions/demystified
Introduction
                                            Starting Small
                                Diving In
                                            Getting meta
                             Building up


The fate of gate hate date



             /.ate/




 Matches                                    Does not match
     aate bate cate date . . .                 ate
     crates abates dates elated                     ates ated
     ...
     @ate 9ate ’ ate’



                                            /regular expressions/demystified
Introduction
                                            Starting Small
                                Diving In
                                            Getting meta
                             Building up


The fate of gate hate date



             /.ate/




 Matches                                    Does not match
     aate bate cate date . . .                 ate
     crates abates dates elated                     ates ated
     ...
     @ate 9ate ’ ate’



                                            /regular expressions/demystified
Introduction
                                           Starting Small
                               Diving In
                                           Getting meta
                            Building up


Character classes



            /[a-z]ate/




Matches                                    Does not match
    aate bate cate date . . .                 ate
    crates abates dates elated                     ates ated
    ...                                            @ate 9ate ’ ate’




                                           /regular expressions/demystified
Introduction
                                          Starting Small
                              Diving In
                                          Getting meta
                           Building up


Character classes



  To match a literal ’-’ it should be the first or last character in
  the class:

         /[+-*/]/                     # Incorrect


         /[+*/-]/                     # Correct




                                          /regular expressions/demystified
Introduction
                                       Starting Small
                           Diving In
                                       Getting meta
                        Building up


Negated character classes



           /[^a-z]ate/




Matches                                Does not match
    @ate 9ate ’ ate’                      ate ates ated
    g@ate e9ated                               aate bate cate date . . .
                                               crates abates dates elated
                                               ...



                                       /regular expressions/demystified
Introduction
                                          Starting Small
                              Diving In
                                          Getting meta
                           Building up


The late fate of gate hate date rate



            /[df-hlr]ate/




 Matches                                  Does not match
     date fate gate hate late                ate aate bate cate eate
     rate                                    iate jate kate . . .
     dates fated billgates hated
     ...



                                          /regular expressions/demystified
Introduction
                                          Starting Small
                              Diving In
                                          Getting meta
                           Building up


The late fate of gate hate date rate



            /[df-hlr]ate/




 Matches                                  Does not match
     date fate gate hate late                ate aate bate cate eate
     rate                                    iate jate kate . . .
     dates fated billgates hated
     ...



                                          /regular expressions/demystified
Introduction
                                         Starting Small
                             Diving In
                                         Getting meta
                          Building up


Anchors



           /^[df-hlr]ate$/




Matches                                  Does not match
    date fate gate hate late                ate aate bate . . .
    rate                                         dates gated berate elated
                                                 ...




                                         /regular expressions/demystified
Introduction
                                        Starting Small
                            Diving In
                                        Getting meta
                         Building up


Anchors




     ˆ matches the start of the string
     $ matches the end of the string
     Both are 0 byte matches, ie, they do not match any
     character




                                        /regular expressions/demystified
Introduction    More or less
                           Diving In   Alternation
                        Building up    Groups


Matching more than one of something




     ? – matches 0 or 1 of what comes before it
     * – matches 0 or more of what comes before it
     + – matches 1 or more of what comes before it
     {n,m} – matches between n and m of what comes before it




                                       /regular expressions/demystified
Introduction    More or less
              Diving In   Alternation
           Building up    Groups


Aaargh!




                          Everyone shout “Aaarrrgh!”




                          /regular expressions/demystified
Introduction    More or less
                          Diving In   Alternation
                       Building up    Groups


How many ways can you say Aargh!?



     argh
     aaaaaargh
     aaaarrrrghhh
     aaaaarrrrrggggghhhh
     aaarrrrggggg
     aaaaarrrrrhhhh




                                      /regular expressions/demystified
Introduction    More or less
                            Diving In   Alternation
                         Building up    Groups


Match ’em all




       /a+r+g+h+/           #     aarrrrgggghhhh
       /a+r+g+h*/           #     aarrgghh & aarrgg
       /a+r+g*h+/           #     aarrgghh & aarrhh
       /a+r+g*h*/           #     argh & arg & arh


  That last one also matches ’ar’ which we don’t want




                                        /regular expressions/demystified
Introduction    More or less
                            Diving In   Alternation
                         Building up    Groups


Match ’em all




       /a+r+g+h+/           #     aarrrrgggghhhh
       /a+r+g+h*/           #     aarrgghh & aarrgg
       /a+r+g*h+/           #     aarrgghh & aarrhh
       /a+r+g*h*/           #     argh & arg & arh


  That last one also matches ’ar’ which we don’t want




                                        /regular expressions/demystified
Introduction    More or less
                            Diving In   Alternation
                         Building up    Groups


Match ’em all




       /a+r+g+h+/           #     aarrrrgggghhhh
       /a+r+g+h*/           #     aarrgghh & aarrgg
       /a+r+g*h+/           #     aarrgghh & aarrhh
       /a+r+g*h*/           #     argh & arg & arh


  That last one also matches ’ar’ which we don’t want




                                        /regular expressions/demystified
Introduction    More or less
                            Diving In   Alternation
                         Building up    Groups


Match ’em all




       /a+r+g+h+/           #     aarrrrgggghhhh
       /a+r+g+h*/           #     aarrgghh & aarrgg
       /a+r+g*h+/           #     aarrgghh & aarrhh
       /a+r+g*h*/           #     argh & arg & arh


  That last one also matches ’ar’ which we don’t want




                                        /regular expressions/demystified
Introduction    More or less
                            Diving In   Alternation
                         Building up    Groups


Match ’em all




       /a+r+g+h+/           #     aarrrrgggghhhh
       /a+r+g+h*/           #     aarrgghh & aarrgg
       /a+r+g*h+/           #     aarrgghh & aarrhh
       /a+r+g*h*/           #     argh & arg & arh


  That last one also matches ’ar’ which we don’t want




                                        /regular expressions/demystified
Introduction    More or less
                           Diving In   Alternation
                        Building up    Groups


Alternation: Match all this or all that




        /ab|cd/


  Matches either ’ab’ or ’cd’




                                       /regular expressions/demystified
Introduction    More or less
                              Diving In   Alternation
                           Building up    Groups


From here to eternity

  | matches either everything on its left or everything on its right
            (That’s a pipe character, not the letter I)




                                          /regular expressions/demystified
Introduction    More or less
                            Diving In   Alternation
                         Building up    Groups


Back to aaargh



       /g*h+|g+h*/


  This matches all the endings we want:
      ggggghhhhhh
      ggggg
      hhhhh




                                        /regular expressions/demystified
Introduction    More or less
                                Diving In   Alternation
                             Building up    Groups


Back to aaargh


       /a+r+g*h+|g+h*/


  This doesn’t quite work

Matches                                     Does not match
    aaarrrhhh                                  aaarrrggg
    aaarrrrggghhh
    gggg
    gggghhhh


                                            /regular expressions/demystified
Introduction    More or less
                                Diving In   Alternation
                             Building up    Groups


Back to aaargh


       /a+r+g*h+|g+h*/


  This doesn’t quite work

Matches                                     Does not match
    aaarrrhhh                                  aaarrrggg
    aaarrrrggghhh
    gggg
    gggghhhh


                                            /regular expressions/demystified
Introduction    More or less
                          Diving In   Alternation
                       Building up    Groups


Group the subexpression



      /a+r+(g*h+|g+h*)/



  Matches
      aaarrrhhh
      aaarrrggg
      aaarrrrggghhh




                                      /regular expressions/demystified
Introduction    More or less
                            Diving In   Alternation
                         Building up    Groups


Grouping parentheses




     ( and ) mark a group
     | alternates within a group
     Groups may be nested - it’s like a new regex inside
     +, *, ? and {n,m} may apply to an entire group




                                        /regular expressions/demystified
Introduction
           Diving In
        Building up


Stop




                       /regular expressions/demystified
Introduction
                          Diving In
                       Building up


Summary




    Start small, match the parts you understand
    Build up to more complex patterns
    Not all problems should be solved by regular expressions




                                      /regular expressions/demystified
Introduction
                            Diving In
                         Building up


More Info. . .




      “Mastering Regular Expressions” – Jeffrey Friedl
      http://tech.bluesmoon.info/search/label/regex




                                        /regular expressions/demystified
Introduction
                       Diving In
                    Building up


Contact me




  Philip Tellis
  philip@bluesmoon.info
  @bluesmoon
  bluesmoon.info




                                   /regular expressions/demystified
Introduction
                             Diving In
                          Building up


Image credits




     http://flickr.com/photos/practicalowl/3933514241/
     http://flickr.com/photos/loozrboy/3908830690/
     http://flickr.com/photos/thetruthabout/2680546103/
     http://flickr.com/photos/donsolo/2136923757/




                                         /regular expressions/demystified
Introduction
    Diving In
 Building up




     Thank You




                /regular expressions/demystified
Introduction
                          Diving In
                       Building up


Aargh with class



      /a+r+g*[gh]h*/



  Matches
      aaarrrhhh
      aaarrrggg
      aaarrrrggghhh




                                      /regular expressions/demystified
Introduction
                       Diving In
                    Building up


Matching meta characters in a character class




        /[a-zA-Z0-9_-]/
        /[a-z^]/
        /[][]/




                                   /regular expressions/demystified
Introduction
                        Diving In
                     Building up


Alternating multiple items




       /apples|oranges|bananas/

       /buy some (apples|oranges|ba(na){2}s)/




                                    /regular expressions/demystified

More Related Content

DOCX
3 film magazine analysis
PPT
Musicademy Worship Guitar - Licks, Tricks and Cheats 2013
PDF
Project four, the essay
PDF
Project%20four%2c%20the%20essay
PDF
Project%20four%2c%20the%20essay
PDF
Feature Structure Unification Syntactic Parser 2.0
PDF
Emo Essay Rubric and Feedback Checklist
PPTX
Lake Forest Arts Presentation
3 film magazine analysis
Musicademy Worship Guitar - Licks, Tricks and Cheats 2013
Project four, the essay
Project%20four%2c%20the%20essay
Project%20four%2c%20the%20essay
Feature Structure Unification Syntactic Parser 2.0
Emo Essay Rubric and Feedback Checklist
Lake Forest Arts Presentation

Viewers also liked (20)

PPTX
"Lima una ciudad de reyes"
PPTX
EtCETra 2k15- Entertainment Quiz Finals
PPT
Ch06 1
PDF
S.S.C. Certificate0001
PPTX
English teachers who blog
 
PDF
Software Radio Implementation: A Systems Perspective
PDF
olajide doc.PDF
PPTX
Bharti B'Day Slide
PPTX
PPT
Expectation of Corporates from Professionals
PDF
Office
PDF
Learning from games : Dr Joanne O'Mara
PPTX
Social Media in the Classroom - ICTEV Conf Gail Casey May12
PDF
AATE handout
PDF
Hamare Rusoom wa Quyood: Syed Ali Naqi Naqvi Sahab t.s.
PPTX
Samit Malkani_TiE Institute_Social Media Content Strategy_280712
PPTX
Aradhana
PPTX
English curriculum studies 1 - Lecture 1
 
PPTX
Main Bitiya Rani Hoon Na
PPTX
Tori.fi ostoaie ja big data
"Lima una ciudad de reyes"
EtCETra 2k15- Entertainment Quiz Finals
Ch06 1
S.S.C. Certificate0001
English teachers who blog
 
Software Radio Implementation: A Systems Perspective
olajide doc.PDF
Bharti B'Day Slide
Expectation of Corporates from Professionals
Office
Learning from games : Dr Joanne O'Mara
Social Media in the Classroom - ICTEV Conf Gail Casey May12
AATE handout
Hamare Rusoom wa Quyood: Syed Ali Naqi Naqvi Sahab t.s.
Samit Malkani_TiE Institute_Social Media Content Strategy_280712
Aradhana
English curriculum studies 1 - Lecture 1
 
Main Bitiya Rani Hoon Na
Tori.fi ostoaie ja big data
Ad

Similar to Regular Expressions Demystified (20)

PPT
RegEx Parsing
PDF
Week-2: Theory & Practice of Data Cleaning: Regular Expressions in Theory
PPTX
NUS_NLP__Foundations_-_Section_2_-_Words.pptx
PPT
Introduction to regular expressions
ODP
OISF: Regular Expressions (Regex) Overview
KEY
Andrei's Regex Clinic
PPT
Introduction to Regular Expressions RootsTech 2013
ODP
DerbyCon 7.0 Legacy: Regular Expressions (Regex) Overview
ODP
CiNPA Security SIG - Regex Presentation
PPTX
Regular Expressions
PPTX
Regular Expressions(Theory of programming languages))
PDF
Regular expressions-google-analytics
PPT
Chapter Two(1)
PPTX
Regular Expression Crash Course
KEY
Regular expressions
PDF
Regex - Regular Expression Basics
PPT
Regular Expression in Action
PDF
Regular Expressions 101
PPT
Perl Intro 5 Regex Matches And Substitutions
PPTX
Regular Expressions Introduction Anthony Rudd CS
RegEx Parsing
Week-2: Theory & Practice of Data Cleaning: Regular Expressions in Theory
NUS_NLP__Foundations_-_Section_2_-_Words.pptx
Introduction to regular expressions
OISF: Regular Expressions (Regex) Overview
Andrei's Regex Clinic
Introduction to Regular Expressions RootsTech 2013
DerbyCon 7.0 Legacy: Regular Expressions (Regex) Overview
CiNPA Security SIG - Regex Presentation
Regular Expressions
Regular Expressions(Theory of programming languages))
Regular expressions-google-analytics
Chapter Two(1)
Regular Expression Crash Course
Regular expressions
Regex - Regular Expression Basics
Regular Expression in Action
Regular Expressions 101
Perl Intro 5 Regex Matches And Substitutions
Regular Expressions Introduction Anthony Rudd CS
Ad

More from Philip Tellis (20)

PDF
Improving D3 Performance with CANVAS and other Hacks
PDF
Frontend Performance: Beginner to Expert to Crazy Person
PDF
Frontend Performance: De débutant à Expert à Fou Furieux
PDF
Frontend Performance: Expert to Crazy Person
PDF
Beyond Page Level Metrics
PDF
Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...
PDF
Frontend Performance: Beginner to Expert to Crazy Person
PDF
Frontend Performance: Beginner to Expert to Crazy Person
PDF
Frontend Performance: Beginner to Expert to Crazy Person
PDF
mmm... beacons
PDF
RUM Distillation 101 -- Part I
PDF
Improving 3rd Party Script Performance With IFrames
PDF
Extending Boomerang
PDF
Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"
PDF
The Statistics of Web Performance Analysis
PDF
Abusing JavaScript to Measure Web Performance
PDF
Rum for Breakfast
PDF
Analysing network characteristics with JavaScript
PDF
A Node.JS bag of goodies for analyzing Web Traffic
PDF
Input sanitization
Improving D3 Performance with CANVAS and other Hacks
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: De débutant à Expert à Fou Furieux
Frontend Performance: Expert to Crazy Person
Beyond Page Level Metrics
Frontend Performance: Beginner to Expert to Crazy Person (San Diego Web Perf ...
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy Person
Frontend Performance: Beginner to Expert to Crazy Person
mmm... beacons
RUM Distillation 101 -- Part I
Improving 3rd Party Script Performance With IFrames
Extending Boomerang
Abusing JavaScript to measure Web Performance, or, "how does boomerang work?"
The Statistics of Web Performance Analysis
Abusing JavaScript to Measure Web Performance
Rum for Breakfast
Analysing network characteristics with JavaScript
A Node.JS bag of goodies for analyzing Web Traffic
Input sanitization

Recently uploaded (20)

PDF
“A New Era of 3D Sensing: Transforming Industries and Creating Opportunities,...
PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
PDF
Convolutional neural network based encoder-decoder for efficient real-time ob...
PDF
Early detection and classification of bone marrow changes in lumbar vertebrae...
PDF
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
PDF
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
PPT
Geologic Time for studying geology for geologist
PPTX
Module 1 Introduction to Web Programming .pptx
PDF
UiPath Agentic Automation session 1: RPA to Agents
PDF
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PPT
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
PPTX
Training Program for knowledge in solar cell and solar industry
PDF
Improvisation in detection of pomegranate leaf disease using transfer learni...
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PPTX
TEXTILE technology diploma scope and career opportunities
PDF
Transform-Your-Factory-with-AI-Driven-Quality-Engineering.pdf
PDF
STKI Israel Market Study 2025 version august
PDF
Consumable AI The What, Why & How for Small Teams.pdf
PPTX
Microsoft Excel 365/2024 Beginner's training
“A New Era of 3D Sensing: Transforming Industries and Creating Opportunities,...
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
Convolutional neural network based encoder-decoder for efficient real-time ob...
Early detection and classification of bone marrow changes in lumbar vertebrae...
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
Geologic Time for studying geology for geologist
Module 1 Introduction to Web Programming .pptx
UiPath Agentic Automation session 1: RPA to Agents
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
sustainability-14-14877-v2.pddhzftheheeeee
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
Training Program for knowledge in solar cell and solar industry
Improvisation in detection of pomegranate leaf disease using transfer learni...
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
TEXTILE technology diploma scope and career opportunities
Transform-Your-Factory-with-AI-Driven-Quality-Engineering.pdf
STKI Israel Market Study 2025 version august
Consumable AI The What, Why & How for Small Teams.pdf
Microsoft Excel 365/2024 Beginner's training

Regular Expressions Demystified

  • 1. Introduction Diving In Building up /regular expressions/demystified From deckhand to pirate in 30 minutes Philip Tellis / [email protected] Yahoo! /regular expressions/demystified
  • 2. Introduction Diving In Building up Outline 1 Introduction Who’s playing? Conventions 2 Diving In Starting Small Getting meta 3 Building up More or less Alternation Groups /regular expressions/demystified
  • 3. Introduction Who’s playing? Diving In Conventions Building up $ whoami? Philip Tellis [email protected] @bluesmoon yahoo geek /regular expressions/demystified
  • 4. Introduction Who’s playing? Diving In Conventions Building up Who are you? Developer Curious Interested in regular expressions You may or may not have used them before /regular expressions/demystified
  • 5. Introduction Who’s playing? Diving In Conventions Building up What is a regular expression? A pattern that can match multiple strings A pattern matching language A Finite Automaton /regular expressions/demystified
  • 6. Introduction Who’s playing? Diving In Conventions Building up What is a regular expression? But this is a hacker session, so let’s forget the theory. (You can read the book later.) /regular expressions/demystified
  • 7. Introduction Who’s playing? Diving In Conventions Building up What is a regular expression? But this is a hacker session, so let’s forget the theory. (You can read the book later.) /regular expressions/demystified
  • 8. Introduction Who’s playing? Diving In Conventions Building up Conventions used in this talk Text in ’single quotes’ denotes a literal string Text in /forward slashes/ denotes a regular expression The operator =∼ indicates that the string on the left matches the pattern on the right The operator !∼ indicates that the string on the left does not match the pattern on the right $string denotes a variable containing a string /regular expressions/demystified
  • 9. Introduction Starting Small Diving In Getting meta Building up Match a single character ’a’ =~ /a/ /regular expressions/demystified
  • 10. Introduction Starting Small Diving In Getting meta Building up Let’s try a different character ’t’ =~ /t/ /regular expressions/demystified
  • 11. Introduction Starting Small Diving In Getting meta Building up Building up Combine the previous two into a single regular expression ’at’ =~ /at/ /regular expressions/demystified
  • 12. Introduction Starting Small Diving In Getting meta Building up You now know regular expressions To build a regular expression, break the pattern into small manageable pieces and incrementally combine them. /regular expressions/demystified
  • 13. Introduction Starting Small Diving In Getting meta Building up Metacharacters The regex language has its own syntax characters to do funky things Some of these act as wild cards Others act as modifiers to whatever comes before them And some of them make your brain explode We won’t be blowing up brains today /regular expressions/demystified
  • 14. Introduction Starting Small Diving In Getting meta Building up Metacharacters The regex language has its own syntax characters to do funky things Some of these act as wild cards Others act as modifiers to whatever comes before them And some of them make your brain explode We won’t be blowing up brains today /regular expressions/demystified
  • 15. Introduction Starting Small Diving In Getting meta Building up Metacharacters The regex language has its own syntax characters to do funky things Some of these act as wild cards Others act as modifiers to whatever comes before them And some of them make your brain explode We won’t be blowing up brains today /regular expressions/demystified
  • 16. Introduction Starting Small Diving In Getting meta Building up The . metacharacter Matches ONE and ONLY ONE character ’a’ =~ /./ ’b’ =~ /./ ’c’ =~ /./ ’’ !~ /./ The empty string has less than ONE character ’abc’ has ONE character. . . three times ’abc’ =~ /./ /regular expressions/demystified
  • 17. Introduction Starting Small Diving In Getting meta Building up The . metacharacter Matches ONE and ONLY ONE character ’a’ =~ /./ ’b’ =~ /./ ’c’ =~ /./ ’’ !~ /./ The empty string has less than ONE character ’abc’ has ONE character. . . three times ’abc’ =~ /./ /regular expressions/demystified
  • 18. Introduction Starting Small Diving In Getting meta Building up The . metacharacter Matches ONE and ONLY ONE character ’a’ =~ /./ ’b’ =~ /./ ’c’ =~ /./ ’’ !~ /./ The empty string has less than ONE character ’abc’ has ONE character. . . three times ’abc’ =~ /./ /regular expressions/demystified
  • 19. Introduction Starting Small Diving In Getting meta Building up The fate of gate hate date /.ate/ Matches Does not match aate bate cate date . . . ate crates abates dates elated ates ated ... @ate 9ate ’ ate’ /regular expressions/demystified
  • 20. Introduction Starting Small Diving In Getting meta Building up The fate of gate hate date /.ate/ Matches Does not match aate bate cate date . . . ate crates abates dates elated ates ated ... @ate 9ate ’ ate’ /regular expressions/demystified
  • 21. Introduction Starting Small Diving In Getting meta Building up Character classes /[a-z]ate/ Matches Does not match aate bate cate date . . . ate crates abates dates elated ates ated ... @ate 9ate ’ ate’ /regular expressions/demystified
  • 22. Introduction Starting Small Diving In Getting meta Building up Character classes To match a literal ’-’ it should be the first or last character in the class: /[+-*/]/ # Incorrect /[+*/-]/ # Correct /regular expressions/demystified
  • 23. Introduction Starting Small Diving In Getting meta Building up Negated character classes /[^a-z]ate/ Matches Does not match @ate 9ate ’ ate’ ate ates ated g@ate e9ated aate bate cate date . . . crates abates dates elated ... /regular expressions/demystified
  • 24. Introduction Starting Small Diving In Getting meta Building up The late fate of gate hate date rate /[df-hlr]ate/ Matches Does not match date fate gate hate late ate aate bate cate eate rate iate jate kate . . . dates fated billgates hated ... /regular expressions/demystified
  • 25. Introduction Starting Small Diving In Getting meta Building up The late fate of gate hate date rate /[df-hlr]ate/ Matches Does not match date fate gate hate late ate aate bate cate eate rate iate jate kate . . . dates fated billgates hated ... /regular expressions/demystified
  • 26. Introduction Starting Small Diving In Getting meta Building up Anchors /^[df-hlr]ate$/ Matches Does not match date fate gate hate late ate aate bate . . . rate dates gated berate elated ... /regular expressions/demystified
  • 27. Introduction Starting Small Diving In Getting meta Building up Anchors ˆ matches the start of the string $ matches the end of the string Both are 0 byte matches, ie, they do not match any character /regular expressions/demystified
  • 28. Introduction More or less Diving In Alternation Building up Groups Matching more than one of something ? – matches 0 or 1 of what comes before it * – matches 0 or more of what comes before it + – matches 1 or more of what comes before it {n,m} – matches between n and m of what comes before it /regular expressions/demystified
  • 29. Introduction More or less Diving In Alternation Building up Groups Aaargh! Everyone shout “Aaarrrgh!” /regular expressions/demystified
  • 30. Introduction More or less Diving In Alternation Building up Groups How many ways can you say Aargh!? argh aaaaaargh aaaarrrrghhh aaaaarrrrrggggghhhh aaarrrrggggg aaaaarrrrrhhhh /regular expressions/demystified
  • 31. Introduction More or less Diving In Alternation Building up Groups Match ’em all /a+r+g+h+/ # aarrrrgggghhhh /a+r+g+h*/ # aarrgghh & aarrgg /a+r+g*h+/ # aarrgghh & aarrhh /a+r+g*h*/ # argh & arg & arh That last one also matches ’ar’ which we don’t want /regular expressions/demystified
  • 32. Introduction More or less Diving In Alternation Building up Groups Match ’em all /a+r+g+h+/ # aarrrrgggghhhh /a+r+g+h*/ # aarrgghh & aarrgg /a+r+g*h+/ # aarrgghh & aarrhh /a+r+g*h*/ # argh & arg & arh That last one also matches ’ar’ which we don’t want /regular expressions/demystified
  • 33. Introduction More or less Diving In Alternation Building up Groups Match ’em all /a+r+g+h+/ # aarrrrgggghhhh /a+r+g+h*/ # aarrgghh & aarrgg /a+r+g*h+/ # aarrgghh & aarrhh /a+r+g*h*/ # argh & arg & arh That last one also matches ’ar’ which we don’t want /regular expressions/demystified
  • 34. Introduction More or less Diving In Alternation Building up Groups Match ’em all /a+r+g+h+/ # aarrrrgggghhhh /a+r+g+h*/ # aarrgghh & aarrgg /a+r+g*h+/ # aarrgghh & aarrhh /a+r+g*h*/ # argh & arg & arh That last one also matches ’ar’ which we don’t want /regular expressions/demystified
  • 35. Introduction More or less Diving In Alternation Building up Groups Match ’em all /a+r+g+h+/ # aarrrrgggghhhh /a+r+g+h*/ # aarrgghh & aarrgg /a+r+g*h+/ # aarrgghh & aarrhh /a+r+g*h*/ # argh & arg & arh That last one also matches ’ar’ which we don’t want /regular expressions/demystified
  • 36. Introduction More or less Diving In Alternation Building up Groups Alternation: Match all this or all that /ab|cd/ Matches either ’ab’ or ’cd’ /regular expressions/demystified
  • 37. Introduction More or less Diving In Alternation Building up Groups From here to eternity | matches either everything on its left or everything on its right (That’s a pipe character, not the letter I) /regular expressions/demystified
  • 38. Introduction More or less Diving In Alternation Building up Groups Back to aaargh /g*h+|g+h*/ This matches all the endings we want: ggggghhhhhh ggggg hhhhh /regular expressions/demystified
  • 39. Introduction More or less Diving In Alternation Building up Groups Back to aaargh /a+r+g*h+|g+h*/ This doesn’t quite work Matches Does not match aaarrrhhh aaarrrggg aaarrrrggghhh gggg gggghhhh /regular expressions/demystified
  • 40. Introduction More or less Diving In Alternation Building up Groups Back to aaargh /a+r+g*h+|g+h*/ This doesn’t quite work Matches Does not match aaarrrhhh aaarrrggg aaarrrrggghhh gggg gggghhhh /regular expressions/demystified
  • 41. Introduction More or less Diving In Alternation Building up Groups Group the subexpression /a+r+(g*h+|g+h*)/ Matches aaarrrhhh aaarrrggg aaarrrrggghhh /regular expressions/demystified
  • 42. Introduction More or less Diving In Alternation Building up Groups Grouping parentheses ( and ) mark a group | alternates within a group Groups may be nested - it’s like a new regex inside +, *, ? and {n,m} may apply to an entire group /regular expressions/demystified
  • 43. Introduction Diving In Building up Stop /regular expressions/demystified
  • 44. Introduction Diving In Building up Summary Start small, match the parts you understand Build up to more complex patterns Not all problems should be solved by regular expressions /regular expressions/demystified
  • 45. Introduction Diving In Building up More Info. . . “Mastering Regular Expressions” – Jeffrey Friedl http://tech.bluesmoon.info/search/label/regex /regular expressions/demystified
  • 46. Introduction Diving In Building up Contact me Philip Tellis [email protected] @bluesmoon bluesmoon.info /regular expressions/demystified
  • 47. Introduction Diving In Building up Image credits http://flickr.com/photos/practicalowl/3933514241/ http://flickr.com/photos/loozrboy/3908830690/ http://flickr.com/photos/thetruthabout/2680546103/ http://flickr.com/photos/donsolo/2136923757/ /regular expressions/demystified
  • 48. Introduction Diving In Building up Thank You /regular expressions/demystified
  • 49. Introduction Diving In Building up Aargh with class /a+r+g*[gh]h*/ Matches aaarrrhhh aaarrrggg aaarrrrggghhh /regular expressions/demystified
  • 50. Introduction Diving In Building up Matching meta characters in a character class /[a-zA-Z0-9_-]/ /[a-z^]/ /[][]/ /regular expressions/demystified
  • 51. Introduction Diving In Building up Alternating multiple items /apples|oranges|bananas/ /buy some (apples|oranges|ba(na){2}s)/ /regular expressions/demystified