Chapter 5: Understanding Text Processing The Complete Guide to Linux System Administration
Objectives Use regular expressions in a variety of circumstances Manipulate text files in complex ways using multiple command-line utilities Use advanced features of the vi editor Use the sed and awk text processing utilities
Regular Expressions Flexible way to encode many types of complex patterns Use to define pattern in many situations Parameter to most Linux commands  Within vi editor  Within programming languages Including shell scripts Used for text
Regular Expressions (continued)
Regular Expressions (continued)
Regular Expressions (continued) Acceptable syntax varies in small but important ways  Depending on where expression used Examples: [Rr]eunion[0-9][0-9].jpg [Rr]eunion[0-9]{2}.jpg Reunion-[^d].jpg
Manipulating Files Command-line utilities useful for: Searching  Sorting Reorganizing Otherwise working with text files
Searching for Patterns with grep grep  Rapidly scan files for specified pattern  Print out lines of text that contain text matching pattern Take further action on matching lines of text  Using pipe to connect grep with other filtering commands
Searching for Patterns with grep (continued) Examples: grep wilson /etc/passwd grep thomas[Cc]orp *txt Often used at end of pipe locate tif | grep frame
Examining File Contents head and tail commands:  Display first few lines and last few lines of file By default include 10 lines -n option Specify number of lines Print output to STDOUT Redirect as needed
Examining File Contents (continued) tail –f option “Follows” file printing new lines as they are added to file by other programs Very useful for tracking log files wc command Count number of characters, words, and lines
Examining File Contents (continued)
Examining File Contents (continued) strings command  Extracts text strings from file that includes binary and other non-text data Provides convenient way to check for information that may not be otherwise available
Examining File Contents (continued)
Manipulating Text Files Filtering Modify part of text file by adding removing or altering data in file Based on complex rules or patterns Use command-line programs to filter text files sort command Sort all of lines in text file uniq command Remove duplicate lines in file
Manipulating Text Files (continued) diff command Displays differences between two files Output format: < indicates lines that were not found in second file > indicates lines that were not found in first file cmp command  Gives quick check of whether two files are identical
Manipulating Text Files (continued) comm command  Used to compare sorted files to see if they differ  at all ispell spell checker  Uses large dictionary to examine text file  Prompts with suggestions
Manipulating Text Files (continued)
Manipulating Text Files (continued)
Manipulating Text Files (continued)
Using sed and awk sed Complex filtering program awk command  Generally used for formatting output
Filtering and Editing Text with sed sed command  Processes each line in text file according to series of command-line options Example:  sed -n '/lincoln/p' /tmp/names Prints to screen all lines of /tmp/names file that contain text “lincoln” By default, prints each line to STDOUT
Filtering and Editing Text with sed (continued) Substitution command syntax: /pattern1/s/pattern2/pattern3/g Watches for lines containing pattern1 Replaces occurrences of pattern2 with pattern3 g option at end of command  Causes sed to replace all occurrences on each line Means global
Filtering and Editing Text with sed (continued) Can place operations in file and pass file name to sed command sed -f nolatin news-article > new_news-article ( & ) Operator within sed command  Refers to text that matches pattern2 S/[0-9]*\[0-9][0-9]/\$&/g sed often useful as part of pipeline of Linux commands
Formatting with awk Processes text  Extracts parts of file Formats text according to information you provide on command line or in script file Format output based on fields within line of text Often can perform same functions with sed or awk
Formatting with awk (continued) Each field on line is normally separated by whitespace Can change which character awk uses to separate fields First field is referred to by $1 second by $2, etc. Basic format: /pattern/ { actions } Example: ls -l | awk '{ print $3 $9  }'
Formatting with awk (continued) Can include regular expression to select which lines awk includes in output: ls -l | awk '/^l/ {print $3 $9  }' Use variable or comparison in awk command Put at beginning of command instead of pattern ls -l | awk ' $2 > 3 {print $0  }' Using awk script file: awk -f awk_command_list text_file
More Advanced Text Editing vi editor provides advanced text editing features
File Operations in vi :w command  Write file you are editing :r file name Insert another file into file you are editing :q command  Exit from vi :wq Save and exit
Screen Repositioning Line number and cursor position on line Shown at bottom right Use parentheses and curly braces  Move forward or backward by one sentence or paragraph at a time Ctrl+f and Ctrl+b key combinations  Move one screen forward and backward
Screen Repositioning (continued) Shift+G  Take you to any line in file Enter line number first then Shift+g Mark  Like bookmark m command followed by name (a-z and 0-9) Place mark ‘  command followed by mark name Return to mark
Screen Repositioning (continued) % Navigate between matching braces, parenthesis, etc. in program source code Shift+J Joins two lines
More Line-Editing Commands :h View vi help file Ctrl+] Navigate to hyperlinks in help files Ctrl+t Navigate back from links in help files
More Line-Editing Commands (continued) Forward slash (/) Search forward from current cursor position Can use regular expression as search pattern n key Move to next occurrence of search pattern ? Search backwards N key Move to previous occurrence of pattern
More Line-Editing Commands (continued) Search-and-replace operations Format :line-number-range s/search-pattern/replacement text/flags Example :1$ s/^configure/configure/
More Line-Editing Commands (continued) Shelling out Execute another Linux command  As if you were at shell prompt Type ! followed by command  Example: :!ls /etc/samba
Setting vi Options :set all View all options currently set in vi Press spacebar multiple times to see all screens of settings :set without the word all  Displays all options that current user has set :set followed by option To set option
Setting vi Options (continued)
Setting vi Options (continued) Can automate settings Define environment variable called EXINIT that contains set command  Executed each time vi started EXINIT='set nu nosmartindent' Place settings in file called .exrc Overrides information in EXINIT variable
Summary Regular expressions used in many places to define patterns of information grep command used to search for lines of text containing pattern defined using regular expression sed and awk commands support complex scripting language that includes regular expressions
Summary (continued) vi  Uses complex combinations of commands to reposition cursor within text Supports search-and-replace operations set command defines editor settings

More Related Content

PPT
Linux Administration
ODP
intro unix/linux 02
PDF
Linux introductory-course-day-1
PDF
Important Linux Commands
PPTX
Linux powerpoint
Linux Administration
intro unix/linux 02
Linux introductory-course-day-1
Important Linux Commands
Linux powerpoint

What's hot (20)

PPT
Spsl by sasidhar 3 unit
PPTX
Shell scripting
PPT
PDF
Comenzi unix
PPT
intro unix/linux 07
PDF
Linux file commands and shell scripts
PPT
intro unix/linux 09
PPTX
Piping into-php
PDF
Kali linux commands
PDF
Course 102: Lecture 28: Virtual FileSystems
DOCX
List command linux a z
PPT
intro unix/linux 11
PDF
Quick guide of the most common linux commands
DOCX
archive A-Z linux
PDF
Course 102: Lecture 12: Basic Text Handling
PDF
Course 102: Lecture 2: Unwrapping Linux
PDF
Basic commands
PPT
Linux administration classes in mumbai
PDF
Course 102: Lecture 11: Environment Variables
PDF
Course 102: Lecture 10: Learning About the Shell
Spsl by sasidhar 3 unit
Shell scripting
Comenzi unix
intro unix/linux 07
Linux file commands and shell scripts
intro unix/linux 09
Piping into-php
Kali linux commands
Course 102: Lecture 28: Virtual FileSystems
List command linux a z
intro unix/linux 11
Quick guide of the most common linux commands
archive A-Z linux
Course 102: Lecture 12: Basic Text Handling
Course 102: Lecture 2: Unwrapping Linux
Basic commands
Linux administration classes in mumbai
Course 102: Lecture 11: Environment Variables
Course 102: Lecture 10: Learning About the Shell
Ad

Viewers also liked (20)

PPT
Learntheory Engl
PPT
PDF
Ifi7056 5loeng
PDF
NJEDge 2008 Kindle iPhone SDK
PPT
Pubblicare Open Access
PDF
Ui05 Brief
PDF
Questioning
PPT
Kokkuvõte 2
PPTX
Õpianalüütika töötuba
PDF
Build Widgets
PPTX
On-Line Privacy: coping with the challengeof children useing emerging technology
PPT
Welcome to Technology
PPT
03inhaledexhaledair
PPT
4 Edizione corso Risorse Elettroniche per la ricerca Polisearch E Refworks
PPTX
UoLRA Depositing your thesis
PPT
Open Genius: Crowdfunding Science
PPT
PPT
Compass Point
PPS
Spain
PDF
Presence @ Winterschool 2008
Learntheory Engl
Ifi7056 5loeng
NJEDge 2008 Kindle iPhone SDK
Pubblicare Open Access
Ui05 Brief
Questioning
Kokkuvõte 2
Õpianalüütika töötuba
Build Widgets
On-Line Privacy: coping with the challengeof children useing emerging technology
Welcome to Technology
03inhaledexhaledair
4 Edizione corso Risorse Elettroniche per la ricerca Polisearch E Refworks
UoLRA Depositing your thesis
Open Genius: Crowdfunding Science
Compass Point
Spain
Presence @ Winterschool 2008
Ad

Similar to Ch05 (20)

PDF
Unit 8 text processing tools
PPTX
Linux Basic commands and VI Editor
ODP
Nithi
PPT
Learning sed and awk
PPTX
VI Editor
PDF
Information about linux operating system
ODP
DevChatt 2010 - *nix Cmd Line Kung Foo
PDF
Lecture_4.pdf
PPTX
Lpt lopsa
PDF
PPT
Spsl II unit
PPT
workshop_1.ppt
ODP
ODP
PPTX
PPTX
Linux Command Suumary
PPTX
Linux System commands Essentialsand Basics.pptx
ODP
Linuxppt
PPTX
Linux Command.pptx
PPT
Unix tutorial-08
Unit 8 text processing tools
Linux Basic commands and VI Editor
Nithi
Learning sed and awk
VI Editor
Information about linux operating system
DevChatt 2010 - *nix Cmd Line Kung Foo
Lecture_4.pdf
Lpt lopsa
Spsl II unit
workshop_1.ppt
Linux Command Suumary
Linux System commands Essentialsand Basics.pptx
Linuxppt
Linux Command.pptx
Unix tutorial-08

More from Mike Qaissaunee (20)

PPTX
Horizon 2011 adoption
PPT
Chapter 3
PPT
Chapter 2
PPT
Chapter 1
PPT
Chapter 4ver2
KEY
Pace presentation
KEY
Advisory Committee Meeting
PPTX
Creating Great Slides - After
PPT
Creating Great Slides - Before
PPTX
ESMP Update
PPT
Online Impact Oct 12 2009
PPT
Online Impact Oct 12 2009
PPT
Online Impact Oct 12 2009
PPT
Online Impact Oct 12 2009
PPT
Online Impact Oct 12 2009
PPT
Online Impact Oct 12 2009
PPT
Online Impact Oct 12 2009
PPT
Online Impact Oct 12 2009
PDF
PDF
Horizon 2011 adoption
Chapter 3
Chapter 2
Chapter 1
Chapter 4ver2
Pace presentation
Advisory Committee Meeting
Creating Great Slides - After
Creating Great Slides - Before
ESMP Update
Online Impact Oct 12 2009
Online Impact Oct 12 2009
Online Impact Oct 12 2009
Online Impact Oct 12 2009
Online Impact Oct 12 2009
Online Impact Oct 12 2009
Online Impact Oct 12 2009
Online Impact Oct 12 2009

Recently uploaded (20)

PDF
THE CHILD AND ADOLESCENT LEARNERS & LEARNING PRINCIPLES
PDF
semiconductor packaging in vlsi design fab
PDF
Fun with Grammar (Communicative Activities for the Azar Grammar Series)
PDF
Civil Department's presentation Your score increases as you pick a category
PDF
Controlled Drug Delivery System-NDDS UNIT-1 B.Pharm 7th sem
PPT
REGULATION OF RESPIRATION lecture note 200L [Autosaved]-1-1.ppt
PDF
faiz-khans about Radiotherapy Physics-02.pdf
PPTX
Macbeth play - analysis .pptx english lit
PPTX
Reproductive system-Human anatomy and physiology
PPTX
Climate Change and Its Global Impact.pptx
PDF
PUBH1000 - Module 6: Global Health Tute Slides
PDF
Literature_Review_methods_ BRACU_MKT426 course material
PDF
LIFE & LIVING TRILOGY - PART (3) REALITY & MYSTERY.pdf
PDF
Disorder of Endocrine system (1).pdfyyhyyyy
PDF
Horaris_Grups_25-26_Definitiu_15_07_25.pdf
PDF
Lecture on Viruses: Structure, Classification, Replication, Effects on Cells,...
PDF
African Communication Research: A review
DOCX
Cambridge-Practice-Tests-for-IELTS-12.docx
PDF
Compact First Student's Book Cambridge Official
PDF
fundamentals-of-heat-and-mass-transfer-6th-edition_incropera.pdf
THE CHILD AND ADOLESCENT LEARNERS & LEARNING PRINCIPLES
semiconductor packaging in vlsi design fab
Fun with Grammar (Communicative Activities for the Azar Grammar Series)
Civil Department's presentation Your score increases as you pick a category
Controlled Drug Delivery System-NDDS UNIT-1 B.Pharm 7th sem
REGULATION OF RESPIRATION lecture note 200L [Autosaved]-1-1.ppt
faiz-khans about Radiotherapy Physics-02.pdf
Macbeth play - analysis .pptx english lit
Reproductive system-Human anatomy and physiology
Climate Change and Its Global Impact.pptx
PUBH1000 - Module 6: Global Health Tute Slides
Literature_Review_methods_ BRACU_MKT426 course material
LIFE & LIVING TRILOGY - PART (3) REALITY & MYSTERY.pdf
Disorder of Endocrine system (1).pdfyyhyyyy
Horaris_Grups_25-26_Definitiu_15_07_25.pdf
Lecture on Viruses: Structure, Classification, Replication, Effects on Cells,...
African Communication Research: A review
Cambridge-Practice-Tests-for-IELTS-12.docx
Compact First Student's Book Cambridge Official
fundamentals-of-heat-and-mass-transfer-6th-edition_incropera.pdf

Ch05

  • 1. Chapter 5: Understanding Text Processing The Complete Guide to Linux System Administration
  • 2. Objectives Use regular expressions in a variety of circumstances Manipulate text files in complex ways using multiple command-line utilities Use advanced features of the vi editor Use the sed and awk text processing utilities
  • 3. Regular Expressions Flexible way to encode many types of complex patterns Use to define pattern in many situations Parameter to most Linux commands Within vi editor Within programming languages Including shell scripts Used for text
  • 6. Regular Expressions (continued) Acceptable syntax varies in small but important ways Depending on where expression used Examples: [Rr]eunion[0-9][0-9].jpg [Rr]eunion[0-9]{2}.jpg Reunion-[^d].jpg
  • 7. Manipulating Files Command-line utilities useful for: Searching Sorting Reorganizing Otherwise working with text files
  • 8. Searching for Patterns with grep grep Rapidly scan files for specified pattern Print out lines of text that contain text matching pattern Take further action on matching lines of text Using pipe to connect grep with other filtering commands
  • 9. Searching for Patterns with grep (continued) Examples: grep wilson /etc/passwd grep thomas[Cc]orp *txt Often used at end of pipe locate tif | grep frame
  • 10. Examining File Contents head and tail commands: Display first few lines and last few lines of file By default include 10 lines -n option Specify number of lines Print output to STDOUT Redirect as needed
  • 11. Examining File Contents (continued) tail –f option “Follows” file printing new lines as they are added to file by other programs Very useful for tracking log files wc command Count number of characters, words, and lines
  • 12. Examining File Contents (continued)
  • 13. Examining File Contents (continued) strings command Extracts text strings from file that includes binary and other non-text data Provides convenient way to check for information that may not be otherwise available
  • 14. Examining File Contents (continued)
  • 15. Manipulating Text Files Filtering Modify part of text file by adding removing or altering data in file Based on complex rules or patterns Use command-line programs to filter text files sort command Sort all of lines in text file uniq command Remove duplicate lines in file
  • 16. Manipulating Text Files (continued) diff command Displays differences between two files Output format: < indicates lines that were not found in second file > indicates lines that were not found in first file cmp command Gives quick check of whether two files are identical
  • 17. Manipulating Text Files (continued) comm command Used to compare sorted files to see if they differ at all ispell spell checker Uses large dictionary to examine text file Prompts with suggestions
  • 18. Manipulating Text Files (continued)
  • 19. Manipulating Text Files (continued)
  • 20. Manipulating Text Files (continued)
  • 21. Using sed and awk sed Complex filtering program awk command Generally used for formatting output
  • 22. Filtering and Editing Text with sed sed command Processes each line in text file according to series of command-line options Example: sed -n '/lincoln/p' /tmp/names Prints to screen all lines of /tmp/names file that contain text “lincoln” By default, prints each line to STDOUT
  • 23. Filtering and Editing Text with sed (continued) Substitution command syntax: /pattern1/s/pattern2/pattern3/g Watches for lines containing pattern1 Replaces occurrences of pattern2 with pattern3 g option at end of command Causes sed to replace all occurrences on each line Means global
  • 24. Filtering and Editing Text with sed (continued) Can place operations in file and pass file name to sed command sed -f nolatin news-article > new_news-article ( & ) Operator within sed command Refers to text that matches pattern2 S/[0-9]*\[0-9][0-9]/\$&/g sed often useful as part of pipeline of Linux commands
  • 25. Formatting with awk Processes text Extracts parts of file Formats text according to information you provide on command line or in script file Format output based on fields within line of text Often can perform same functions with sed or awk
  • 26. Formatting with awk (continued) Each field on line is normally separated by whitespace Can change which character awk uses to separate fields First field is referred to by $1 second by $2, etc. Basic format: /pattern/ { actions } Example: ls -l | awk '{ print $3 $9 }'
  • 27. Formatting with awk (continued) Can include regular expression to select which lines awk includes in output: ls -l | awk '/^l/ {print $3 $9 }' Use variable or comparison in awk command Put at beginning of command instead of pattern ls -l | awk ' $2 > 3 {print $0 }' Using awk script file: awk -f awk_command_list text_file
  • 28. More Advanced Text Editing vi editor provides advanced text editing features
  • 29. File Operations in vi :w command Write file you are editing :r file name Insert another file into file you are editing :q command Exit from vi :wq Save and exit
  • 30. Screen Repositioning Line number and cursor position on line Shown at bottom right Use parentheses and curly braces Move forward or backward by one sentence or paragraph at a time Ctrl+f and Ctrl+b key combinations Move one screen forward and backward
  • 31. Screen Repositioning (continued) Shift+G Take you to any line in file Enter line number first then Shift+g Mark Like bookmark m command followed by name (a-z and 0-9) Place mark ‘ command followed by mark name Return to mark
  • 32. Screen Repositioning (continued) % Navigate between matching braces, parenthesis, etc. in program source code Shift+J Joins two lines
  • 33. More Line-Editing Commands :h View vi help file Ctrl+] Navigate to hyperlinks in help files Ctrl+t Navigate back from links in help files
  • 34. More Line-Editing Commands (continued) Forward slash (/) Search forward from current cursor position Can use regular expression as search pattern n key Move to next occurrence of search pattern ? Search backwards N key Move to previous occurrence of pattern
  • 35. More Line-Editing Commands (continued) Search-and-replace operations Format :line-number-range s/search-pattern/replacement text/flags Example :1$ s/^configure/configure/
  • 36. More Line-Editing Commands (continued) Shelling out Execute another Linux command As if you were at shell prompt Type ! followed by command Example: :!ls /etc/samba
  • 37. Setting vi Options :set all View all options currently set in vi Press spacebar multiple times to see all screens of settings :set without the word all Displays all options that current user has set :set followed by option To set option
  • 38. Setting vi Options (continued)
  • 39. Setting vi Options (continued) Can automate settings Define environment variable called EXINIT that contains set command Executed each time vi started EXINIT='set nu nosmartindent' Place settings in file called .exrc Overrides information in EXINIT variable
  • 40. Summary Regular expressions used in many places to define patterns of information grep command used to search for lines of text containing pattern defined using regular expression sed and awk commands support complex scripting language that includes regular expressions
  • 41. Summary (continued) vi Uses complex combinations of commands to reposition cursor within text Supports search-and-replace operations set command defines editor settings