Linux文件处理工具精讲-CSDN博客

文件处理工具

①grep工具

grep用于根据关键字进行行过滤
grep options 'keys' filename
OPTIONS:
    -i: 不区分大小写
    -v: 查找不包含指定内容的行,反向选择
    -w: 按单词搜索
    -o: 打印匹配关键字
    -c: 统计匹配到的次数
    -n: 显示行号
    -r: 逐层遍历目录查找
    -A: 显示匹配行及后面多少行	
    -B: 显示匹配行及前面多少行
    -C: 显示匹配行前后多少行
    -l：只列出匹配的文件名
    -L：列出不匹配的文件名
    -e: 使用正则匹配
    -E:使用扩展正则匹配
    ^key:以关键字开头
    key$:以关键字结尾
    ^$:匹配空行
    --color=auto ：可以将找到的关键词部分加上颜色的显示

临时设置：
# alias grep='grep --color=auto'			//只针对当前终端和当前用户生效

永久设置：
1）全局（针对所有用户生效）
vim /etc/bashrc
alias grep='grep --color=auto'
source /etc/bashrc

2）局部（针对具体的某个用户）
vim ~/.bashrc
alias grep='grep --color=auto'
source ~/.bashrc

示例：
# grep -i root passwd				忽略大小写匹配包含root的行
# grep -w ftp passwd 				精确匹配ftp单词
# grep -w hello passwd 				精确匹配hello单词;自己添加包含hello的行到文件中
# grep -wo ftp passwd 				打印匹配到的关键字ftp
# grep -n root passwd 				打印匹配到root关键字的行好
# grep -ni root passwd 				忽略大小写匹配统计包含关键字root的行
# grep -nic root passwd				忽略大小写匹配统计包含关键字root的行数
# grep -i ^root passwd 				忽略大小写匹配以root开头的行
# grep bash$ passwd 							匹配以bash结尾的行
# grep -n ^$ passwd 							匹配空行并打印行号
# grep ^# /etc/vsftpd/vsftpd.conf		匹配以#号开头的行
# grep -v ^# /etc/vsftpd/vsftpd.conf	匹配不以#号开头的行
# grep -A 5 mail passwd 				 	匹配包含mail关键字及其后5行
# grep -B 5 mail passwd 				 	匹配包含mail关键字及其前5行
# grep -C 5 mail passwd 					匹配包含mail关键字及其前后5行

②cut工具

cut用于列截取
-c:	以字符为单位进行分割。
-d:	自定义分隔符，默认为制表符。\t
-f:	与-d一起使用，指定显示哪个区域。

# cut -d: -f1 1.txt 			以:冒号分割，截取第1列内容
# cut -d: -f1,6,7 1.txt 	以:冒号分割，截取第1,6,7列内容
# cut -c4 1.txt 				截取文件中每行第4个字符
# cut -c1-4 1.txt 			截取文件中每行的1-4个字符
# cut -c4-10 1.txt 			
# cut -c5- 1.txt 				从第5个字符开始截取后面所有字符

③sort工具

sort：将文件的每一行作为一个单位，从首字符向后，依次按ASCII码值进行比较，最后将他们按升序输出。

-u ：去除重复行
-r ：降序排列，默认是升序
-o : 将排序结果输出到文件中  类似 重定向符号>
-n ：以数字排序，默认是按字符排序
-t ：分隔符
-k ：第N列
-b ：忽略前导空格。
-R ：随机排序，每次运行的结果均不同。
 
 示例：
# sort -n -t: -k3 1.txt 			按照用户的uid进行升序排列
# sort -nr -t: -k3 1.txt 			按照用户的uid进行降序排列
# sort -n 2.txt 						按照数字排序
# sort -nu 2.txt 						按照数字排序并且去重
# sort -nr 2.txt 
# sort -nru 2.txt 
# sort -nru 2.txt 
# sort -n 2.txt -o 3.txt 		按照数字排序并将结果重定向到文件
# sort -R 2.txt 
# sort -u 2.txt

④uniq工具

uniq：去除连续重复行
-i: 忽略大小写
-c: 统计重复行次数
-d:只显示重复行

# uniq 2.txt 
# uniq -d 2.txt 
# uniq -dc 2.txt

⑤tee工具

tee工具从标准输入读取并写入标准输出和文件，即：双向覆盖重定向<屏幕输出|文本输入>
-a 双向追加重定向

[root@linux ~]# echo hello world
hello world
[root@linux ~]# echo hello world|tee file1
hello world
[root@linux ~]# cat file1 
hello world
[root@linux ~]# echo 999|tee -a file1
999
[root@linux ~]# cat file1 
hello world
999

⑥paste工具

paste工具用于合并文件行

-d：自定义间隔符，默认是tab
-s：串行处理，非并行

[root@server shell01]# cat a.txt 
hello
[root@server shell01]# cat b.txt 
hello world
888
999
[root@server shell01]# paste a.txt b.txt 
hello   hello world
        888
        999
[root@server shell01]# paste b.txt a.txt   
hello world     hello
888
999

[root@server shell01]# paste -d'@' b.txt a.txt 
hello world@hello
888@
999@

[root@server shell01]# paste -s b.txt a.txt 
hello world     888     999
hello

⑦tr工具

字符转换：替换，删除

tr用来从标准输入中通过替换或删除操作进行字符转换；主要用于删除文件中控制字符或进行字符转换。
使用tr时要转换两个字符串：字符串1用于查询，字符串2用于处理各种转换。

语法：
commands|tr  'string1'  'string2'
tr  'string1'  'string2' < filename

tr options 'string1' < filename

-d 删除字符串1中所有输入字符。
-s 删除所有重复出现字符序列，只保留第一个；即将重复出现字符串压缩为一个字符串。


	a-z 任意小写
	A-Z 任意大写
	0-9 任意数字
  [:alnum:]       all letters and digits		所有字母和数字
  [:alpha:]       all letters						所有字母
  [:blank:]       all horizontal whitespace	所有水平空白
  [:cntrl:]       all control characters		所有控制字符
\b Ctrl-H  		退格符
\f Ctrl-L  		走行换页
\n Ctrl-J  		新行
\r Ctrl-M  		回车
\t Ctrl-I  		tab键
  [:digit:]    all digits	所有数字
  [:graph:]    all printable characters, not including space
  所有可打印的字符，不包含空格
  [:lower:]       all lower case letters		所有小写字母
  [:print:]       all printable characters, including space
  所有可打印的字符，包含空格
  [:punct:]       all punctuation characters			所有的标点符号
  [:space:]       all horizontal or vertical whitespace	所有水平或垂直的空格
  [:upper:]       all upper case letters				所有大写字母
  [:xdigit:]      all hexadecimal digits				所有十六进制数字
  [=CHAR=]        all characters which are equivalent to CHAR	所有字符
  
 

[root@server shell01]# cat 3.txt 	自己创建该文件用于测试
ROOT:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
uucp:x:10:14:uucp:/var/spool/uucp:/sbin/nologin
boss02:x:516:511::/home/boss02:/bin/bash
vip:x:517:517::/home/vip:/bin/bash
stu1:x:518:518::/home/stu1:/bin/bash
mailnull:x:47:47::/var/spool/mqueue:/sbin/nologin
smmsp:x:51:51::/var/spool/mqueue:/sbin/nologin
aaaaaaaaaaaaaaaaaaaa
bbbbbb111111122222222222233333333cccccccc
hello world 888
666
777
999


# tr -d '[:/]' < 3.txt 				删除文件中的:和/
# cat 3.txt |tr -d '[:/]'			删除文件中的:和/
# tr '[0-9]' '@' < 3.txt 			将文件中的数字替换为@符号
# tr '[a-z]' '[A-Z]' < 3.txt 		将文件中的小写字母替换成大写字母
# tr -s '[a-z]' < 3.txt 			匹配小写字母并将重复的压缩为一个
# tr -s '[a-z0-9]' < 3.txt 		匹配小写字母和数字并将重复的压缩为一个
# tr -d '[:digit:]' < 3.txt 		删除文件中的数字
# tr -d '[:blank:]' < 3.txt 		删除水平空白
# tr -d '[:space:]' < 3.txt 		删除所有水平和垂直空白

案例1：

使用小工具分别截取当前主机IP；截取NETMASK；截取广播地址；截取MAC地址

[root@linux ~]# ifconfig ens33|grep 'inet'|tr -d '[a-zA-Z ]'|cut -d: -f2,3,4
192.168.209.133255.255.255.0192.168.209.255

[root@linux ~]# ifconfig ens33|grep -w 'inet'|tr '[:space:]' '\n' | tr -d [a-z] | grep -v ^$
192.168.209.133
255.255.255.0
192.168.209.255

[root@linux ~]#  ifconfig ens33|grep -w inet|tr -d '[:a-zA-Z]'|tr ' ' '@'|tr -s '@'|tr '@' '\n'|grep -v ^$
192.168.209.133
255.255.255.0
192.168.209.255

[root@linux ~]#  ifconfig ens33|grep -w 'inet'|tr -d [:alpha:]|tr '[ :]' '\n'|grep -v ^$
192.168.209.133
255.255.255.0
192.168.209.255