加入收藏 | 设为首页 | 会员中心 | 我要投稿 新余站长网 (https://www.0790zz.com/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 服务器 > 搭建环境 > Linux > 正文

Sed - An Introduction and Tutorial by Bruce Barnett

发布时间:2021-01-29 08:45:10 所属栏目:Linux 来源:网络整理
导读:http://www.grymoire.com/unix/sed.html Quick Links - NEW table border="1" tr Sed Pattern Flags /tr tr td a href="http://www.grymoire.com/unix/Sed.html#uh-6"gt;/g?- Global/td /tr tr td a href="http://www.grymoire.com/unix/Sed.html#uh-10a"gt;

The string "abc" is unchanged,because it was not matched by the regular expression. If you wanted to eliminate "abc" from the output,you must expand the regular expression to match the rest of the line and explicitly exclude part of the expression using "(",")" and "1",which is the next topic.

A quick comment. The original sed did not support the "+" metacharacter. GNU sed does if you use the "-r" command line option,which enables extended regular expressions. The "+" means "one or more matches". So the above could also be written using

% echo "123 abc" | sed -r 's/[0-9]+/& &/'
123 123 abc

I have already described the use of "(" ")" and "1" in my tutorial on??To review,the escaped parentheses (that is,parentheses with backslashes before them) remember a substring of the characters matched by the regular expression. You can use this to exclude part of the characters matched by the regular expression. The "1" is the first remembered pattern,and the "2" is the second remembered pattern. Sed has up to nine remembered patterns.

If you wanted to keep the first word of a line,and delete the rest of the line,mark the important part with the parenthesis:

sed 's/([a-z]*).*/1/'

I should elaborate on this. Regular expressions are greedy,and try to match as much as possible. "[a-z]*" matches zero or more lower case letters,and tries to match as many characters as possible. The ".*" matches zero or more characters after the first match. Since the first one grabs all of the contiguous lower case letters,the second matches anything else. Therefore if you type

echo abcd123 | sed 's/([a-z]*).*/1/'

This will output "abcd" and delete the numbers.

If you want to switch two words around,you can remember two patterns and change the order around:

sed 's/([a-z]*) ([a-z]*)/2 1/'

Note the space between the two remembered patterns. This is used to make sure two words are found. However,this will do nothing if a single word is found,or any lines with no letters. You may want to insist that words have at least one letter by using

sed 's/([a-z][a-z]*) ([a-z][a-z]*)/2 1/'

or by using extended regular expressions (note that '(' and ')' no longer need to have a backslash):

sed -r 's/([a-z]+) ([a-z]+)/2 1/' # Using GNU sed

The "1" doesn't have to be in the replacement string (in the right hand side). It can be in the pattern you are searching for (in the left hand side). If you want to eliminate duplicated words,you can try:

sed 's/([a-z]*) 1/1/'

If you want to detect duplicated words,you can use

sed -n '/([a-z][a-z]*) 1/p'

or with extended regular expressions

sed -n '/([a-z]+) 1/p'

This,when used as a filter,will print lines with duplicated words.

The numeric value can have up to nine values: "1" thru "9." If you wanted to reverse the first three characters on a line,you can use

sed 's/^(.)(.)(.)/321/'

You can add additional flags after the last delimiter. You might have noticed I used a 'p' at the end of the previous substitute command. I also added the '-n' option. Let me first cover the 'p' and other pattern flags. These flags can specify what happens when a match is found. Let me describe them.

Most UNIX utilities work on files,reading a line at a time.?Sed,by default,is the same way. If you tell it to change a word,it will only change the first occurrence of the word on a line. You may want to make the change on every word on the line instead of the first. For an example,let's place parentheses around words on a line. Instead of using a pattern like "[A-Za-z]*" which won't match words like "won't," we will use a pattern,"[^ ]*," that matches everything except a space. Well,this will also match anything because "*" means?zero or more. The current version of Solaris's?sed?(as I wrote this) can get unhappy with patterns like this,and generate errors like "Output line too long" or even run forever. I consider this a bug,and have reported this to Sun. As a work-around,you must avoid matching the null string when using the "g" flag to?sed. A work-around example is: "[^ ][^ ]*." The following will put parenthesis around the first word:

sed 's/[^ ]*/(&)/' new

If you want it to make changes for every word,add a "g" after the last delimiter and use the work-around:

sed 's/[^ ][^ ]*/(&)/g' new

Sed?only operates on patterns found in the in-coming data. That is,the input line is read,and when a pattern is matched,the modified output is generated,and therest?of the input line is scanned. The "s" command will not scan the newly created output. That is,you don't have to worry about expressions like:

sed 's/loop/loop the loop/g' new

(编辑:新余站长网)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

热点阅读