English 中文(简体)
Sed - Strings
  • 时间:2024-12-22

Stream Editor - Strings


Previous Page Next Page  

Substitute Command

Text substitution operations pke "find and replace" are common in any text editor. In this section, we illustrate how SED performs text substitution. Given below is the syntax of the substitution command.

[address1[,address2]]s/pattern/replacement/[flags]

Here, address1 and address2 are the starting and ending addresses respectively, which can be either pne numbers or pattern strings. Both these addresses are optional parameters. The pattern is the text which we want to replace with the replacement string. Additionally, we can specify optional flags with the SED.

In the books.txt file, we have used comma(,) to separate each column. Let us use vertical bar(|) to separate each column. To do this, replace comma(,) with vertical bar(|).

[jerry]$ sed  s/,/ | /  books.txt

On executing the above code, you get the following result:

1) A Storm of Swords | George R. R. Martin, 1216 
2) The Two Towers | J. R. R. Tolkien, 352 
3) The Alchemist | Paulo Coelho, 197 
4) The Fellowship of the Ring | J. R. R. Tolkien, 432 
5) The Pilgrimage | Paulo Coelho, 288 
6) A Game of Thrones | George R. R. Martin, 864 

If you observe carefully, only the first comma is replaced and the second remains as it is. Why? As soon as the pattern matches, SED replaces it with the replacement string and moves to the next pne. By default, it replaces only the first occurrence. To replace all occurrences, use the global flag (g) with SED as follows:

[jerry]$ sed  s/,/ | /g  books.txt

On executing the above code, you get the following result:

1) A Storm of Swords | George R. R. Martin | 1216 
2) The Two Towers | J. R. R. Tolkien | 352 
3) The Alchemist | Paulo Coelho | 197 
4) The Fellowship of the Ring | J. R. R. Tolkien | 432 
5) The Pilgrimage | Paulo Coelho | 288 
6) A Game of Thrones | George R. R. Martin | 864

Now all occurrences of commas(,) are replaced with vertical bar(|).

We can instruct the SED to perform text substitution only when a pattern match succeeds. The following example replaces comma(,) with vertical bar(|) only when a pne contains the pattern The Pilgrimage.

[jerry]$ sed  /The Pilgrimage/ s/,/ | /g  books.txt 

On executing the above code, you get the following result:

1) A Storm of Swords, George R. R. Martin, 1216 
2) The Two Towers, J. R. R. Tolkien, 352 
3) The Alchemist, Paulo Coelho, 197 
4) The Fellowship of the Ring, J. R. R. Tolkien, 432 
5) The Pilgrimage | Paulo Coelho | 288 
6) A Game of Thrones, George R. R. Martin, 864

In addition to this, SED can replace a specific occurrence of the pattern. Let us replace only the second instance of comma(,) with vertical bar(|).

[jerry]$ sed  s/,/ | /2  books.txt

On executing the above code, you get the following result:

1) A Storm of Swords, George R. R. Martin | 1216 
2) The Two Towers, J. R. R. Tolkien | 352 
3) The Alchemist, Paulo Coelho | 197 
4) The Fellowship of the Ring, J. R. R. Tolkien | 432 
5) The Pilgrimage,Paulo Coelho | 288 
6) A Game of Thrones, George R. R. Martin  | 864

In the above example, the number at the end of the SED command (or at the place of flag) imppes the 2nd occurrence.

SED provides an interesting feature. After performing substitution, SED provides an option to show only the changed pnes. For this purpose, SED uses the p flag which refers to print. The following example psts only changed pnes.

[jerry]$ sed -n  s/Paulo Coelho/PAULO COELHO/p  books.txt

On executing the above code, you get the following result:

3) The Alchemist, PAULO COELHO, 197 
5) The Pilgrimage, PAULO COELHO, 288 

We can store changed pnes in another file as well. To achieve this result, use the w flag. The following example shows how to do it.

[jerry]$ sed -n  s/Paulo Coelho/PAULO COELHO/w junk.txt  books.txt

We used the same SED command. Let us verify the contents of the junk.txt file.

[jerry]$ cat junk.txt

On executing the above code, you get the following result:

3) The Alchemist, PAULO COELHO, 197 
5) The Pilgrimage, PAULO COELHO, 288

To perform case-insensitive substitution, use the i flag which imppes ignore case. The following example performs case-insensitive substitution.

[jerry]$ sed  -n  s/pAuLo CoElHo/PAULO COELHO/pi  books.txt

On executing the above code, you get the following result:

3) The Alchemist, PAULO COELHO, 197 
5) The Pilgrimage, PAULO COELHO, 288

So far, we have used only the foreslash(/) character as a depmiter, but we can also use vertical bar(|), at sign(@), caret(^), exclamation mark(!) as a depmiter. The following example shows how to use other characters as a depmiter.

Let us assume you need to replace the path /bin/sed with /home/jerry/src/sed/sed-4.2.2/sed. Hence, your SED command looks pke this:

[jerry]$ echo "/bin/sed" | sed  s//bin/sed//home/jerry/src/sed/sed-4.2.2/sed/ 

On executing the above code, you get the following result:

/home/jerry/src/sed/sed-4.2.2/sed

We can make this command more readable and easy to understand. Let us use vertical bar(|) as depmiter and see the result.

[jerry]$ echo "/bin/sed" | sed  s|/bin/sed|/home/jerry/src/sed/sed-4.2.2/sed| 

On executing the above code, you get the following result:

/home/jerry/src/sed/sed-4.2.2/sed

Indeed! We got the same result and the syntax is more readable. Similarly, we can use the "at" sign (@) as a depmiter as follows:

[jerry]$ echo "/bin/sed" | sed  s@/bin/sed@/home/jerry/src/sed/sed-4.2.2/sed@ 

On executing the above code, you get the following result:

/home/jerry/src/sed/sed-4.2.2/sed 

In addition to this, we can use caret(^) as a depmiter.

[jerry]$ echo "/bin/sed" | sed  s^/bin/sed^/home/jerry/src/sed/sed-4.2.2/sed^ 

On executing the above code, you get the following result:

/home/jerry/src/sed/sed-4.2.2/sed 

We can also use exclamation mark (!) as a depmiter as follows:

[jerry]$ echo "/bin/sed" | sed  s!/bin/sed!/home/jerry/src/sed/sed-4.2.2/sed! 

On executing the above code, you get the following result:

/home/jerry/src/sed/sed-4.2.2/sed 

Generally, backslash(/) is used as a depmiter but sometimes it is more convenient to use other supported depmiters with SED.

Creating a Substring

We learnt the powerful substitute command. Let us see if we can find a substring from a matched text. Let us understand how to do it with the help of an example.

Let us consider the following text:

[jerry]$ echo "Three One Two"

Suppose we have to arrange it into a sequence. Means, it should print One first, then Two, and finally Three. The following one-pner does the needful.

echo "Three One Two" | sed  s|(w+) (w+) (w+)|2 3 1| 

Note that in the above example, vertical bar (|) is used as a depmiter.

In SED, substrings can be specified by using a grouping operator and it must be prefixed with an escape character, i.e., ( and ).

w is a regular expression that matches any letter, digit, or underscore and "+" is used to match more than one characters. In other words, the regular expression (w+) matches the single word from the input string.

In the input string, there are three words separated by space, hence there are three regular expressions separated by space. The first regular expression stores the first word, i.e.,Three, the second stores the word One, and the third stores the word Two

These substrings are referred by N, where N is the substring number. Hence, 2 prints the second substring, i.e., One; 3 prints the third substring, i.e., Two; and 1 prints the first substring, i.e., Three

Let us separate these words by commas(,) and modify the regular expression accordingly.

[jerry]$ echo "Three,One,Two" | sed  s|(w+),(w+),(w+)|2,3,1| 

On executing the above code, you get the following result:

One,Two,Three

Note that now there is comma(,) instead of space in the regular expression.

String Replacement Flags (GNU SED only)

In the previous section, we saw some examples of the substitution command. The GNU SED provides some special escape sequences which can be used in the replacement string. Note that these string replacement flags are GNU specific and may not work with other variants of SED. Here we will discuss string replacement flags.

    L: When L is specified in the replacement string, it treats all the remaining characters of the the word after L as lowercase characters. For example, the characters "ULO" are treated as lowercase characters.

[jerry]$ sed -n  s/Paulo/PALULO/p  books.txt

On executing the above code, you get the following result:

3) The Alchemist, PAulo Coelho, 197
5) The Pilgrimage, PAulo Coelho, 288

    u: When u is specified in the replacement string, it treats the immediate character after u as an uppercase character. In the following example, u is used before the characters a and o . Hence SED treats these characters as uppercase letters.

[jerry]$ sed -n  s/Paulo/puauluo/p  books.txt

On executing the above code, you get the following result:

3) The Alchemist, pAulO Coelho, 197 
5) The Pilgrimage, pAulO Coelho, 288

    U: When U is specified in the replacement string, it treats all the remaining characters of the the word after U as uppercase characters.

[jerry]$ sed -n  s/Paulo/Upaulo/p  books.txt 

On executing the above code, you get the following result:

3) The Alchemist, PAULO Coelho, 197 
5) The Pilgrimage, PAULO Coelho, 288

    E: This flag should be used with L or U. It stops the conversion initiated by the flag L or U. In the following example, only the first word is replaced with uppercase letters.

[jerry]$ sed -n  s/Paulo Coelho/Upaulo Ecoelho/p  books.txt

On executing the above code, you get the following result:

3) The Alchemist, PAULO coelho, 197 
5) The Pilgrimage, PAULO coelho, 288
Advertisements