Starting with sed

by Mike on May 31, 2009 · 4 comments

in Regular Expressions

The stream editor, sed, is a filtering program that automates repetitive editing tasks and is used to process information sent from other Linux commands in pipes.  In the simplest form sed looks like this:

sed [options] ‘command’ [file]

Note that the command that you execute is inside single quotes.  Here is an actual command for comparison.

sed  10q /etc/httpd/conf/httpd.conf

on Ubuntu/Debian systems
sed 10q   /etc/apache2/apache2.conf

This will print the first 10 lines of the file.

In order to use a real life example here is a sample of logs from a web server.

85.27.31.159 – - [15/Apr/2009:10:42:31 -0600] “GET /blog/wp-content/uploads/2009/02/drive.gif HTTP/1.1″ 200 66574 “http://example.com/blog/2009/02/performance-tuning-for-a-linux-web-server-part-1/” “Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.5) Gecko/2008121621 Ubuntu/8.04 (hardy) Firefox/3.0.5″
85.27.31.159 – - [15/Apr/2009:10:42:31 -0600] “GET /blog/favicon.ico HTTP/1.1″ 404 40584 “-” “Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.5) Gecko/2008121621 Ubuntu/8.04 (hardy) Firefox/3.0.5″
85.27.31.159 – - [15/Apr/2009:10:42:31 -0600] “GET /blog/wp-content/uploads/2009/02/drive2.gif HTTP/1.1″ 200 91945 “http://example.com/blog/2009/02/performance-tuning-for-a-linux-web-server-part-1/” “Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.5) Gecko/2008121621 Ubuntu/8.04 (hardy) Firefox/3.0.5″
192.168.4.1 – - [15/Apr/2009:10:42:56 -0600] “GET /blog/feed/ HTTP/1.1″ 200 11503 “http://example.com/blog/feed/” “SimplePie/1.0.1 (Feed Parser; http://simplepie.org/; Allow like Gecko) Build/20070719221955″
192.168.4.1 – - [15/Apr/2009:10:42:57 -0600] “GET /blog/feed/ HTTP/1.1″ 200 11503 “http://example.com/blog/feed/” “SimplePie/1.0.1 (Feed Parser; http://simplepie.org/; Allow like Gecko) Build/20070719221955″
In this log you may decide to change all of the “example.com”     references to “EXAMPLE.COM” for visual effect.  Take your log and save it to a file called ‘web” so you can run sed scripts on it.

Certain utilities that use regular expressions, such as sed, require that you use a delimiter to indicate the text patterns that you want to work with.  The default delimiter for sed is a forward slash.  So, if you wanted to search for the word “apache” and change it to “Apache”, you would delimit these two patterns like so:

/apache/Apache/

Other utilities, such as grep, don’t have this requirement.

Notice, that the first thing you see within the single quotes is the letter “s”.  That’s what tells sed to perform a substitution.  Next, you see the two patterns that you’re working with.  The first is the pattern that you’re replacing, and the second is what you’re using as the replacement.  After you close the expression with another single quote, list the text file that you want to modify.

sed ‘s/example.com/EXAMPLE.COM/’ web

89.150.141.189 – - [15/Apr/2009:10:43:00 -0600] “GET /images/apps/subrss.gif HTTP/1.1″ 200 2978 “http://EXAMPLE.COM/server_training/server-managment-topics/” “Mozilla/5.0 (X11; U; Linux i686; da-DK; rv:1.9.0.8) Gecko/2009032711 Ubuntu/8.10 (intrepid) Firefox/3.0.8 GTB5″

By default, sed’s “s” command will only replace the first occurrence of a pattern in a given line.  To fix this, you will need to use the “s” command with its global option.  That way, every occurrence of a pattern in any given line will be replaced.

sed ‘s/example.com/EXAMPLE.COM/g’ web

Now if you wanted to highlight several browsers that accessed your site you could run a command like this:

sed ‘s/Safari/SAFARI/g; s/Firefox/FIREFOX/g’ web

15.219.153.79 – - [15/Apr/2009:10:42:52 -0600] “GET /server_training/ftp-server/987-build-an-ubuntu-ftp-server HTTP/1.1″ 200 8861 “-” “Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.19 (KHTML, like Gecko) Chrome/1.0.154.53 SAFARI/525.19″

89.150.141.189 – - [15/Apr/2009:10:42:59 -0600] “GET /favicon.ico HTTP/1.1″ 200 1406 “-” “Mozilla/5.0 (X11; U; Linux i686; da-DK; rv:1.9.0.8) Gecko/2009032711 Ubuntu/8.10 (intrepid) FIREFOX/3.0.8 GTB5″

All you have to do is create two separate sed scripts, and combine them with a semi-colon.  You can place both scripts within only one set of single quotes.

By default, sed will read in the text file that you want to modify, and send the entire modified file to stdout.  What if you only want to see the lines that actually got modified?  For that, you would use the “-n” switch for the sed option, and the “p” switch for the “s” option.

sed -n ‘s/Safari/SAFARI/gp ; s/Firefox/FIREFOX/gp’ web

Without the “p” switch for the “substitute” command, the “-n” switch will suppress all output.

Without the “-n” sed option, the “p” option for the “s” command will cause sed to print its default output, in addition to printing the modified lines a second time.

There are still plenty of other ways to perform substitutions with sed.  Here is a way to look for lines which start with an IP Adresses, in other words the source, and highlight the browser used by that source.

sed ‘/^89.150.141.189/s/Firefox/FIREFOX/’ web

89.150.141.189 – - [15/Apr/2009:10:43:00 -0600] “GET /images/banners/ub_85.gif HTTP/1.1″ 200 7239 “http://example.com/server_training/server-managment-topics/1016-ldap-server-on-ubuntu-804″ “Mozilla/5.0 (X11; U; Linux i686; da-DK; rv:1.9.0.8) Gecko/2009032711 Ubuntu/8.10 (intrepid) FIREFOX/3.0.8 GTB5″

In this example,  a regular expression for an address was used.  That is, you placed the literal string “89.150.141.189″ within the default sed delimiters (the forward slashes), then preceded it with the metacharacter “^”.  That causes the substitute command to look for only the occurrence of “89.150.141.189″ at the beginning of a line.  Note again how this address immediately precedes the “s” command.  (You don’t need to use the global option for the “s” command, because you already know that there’s only one occurrence of a given word in each line.)

{ 4 comments }

VPS June 6, 2009 at 8:38 am

That’s really useful. I love beginlinux.com!

Dedicated October 4, 2009 at 8:49 am

i love you beginlinux!

svnlabs November 15, 2009 at 8:41 am

Awesome!!

This is very article on SED (stream editor)….

I really enjoyed it

Thanks

zaklady bukmacherskie May 20, 2010 at 11:07 pm

I see a lot of interesting posts here. I have bookmarked for future referrence.

Previous post:

Next post: