Jud Dagnall Photography Blog

Photography, technology and occasional rants!

selecting ranges from a file with the perl .. operator

Posted on December 9th, 2004 in by jud || No Comment

Mike Schilli showed me a cool perl operator today that I haven’t ever used. Often I want to get lines from a file that are between to specific lines. For example, given a file like:

blah
blah
START
THIS IS
VERY IMPORTANT
END
blah
blah

I would like to easily extract everything between (and including) START and END. Using a perl one liner and the .. operator, this is very easy.


$ perl -ne'print $_ if /START/ .. /END/'

START
THIS IS
VERY IMPORTANT
END

The .. operator (and it’s cousin the … operator) is a flip-flop operator. In scalar context, it is false until the first expression becomes true and then is true until the second expression becomes false. In this case, we want nothing to be printed until the current line matches /START/, and then continue printing until the current line matches /END/. I’m using the implicit variable $_ (the current line) to make this more succinct.

See perldoc perlop for more information

Here’s a more complex example that excludes the boundary lines by making use of a hackish perlism:
When true, the .. operator returns a sequence number (starting with 1). However, on the last item (when the second expression becomes false), perl will return not simply a number like 24, but instead, a number like 24E0. This is actually 24 in scientific notation (24 * 10 ^ 0 = 24). This evaulates to the correct value when treated as a number, but can be identified via a regular expression in string context. So in this case, we print lines EXCEPT the last line by adding the condition our pattern matching expression ($i !~ /E0^/), and we print all the lines in except the first one by including the condition ($i > 1).


$ perl -ne'print $_ if ($i = /START/ .. /END/) and ($i > 1) and ($i !~ /E0$/)'

THIS IS
VERY IMPORTANT

Just to confuse matters, the .. operator builds a range when called in a list context, e.g.


@alphabet = ( 'A' .. 'Z');

Finally, remember to use a regular expression that is as specific as is ncessary, like /^START$/ instead of START so you don’t match string like RESTART.

Leave a Reply

Your email address will not be published.