Along with identifying regular expressions Perl can make substitutions based on those
matches.
A D V E R T I S E M E N T
This is done using the "s" function which has been designed to mimic the way
substitution is made in the vi text editor. Here again the match operator is made used,
and again if it is omitted then substitution is assumed to have taken place with
the variable $_.
To replace the of london by London in the string $sntnce we use
the following expression
$sntnce =~ s/london/London/
and to do the same thing with the $_ variable just
s/london/London/
Notice that both the regular expressions (london and London) are surrounded by a total of
three slashes. Number of substitutions made is the result of this expression, so either it
is 0 (false) or 1 (true) in this case.
Perl Options
The following example replaces only the first occurrence of the string, and there may be more
than one such string which we want to replace. The last slash is followed by "g" to make a
global substitution as follows.
s/london/London/g
If we want to also replace occurrences of lOndon, lonDON, LoNDoN and so on then we could
use
s/[Ll][Oo][Nn][Dd][Oo][Nn]/London/g
but an easier way is to use the i option (for "ignore case"). The expression
s/london/London/gi
will make a global substitution ignoring case. The i option is also used in the
basic /.../ regular expression match.
Remembering patterns
It will be useful if we remember the patterns that have been matched so that they can be used
again in future. It just happens that anything that gets matched in parentheses is remembered in the
variables $1,...,$9. By using the special RE codes \1,...,\9 these strings can be also used in the same regular expression
(or substitution). let us cosider for example:
$_ = "Lord Whopper of Fibbing";
s/([A-Z])/:\1:/g;
print "$_\n";
It will replace each upper case letter by those letters surrounded by colons. It will print
:L:ord :W:hopper of :F:ibbing. All the variables from $1,...,$9 are read-only variables;
therefore you cannot alter them yourself
Cosider the following example.
if (/(\b.+\b) \1/)
{
print "Found $1 repeated\n";
}
This will identify any of the words which are repeated. Each \b represents a word boundary and the .+ matches
any of the non-empty strings, so \b.+\b matches anything between two word boundaries. This is
then remembered by parentheses and for regular expressions stored as \1 and for the rest of the program as $1.
The following line swaps the first and last characters of a line in a variable $_ :
s/^(.)(.*)(.)$/\3\2\1/
The beginning and the end of the line are matched by ^ and $. The first
character is stored in \1 code, everything else up to the last character is stored in
the \2 code. which is stored in the code \3 . Then the whole line is replaced with
\3 and \1 swapped round.
After a match is found, you can use the variables $` and $& and $' which is special
read-only variable to find out what was matched before, during and after the search.
$_ = "Lord Whopper of Fibbing";
/pp/;
All of the following statements are true.(Remember that eq is a string-equality test.)
On the subject of remembering patterns it is also worth knowing that inside of
slashes of a match or a substitution variables are been interpolated.
$search = "the";
s/$search/xxx/g;
This line will replace xxx with every occurrence of "the". If you wish to replace every
occurence of "there" then you cannot do that using s/$searchre/xxx/, because this
will be interpolated as a variable "$searchre". For this you should put the variable
names in the curly braces so that the code becomes:
$search = "the";
s/${search}re/xxx/;
Translation in Perl
Character-by-character translation is done by the tr function. In the following
expression each a is replaced with e, each b with d, and c with f in the variable
$sntnce.
The expression returns the number of substitutions made.
$sntnce =~ tr/abc/edf/
Most of the special RE codes are not applicable in the tr function. Let us consider
for example, here the statement counts the number of asterisks present in the variable
$sntnce and stores that in the variable $count.
$count = ($sntnce =~ tr/*/*/);
However, the "-" is still used to mean "between". This statement converts $_ to upper
case.