split() function of Perl is used to break up a string into an array on a specific pattern.
The PATTERN is nothing but a regular expression which may be as simple as a single character.
By default on every instance of the PATTERN the STRING is split, but you can LIMIT that to
some specific number of instances.
A D V E R T I S E M E N T
split function
It splits the string into an array of strings, then returns it. By default, leading empty
fields are preserved, and trailing empty ones are deleted.
If not in the list context, returns the number of fields which are found and splits
into an @_ array. (In context of list , by using ?? as the pattern delimiters you can
force the split into @_ , but still it returns the list value.) The use of implicit split
to @_ is deprecated, however, since it clobbers your arguments to the subroutine .
If EXPR is been omitted, splits the $_ string. And if PATTERN is also been omitted,
splits on the whitespace (after skipping any leading whitespace). Anything that matching
the PATTERN is taken to be a delimiter that separates the fields. (Note here that the
delimiter can be longer than one character).
If the LIMIT has been specified and positive, it splits into not more than the number of
fields (though it may split up into fewer). If LIMIT has not been specified or if it is
zero, the trailing null fields are stripped. If LIMIT has been set negative, it will
treat as if an arbitrarily large LIMIT had been specified.
A pattern that is matching the null string (do not confuse with a null pattern //, which
is just a member of the set of patterns that are matching a null string) will split up
value of the EXPR into separate characters at each time it matches that way. For example:
print join(':', split(/ */, 'hi there'));
produces the output 'h:i:t:h:e:r:e'.
The LIMIT parameter can be used to split a line partially
While assigning to list, if the LIMIT is omitted, Perl supplies the LIMIT one larger
than the number of variables which are their in the list, to avoid all the unnecessary
work. For the above list, LIMIT would be "4" by default. In some time critical
applications it behooves not to split up into more number of fields than you really need.
If the PATTERN consists parentheses, then additional array elements are been created
from each of the matching substring in the delimiter.
split(/([,-])/, "1-10,20", 3);
produces the list value
(1, '-', 10, ',', 20)
If you have the entire header of a normal Unix email message in the variable $header,
you could split up this into fields and their values this way:
$header =~ s / \ n \ s +/ /g; # fix continuation lines
%hdrs = (UNIX_FROM => split / ^ ( \ S*?):\ s* /m, $header);
The pattern /PATTERN/ may be replaced by the expression to specify patterns that do vary
during runtime. (for runtime compilation only once you can use /$variable/o.)
As the special case, specifying the PATTERN of space (' ') will be split on white space
just as the function split() with zero arguments do. Therefore the split(' ') can be
used to emulate the awk's default behavior, whereas in split(/ /) will just give you
those many null initial fields as there are leading blank-spaces. The split() on
/ \ s+ / is similar to split(' ') except that any number of leading whitespace produces
a null first field. A split() function with no arguments actually does a split(' ', $_)
internally.