The &nested_quotewords() and "ewords() functions accept a delimiter
(which can be a regular expression)
and a list of lines and then breaks those lines up into a list of
words ignoring delimiters that appear inside quotes. "ewords()
returns all of the tokens in a single long list, while &nested_quotewords()
returns a list of token lists corresponding to the elements of @lines.
&parse_line() does tokenizing on a single string. The &*quotewords()
functions simply call &parse_line(), so if you're only splitting
one line you can call &parse_line() directly and save a function
call.
The $keep argument is a boolean flag. If true, then the tokens are
split on the specified delimiter, but all other characters (quotes,
backslashes, etc.) are kept in the tokens. If $keep is false then the
&*quotewords() functions remove all quotes and backslashes that are
not themselves backslash-escaped or inside of single quotes (i.e.,
"ewords() tries to interpret these characters just like the Bourne
shell). NB: these semantics are significantly different from the
original version of this module shipped with Perl 5.000 through 5.004.
As an additional feature, $keep may be the keyword ``delimiters'' which
causes the functions to preserve the delimiters in each string as
tokens in the token lists, in addition to preserving quote and
backslash characters.
&shellwords() is written as a special case of "ewords(), and it
does token parsing with whitespace as a delimiter-- similar to most
Unix shells.
Examples section another documentation provided by John Heidemann
<johnh@ISI.EDU>
Bug reports, patches, and nagging provided by lots of folks-- thanks
everybody! Special thanks to Michael Schwern <schwern@envirolink.org>
for assuring me that a &nested_quotewords() would be useful, and to
Jeff Friedl <jfriedl@yahoo-inc.com> for telling me not to worry about
error-checking (sort of-- you had to be there).