lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


2015-07-14 18:21 GMT+02:00 John Hind <john.hind@zen.co.uk>:

> If the start and end anchors behaved like character classes we could do
> better. For regularity they probably should be %^ and %$ rather than the
> existing special characters

We have had quite a pleasant little discussion in this thread, and not
much has been left unsaid, but I would like to return to the point that
after all deals with the actual subject.

We are dealing with two different concepts here.

1. Anchors ^ and $, implemented. These tell the find/match/substitute
routines that the pattern should only be tested for at respectively the
beginning or the end of the subject.

2. Character classes %^ and %$, proposed. These represent an empty
string at respectively the beginning or the end of the subject. This breaks
the rule that %x means literally x when x is non-alphanumeric, which
IMHO overrules the regularity argument, so I would prefer to denote
these, too, by just ^ and $.

It seems adequate to allow this usage of ^ and $ as set components only,
not as character classes that may appear anywhere. There is the difficulty
that ^ is also the set complement character.
[As I type this, a new post by John Hind has arrived, and readers will have
read it before getting here. I shall finish this post without having done
more than glance at that, even though some points overlap.]

The proposal, whether with or without, does not seem terribly hard
to implement. We could get some hands-on experience on them
instead of making the capital mistake of theorizing without data.
Power patch, anyone?