[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: Patterns: Why are anchors not character classes?
- From: Dirk Laurie <dirk.laurie@...>
- Date: Thu, 16 Jul 2015 12:16:08 +0200
2015-07-14 18:21 GMT+02:00 John Hind <firstname.lastname@example.org>:
> If the start and end anchors behaved like character classes we could do
> better. For regularity they probably should be %^ and %$ rather than the
> existing special characters
We have had quite a pleasant little discussion in this thread, and not
much has been left unsaid, but I would like to return to the point that
after all deals with the actual subject.
We are dealing with two different concepts here.
1. Anchors ^ and $, implemented. These tell the find/match/substitute
routines that the pattern should only be tested for at respectively the
beginning or the end of the subject.
2. Character classes %^ and %$, proposed. These represent an empty
string at respectively the beginning or the end of the subject. This breaks
the rule that %x means literally x when x is non-alphanumeric, which
IMHO overrules the regularity argument, so I would prefer to denote
these, too, by just ^ and $.
It seems adequate to allow this usage of ^ and $ as set components only,
not as character classes that may appear anywhere. There is the difficulty
that ^ is also the set complement character.
[As I type this, a new post by John Hind has arrived, and readers will have
read it before getting here. I shall finish this post without having done
more than glance at that, even though some points overlap.]
The proposal, whether with or without, does not seem terribly hard
to implement. We could get some hands-on experience on them
instead of making the capital mistake of theorizing without data.
Power patch, anyone?