[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: Patterns: Why are anchors not character classes?
- From: "John Hind" <john.hind@...>
- Date: Wed, 15 Jul 2015 10:26:55 +0100
Thanks everyone, this has advanced my understanding of patterns.
The [^,]+ pattern is good and I'll probably use that.
The frontier pattern, I'd missed completely probably because it post-dates
my edition of "Programming in Lua" - as Dirk says I should use the
electronic documentation. However the documentation itself is interesting:
"%f[set], a frontier pattern; such item matches an empty string at any
position such that the next character belongs to set and the previous
character does not belong to set. The set set is interpreted as previously
described. The beginning and the end of the subject are handled as if they
were the character '\0'."
Here the beginning and end are not just character classes but are
(unnecessarily) given an explicit byte encoding breaking the "8-bit clean"
rule for strings. There would be no need for them to have byte encodings if
beginning and end were separate character classes.
Having explicit and distinct character classes for "beginning of subject"
and "end of subject" would regularise and formalise the conceptual framework
as well as adding practical expressiveness.
This email has been checked for viruses by Avast antivirus software.