lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Sun, Jul 26, 2015 at 8:24 PM, Soni L. <fakedme@gmail.com> wrote:
> The "first meaning" is actually just a literal (, because () in sets are
> considered literals (thus special-cased)

OK, my mistake. I focused too much on the () and missed the [] (six
hours of sleep in the past two days can do that to you).

Still, Lua allows you to escape special characters inside a set
anyway, which I normally do for clarity. And in what major regex
flavor do special characters *not* lose their meaning inside sets?
(E.g., Python treats them as ordinary characters when inside a set.
[1]).


> We don't need a zero-or-more. We could use : instead of - for ranges and
> it'd still make sense.
>
> Instead of having zero-or-more we could have ? modify + and - for 0 or 1
> e.g. +? will try to match with a + and if that fails will try a 0 or more.

> %s+? (where ? is the good old ? you're used to, applied to %s+, where + is
> the good old + you're used to)

I **strongly oppose** using any character or sequence of characters in
Lua patterns to mean one thing, when the same characters, in the same
context, used in the same way, would have a completely different
meaning in a normal regex. See my argument about keeping things
familiar for people used to full regex. Creating more chances for
confusion, when the same can be accomplished without said confusion,
is never a good thing.


> We also don't need ? if we allow empty alternations: the regex (t|) would be
> equivalent to t?, and (|t) would be a non-greedy t?

Again, this creates unnecessary confusion for the person used to full
regex. They may see a pattern with this, and assume that Lua offers
full alternation power, which it doesn't. And actually adding full
alternations would increase the complexity of the parser, which would
be counter to your apparent goal of reducing the complexity.

You're also proposing purposely taking a function (matching an
optional character) that currently uses the same syntax in both Lua
and regex, and purposely taking away the familiar syntax and forcing
coders to use a totally different syntax for Lua. Again, confusion.


TL;DR: Let's keep features that are available in both Lua patterns and
full regex as similar as possible to their traditional syntax in full
regex. Creating large amounts of unnecessary differences and confusion
for the sake of a small performance gain is not worth the frustration
it would cause.