lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]




On 26/07/15 10:02 PM, Jonathan Goble wrote:
On Sun, Jul 26, 2015 at 8:24 PM, Soni L. <fakedme@gmail.com> wrote:
The "first meaning" is actually just a literal (, because () in sets are
considered literals (thus special-cased)
OK, my mistake. I focused too much on the () and missed the [] (six
hours of sleep in the past two days can do that to you).

Still, Lua allows you to escape special characters inside a set
anyway, which I normally do for clarity. And in what major regex
flavor do special characters *not* lose their meaning inside sets?
(E.g., Python treats them as ordinary characters when inside a set.
[1]).


We don't need a zero-or-more. We could use : instead of - for ranges and
it'd still make sense.

Instead of having zero-or-more we could have ? modify + and - for 0 or 1
e.g. +? will try to match with a + and if that fails will try a 0 or more.
%s+? (where ? is the good old ? you're used to, applied to %s+, where + is
the good old + you're used to)
I **strongly oppose** using any character or sequence of characters in
Lua patterns to mean one thing, when the same characters, in the same
context, used in the same way, would have a completely different
meaning in a normal regex. See my argument about keeping things
familiar for people used to full regex. Creating more chances for
confusion, when the same can be accomplished without said confusion,
is never a good thing.


We also don't need ? if we allow empty alternations: the regex (t|) would be
equivalent to t?, and (|t) would be a non-greedy t?
Again, this creates unnecessary confusion for the person used to full
regex. They may see a pattern with this, and assume that Lua offers
full alternation power, which it doesn't. And actually adding full
alternations would increase the complexity of the parser, which would
be counter to your apparent goal of reducing the complexity.

You're also proposing purposely taking a function (matching an
optional character) that currently uses the same syntax in both Lua
and regex, and purposely taking away the familiar syntax and forcing
coders to use a totally different syntax for Lua. Again, confusion.


TL;DR: Let's keep features that are available in both Lua patterns and
full regex as similar as possible to their traditional syntax in full
regex. Creating large amounts of unnecessary differences and confusion
for the sake of a small performance gain is not worth the frustration
it would cause.

There are a few added benefits. With ClEx, (test) would be a group (test) even when inserted inside a set [(test)]. Name me one regex engine where you can do that!

--
Disclaimer: these emails are public and can be accessed from <TODO: get a non-DHCP IP and put it here>. If you do not agree with this, DO NOT REPLY.