lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi:


On Wed, Mar 5, 2014 at 8:25 PM, Philipp Janda <siffiejoe@gmx.net> wrote:

> It seems that you want to write a pattern matching function that looks
> atomic on the outside but is implemented as a loop on the inside.

I wanted to be able to write lua pattern code without having to look
at a manual page and carefully examine the pattern everytime. Now I
know I need to do this and also to refer to my notes extracted from
this thread.

> I agree
> that for this specific use case Lua's interpretation of anchors requires
> special care. I don't find it surprising, though.

Not for this special case, IMO, but anywhere I use an offseted match.
There is no problem though, I'll do it.

> I knew Perl before I came
> to Lua, but I can't think of a Perl function that applies a pattern at a
> given starting index, so I haven't had any expectations for this case.

Maybe you aproached it with the wrong mindset,  in perl, for regexp,
you do not look for functions, they are native and you look for
operators / regexp modifiers. You do not have index oriented functions
like in lua because you do not normallyuse indexes for matches in
perl. If you look at my previous examples there were some of them.
Here are a couple more:

folarte@paqueton:~$ perl -MData::Dumper -e 'print Dumper [
"12ABD12DEF" =~ /12./g ]'
$VAR1 = [
          '12A',
          '12D'
        ];
folarte@paqueton:~$ perl -MData::Dumper -e 'print Dumper [ "12345678"
=~ /.(.)./g ]'
$VAR1 = [
          '2',
          '5'
        ];

And, if you want to do lua-style anchoring, you have a meta for this:

folarte@paqueton:~$ perl -MData::Dumper -e 'print Dumper [
"12ABD12DEF" =~ /\G12./g ]'
$VAR1 = [
          '12A'
        ];

It's difficult to beat perl for a sequence of pattern matches,
specially the kind coded into the program with lots of meta, this is
where it really shines. You mainly have problems if you think in a
C/java/lua way an try to work via indexes.

> If
> applying an anchored pattern to somewhere other than index 1 always failed,
> doing so would be foolish in the first place, and it probably only happens
> by accident.

Given your previous ' "always fail" ' comments and these I do not know
if you've read what I wrote or choose to deliberatlely ignore it or
what. I'll opt for thinking I do not explain myself well and do it
again.

ANY of the examples we have been putting, with constant strings, is
foolish, we could substitute for the constant return value.

What I was trying to illustrate is that a fuction which gets an input
pattern and tries to do an offseted find will surprise anyone familiar
with how patterns in a lot of other language work, specially on some
patterns, like '^ *#', which are the same as a regular expression, and
I fear they are the same by dessign ( I mean it seems like the Lua
team picked thos chars because they knew them from working with
regexps ). And they work in a similar way in lot of languages for a
good reason, patterns / regexps are a mini language on its own, having
mainly similar but subtly different ones will confuse the hell out of
users.

I see how these may be useful, it's why the perl folks put a \G in
their regexp spec. I do not have a problem with something different
enough, like lpeg, to force me look at the docs. But I have a problem
due to surprising behaviour. Is like if someone makes a language
where, being addition more common than substraction, they use - to add
and + for sub because - is statistically easier to type in most
keyboards,it will surprise the heell out of the rest of the
programmers ( Hint: this is an intentionally exagerated and ridiculous
sample, not a propossal or a comment on your code or proposals ).

> We also seem to agree that Lua's behavior covers additional use
> cases. Given all that, the current implementation makes sense. A clarifying
> note in the reference manual wouldn't hurt, though, and there is a manual
> update in the pipeline anyway ...

No, lua covers different use cases. If you read my texts, I consider,
unless someone proves me wrong, the other langauages behaviour
superior for lua. If I have a Cfind function with the current
behaviour and a Ffind function with the other language behaviour I can
obtaing S:Cfind(p,i) as s:sub(i):Ffind(p). The other way round is a
little more complex, as you need a conditional to test the pattern. Of
course, anyone needing this in a serious program will wrap everything
in a function. ( Hint: I consider, that means this is an opinion )

On the makes sense stuff, things make sense to some people and does
not to other. Given the background of the lua team and my study of
their work I've always considered it makes sense to at least one of
them.

And this is not going to hit me hard, as given my bacground lua will
not even get to top five as the language I'll choose if I need to code
a matching heavy thing. The clarifying stuff would be great, as this
is not the first time I've found the manual slighltly underspecified,
although this make sense to me ( lua is young, it is not that widely
used in places where that matters, target userbase can live with this
and the manual is improving nicely ).


>> return nil, or that I sanitize every pattern I use to see wether it
>> starts with ^ or what?
> Yes, for your current use case (if I guessed correctly) that's one way to
> go.
Well, these lines alone proves ( to me ) I'm not able to convey
information to you.

Francisco Olarte.