lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Am 05.03.2014 15:24 schröbte Francisco Olarte:


I'm not discussing which is more useful, more on what is least
surprissing. After more than 25 years of working with regular
expressions this is the first time I've seen this behaviour. If you
resort to mutative operations, anchoring is trivial, just find on
substring. Using the customary character for start of string, telling
it matches on start of subject string without defining subject string
and then making it mach on start of explored region is my problem. I
wouldn't care of lua's behaviour if the manual defined subject string.
I would care a little less if it used a distinct charset than regular
expressions, although I would consider it should be defined. And the
surprise problem is not only for people like me who came to lua after
using lots of regexes, it will hit people who start in lua and go to
another languages too.

It seems that you want to write a pattern matching function that looks atomic on the outside but is implemented as a loop on the inside. I agree that for this specific use case Lua's interpretation of anchors requires special care. I don't find it surprising, though. I knew Perl before I came to Lua, but I can't think of a Perl function that applies a pattern at a given starting index, so I haven't had any expectations for this case. If applying an anchored pattern to somewhere other than index 1 always failed, doing so would be foolish in the first place, and it probably only happens by accident. We also seem to agree that Lua's behavior covers additional use cases. Given all that, the current implementation makes sense. A clarifying note in the reference manual wouldn't hurt, though, and there is a manual update in the pipeline anyway ...

And your sugestion for the "quoty text" is I substitute all my find
code with a return nil ? It seems so. Are you suggesting I put a
return nil, or that I sanitize every pattern I use to see wether it
starts with ^ or what?

Yes, for your current use case (if I guessed correctly) that's one way to go.

Francisco Olarte.