lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


2015-07-27 10:52 GMT+02:00 Tim Hill <drtimhill@gmail.com>:
>
>> On Jul 27, 2015, at 12:28 AM, Sean Conner <sean@conman.org> wrote:
>>
>>>
>>>> ("d//"):gsub("/*$", "/")
>>>    d// 2
>>>
>>> This result suggests that there is an empty string being matched
>>> between the last "/" and the end of the string.
>>> It's matching the // and replacing that with "/"; but then it gets
>>> confused and matches the empty string at the end, and ends up
>>> inserting an extra /
>>> Using 'print' as the match confirms:
>>>
>>>> ("d//"):gsub("/*$", print)
>>>    //
>>>
>>>    d// 2
>>>
>>> Is this a bug in string.gsub?
>>
>>  No.  You're telling Lua that you want to match zero or more '/' followed
>> by the end of string.  Going through "d//", it first finds "//", which is
>> zero or more '/', and replaces it with a slash.  It then finds "END OF LINE"
>> [1], which is zero or more '/', and replaces it with a slash.  It's working
>> as intended.
>>
>>  -spc (You may have to switch to LPEG ... )
>>
>> [1]   Thus sayeth the Master Control Program.
>>
>

I raised a similar point some time ago [1,2]. Roberto made three
contributions to the thread (the quoted lines are me arguing futilely):

~~~
Please stop calling "bug" something that does not behave as you
wanted or imagined.
~~~
I may be wrong, but it seems that the two rules can be stated like that:

1) Do not match two empty strings in the same position. (current Lua rule)

2) Do not match an empty string in the same position of another match
(not necessarily empty). (sed rule)

Is rule 2 really more intuitive in general or it just happen to do what
you want in this particular case?
~~~
| It has the advantage of making `split` trivial instead of requiring the
| sort of thing that takes the Lua Wiki 300 lines to explain.

True, but that does not make it more intuitive; it makes it more useful
in one particular case. Are there other scenarios where it is more (or
less) useful?
~~~

So I can't see him agreeing this time round either.

> Hmm .. my vote goes with the OP. Matches are greedy so the
> first match should be on “//“ AND the end of the string. I found
> this interesting:
>
> (“d//“):gsub(“/+$”, “/“)
>         d/ 1

That's what the OP should have written. Matches involving *
are almost always buggy. If you modify the OP's example by
removing the $, the mistake becomes glaringly obvious.

> ("d//"):gsub("/*", "/")
/d//    3

[1] http://lua-users.org/lists/lua-l/2013-04/msg00812.html