lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]




On 27/07/15 12:05 AM, Soni L. wrote:


On 26/07/15 11:40 PM, Daurnimator wrote:
I want to ensure that a string always ends in a single "/".
If it has more than one, the extras should be removed
If it has none, a "/" should be appended.

"/*$" should match all the '/' at the end of the string, and replace
them with a single "/".
I got an unexpected result:

     > ("d//"):gsub("/*$", "/")
     d// 2

This result suggests that there is an empty string being matched
between the last "/" and the end of the string.
It's matching the // and replacing that with "/"; but then it gets
confused and matches the empty string at the end, and ends up
inserting an extra /
Using 'print' as the match confirms:

     > ("d//"):gsub("/*$", print)
     //

     d// 2

Is this a bug in string.gsub?
It seems odd to me that you could get 2 replacements for an anchored match. Though as far as I can see, a strict reading of the manual doesn't disallow it.


Daurn.

$ doesn't consume the end of the string?

You'll probably find this issue in most pattern matchers?

I've been writing a pattern matcher lately, so let's look at what it'd do (a bit simplified to be easier to read):

Pattern: /*$ ->
Root[GreedyZeroOrMore["/"], EndOfString]

Matcher:

Cursor position: 0
d//
^
Matched /*, cursor position: 0
d//
^
Doesn't match $. Put char on buffer, increment cursor position and repeat.

Cursor position: 1
d//
 ^
Matched /*, cursor position: 3
d//
   ^
Matched $, cursor position: 3
End of pattern, put replacement on buffer (in this case "/"). Repeat.

Cursor position: 3
d//
   ^
Matched /*, cursor position: 3
d//
   ^
Matched $, cursor position: 3
End of pattern, put replacement on buffer (in this case "/"). Start cursor position == end cursor position, so advance cursor.

Cursor position: 4
d//
    ^
End of string, return buffer.

So you end up with 2 matches and "d//".

--
Disclaimer: these emails are public and can be accessed from <TODO: get a non-DHCP IP and put it here>. If you do not agree with this, DO NOT REPLY.