[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: [Possibly Spam] Possible bug with non-greedy matching in gsub
- From: "Rici Lake" <lua@...>
- Date: Tue, 22 Feb 2005 13:53:07 -0000 (GMT)
crow said:
> DATA=string.gsub(DATA,"\r\n\r\n(Author:.-\r\nEmail:)","\r\n\r\n××%1")
> [/code]
>
> That looks like it will work, as it anchors the pattern to the double
> newline, then "Author:", then non-greedy match to end of line, then
> "Email:".
I don't think you have quite the correct interpretation of non-greedy.
Non-greedy is not a "fence" operator. The .- in that regex will not stop
just because a \r is matched. It means "the shortest match which matches
the pattern", so if the next line after Author: ... is not Email:, it will
keep matching until it reaches an Email: line.
Your "workaround" of changing . to [^\r] is actually the correct solution:
i.e., if you don't want to match newlines, you have to say that
explicitly.
To put it another way, both "greedy" and "non-greedy" matches will find
the earliest match for the entire pattern; the difference is that "greedy"
matches the longest match at that point while "non-greedy" matches the
shortest one. But in no case will a match be ignored.
Hope that helps,
Rici