[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Pattern matching bug
- From: Roberto Ierusalimschy <roberto@...>
- Date: Wed, 8 Apr 2009 19:03:14 -0300
> Consider the following patterns:
> "(%w+)=([\"'])([^%2]-)%2"
> "([\"'])([^%1]-)%1"
>
> If the first pattern is applied against the following strings:
> 1) "key1='value1'"
> 2) "key2='value2'"
>
> It will match #1 and not #2. Similarly, for the 2nd pattern, if it is
> applied against those same two strings, it will match #2, and not #1.
>
> The trouble is in the sequence "([^%2]-)" which should be
> "non-greedily match anything except the characters contained in
> capture 2".
> It appears that is actually doing "non-greedily match anything except
> the characters contained in capture 2 and the actual number 2".
>
> Thoughts? I have already found a workaround, but this pattern is
> syntactically valid, it just doesn't work in the expected manner. I
> took a look through lstrlib.c, but it was not immediately obvious
> where the problem might be.
It does not work the mannter you expect, but it does what the manual
says it does. The "%2" you want is a 'pattern item', but there are no
items inside character classes. Inside a class the '%' only escapes
characters.
-- Roberto