lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

On Sun, Oct 22, 2017 at 10:25 PM Sean Conner <> wrote:
It was thus said that the Great Sean Conner once stated:
> >
> > Also, unless I misunderstood something or something has changed since I
> > last looked at it, LPeg cannot do anything requiring backtracking, so there
> > are things matchable by patterns but unmatchable by LPeg. Possible example
> > being "capture the three characters between a question mark and a dollar
> > sign, immediately before the dollar sign", for which the (untested) pattern
> > "%?.-([^$][^$][^$])%$" should match easily, but AFAICT, there is no
> > equivalent LPeg pattern since the only way is to match up to the $ and then
> > back up three characters.
>   Yes it can:
>       qd = lpeg.P"?"    -- match the question mark
>          * lpeg.P(1)^3  -- at least three characters, could be more
>          * lpeg.Cmt(    -- at match time
>                   lpeg,P"$", -- when we hit the dollar sign, call this
>                   function(subject,position)
>                   -- position is one past the dollar sign,
>                   -- so we return the three characters prior to the
>                   -- dollar sign (position-4 to position-2 of subject)
>                   return position,subject:sub(position-4,position-2)
>                 end
>            )

  I realized the code has a bug in that the three characters matched should
*not* match the dollar sign.  Easiest fix is in the function:

        qd = lpeg.P"?"
           * lpeg.P(1)^3
           * lpeg.Cmt(lpeg.P"$",function(subject,position)
                 local ret = subject:sub(position-4,position-2)
                 if not ret:find("$",1,true) then
                   return position,ret

OR, you could fix it:

        no_dollar = lpeg.R("\0#","%\255") -- look ma!  No $
        qd        = lpeg.P"?"
                  * (lpeg.P(1) - (no_dollar * no_dollar * no_dollar * lpeg.P"$"))^0
                  * lpeg.C(no_dollar * no_dollar * no_dollar)
                  * lpeg.P"$"

and get rid of the call to lpeg.Cmt().  Okay, this one:

        lpeg.R()        - define ranges of characters to match
        lpeg.C()        - return pattern matches as a capture
        patt1 - patt2   - match pattern1 if pattern2 does not match

  That should take care of

        ?$$$$$123$      - returns 123
        ?$$$$           - returns nil
        ?123$           - returns 123
        ?987654123$     - returns 123
        ?12$            - returns nil
        ?12$$           - returns nil

  -spc (Think of 'patt1 - patt2' as look ahead ... )

Okay, interesting. So that particular pattern, at least, can be converted to LPeg. However, that was used merely as an example of a more general concern, which is that there are things matchable with Lua patterns but not matchable with LPeg.

Is that true or not? If not, then that removes one of my objections to LPeg replacing standard patterns (though the first objection, regarding the difficulty of LPeg, would still stand).