[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: how to translate lua pattern "(.*)and(.*)" with lpeg re ?
- From: Albert Chan <albertmcchan@...>
- Date: Fri, 16 Feb 2018 18:10:46 -0500
> On Feb 16, 2018, at 3:17 PM, Dirk Laurie <dirk.laurie@gmail.com> wrote:
>
> 2018-02-16 19:28 GMT+02:00 Albert Chan <albertmcchan@yahoo.com>:
>
>>>> But ... what if z = re.compile " 'and' %s+ ('possibly' / 'likely' / 'definitely') %s+ " ?
>>>> or, possibly even more complicated ?
>>>>
>>>> Above lpeg re pattern can handle it
>>>
>>> LPeg without `re` handles it too.
>>>
>>> C, P = lpeg.C, lpeg.P
>>> kw = P'possibly' + 'likely' + 'definitely'
>>> (C((1-kw)^0) * kw * (P(-1)/''+C(P(1)^0)))
>>
>> your lpeg pattern first capture is non-greedy.
>> it stop on the first kw match (also, P(-1) is unnecessary)
>> So, if kw = 'and', your lpeg is same as lua pattern "^(.-)and(.*)$"
>
> OK, so the assignment becomes "find the last occurrence of a keyword
> and return the strings before and after it." I'm not too proud to
> admit that I would do that with a mixture of Lua and LPeg. Too late at
> night in my timezone to work out the details.
-- You can do above in lua easily, with the help of xpattern.lua
-- I post a copy in https://github.com/achan001/LPeg-anywhere
xpattern = require 'xpattern'
X = xpattern.P
p1 = X'(.*)and%s+' * (X'possibly' + 'likely' + 'definitely') * X'%s+(.*)'
p1_match = pat:compile()
t = 'lpeg and possibly xpattern will work, and definitely worth trying'
=p1_match(t)
lpeg and possibly xpattern will work, worth trying
-- now, a plug for my patched lpeg-anywhere :-)
re = require 're'
z = re.compile[[ 'and' %s+ ('possibly' / 'likely' / 'definitely') %s+ ]]
p2 = re.compile("{(. >&%z)*} %z {.*}", {z=z})
=p2:match(t)
lpeg and possibly xpattern will work, worth trying
-- '>%pat' == '(g <- %pat / .[^%pat]* g)', and I meant it literally.
-- [^%pat] = [^a], by actually examine the lpeg object %pat for head-chars
--> [^%pat]* guaranteed the text begins with 'a' for next match
--> less false start means better performance.
on my old pentium 3, 1 million iterations
pat1_match take 8.903 sec
pat2:match take 6.209 sec