[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: lpeg could infer the validity of this grammar.
- From: Pierre-Yves Gérardy <pygy79@...>
- Date: Mon, 15 Dec 2014 19:56:50 +0100
Here's the code that detects left recursion:
https://github.com/lua/lpeg/blob/2960f1cf68af916a095625dfd3e39263dac5f38c/lptree.c#L934
https://github.com/lua/lpeg/blob/2960f1cf68af916a095625dfd3e39263dac5f38c/lptree.c#L915
The function walks the grammar tree and if it loops past a given
threshold, an error is raised.
LPeg could support left-recursion. Sérgio Medeiros, Fabio Mascarenhas
and Roberto Ierusalimschy wrote a paper that describes an extension to
the engine, but they did not publish the code, for some reason.
http://arxiv.org/abs/1207.0443
Rostislav Sacek added left recursion support in LPegLJ, but it only
works with LuaJIT.
https://github.com/sacek/LPegLJ
Regards,
—Pierre-Yves Gérardy
—Pierre-Yves
On Mon, Dec 15, 2014 at 4:53 PM, Sam Putman <atmanistan@gmail.com> wrote:
>
>
> On Sun, Dec 14, 2014 at 10:56 PM, Sean Conner <sean@conman.org> wrote:
>>
>>
>> LPeg is a Lua implementation of Parsing Expresson Grammars, which are
>> *NOT* (I repeat, *NOT*) the same as LR() or LL() parsers. There's a bit
>> more information about this on the Wikipedia:
>>
>>
>> http://en.wikipedia.org/wiki/Parsing_expression_grammar#Indirect_left_recursion
>>
>> -spc
>>
>
> Absolutely correct, now, let's focus.
>
> The lpeg engine has to detect potential left recursion, and reject it.
>
> How does it do this?
>
> It looks at a call to "patt", and sees whether another call to "patt" can be
> reached without consuming any input. We check the rules, one at a time, to
> assure that the second "patt" call either consumes or fails before it
> reaches a third.
>
> The only type of capture which can modify this *logic* in a way that *still
> prevents infinite loops* is #patt, and only under this deterministic
> circumstance: If we have a pattern 'token', a looping pattern "expr", and a
> calling pattern "factor", then if factor looks like "factor = #token * expr"
> and expr looks like "expr = token + factor", we're okay.
>
> Specifically, any #patt encountered before a potential left-recursion should
> be followed to see if a later call to `patt` is encountered. This must
> happen before the cursor moves, and before the recursive call to the
> "factor" type rule.
>
> Think of #patt as a promise to the grammar compiler that 'patt' will be
> encountered. The compiler may statically confirm that this happens.
> Currently, it treats #patt as any other pattern that doesn't consume input,
> but in fact, it adds a new constraint which may be propagated.
>
> One more time: I am making a logical claim about an algorithm. I would like
> to see demonstration code, and I'll tell you if the algorithm would say
> "cannot loop infinitely" or "might loop infinitely". I am especially
> interested in grammars for which these categories cannot be assigned: I
> haven't proven that it's impossible, because I don't have the time, but I
> suspect it is.