lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


It was thus said that the Great Albert Chan once stated:
> 
> > On Jan 22, 2018, at 12:33 PM, Sean Conner <sean@conman.org> wrote:
> > 
> > It was thus said that the Great albertmcchan once stated:
> >> during testing, lpeg backtracking can only handle max string length of 199:
> >> is this a lpeg bug ? only 199 chars limit ?
> > 
> >  Nope.  It's documented.  From the LPeg documentation:
> > 
> >    lpeg.setmaxstack (max)
> > 
> >        Sets a limit for the size of the backtrack stack used by
> >        LPeg to track calls and choices. (The default limit is 400.)
> >        Most well-written patterns need little backtrack levels and
> >        therefore you seldom need to change this limit; before
> >        changing it you should try to rewrite your pattern to avoid
> >        the need for extra space. Nevertheless, a few useful
> >        patterns may overflow. Also, with recursive grammars,
> >        subjects with deep recursion may also need larger limits.
> > 
> > And this is exactly what you are hitting:
> > 
> >    lua: a.lua:18: backtrack stack overflow (current limit is 400)
> 
> but my example only need 4 backtracks, not 400 !
> And it never say anything about max string length of 199

  But it needs more than 400 levels of stack, because  ...

> pat = re.compile "{g <- .g / &'and'} 'and' {.*}"

'g' calls itself, over and over again, exhausting the call stack (aka the
'backtrack') of the LPeg VM (yes, LPeg generates its own VM code).

  Annoying yes.  And yes, I've run into the problem.  My first JSON decoder
[1] would blow up if JSON objects (or arrays) were nested too deep.  Shame
too, because the code for that is a verbatim dump of the JSON BNF (it works
most of the time, except when it doesn't).  I then wrote a second version of
it [2] that's not quite as pretty, but handles deeeply nested JSON objects
(and it can handled data feed into it, whereas the original needed the
entire JSON document in memory).

  Could the documentation be a bit clearer?  Perhaps.  The LPeg documenation
does mention the limit when describing the lpeg.setmaxstack() function, and
re uses LPeg to generate LPeg code from the re code provided so it *is*
mentioned, but as usual, in the PUC terse style.

  -spc

[1]	https://github.com/spc476/LPeg-Parsers/blob/master/json.lua

[2]	https://github.com/spc476/LPeg-Parsers/blob/master/jsons.lua