lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Further (again) to my message about the re module self-test failure, I think I worked it out (this took a few days).

It fails to parse any lines with "<-" on them, leading me to query why the test for "name S !arrow" failed.

The relevant part of the grammar is here:

  pattern         <- exp !.
  exp             <- S (alternative / grammar)

  alternative     <- seq ('/' S seq)*
  seq             <- prefix*

   ...

  grammar         <- definition+


An "exp" is either an "alternative" or a "grammar".

Assuming the alternative doesn't use the "/" symbol we effectively have this:

  pattern <- S (prefix* / definition+) !.


-----

We can make up a similar test case:

  require "re"
  local target = "foo"
  local grammar = " ('foo'* / 'bar'+) !."
  print (re.match (target, grammar))


That will match at target of "foo" but not "bar". Why? Because even zero instances of "foo" are acceptable as a match. Therefore the "'bar'+" alternative is not considered. Thus in the real grammar "alternative" can consist of an empty string. A line like this will still match "alternative" (without consuming any characters):

   pattern         <- exp !.


Now the final test fails (the test that we are at end-of-subject).


However by putting "grammar" first (ie. "(grammar / alternative)" rather than "(alternative / grammar)" ) this works, because "grammar" matches ONE or more (not zero or more) and will fail on a non-grammar line, thus letting the PEG try the "alternative" route.

You could work around it as well by insisting that "seq" matches at least something:


   seq <- prefix+

However that fails to pass a totally empty grammar.


Reference: http://www.inf.puc-rio.br/~roberto/lpeg/re.html