lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Mike,

you've posed a good LPEG quiz about parsing Lua long strings and long comments. -- I'll work on it.

However, my first thought is that since PEGs (or grammars in general) are supersets of RegExps, it should not be a problem (in principle) to represent RegExps in a PEG. The question is whether Lua's LPEG version 0.2 is fully up to PEG & RegExp functionality yet.

In particular, whereas CFGs are expressed (typically) in Bakus Naur Form (BNF), PEGs extend the BNF-like semantics of grammar definition languages by *adding* regular expressions as first-class pattern constructs directly to the BNF rules themselves. Thus PEGs, by definition, include RegExp facilities, as well as capture facilities (which provide additional/local context and virtual backtracking not found in conventional CFGs, though are often added as auxiliary functions in CFG parsers).

Whether Lua's LPEG has arrived "there" yet in terms of fully incorporating RegExps into its Grammar/Rule syntax is another matter. LPEG's lpeg.P(pattern), R(range), S(set), T(table), C(capture), and I(index) constructs are the starting point for RegExp construction. The other semantics provided by LPEG then extend the RegExps and provide the rich PEG grammar behavior.

I think I'll wait a little while longer, however, until LPEG is closer to 0.9 than 0.2 to test the limits of its PEG & RegExp expressivity!

Cheers,

// Brian


Mike Pall wrote:

Maybe it's also interesting to show what cannot be expressed with
PEGs. Apart from the textbook examples (see the Wikipedia links),
I think I found one with practical relevance:

I found no closed PEG for properly parsing Lua 5.1 long strings
and long comments. This is easy for any finite subset:

  "[==[" * (P(1)-"]==]")^0 * "]==]"