lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Roberto Ierusalimschy a écrit :
I am releasing a prototype of (yet) another pattern-matching
library for Lua, called LPeg:

Very interesting. I printed out your page and the Wikipedia entry and read them twice to impregnate the concepts...

Funnily, I recently thought it would be nice to have a real pattern-matching language, that would be more flexible and more readable, with a gentler learning curve than regular expressions. I am not too sure about the readability of LPeg... But it is close of what I had in mind.

I didn't knew this PEG technology, it looks very promising.
Now that I master regular expressions, I have to learn a new syntax and even more new concepts/way of thinking...


Some remarks (based on the doc, I didn't tried the library yet):

- I understand that the LPeg syntax leverages metamethods, and thus is limited by the capability of these. It is a bit annoying that it forces to change from the "traditional" syntax, although it is still logical. Too bad that priority rules prevents you from keeping / as ordered choice and .. as sequence.
And by using #, you limit the library to Lua 5.1.

- I wondered how to emulate {n,m} syntax of REs. It seems that patt^n is close of that. Perhaps for newbies like me, you should provide more examples, something with both bounds, if possible. Classical example in REs is [a-z]{2,6} for TLDs (perhaps more on the upper bound).
I understand that, per the doc, we have:
^n  -> {n,}
^-n -> {0,n}
^0  -> {0,}  -> *
^1  -> {1,}  -> +
^-1 -> {0,1} -> ?
so, to have patt{n,m}, can we write
  patt^n * patt^(n - m)
? Is it possible to express this without writing the pattern twice? (or is it a non-issue?)

- I suppose you plan to allow names (keys) as grammar table indexes, or is it really hard to implement? Or is it again a non-issue? Oh, I see now the last example using variables as rule names, so it is indeed a non-issue.

- Looking at the CSV example, is (lpeg.P(1) - '"') the same as the (more used elsewhere) (1 - lpeg.P"'")? I suppose the rule here is to have at least one pattern in the expression to trigger metamethods.


Typo alert!
- where the /math/ occurs.
- we can use /to/ following transformer:
- the use of a dot to /denote/ concatenation.
(not sure about this one, this verb looks like having a strange use (per WordReference translation/definition), but it can be just a limitation of my own English understanding...)


Overall, it is an interesting technique. I am starting to phantasm on using such parser to describe flexible but fast and powerful lexers for Scintilla, allowing to add (or change) lexers without recompiling the component... I suppose your library (parser) uses the Lua license, like most others. It could be a good starting point, I saw no reference to a packrat parser written in C...

--
Philippe Lhoste
--  (near) Paris -- France
--  http://Phi.Lho.free.fr
--  --  --  --  --  --  --  --  --  --  --  --  --  --