lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

It was thus said that the Great Lorenzo Donati once stated:
> On 21/07/2020 10:51, Sean Conner wrote:
> >[1]	I recently wrote an HTML parser using LPEG.  I started out with a
> >	"quick-n-dirty" one but quickly realized I was going to be worse off
> >	than with a proper parser.  So I broke out the DTD [2] for the
> >	version of HTML I had to parse, and wrote one [3].  Works perfectly,
> >	handles the optional closing tags (and the one opening tag).  It
> >	helped that all the HTML I need to parse is well formed and
> >	validated.
> I wish I had time to learn to use LPEG. I gave it a go a couple of times 
> in past decade, but it's theoretical background is way over my head to 
> be "grokked" in a couple of days. 

  I'm sad to hear that, because I found LPEG to be *way* easier to use than
the old lex and yacc (or flex and bison as the modern replacements) which
were even *more* theoretical in nature (I always hated those "shift and
reduce errors" from yacc).

> I have little formal education in 
> compiler and grammar theory, and I realize having a firm understanding 
> of how a formal grammar "behaves" really would help understanding LPEG 
> and how to use it for practical tasks.

  If you can use Lua patterns (or general regex) I think you can learn LPEG. 
Yes, there's a bit of a learning curve, but I don't think it's that big, and
I don't think you need any formal education to understand it.

> I /can/ read the EBNF form of a grammar and reason about it in a 
> practical way, but I really can't /design/ a grammar to do what I want, 
> and that would help a lot to use LPEG effectively, I guess.

  Well, the RFCs do give BNF for the headers, so you aren't entirely left to
your own devices.  Most will even collect all the BNF in a section at the
end so you don't have to page around the document trying to find all the

> So every time I gave up for lack of time and I forgot almost everything 
> I learned. I found it has quite a steep learning curve, alas. I also 
> tried a small tutorial written by Gavin Wright (IIRC), but it wasn't 
> enough to bring me to that "AHA!" moment when you really grasp how to 
> use the tool effectively.

  Anything you can do with Lua patterns you can do with LPEG (there was a
thread on this mailing list a few years ago about that).  But the neat thing
about LPEG is that you can construct a "pattern" from smaller pieces.
You can see all that if you check out my LPEG parsers repo:

but you can also check out some simplified examples in another repo I have:

  For that one, I would go through the examples in the order listed. 
date1.lua is about as simple as they come, a pattern to match a date like
"Wed, 2 Dec 2015 20:51:17 +0100".

  Anyway, I blather ...