[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Request for advice: pure Lua Library to parse mail messages.
- From: Sean Conner <sean@...>
- Date: Tue, 21 Jul 2020 17:47:03 -0400
It was thus said that the Great Lorenzo Donati once stated:
> On 21/07/2020 10:51, Sean Conner wrote:
> >[1] I recently wrote an HTML parser using LPEG. I started out with a
> > "quick-n-dirty" one but quickly realized I was going to be worse off
> > than with a proper parser. So I broke out the DTD [2] for the
> > version of HTML I had to parse, and wrote one [3]. Works perfectly,
> > handles the optional closing tags (and the one opening tag). It
> > helped that all the HTML I need to parse is well formed and
> > validated.
>
> I wish I had time to learn to use LPEG. I gave it a go a couple of times
> in past decade, but it's theoretical background is way over my head to
> be "grokked" in a couple of days.
I'm sad to hear that, because I found LPEG to be *way* easier to use than
the old lex and yacc (or flex and bison as the modern replacements) which
were even *more* theoretical in nature (I always hated those "shift and
reduce errors" from yacc).
> I have little formal education in
> compiler and grammar theory, and I realize having a firm understanding
> of how a formal grammar "behaves" really would help understanding LPEG
> and how to use it for practical tasks.
If you can use Lua patterns (or general regex) I think you can learn LPEG.
Yes, there's a bit of a learning curve, but I don't think it's that big, and
I don't think you need any formal education to understand it.
> I /can/ read the EBNF form of a grammar and reason about it in a
> practical way, but I really can't /design/ a grammar to do what I want,
> and that would help a lot to use LPEG effectively, I guess.
Well, the RFCs do give BNF for the headers, so you aren't entirely left to
your own devices. Most will even collect all the BNF in a section at the
end so you don't have to page around the document trying to find all the
rules.
> So every time I gave up for lack of time and I forgot almost everything
> I learned. I found it has quite a steep learning curve, alas. I also
> tried a small tutorial written by Gavin Wright (IIRC), but it wasn't
> enough to bring me to that "AHA!" moment when you really grasp how to
> use the tool effectively.
Anything you can do with Lua patterns you can do with LPEG (there was a
thread on this mailing list a few years ago about that). But the neat thing
about LPEG is that you can construct a "pattern" from smaller pieces.
You can see all that if you check out my LPEG parsers repo:
https://github.com/spc476/LPeg-Parsers
but you can also check out some simplified examples in another repo I have:
https://github.com/spc476/LPeg-talk
For that one, I would go through the examples in the order listed.
date1.lua is about as simple as they come, a pattern to match a date like
"Wed, 2 Dec 2015 20:51:17 +0100".
Anyway, I blather ...
-spc