[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Request for advice: pure Lua Library to parse mail messages.
- From: Lorenzo Donati <lorenzodonatibz@...>
- Date: Mon, 20 Jul 2020 17:19:07 +0200
Hi list!
I need to extract some information from some mail messages. Is there
some pure Lua library that can help me in the process?
The requirements are more or less the following:
* Pure Lua. Possibly simple and lightweight. Maybe short enough to be
embedded in a Lua script or anyway to reside in a single file side to
side to my script.
* Reliable, well-tested and foolproof. I don't know much about all the
RFCs that comprise the mail message format, but the library API should
be easy enough to let me extract the content of any header field and any
text part of the message. I have little time and expertise to cope with
corner cases where the library could fail because of bugs.
* It should handle quoted-printable encoding. In particular, it should
be able to convert from quoted-printable to UTF-8 automatically. I don't
strictly need other encodings, but also converting to Windows CP-1252
would be a bonus.
* It doesn't need to be able handle all MIME types. Just text (both
plain and html).
*It should be able to enumerate every part of a multipart message with
its "local header" fields.
* MIT or similar license (no copy-left hassle). I don't mean to publish
the code but I wouldn't want to have something in my code-base that
needs tracking for the future (besides having a MIT license boilerplate
text inside, of course).
Ideally what I'd like to do is this:
- Reading the message source saved manually from Thunderbird mail client.
- Use the library to enumerate all the header fields and choose the ones
I need. Ideally I would need a Lua table of header field names vs. their
text content.
- Find the message part(s) I need (in a multipart message). The
selection would be made primarily using the MIME type of the part (I
just need access to the text/plain and text/html parts) and their
position in the multipart message.
- Get the decoded UTF-8 content of the message part(s) as a Lua string,
on which I would perform custom processing.
I think I could implement what I want to do directly easily without a
library except the quoted-printable decoding part. But I know little
about the mail format, except a quick glimpse on the related Wikipedia
articles, so I fear I could botch something obvious by simply creating
an ad-hoc "parser", and I don't have much time for this little project.
TIA for any useful advice and hint.
Cheers!
-- Lorenzo.