lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi list!

I need to extract some information from some mail messages. Is there some pure Lua library that can help me in the process?

The requirements are more or less the following:

* Pure Lua. Possibly simple and lightweight. Maybe short enough to be embedded in a Lua script or anyway to reside in a single file side to side to my script.

* Reliable, well-tested and foolproof. I don't know much about all the RFCs that comprise the mail message format, but the library API should be easy enough to let me extract the content of any header field and any text part of the message. I have little time and expertise to cope with corner cases where the library could fail because of bugs.

* It should handle quoted-printable encoding. In particular, it should be able to convert from quoted-printable to UTF-8 automatically. I don't strictly need other encodings, but also converting to Windows CP-1252 would be a bonus.

* It doesn't need to be able handle all MIME types. Just text (both plain and html).


*It should be able to enumerate every part of a multipart message with its "local header" fields.

* MIT or similar license (no copy-left hassle). I don't mean to publish the code but I wouldn't want to have something in my code-base that needs tracking for the future (besides having a MIT license boilerplate text inside, of course).


Ideally what I'd like to do is this:

- Reading the message source saved manually from Thunderbird mail client.

- Use the library to enumerate all the header fields and choose the ones I need. Ideally I would need a Lua table of header field names vs. their text content.

- Find the message part(s) I need (in a multipart message). The selection would be made primarily using the MIME type of the part (I just need access to the text/plain and text/html parts) and their position in the multipart message.

- Get the decoded UTF-8 content of the message part(s) as a Lua string, on which I would perform custom processing.

I think I could implement what I want to do directly easily without a library except the quoted-printable decoding part. But I know little about the mail format, except a quick glimpse on the related Wikipedia articles, so I fear I could botch something obvious by simply creating an ad-hoc "parser", and I don't have much time for this little project.

TIA for any useful advice and hint.

Cheers!

-- Lorenzo.