[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: LPEG vs. PCRE
- From: Florian Weimer <fw@...>
- Date: Sun, 22 Nov 2009 13:11:36 +0100
* Roberto Ierusalimschy:
> Even for regular languages, LPEG may be better than PCRE. As an extreme
> example, we have email addresses as defined in RFC 822. There is a real
> Perl module that validates such addresses using a regular expression
> that starts like this:
>
> (?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t]
> )+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:
> \r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(
> ?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[
> \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\0
> 31]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\
>
> and goes on for a total of 75 lines like those.
> (Mail::RFC822::Address: regexp-based address validation)
It's still incomplete because email addresses do not form a regular
language, see <http://tools.ietf.org/html/rfc5322#page-11> and the
ccontent and comment productions.