lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


> Hi Roberto,
> 
> > A quick survey, for those who care:
> > - should LPeg support utf-8?
> > - If so, what would that mean?
> 
> I love LPeg but don't see anything useful it could do with UTF-8 that it
> doesn't already do. LPeg already handles parsing UTF-8 fine (for those
> who don't know: UTF-8 is a superset of ASCII). Any built-in "magic"
> would only reduce the flexibility for users of LPeg, unless you're
> considering an add-on module like "re". That would be fine of course but
> I don't really see the need for it, given that modules like slnunicode
> are available.

The support for UTF-8 would not change current "byte-oriented"
behavior.  I am thinking more in terms of extra build-in patterns.
So, for instance, lpeg.utf8.point(n) would match n UTF-8 code points,
lpeg.utf8.set("...") would match any point present in the given string,
and lpeg.utf8.range(v1,v2) would match any point with a code between v1
and v2.

-- Roberto