Re: LPEG - next version

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: LPEG - next version
From: Florian Weimer <fw@...>
Date: Fri, 12 Jun 2009 21:16:21 +0200

* Miles Bader:

> Florian Weimer <fw@deneb.enyo.de> writes:
>>> I must confess I am currently stuck. I think LPEG should support Unicode
>>> (through UTF-8), but I have no idea what "to support Unicode" means :)
>>
>> P(1) needs to turn into
>
> It seems there needs to be a clear distinction between "raw char" (given
> that lpeg is quite usable for binary data) and "unicode char".
>
> Making P(x) count utf8 chars would certainly be convenient for people
> reading utf8 files, but... it doesn't seem the cleanest thing in
> general....

Sure, this has to be optional.

By the way, I'm not sure if it is reasonably possible to implement
something like grapheme cluster matching without special bytecode
support.  Right now, the compiled program would be fairly large, I
fear, and there would be a rather long sequences of choices.

References:
- LPEG - next version, Thomas Harning Jr.
- Re: LPEG - next version, Roberto Ierusalimschy
- Re: LPEG - next version, Florian Weimer
- Re: LPEG - next version, Miles Bader

Prev by Date: Re: Next Version of Lua?
Next by Date: Re: retrieve top-level lua_State?
Previous by thread: Re: LPEG - next version
Next by thread: Re: Removing debug assertions
Index(es):
- Date
- Thread