[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: UTF-8 patterns in Lua 5.3
- From: Hisham <h@...>
- Date: Sun, 20 Apr 2014 00:29:46 -0300
On 19 April 2014 23:00, Keith Matthews <keith.l.matthews@gmail.com> wrote:
> On Sat, Apr 19, 2014 at 1:00 PM, Jay Carlson <nop@nop.com> wrote:
>> As an aside, I like the demarcation point of "Lua does UTF-8, but it does
>> not know Unicode." It is always good to be clear what you are *not* trying
>> to do.
>
> I'm no Unicode expert, but this doesn't make sense to me. UTF-8 is
> merely a Unicode encoding, so of course Lua 5.3 work 2 "knows"
> Unicode. With the utf8 library, Lua can count, index and iterate over
> Unicode code points in UTF-8-encoded strings, and convert sequences of
> Unicode code points to UTF-8.
>
> Of course, it provides no Unicode algorithms since they typically need
> lookup tables larger than Lua itself, but these algorithms are now
> much easier to implement thanks to the utf8 library.
Which brings two honest questions to my mind:
1) would one want to implement those algorithms in Lua? (Which carries
the sub-question 1.1, doesn't that lead back to "if you open that can
of worms you have to do the whole thing", in which case one would end
up just reimplementing something like ICU in Lua?)
2) if the answer to 1 is yes, would UTF-8 patterns help?
My hunch is to answer "1) some, sometimes (1.1: no); 2) yes" but I'm
far from sure about that.
-- Hisham