Re: Native unicode support?

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: Native unicode support?
From: Björn De Meyer <bjorn.demeyer@...>
Date: Thu, 27 Jun 2002 21:42:39 +0200

Peter Loveday wrote:
>
> I don't see how this deals with UTF-8 at all ?
> 
> Surely to do so you need to combine characters that
> are multi-byte prefixes, otherwise its just 8 bit
> ASCII ?
> 
> Love, Light and Peace,
> 
> - Peter Loveday
> Director of Development, eyeon Software

Well, LUA need not validate nor interpret 
the utf8 multibyte character sequences. 
It only has to /detect/ them. And, and that 
is the beauty of UTF8.  Any byte that has the 
eight bit set, apart from 0xfe and 0xff, 
is part of a multibyte encoding of a Unicode character. 
Furthermore, UTF8 is compatible with 7-bit ascii, 
so we are sure that these multibyte encodings of 
Unicode characters do not encode for any 7-bit 
whitespace, digit or nonprinting characters. 
So, we have a byte of a multibyte sequence that represents 
a Unicode character, that for all practical means is 
valid for use in an identifier.

I assume you knowthe UTF-8 specs. If not,
take a gander here: http://czyborra.com/utf/#UTF-8

-- 
"No one knows true heroes, for they speak not of their greatness." -- 
Daniel Remar.
Björn De Meyer 
bjorn.demeyer@pandora.be

References:
- Native unicode support?, Chung Jiho
- Re: Native unicode support?, Björn De Meyer
- Re: Native unicode support?, David Burgess
- Re: Native unicode support?, Björn De Meyer
- Re: Native unicode support?, Edgar Toernig
- Re: Native unicode support?, Björn De Meyer
- Re: Native unicode support?, Peter Loveday

Prev by Date: Re: ANN: sleep() patch for Lua 4.0
Next by Date: Re: equal tag method [was Re: unicode and locale again]
Previous by thread: Re: Native unicode support?
Next by thread: Re: Native unicode support?
Index(es):
- Date
- Thread