lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


John Belmonte wrote:
> 
> This seems like misinformation.  From what I understand, unicode has a
> 31 bit space, and only about 21 bits of that is required to cover all
> characters in use today.  As for using unicode for Japanese, certainly
> it is possible and works well, as I've personally deployed unicode with
> UTF-8 encoding at Japanese websites.
> 
> Perhaps your experience is with some naive *encoding* of unicode that
> tries to stuff 21 bits into 16?  ;-)
> 

Yes, I do see that point. However, Unicode started out as 
being just that 16 bit wide encoding that lmicrosoft still uses
these days. So, historically speaking, unicode laden with the 
stigma of being too restrictive.

The other more important problem wich I mention here is CJK 
unification. If I am not mistaken, even in Unicode of these days, 
many Chinese, Japanese and Korean ideographs have not been 
included on grounds of being  historical forms, being 
"too similar" with other ideographs, on grounds of being 
uncommon in usage, or on grounds of being writable by
other characters.

Think about it. That's like saying that the q is not needed
in the roman alphabet because it's so similar to an o (just
one extra line), not used very commonly, and besides we could 
replace all q's by, for instance "kw". The kwestion is of 
course whether we would agree to such an intrusion into our 
westenr languages. And people named Quinten might object 
to having to write their name as Kwinten.

I have heard from some Japanese people that they find Unicode 
culturally unacceptable exactly for these reasons. Maybe something
has changred at the Unicode consortium, but I can't forget it's 
still Microsoft's brainchild, so I am weary of it.

95000 characters are now in unicode, but to my estimate, 
there are probably a million different characters 
in all human languages of the past and the present. Lacking
are historical forms, rare scripts, reginal variants, etc.
Maybe I have been misinformed. Maybe someone will be able to
reassure me with regards to my doubts.

More importantly, I am still interested in doing a 
small uft8-lib for Lua.


-- 
"No one knows true heroes, for they speak not of their greatness." -- 
Daniel Remar.
Björn De Meyer 
bjorn.demeyer@pandora.be