lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Roberto Ierusalimschy <roberto@inf.puc-rio.br> wrote:

[UTF-8]
>But, then, my other question: what is the relationship between Windows CE 
>and Unicode? Why did everybody that tryed to port Lua to Windows CE come up 
>with this subject? Why can't they just use this approach (UTF-8)?
>(this is pure ignorance of my part; I know nothing about Windows CE...) 

The machine is based on wchar_t as way to pass strings in and out, so
people tend to think it needs to be that way inside their code as well. 
IMO, this is not the case - it's a conversion issue just like converting
numbers to/from printable form is.  Or closer to home: just like Lua
converts everything back and forth from doubles when it needs to
interface with things outside it.  If conversions are used only for
information going to/from the user interface, and things like file names,
then they need not become a bottleneck.  As I said before, storing data
on file with anything other than UTF-8 would IMHO be a mistake.

I'd say that if WinCE is considered the main universe, then wchar_t makes
sense, but in a broader perspective it does less so.  The choice of
encoding things as 16-bit shorts already causes trouble with >65k char
codes.  UTF-8 is compact, portable, endian-neutral, and capable of
storing unlimited char sets.  It's the equivalent of people writing words
by stringing characters together.

-jcw