lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


A tip: using a real name on technical lists will tend to get you
a better response.

On Fri, Feb 18, 2005 at 11:53:03AM +0100, PA wrote:
> >For a truly i18n app it's probably easiest to always use UTF-8
> >internally.
> 
> This is what I would like to do, yes. How do I achieve that today with 
> the stock Lua distribution?

Probably by having the application that's using Lua do the appropriate
conversion to UTF-8 at the entry/exit points, and using setlocale()
to change the locale to UTF-8.

> >>My OS do have a default character set encoding, but how do I
> >>know about it?
> >It is ISO-8859-1 (Latin-1).
> 
> Always? ISO-8859-1? This is useless for three fourth of world.

No, it's not always ISO-8859-1.  (Don't know where he got that answer from.)
In Unix, it depends on the locale; in Windows, it's the ANSI codepage.  My
encoding is UTF-8, not ISO-8859-1.  The encoding on my Windows machine is
currently CP932 (Shift-JIS).

> >>How do I tell Lua that everything I want to
> >>deal with is UTF-8 encoded and that is it?!?!
> >you don't - Lua doesn't care. Use the extension.
> 
> You mean your recently mentioned UTF-8 library? This is the extension 
> you are talking about? Is Lua itself going to ever support Unicode 
> directly one way or another?

"Support" in what way?  What operations do you want to do?  You can do things
like substring searches on UTF-8 without any extra code.  Regexes/gsub is
harder (eg. "[äï]" should be a set of two letters, regardless of encoding).
I don't know if that's supported--it would come at a performance and
portability cost.

> >>For instance, lets
> >>assume that my application display its data according to HTTP's
> >>Accept-Language header. One request is in de_DE, the next one in fr_FR
> >>and so on, while the application default language is en_US. How does
> >>all this fit together?
> >It doesn't - you just don't care.
> 
> Hmmm... what if I do care?

In this case, you'd probably want an interface to iconv.  Accepting multiple
tagged character sets in a single application is fairly atypical for i18n
support: most applications only need to be able to deal with the language of
the user.  Web browsers, mailers, etc. need more than that.

(If I was using Lua to extend a web browser, though, I'd probably convert the
text to UTF-8 before giving it to Lua, and not make Lua do the conversion.)

> >Pick one of "ja" and "oui" and "yerpo".
> >Second, set the document's content type to "text/html; 
> >charset=ISO-8859-1"
> >in your webserver config and better also in the documents header.
> 
> My webserver is my application. There is no Deus ex Machina.

One generally extends a webserver with Lua--one doesn't write one *in* Lua.  :)

> >http://www.i18nguy.com/
> 
> I'm not interested in i18n in general. Only specifics on how to deal 
> with it in Lua.

If you don't understand i18n in general, you can't really understand
how to deal with it in Lua.

-- 
Glenn Maynard