[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: The World According to Lua: How To?
- From: Glenn Maynard <glenn@...>
- Date: Fri, 18 Feb 2005 06:36:53 -0500
A tip: using a real name on technical lists will tend to get you
a better response.
On Fri, Feb 18, 2005 at 11:53:03AM +0100, PA wrote:
> >For a truly i18n app it's probably easiest to always use UTF-8
> >internally.
>
> This is what I would like to do, yes. How do I achieve that today with
> the stock Lua distribution?
Probably by having the application that's using Lua do the appropriate
conversion to UTF-8 at the entry/exit points, and using setlocale()
to change the locale to UTF-8.
> >>My OS do have a default character set encoding, but how do I
> >>know about it?
> >It is ISO-8859-1 (Latin-1).
>
> Always? ISO-8859-1? This is useless for three fourth of world.
No, it's not always ISO-8859-1. (Don't know where he got that answer from.)
In Unix, it depends on the locale; in Windows, it's the ANSI codepage. My
encoding is UTF-8, not ISO-8859-1. The encoding on my Windows machine is
currently CP932 (Shift-JIS).
> >>How do I tell Lua that everything I want to
> >>deal with is UTF-8 encoded and that is it?!?!
> >you don't - Lua doesn't care. Use the extension.
>
> You mean your recently mentioned UTF-8 library? This is the extension
> you are talking about? Is Lua itself going to ever support Unicode
> directly one way or another?
"Support" in what way? What operations do you want to do? You can do things
like substring searches on UTF-8 without any extra code. Regexes/gsub is
harder (eg. "[äï]" should be a set of two letters, regardless of encoding).
I don't know if that's supported--it would come at a performance and
portability cost.
> >>For instance, lets
> >>assume that my application display its data according to HTTP's
> >>Accept-Language header. One request is in de_DE, the next one in fr_FR
> >>and so on, while the application default language is en_US. How does
> >>all this fit together?
> >It doesn't - you just don't care.
>
> Hmmm... what if I do care?
In this case, you'd probably want an interface to iconv. Accepting multiple
tagged character sets in a single application is fairly atypical for i18n
support: most applications only need to be able to deal with the language of
the user. Web browsers, mailers, etc. need more than that.
(If I was using Lua to extend a web browser, though, I'd probably convert the
text to UTF-8 before giving it to Lua, and not make Lua do the conversion.)
> >Pick one of "ja" and "oui" and "yerpo".
> >Second, set the document's content type to "text/html;
> >charset=ISO-8859-1"
> >in your webserver config and better also in the documents header.
>
> My webserver is my application. There is no Deus ex Machina.
One generally extends a webserver with Lua--one doesn't write one *in* Lua. :)
> >http://www.i18nguy.com/
>
> I'm not interested in i18n in general. Only specifics on how to deal
> with it in Lua.
If you don't understand i18n in general, you can't really understand
how to deal with it in Lua.
--
Glenn Maynard