lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi,

Matt Campbell wrote:
> However, it's problematic on Windows; 
> someone please correct me if I'm wrong, but I believe that UTF-8 is 
> never (or rarely) the encoding associated with the system locale on 
> Windows.

The standard codepage is WINDOWS-1252 in most western countries.
It's unlikely to be set to UTF-8 because this codepage mapping is
only for compatibility to MS-DOS, Win16 and old Win32 apps.

E.g. a filename passed to CreateFileA() is converted to UTF-16
using the codepage before handing it over to CreateFileW(). Since
fopen() is just a wrapper around CreateFileA() you won't get very
far using UTF-8 file names with Lua's io.* library ...

Most Windows-centric applications and libraries only use the
UTF-16 APIs and don't care about the codepage at all. Most
applications ported from POSIX systems use the ISO/ANSI 8-bit
APIs. They'll get into trouble when they do anything other than
1:1 forwarding of file names, resource names and the like.

The recipe for portability is to ignore the locale and do
everything in UTF-8 internally (e.g. for string handling). Only
convert at the boundaries to/from the 'widest' API available
(e.g. for file I/O or GUIs).

Bye,
     Mike