[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: question about Unicode
- From: Mike Pall <mikelu-0612@...>
- Date: Mon, 4 Dec 2006 20:09:01 +0100
Matt Campbell wrote:
> However, it's problematic on Windows;
> someone please correct me if I'm wrong, but I believe that UTF-8 is
> never (or rarely) the encoding associated with the system locale on
The standard codepage is WINDOWS-1252 in most western countries.
It's unlikely to be set to UTF-8 because this codepage mapping is
only for compatibility to MS-DOS, Win16 and old Win32 apps.
E.g. a filename passed to CreateFileA() is converted to UTF-16
using the codepage before handing it over to CreateFileW(). Since
fopen() is just a wrapper around CreateFileA() you won't get very
far using UTF-8 file names with Lua's io.* library ...
Most Windows-centric applications and libraries only use the
UTF-16 APIs and don't care about the codepage at all. Most
applications ported from POSIX systems use the ISO/ANSI 8-bit
APIs. They'll get into trouble when they do anything other than
1:1 forwarding of file names, resource names and the like.
The recipe for portability is to ignore the locale and do
everything in UTF-8 internally (e.g. for string handling). Only
convert at the boundaries to/from the 'widest' API available
(e.g. for file I/O or GUIs).