[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: RE: newbie - Lua and unicode
- From: Phil Teschner <philt@...>
- Date: Thu, 14 Sep 2006 10:42:24 -0700
> I agree with Lisa, but on Windows, UTF-16 is almost unavoidable (even if
> MS provide functions to convert to UTF-8, which could be useful for
> processing in Lua).
It is not just Windows that uses UTF-16. If you look at the wikipedia reference for UTF-16 it lists some common OS and applications:
* Everything Microsoft - Windows (including Pocket PC) and applications
* MacOS X and applications
* Symbian (phone/mobile OS) [Symbian]
* Qualcomm BREW (phone/mobile OS)
* SAP [SAP]
* Sybase [Sybase]
* International Components for Unicode [ICU]
* Rosette Core Library for Unicode [Rosette]
* Modern, widespread browsers: IE, Mozilla, Opera
* XML DOM 2.0 API and popular parsers (e.g. Apache Xerces)
* KDE/Qt and applications
* Modern programming languages
* All .Net languages (C#, J#, VB.Net, etc.)
* Python 1.6 (see Unicode in Python [Python])
* Ada 95 [Ada95]
* Enterprise Cobol [Cobol]
If you really don't want to use UTF-16 on Windows then you can use MultiByteToWideChar and WideCharToMultiByte to convert between different representations (see MSDN: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_9i79.asp).
From: firstname.lastname@example.org [mailto:email@example.com] On Behalf Of Philippe Lhoste
Sent: Thursday, September 14, 2006 9:49 AM
Subject: Re: newbie - Lua and unicode
Klaus Ripke a écrit :
> On Wed, Sep 13, 2006 at 06:24:17PM +0300, Theodor-Iulian Ciobanu wrote:
>> What modules do I need to be able to use unicode with Lua? (especially parsing of logs).
>> And is there a way to use both ANSI and Unicode?
> Yes, the snlunicode package provides two single-byte modules (ascii and latin1)
> as well as two multi-byte modules (utf8 and grapheme) with full support
> for all Unicode character classes, upper/lower etc in UTF-8.
> Conversion between other Unicode encodings like UTF-16 native/BE/LE/BOM
> and UTF-8 is trivial.
> As Lisa pointed out, you should avoid UTF-16 like the plague.
Well, he is on Windows (XP or 2k, I suppose) and he is parsing log files
which might be generated by some Windows tools, so uses the native
I agree with Lisa, but on Windows, UTF-16 is almost unavoidable (even if
MS provide functions to convert to UTF-8, which could be useful for
processing in Lua).
The resource you gave is interesting, thanks.
-- (near) Paris -- France
-- -- -- -- -- -- -- -- -- -- -- -- -- --