[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: RE: newbie - Lua and unicode
- From: Phil Teschner <philt@...>
- Date: Thu, 14 Sep 2006 10:42:24 -0700
> I agree with Lisa, but on Windows, UTF-16 is almost unavoidable (even if
> MS provide functions to convert to UTF-8, which could be useful for
> processing in Lua).
It is not just Windows that uses UTF-16. If you look at the wikipedia reference for UTF-16 it lists some common OS and applications:
* Everything Microsoft - Windows (including Pocket PC) and applications
* MacOS X and applications
* Symbian (phone/mobile OS) [Symbian]
* Qualcomm BREW (phone/mobile OS)
* SAP [SAP]
* Sybase [Sybase]
* International Components for Unicode [ICU]
* Rosette Core Library for Unicode [Rosette]
* Modern, widespread browsers: IE, Mozilla, Opera
* XML DOM 2.0 API and popular parsers (e.g. Apache Xerces)
* KDE/Qt and applications
* OpenOffice
* Modern programming languages
* Java
* ECMAScript (JavaScript/JScript)
* All .Net languages (C#, J#, VB.Net, etc.)
* Python 1.6 (see Unicode in Python [Python])
* Ada 95 [Ada95]
* Enterprise Cobol [Cobol]
If you really don't want to use UTF-16 on Windows then you can use MultiByteToWideChar and WideCharToMultiByte to convert between different representations (see MSDN: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_9i79.asp).
Phil
-----Original Message-----
From: lua-bounces@bazar2.conectiva.com.br [mailto:lua-bounces@bazar2.conectiva.com.br] On Behalf Of Philippe Lhoste
Sent: Thursday, September 14, 2006 9:49 AM
To: lua@bazar2.conectiva.com.br
Subject: Re: newbie - Lua and unicode
Klaus Ripke a écrit :
> On Wed, Sep 13, 2006 at 06:24:17PM +0300, Theodor-Iulian Ciobanu wrote:
>> What modules do I need to be able to use unicode with Lua? (especially parsing of logs).
> http://lua-users.org/wiki/LuaUnicode
>> And is there a way to use both ANSI and Unicode?
> Yes, the snlunicode package provides two single-byte modules (ascii and latin1)
> as well as two multi-byte modules (utf8 and grapheme) with full support
> for all Unicode character classes, upper/lower etc in UTF-8.
> Conversion between other Unicode encodings like UTF-16 native/BE/LE/BOM
> and UTF-8 is trivial.
>
> As Lisa pointed out, you should avoid UTF-16 like the plague.
Well, he is on Windows (XP or 2k, I suppose) and he is parsing log files
which might be generated by some Windows tools, so uses the native
encoding: UTF-16.
I agree with Lisa, but on Windows, UTF-16 is almost unavoidable (even if
MS provide functions to convert to UTF-8, which could be useful for
processing in Lua).
The resource you gave is interesting, thanks.
--
Philippe Lhoste
-- (near) Paris -- France
-- http://Phi.Lho.free.fr
-- -- -- -- -- -- -- -- -- -- -- -- -- --