[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Chinese characters in a string
- From: Rob Kendrick <rjek@...>
- Date: Tue, 4 Dec 2012 13:13:37 +0000
On Tue, Dec 04, 2012 at 02:10:43PM +0100, alessandro codenotti wrote:
>
> Hello, I moved to china some 3 months ago and now that I'm starting to speak the language I'm also starting writing programs that have to operate on strings containing chinese character and i noticed that the string functions behaves in a strange way:
>
> a="我叫李乐"
> print(string.sub(a,1,4)) -->我叫
> print(string.sub(a,1,5)) -->我叫?
> print(string.len(a)) -->8
Lua's string functions only operate on byte quantities, and are only
really useful for ASCII text. However, Lua strings themselves can
contain any characters, and thus are quite safe to contain UTF-8 data.
You need to use a Unicode-away string library to manipulate them
usefully, however.
B.