[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Unicode and UTF-8 the Lua way, mid-discussion (was Re: What do you miss most in Lua)
- From: Patrick Rapin <toupie300@...>
- Date: Thu, 9 Feb 2012 22:21:35 +0100
> Then if you have an ichars() method that returns a codepoint iterator,
> and you have the same API for other encodings, converting from one
> encoding to another becomes pretty easy:
>
> u8encoded = utf8.char(utf16.ichars(u16encoded))
Note that even without iterators it can be very easy (provided utf16
module works like utf8):
u8encoded = utf8.char{ utf16.codepoint(u16encoded, 1, -1) }
I like the utf8.ichars iterator idea. But thinking about it, I can't
find a situation where it would better suited than the utf8.codepoint
function.
The utf8.char function could do it, still the next requested feature
will be to be able to write HAKṢHMALAWARAYAṀ as
"\uf67\uf90\ufb5\ufa8\ufb3\ufba\ufbc\ufbb\uf82" ...
- References:
- Unicode and UTF-8 the Lua way, mid-discussion (was Re: What do you miss most in Lua), Jay Carlson
- Re: Unicode and UTF-8 the Lua way, mid-discussion (was Re: What do you miss most in Lua), Dirk Laurie
- Re: Unicode and UTF-8 the Lua way, mid-discussion (was Re: What do you miss most in Lua), Rob Hoelz
- Re: Unicode and UTF-8 the Lua way, mid-discussion (was Re: What do you miss most in Lua), Sam Roberts
- Re: Unicode and UTF-8 the Lua way, mid-discussion (was Re: What do you miss most in Lua), Roberto Ierusalimschy
- Re: Unicode and UTF-8 the Lua way, mid-discussion (was Re: What do you miss most in Lua), Duncan Cross