[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Unicode and UTF-8 the Lua way, mid-discussion (was Re: What do you miss most in Lua)
- From: Miles Bader <miles@...>
- Date: Fri, 10 Feb 2012 11:21:25 +0900
Miles Bader <miles@gnu.org> writes:
> -- iterate over STRING, outputting a single character at a time
> while start < end do
> local codepoint = utf8.codepoints (string, start)
> output_unicode_codepoint_to_mumble (mumble, codepoint)
> start = utf8.byteoffset (string, 1, start) -- increment START
> end
Hmm, maybe that's wrong, maybe it should be
utf8.byteoffset(string,2,start)?
This is a case where 1-based indexing is a little confusing, since
the operation I really want is "give me a byteoffset which is +N utf8
characters from a start byteoffset" -- so I expect +1 to be "next
character", 0 to be "no change", and maybe -1 "previous character".
I dunno maybe it's better to have a separate function for this?
-Miles
--
Monday, n. In Christian countries, the day after the baseball game.
- References:
- Unicode and UTF-8 the Lua way, mid-discussion (was Re: What do you miss most in Lua), Jay Carlson
- Re: Unicode and UTF-8 the Lua way, mid-discussion (was Re: What do you miss most in Lua), Dirk Laurie
- Re: Unicode and UTF-8 the Lua way, mid-discussion (was Re: What do you miss most in Lua), Rob Hoelz
- Re: Unicode and UTF-8 the Lua way, mid-discussion (was Re: What do you miss most in Lua), Sam Roberts
- Re: Unicode and UTF-8 the Lua way, mid-discussion (was Re: What do you miss most in Lua), Roberto Ierusalimschy
- Re: Unicode and UTF-8 the Lua way, mid-discussion (was Re: What do you miss most in Lua), Miles Bader