Re: What do you miss most in Lua

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: What do you miss most in Lua
From: Tim Mensch <tim-lua-l@...>
Date: Tue, 07 Feb 2012 10:55:51 -0700

On 2/7/2012 4:13 AM, David Given wrote:

What do you mean by a 'character'? A Unicode code point?

I can't speak to what HyperHacker means, but when I manipulated UTF-8for handling internationalization on over 50 games, I never had to dealwith a SINGLE instance of grapheme clusters breaking the code. I ignoredthem, and those 50 games were translated into at least 3 languages each,and some as many as 9 languages (including Chinese, Japanese, and Arabic).

I would suggest that a LOT of UTF-8 usage in the real world follows thatpattern; not everyone is writing text entry fields.

If you're having to deal with completely arbitrary Unicode, then yes,you need to deal with grapheme clusters. An optimization we added forUTF-8 code points would actually be useful there, though: In each stringwe cached the last code point offset requested. If you asked for s[i],it would find the i'th code point, and remember the binary offset atthat code point, so loops like:


for (int i=0; i<s.length(); ++i)
{
    // do something with s[i]
}

...would only need to "walk" the string from the last code point, whichis an O(1) operation up or down. You could use the same cache to "walk"grapheme clusters, and then it's mostly O(1), unless you have crazylarge clusters making up the text, at which point it's O(M) where M iscluster length.

In neither case does it necessarily make the library hideouslyheavyweight, though, unless adding an index/offset pair to each UTF-8string is hideously overweight. The only obvious change is that, insteadof returning a code point, which can fit in an int, the s[i] above wouldreturn a binary blob (effectively, an opaque string).

Tim

References:
- Re: What do you miss most in Lua (was: Why isn't Lua more widely used?), sergei karhof
- Re: What do you miss most in Lua (was: Why isn't Lua more widely used?), Jay Carlson
- Re: What do you miss most in Lua, Miles Bader
- Re: What do you miss most in Lua, Jay Carlson
- Re: What do you miss most in Lua, Miles Bader
- Re: What do you miss most in Lua, HyperHacker
- Re: What do you miss most in Lua, David Given

Prev by Date: Re: Help learning Lua for Make-A-Wish Foundation
Next by Date: Re: [LuaJIT] Bitwise Operations on 64-bit FFI Integers?
Previous by thread: Re: What do you miss most in Lua
Next by thread: Lua_Apps.html, was Where Lua *is* used
Index(es):
- Date
- Thread