lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

On Tue, Dec 05, 2006 at 04:33:48PM +0000, David Given wrote:
> In fact, when dealing with UTF-8 strings, all text should be normalised so you
> *don't* get the issue you mention above. Multiple-character graphemes should
> be collapsed down into a single character whereever possible (I believe that
> it is possible for all romance languages, but I could be wrong).

But with general combining, there will always be combinations that
don't.  If you're writing a good UTF-8 editor, it seems like good
manners to not normalize the text file the user is editing without
being asked, too (even if new text is created in eg. NFC).

> I think that's all I need. I should be able to do the rest with just those
> three, and conventional string munging tools. Hmm...

I'd recommend something along those lines--keep the core string handling
using bytes, and have the rendering-based stuff that deals in "columns"
at a higher level.

Glenn Maynard