Re: question about Unicode

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: question about Unicode
From: Glenn Maynard <glenn@...>
Date: Tue, 5 Dec 2006 21:27:04 -0500

On Tue, Dec 05, 2006 at 04:33:48PM +0000, David Given wrote:
> In fact, when dealing with UTF-8 strings, all text should be normalised so you
> *don't* get the issue you mention above. Multiple-character graphemes should
> be collapsed down into a single character whereever possible (I believe that
> it is possible for all romance languages, but I could be wrong).

But with general combining, there will always be combinations that
don't.  If you're writing a good UTF-8 editor, it seems like good
manners to not normalize the text file the user is editing without
being asked, too (even if new text is created in eg. NFC).

> I think that's all I need. I should be able to do the rest with just those
> three, and conventional string munging tools. Hmm...

I'd recommend something along those lines--keep the core string handling
using bytes, and have the rendering-based stuff that deals in "columns"
at a higher level.

-- 
Glenn Maynard

References:
- RE: question about Unicode, Jerome Vuarand
- Re: question about Unicode, David Given

Prev by Date: Re: How do you deal with event functions?
Next by Date: Multi-level multi-inheritance realization
Previous by thread: Re: question about Unicode
Next by thread: Re: question about Unicode
Index(es):
- Date
- Thread