RE: question about Unicode

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: RE: question about Unicode
From: "Jerome Vuarand" <jerome.vuarand@...>
Date: Tue, 5 Dec 2006 10:23:24 -0500

David Given wrote:
> I want to write a text editor, and so there'll be lots of 
> nasty fetch-the-character-from-column-Z issues. Assuming each 
> grapheme cluster renders into a single character cell --- 
> which I know is not strictly valid, as some clusters will 
> occupy multiple cells --- then dealing with character offsets 
> instead of byte offsets will make life much easier.

Also keep in mind that many Unicode characters are meant to be combined with others (`+E gives È for example), and as such you will have multiple unicode codepoints for a single grapheme (and a single character cell). Character offset in unicode strings don't reflect grapheme offset in the string graphical representation, even with fixed width fonts.

Follow-Ups:
- Re: question about Unicode, Klaus Ripke
- Re: question about Unicode, David Given

References:
- Re: question about Unicode, David Given

Prev by Date: Re: question about Unicode
Next by Date: Re: How do you deal with event functions?
Previous by thread: Re: question about Unicode
Next by thread: Re: question about Unicode
Index(es):
- Date
- Thread