Glenn Maynard wrote:
[...]
Out of curiosity, what use is that? In particular, if a function
returns a character offset, and you want to use it to address the
string,
you have to convert it to a byte offset--which is an expensive
operation.
I've used UTF-8 for years, and I can't remember the last time I wanted
a character offset. (Even if you use wide strings, you still don't
get those directly, due to combining characters.)
I want to write a text editor, and so there'll be lots of nasty
fetch-the-character-from-column-Z issues. Assuming each grapheme
cluster
renders into a single character cell --- which I know is not strictly
valid,
as some clusters will occupy multiple cells --- then dealing with
character
offsets instead of byte offsets will make life much easier.