lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Thu, 12 Jun 2003 18:18:43 -0500, <RLake@oxfam.org.pe> wrote:

Suppose I have three strings: "Ångstrom", "Ångstrom", and "Ångstrom".
[...]
identical, but they don't, at leat on this machine, with this mail client and this font (Windows NT / Lotus Notes / Lucida Sans Unicode 10 pt, as it happens), where they look slightly different.

Windows XP Pro / Opera M2 7.11 RU / Courier New 10 pt - the same. There are
no ideal Unicode fonts yet... or font displaying engines?

Well, OK, that is a bit of a cheat because I think they actually turn into the same string if you apply any Unicode Normalisation transformation. But what about Cyrillic? (Or Greek, for that matter.) Do the identifiers "A", "А", and "Α" refer to the same object or not? (That was U+0041, U+410 and U+391, respectively.) What is the general case in which this is not a Bad Thing? If you are referring to display of text, I would say that was a pretty specific case.

Not so specific, really :) Let's take "B", "C", "E" (latin) and
"В", "С", "Е" (russian). They look identically, but... their alphabetic
position is different (2, 3, 5 and 3, 20, 6 resp.). So these letters _must_
be different for correct sorting etc.

It could have been otherwise with a simple rule: 1 glyph == 1 code.

Simple but wrong...

P.S. Sorry for bad English ;)

--
WBR, AD