lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]



On 7-Dec-06, at 5:55 PM, Mike Pall wrote:

Well, then there are also distinct characters that have the same
glyph shape, Like 'a' and '\u0430' (Cyrillic a). Normalization
won't help you here ... There is no perfect solution.

Absolutely, but one can minimize confusion.

I'm unlikely to accidentally type a Cyrillic a when I meant 'a', but it is very easy to accidentally have the wrong character encoding, or to be using an input method which decomposition normalizes instead of composition normalizing.

Protecting against the wrong character encoding is easy, though: just insist that the source file be valid utf-8, which is a very fast test.