[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Lost in Unicode
- From: Enrico Colombini <erix@...>
- Date: Mon, 20 Oct 2003 22:18:38 +0200
On Monday 20 October 2003 18:15, RLake@oxfam.org.pe wrote:
> Roberto's function will also fail, possibly more seriously, on characters
> outside of the ISO-8859-1 range; in particular, the code page typically
> used by non-Unicode OS's uses high-control characters (in the range 0x80
> to 0x9F) for additional graphics characters whose Unicode code points are
> outside of the two-byte UTF-8 range. In particular, typographic single and
> double quotes will not translate properly, nor will typographic em dashes,
> and those are characters typically inserted by editors (or at least by MS
> Word).
You're right, I'd better consider 0x80..0x9f as 'invalid'. That should not be
a problem in this case, as I was thinking of an Italian-only application
where the only really needed characters are accented letters (thankfully, as
the Euro sign in ISO-8859-15 would be another problem). For the same reason,
I do not expect my users to employ ISO-8859-2.
i18n in the real world seems to be a big mess... give me back my Apple ][ :-)
Enrico