Re: [ANN] Winapi - a minimal but useful Windows API binding

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: [ANN] Winapi - a minimal but useful Windows API binding
From: Peter Cawley <lua@...>
Date: Thu, 9 Jun 2011 14:32:22 +0100

On Thu, Jun 9, 2011 at 2:16 PM, steve donovan <steve.j.donovan@gmail.com> wrote:
> Ah, but any plain ASCII is a degenerate (and valid) kind of UTF-8, so
> I have the old problem of how to decide:
>
> http://stackoverflow.com/questions/1031645/how-to-detect-utf-8-in-plain-c

I don't follow your issue here. As clearly explained on Wikipedia [1],
not all byte sequences are valid UTF-8. Byte sequences consisting
entirely of values between 0 and 127 are fine as they have the same
meaning in UTF-8 as in ASCII. The assumption that people make is that
if text is ASCII and uses codes between 128 and 255, then at least
once it won't use two of those codes in a row, and thus will be an
invalid UTF-8 byte sequence. Obviously there are examples of exotic
ASCII strings which *are* valid UTF-8 byte streams and have different
meaning when interpreted as UTF-8, but they are generally ignored due
to being uncommon in real-world usage.

[1] http://en.wikipedia.org/wiki/UTF-8#Invalid_byte_sequences

Follow-Ups:
- Re: [ANN] Winapi - a minimal but useful Windows API binding, Javier Guerra Giraldez

References:
- [ANN] Winapi - a minimal but useful Windows API binding, steve donovan
- Re: [ANN] Winapi - a minimal but useful Windows API binding, Patrick Rapin
- Re: [ANN] Winapi - a minimal but useful Windows API binding, steve donovan

Prev by Date: Re: [ANN] Winapi - a minimal but useful Windows API binding
Next by Date: Lua Missions (a.k.a. Lua Koans)
Previous by thread: Re: [ANN] Winapi - a minimal but useful Windows API binding
Next by thread: Re: [ANN] Winapi - a minimal but useful Windows API binding
Index(es):
- Date
- Thread