Re: UTF-8 testing

Subject: Re: UTF-8 testing
From: Henning Diedrich &lt;hd2010@ ... &gt;
Date: Thu, 06 Jan 2011 23:26:57 +0100

lua-l archive

Hi Sean,

On 1/6/11 10:59 PM, Sean Conner wrote:

it assumes a valid UTF-8 string to begin with

I think that this is the main problem with the tests not giving consistent results, the 'broken' chars.

  -Been doing a lot of UTF-8 wrangling recently

Can you tell me an official count of

https://gist.github.com/768309 ('tamed' version of the last)
http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt
http://www.columbia.edu/kermit/utf8.html

Thanks,
Henning

References:
- UTF-8 testing, Henning Diedrich
- Re: UTF-8 testing, Sean Conner