[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: utf8 library may cause heap corruption
- From: Kim Alvefur <zash@...>
- Date: Thu, 9 Feb 2017 15:08:03 +0100
On Thu, Feb 09, 2017 at 02:30:39PM +0200, Dirk Laurie wrote:
> 2017-02-09 14:05 GMT+02:00 云风 Cloud Wu <email@example.com>:
> > But there is another problem.
> > local s = "\xE4\xBA"
> > assert(utf8.len(s, 1, 2) == utf8.len(s .. "\x91",1,2)) -- failed
> Why is this a problem? It should fail. s is not a valid UTF8 codepoint
> ("\xE4" promises three bytes, but there are only two). When you
> supply the extra byte, there is one valid codepoint. starting between
> charaters 1 and 2.
The manual says:
> Returns the number of UTF-8 characters in string s that **start**
> between positions i and j (both inclusive).
Extra emphasis on **start**. The 3 byte does sequence starts within the
Description: PGP signature