[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: UTF-8 validation
- From: Jonathan Goble <jcgoble3@...>
- Date: Wed, 9 Dec 2015 21:32:51 -0500
On Wed, Dec 9, 2015 at 9:29 PM, Jay Carlson <nop@nop.com> wrote:
> Given a string where is_utf8(s) is false, it might be nice to be able to find the byte offset of the first non-UTF-8 sequence.
utf8.len() already does this. On an invalid sequence, it returns two
values: nil plus the byte position of the first invalid sequence. I
believe this was also mentioned earlier in the thread.