Re: UTF-8 validation

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: UTF-8 validation
From: Jonathan Goble <jcgoble3@...>
Date: Wed, 9 Dec 2015 21:32:51 -0500

On Wed, Dec 9, 2015 at 9:29 PM, Jay Carlson <nop@nop.com> wrote:
> Given a string where is_utf8(s) is false, it might be nice to be able to find the byte offset of the first non-UTF-8 sequence.

utf8.len() already does this. On an invalid sequence, it returns two
values: nil plus the byte position of the first invalid sequence. I
believe this was also mentioned earlier in the thread.

Follow-Ups:
- Re: UTF-8 validation, Jonathan Goble
- Re: UTF-8 validation, Jay Carlson
- Re: UTF-8 validation, Jay Carlson

References:
- UTF-8 validation, Cezary H. Noweta
- Re: UTF-8 validation, Coda Highland
- Re: UTF-8 validation, Cezary H. Noweta
- Re: UTF-8 validation, Coda Highland
- Re: UTF-8 validation, Cezary H. Noweta
- Re: UTF-8 validation, Javier Guerra Giraldez
- Re: UTF-8 validation, Coda Highland
- Re: UTF-8 validation, Jay Carlson

Prev by Date: Re: UTF-8 validation
Next by Date: Re: UTF-8 validation
Previous by thread: Re: UTF-8 validation
Next by thread: Re: UTF-8 validation
Index(es):
- Date
- Thread