Re: Feature request: "u" option to file:read

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: Feature request: "u" option to file:read
From: Coda Highland <chighland@...>
Date: Fri, 23 Feb 2018 09:08:27 -0600

If we guarantee that "u" always consumes the character, and it just
returns nil if that character happens to be invalid, then it works. If
the current UTF-8 sequence is invalid (and therefore the read returns
nil), then the byte that you thought was going to be a continuation
byte but was in fact not can is the only one that needs pushed back.

If we have to be able to put the read pointer back where it was before
the read in case the character is invalid (e.g. so a different
function could read out the raw bytes) then that couldn't be done with
a single unget.

/s/ Adam

On Fri, Feb 23, 2018 at 8:48 AM, Charles Heywood <vandor2012@gmail.com> wrote:
> No, because a valid UTF-8 sequence can be invalidated multiple bytes in.
>
> On Fri, Feb 23, 2018 at 8:46 AM Luiz Henrique de Figueiredo
> <lhf@tecgraf.puc-rio.br> wrote:
>>
>> > "u": reads one or more bytes forming one UTF-8 character, and returns
>> > that character as a string. Returns nil if the file at the current
>> > position does not start with a valid UTF-8 sequence.
>>
>> Can this be done without having to unget more than one byte from the
>> stream?
>>
> --
> --
> Ryan | Charles <vandor2012@gmail.com>
> Software Developer / System Administrator
> https://hashbang.sh

Follow-Ups:
- Re: Feature request: "u" option to file:read, Roberto Ierusalimschy

References:
- Feature request: "u" option to file:read, Dirk Laurie
- Re: Feature request: "u" option to file:read, Luiz Henrique de Figueiredo
- Re: Feature request: "u" option to file:read, Charles Heywood

Prev by Date: Re: Feature request: "u" option to file:read
Next by Date: Re: Feature request: "u" option to file:read
Previous by thread: Re: Feature request: "u" option to file:read
Next by thread: Re: Feature request: "u" option to file:read
Index(es):
- Date
- Thread