Re: io:lines() and \0

Subject: Re: io:lines() and \0
From: Ren&#xE9; Rebe &lt;rene@ ... &gt;
Date: Thu, 20 Feb 2014 21:15:55 +0100

On Feb 20, 2014, at 21:03 , Dirk Laurie wrote:

2014-02-20 21:44 GMT+02:00 René Rebe <rene@exactcode.de>:

The discussion is about lines(), that it using fgets is just an
implementation detail.

If Roberto would not kind of implied performance loss is not that acceptable
with his bible test case then a fgetc() look without all this troubles would
have been very fine for me, too.

I can certainly give up improving vanilla Lua and convincing some that
random data loss is usually considered a bug, and live very happily with the
fix that works for me just fine.

Have fun parsing MIME, CGI data, or financial programs exports using \0
field delimiters. Or wherever a zero comes along.

It is useful to look again at the start of the post where it all started.

I just noticed that io:lines() does not cope with \0 in the lines

Allow me to summarize the facts.

1. io.lines operates on text files.

lines operates on streams, which on most platforms these days only operate in binary mode anyway.

2. Text files may not contain any control character except whitespace.

My popular system Linux -running most internet servers and such you know- does not know about text files and treads all bytes equally. So do all the BSD, Mac OS X and whatnot.

3. \0 is not whitespace.

Is there an official standard for this?

In other words, the behaviour complained of is that a standard library
routine when given data that does not conform to specification gives
undefined results.