lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi:


On Thu, Feb 20, 2014 at 2:48 PM, Enrico Colombini <erix@erix.it> wrote:
> I'll add a note: Lua is also used on embedded systems, whose compilers
> sometimes have proprietary C libraries; it should work everywhere.

Well, everywhere you have an ANSI C lib, I do not know if this should
be a conforming, or freestanding or whatever implementation. I think
the problem will be more with mainframes than with embeded.

Current implementation is good enough for this, I would prefer my getc
version because, due to my background, losing data is a capital sin. I
would tolerate lua core-dumping on me if I feed it nulls, or any other
not strictly text character, but would never tolerate a library where
I read a file and it happily discards parts of it due to embeded
nulls, specially in a language which has nul-safe strings and states
it in several places running with a libc which treats nuls as any
other char. Although nuls are a hairy stuff in *ix as they are trated
specially by the operating system too ( IIRC you can use every byte
except \0 and / in a file name ).

> [*] K&R 2nd ed. says:
> << fgets reads at most the next n-1 characters into the array s, stopping if
> a newline is encountered; the newline is included in the array, which is
> terminated by '\0'. [etc.]. >>

> A library implementor could be forgiven here for thinking that a '\0' in the
> file should not happen, or that it could be considered a terminator (either
> while reading the file or, more correctly, when copying the line from a
> system buffer to the 's' array): after all, copying data beyond the
> destination end ('\0') would be a waste of CPU cycles.

As I hinted above, if in a concrete implementation fgetc, or its macro
cousin getc, can return a null byte from a text-opened ( that would be
non-binary opened ) FILE, that means it defines nulls as chars to me,
so fgets should handle them the same way and the programmer can be
understood, but not forgiven. For me fgets(buf, size, file) should be
equivalent to a getc loop with some checkings for size, \n and EOF.
And, from what we've seen on this thread, it seems the libC
implementation do it that way. Is lua lib which does not.

I could tolerate if it interpreted '\0' as '\n', heck, I did tolerate
MSC discarding \015 ( which is not the same as mapping '\015\012' to
'\n' ), but reading past the null and then discarding the chars is too
much.

I think the main problem with lua now would be it does not clearly
specify file with embeded nuls are not safe to read by lines. By not
safe here I mean it seems that reading with *L and concatenating does
not yield the same result as reading with *a ( which , if I interpret
correctly, would yield the same result as reading with any number and
concatenating, as read_chars() and read_all use fread while read_lines
uses fgets which cannot be used to read lines with nulls.

And it is a shame The C library does not say anything about wether
fgets() modifies any part of the buf PAST the null it inserted,
otherwise we could use memset(anything) and then search for the nul
from the end of the string:

memset(buf, 'x', size);
if (fgets(buf, size, file)!=null) {
   len = memrchr(buf, 0, size);
}

( I know memrchr is not ansi, but it can be trivially written. Even
knowing if fgets returns not-null we must have a zero we could do
'l=size; while(buf[--l]);' )

But i would bet one day after putting this on the wild someone fires
it to a library which, say, helpfully zeroes the whole buf before
reading to aid in debug.

Francisco Olarte.