lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi,

On Feb 21, 2014, at 9:06 , Tom N Harris wrote:

There isn't a reliable way to recover a complete "line" that may contain any
byte using fgets. Any trick with padding will always run into the corner case
of a file ending with the padding character and no EOL. The replacements such

There is no corner case left. At EOF without EOL the string is 0 terminated.
My last patch checks for if the stream reached EOF and removes the superfluous
\0\n.

as getline are not part of C89. So to make the io library read those lines
(not only io.lines but file:read"*l") it would have to forego fgets and read
the file a byte at a time. Reading individual characters from a stream is
notoriously slow.

No, it works just fine with fgets. See my last proposed patch. All covered
without much noticeable performance loss on the bible test case on an
Intel Core.

So if you don't know that your file contains only text in a
simple encoding, you should treat it like arbitrary data and read large chunks
into Lua then split them.

Make it complex and error prone when it could just work?

The "fix", as was mentioned some days ago, is to add a note to the manual that
the line reading functions don't work if the line to be read may contain non-
text characters such as NULL, CR, or Ctrl+Z. In other words:

   A man said to the doctor, "It hurts when I move my arm like this."
   Said the doctor, "Then don't do that."

And not heal AIDS, nor cancer? I would rather not have my arm hurt. 

-- 
 ExactCODE GmbH, Jaegerstr. 67, DE-10117 Berlin
 http://exactcode.com | http://exactscan.com | http://ocrkit.com | http://t2-project.org | http://rene.rebe.de