[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: io:lines() and \0
- From: Ulrich Schmidt <u.sch.zw@...>
- Date: Fri, 21 Feb 2014 08:34:43 +0100
Hi all.
I have read the entire thread until now and (i am sorry) i cant find any
good idea in here.
What we are discussing about? We are talking about 8-bit charset text
streams. Everyone who dealt with - including me - knows: 8-bit char-sets
are .... outdated (very friendly spoken). In case you receive a 8-bit
text file, you probably know nothing about it.
- What codepage was used?
- May be it is a old CP/M textfile where ^Z is used to define the text
end. (CP/M file size is a multiple of 128)
- UTF8 extensions in use?
... and much more Questions how to read the text i cant answer.
There is no and there will never exist a fire-and-forget solution for
reading 8-bit text streams.
I would like to see a lua version working with UTF16. And if someone
want to read 8-bit text, he can convert it - using his knowledge about
the text history - to UTF16. And please dont blame lua for this 8-bit-mess.
m2c.
Ulrich.
Am 17.02.2014 16:51, schrieb René Rebe:
Hi all,
I just noticed that io:lines() does not cope with \0 in the lines, and
thus just returns truncated lines (lua-5.2.3, but legacy 5.1 likewise).
May I suggest replacing the call to fgets in src/liolib.c so that we can
read lines with \0 data?
René
--
ExactCODE GmbH, Jaegerstr. 67, DE-10117 Berlin
http://exactcode.com | http://exactscan.com <http://exactscan.com/> |
http://ocrkit.com <http://ocrkit.com/> | http://t2-project.org
<http://t2-project.org/> | http://rene.rebe.de