lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Feb 21, 2014, at 8:34 , Ulrich Schmidt wrote:

Hi all.

I have read the entire thread until now and (i am sorry) i cant find any good idea in here.

What we are discussing about? We are talking about 8-bit charset text streams. Everyone who dealt with - including me - knows: 8-bit char-sets are .... outdated (very friendly spoken). In case you receive a 8-bit text file, you probably know nothing about it.

? UTF-8 is the current hotness, and actually my 8-bit streams are usually UTF-8.

- What codepage was used?
- May be it is a old CP/M textfile where ^Z is used to define the text end. (CP/M file size is a multiple of 128)
- UTF8 extensions in use?
... and much more Questions how to read the text i cant answer.

There is no and there will never exist a fire-and-forget solution for reading 8-bit text streams.

I would like to see a lua version working with UTF16. And if someone want to read 8-bit text, he can convert it - using his knowledge about the text history - to UTF16. And please dont blame lua for this 8-bit-mess.

You can please open an own thread for your UTF-16 vote, please ;-) (many consider UTF-8 superiors, though, more efficient storage, and UTF-16 needs surrogate pairs, as well, and thus is variable length multi byte, likewise)


Am 17.02.2014 16:51, schrieb René Rebe:
Hi all,

I just noticed that io:lines() does not cope with \0 in the lines, and
thus just returns truncated lines (lua-5.2.3, but legacy 5.1 likewise).

May I suggest replacing the call to fgets in src/liolib.c so that we can
read lines with \0 data?


 ExactCODE GmbH, Jaegerstr. 67, DE-10117 Berlin | <> | <> |
<> |

 ExactCODE GmbH, Jaegerstr. 67, DE-10117 Berlin | | | |