lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


>>
>> Perhaps you're using a file created on a different system and not 
translated
>> as a text file during transfer.

> This is text typed in by the user directly from TextEdit (I know, don¹t
> cringe mac users out that that I'm still using TextEdit; it's on my list 
to
> change! :)

> TE always has had \r at new lines...

You should change text editors. :) But I don't think that will solve your 
problems.

I'm assuming you're using Mac OSX -- my new iMac showed up last week so 
now I'm learning how to deal with its quirks, too.

The problem is that OSX is schizophrenic about line endings; the 
underlying BSD environment uses line-feed and the compatibility Mac 
environment uses carriage-return. So either could show up in any file. 
Furthermore, TextEdit is liberal about what it will accept as a 
line-ending (it accepts either line-feed, carriage-return, or 
carriage-return/line-feed (DOS style), and I think it also accepts the 
Unicode line end character which no OS that I know of uses), but it does 
not force all line-endings in a given file to be consistent, so if you 
edit a file with TextEdit you can end up with inconsistent line endings 
and no way of knowing.

Lua uses standard C; it opens files in text mode (i.e. without the "b" 
attribute) and assumes that the standard C library will correct line 
endings to whatever the byte the compiler translates "\n" into, which is 
what the standard says it should do. Unfortunately, MacOSX doesn't have a 
consistent line ending character. So the included C library does no 
translation, but different development tools available on the Mac might 
compile "\n" in different ways. Irritating, isn't it?

<Extremely technical detail>
There is a widespread myth that \n in a C string means hex 0A. It doesn't. 
(Although lots of C compilers use hex 0A to replace \n.) The ANSI/ISO 
standard simply says that there is some "new line" character which fits 
into a single char, and which is represented in a string or character 
literal by \n. The standard also says that the standard IO library must 
translate text files on input and output so that they appear to use \n 
(whatever it is) as a line ending character. Since the traditional 
Apple/Mac operating systems actually used hex 0C as a line ending 
character, Mac-oriented development environments tended to translate \n 
into hex 0C; this is the representation of \r on most DOS/Unix development 
environments. Note that in DOS/Windows environments, the external 
representation of the file uses the two-octet sequence 0x0C0A to represent 
a line end; however, the C library distributed in those environments 
replaces this with 0x0A, which the C compiler uses for \n.

Actually, neither Unix nor DOS really has a line ending character, but the 
C behaviour effectively creates a protocol. Unix and DOS file systems 
simply represent text files as octet-sequences; the OS itself doesn't have 
a concept of lines. There are other operating systems which actually 
represent text files as a list of lines, often of fixed lengths; the C 
standard allows the standard library to delete trailing blanks in a line 
(and then later pad the line back out with blanks) in order to maintain 
the fiction that such a file is actually a Unix-like stream. (The C 
standard does not guarantee that you can even write a line longer than 254 
characters into a file, by the way.)
</Extremely technical detail>

None of this helps you much, Ando, sorry. I think that to properly support 
the MacOSX, we would need at least to write a filter function for 
loadfile. I'm sure I'll be forced to do that soon if no one else gets 
around to it. (Perhaps CFStringGetLineBounds is the appropriate MacOSX API 
function.) Meanwhile, your best bet is probably to download BBEdit Light 
or buy the full version, or use the Project Builder text editor, or one of 
the utilities kicking around to normalise line endings. 

R.