lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


If anyone has a reasonable solution for this problem that
does not involve a look-ahead, please feel free to propose.
What I can't do is see a <CR> and return a line, then see a
<LF> on the next read and return an empty line. Or worse,
wait for a <LF> after a <CR> that would never come if I
accepted <CR> as a terminator.

I've suggested "=x" as a receive() parameter which says "x is a single byte value which is my EOL, now read me a line and return everything up to but not including x". This will work everywhere that "*l" works, with a little bit of extra effort by the caller to look for other control characters which then may or may not be ignored depending on ambient settings elsewhere in the caller's code. (And since x is a single byte then no lookahead is required, and the received data can be exactly reconstructed. This handles two issues nicely.)

"=x" will solve the SMTP-style problem (where LF is always the _final_ part of an EOL marker, but where sometimes you might want to check up on the CRs). "=x" also solves the Flash-style problem (where you have lines ended with NUL markers).

"=x" will only solve part of the HTTP-style problem (where the EOL marker can be x, xy or y), which at least needs look-behind -- so that if the first byte of the second line is y when the previous line was terminated by x rather than y, then that first-byte y should be ignored...and leaves the issue of what to do with yx, which RFC2616 implies is two consecutive EOLs if xy is the combination considered a single EOL. Confused? Well, RFC2616 calmly also reminds you the values of x any y may change between documents, and won't always be 13 and 10, due to encoding issues :-)

So if it's felt that being able to change the "one true" EOL char (usually LF) and not to lose CRs is a feature common enough to be useful, and if we ignore the HTTP problem because it's too complex for LuaSocket to want to sort out, then "=x" is IMO worth it.

Otherwise, the only other easy solution I can think of which still KISS but does solve the "how many CRs did I eat" issue is to continue to use "*l" for what it currently does, and have (say) "*L" for 'do exactly what "*l" does, except return _everything_ (CRs and final LF included) in the line, and leave it to me to decide what to do with the characters I don't want or need.' Like the "=x" change, this will affect about five lines of code.

Failing that, RFC2616 has made me accept that doing nothing here is a perfectly good solution. If you want to be clever, do your own buffering higher up!