[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: io:lines() and \0
- From: Francisco Olarte <folarte@...>
- Date: Sat, 22 Feb 2014 20:50:56 +0100
Hi René:
On Sat, Feb 22, 2014 at 7:51 PM, René Rebe <rene@exactcode.de> wrote:
> Yeah, fugly, ... just trying to make the best out of the C function while
> trying to preserve the performance.
I totally understood and respected that. For me the worst thing is the
non-guarantee of non-touching-besides the \0 it inserted by fgets. The
rest is quite normal.
> While I do not really like to beat this dead horse even further, ... I
> still find this "undefined" behavior not really elegant and matching
> to Lua's otherwise quite high standards of elegance and cleanness.
+1
> Although by now Roberto is probably already shivering with each
> email with this subject in his inbox, one more proposal to fix this
> in a clean way could be: your gets loop but not using the buffer
> management advance function in "luaL_addchar(buffer, c);", but
> call fgetc in a tight loop with the existing buffer resizing with
> LUAL_BUFFERSIZE. That could potentially be much faster.
> However, as most people are against any improvement here I did
> not yet benchmark that, ... yet.
NONONO. Or yes but with some explanation.
First, on tight functions you do not use fgetc, always go for getc.
Second, I doubt the problem is due to lualAddchar. If you remember my
benchmarks, getc, which is normally very similar to addchar, managed
to beat dgets by 10% easily. The real problem is the LOCKING. Moderns
stdios are thread safe in a perverse way, they sync each op. The fact
I managed to do 80M getc in .8 secs, which is 10ns per getc, is a
testimony to how fast they do it, but this kind of thing would kill
performance.
addchar should be quite fast:
#define luaL_addchar(B,c) \
((void)((B)->n < (B)->size || luaL_prepbuffsize((B), 1)), \
((B)->b[(B)->n++] = (c)))
Althoguh my personal style would be to define the buffer using the
base-current-end pointer trio plus the overflow function, which lends
itself to the usual getc() trick:
inline X get(buf) { return buf->curr==buf->end ? outlined_get(buf) :
*(buf->curr++) }
( I normally code my C using the C++ compiler, just to be able to use
inlines, declaring anywhere and all this things )
Your suggestion, in this case, would be correct, but IMO because you
would need to conditionally.test for getc_unlocked plus
flockfile/funlockfile, and them use the prepbuf version ( possibly
using insider-knowledge of the buffer-size-doubling policy of
lual_prepbuffer ) to lock and unlock the file around it ( cause you
cannot leave the file unlocked in addchar / prepbuff as they may
longjmp out ).
Of course if there is a way to pcall a cfunction ( apart of wrapping
it as closure ), you could make a wrapper which locks, calls the
worker, and unlocks. I miss some kind of pcall for C as it is not the
first time I've hit a resource management problem. Maybe there is and
I should dig more.
> I do not really see why others have such reservations against
> improving Lua to better handle this kind of binary files. Seriously,
> separating a stream at '\n' is not that difficult, nor should it be that
> 'verboten'.
I see it. I would switch to a getc() loop in a jiffy, and eat the
potential speed problem. But I'm not going to push a solution further,
I have deeper problems with lua which make it unsuitable for normal
usage to me, I just use it embeded without the std libaries, so this
is no biggie.
Regards.
Francisco Olarte.
- References:
- io:lines() and \0, René Rebe
- Re: io:lines() and \0, steve donovan
- Re: io:lines() and \0, René Rebe
- Re: io:lines() and \0, Enrico Colombini
- Re: io:lines() and \0, steve donovan
- Re: io:lines() and \0, René Rebe
- Re: io:lines() and \0, Craig Barnes
- Re: io:lines() and \0, René Rebe
- Re: io:lines() and \0, Sean Conner
- Re: io:lines() and \0, René Rebe
- Re: io:lines() and \0, Tim Hill
- Re: io:lines() and \0, Francisco Olarte
- Re: io:lines() and \0, René Rebe