[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Speed of Lua's immutable strings vs buffers (split from Re: Pooling of strings is good)
- From: Sean Conner <sean@...>
- Date: Mon, 25 Aug 2014 19:38:46 -0400
It was thus said that the Great Coroutines once stated:
> On Mon, Aug 25, 2014 at 3:02 PM, Sean Conner <sean@conman.org> wrote:
>
> > So, the code. It's all here:
> >
> > https://github.com/spc476/lua-net-speed
>
> Hmm.
>
> Okay, first I want to say that I really appreciate you going through
> all the effort to bootstrap this comparison.
>
> I think we may have a misunderstanding though -- my objective was
> mostly to save on reallocating the same buffer, as I cannot mutate it
> once it becomes a string. I think there might be a potential for this
> to be faster by always having a fixed-sized buffer allocated to copy
> out to a string (if needed), but mostly this was about avoiding
> duplicating the buffer that gets recv()'d into :> I am still
> interested in this comparson on speed, however.
The suckb.lua script does allocate the buffer once, and it is of a fixed
size. The __len() function returns the currently used amount of the buffer.
> > I am willing to conceed an error in my methodology, but the ball is
> > now in Coroutines' court to show where I am in error.
>
> I *want* to disagree with how this benchmark is done because even if
> the networking portion of this comparison is done correctly it's using
> your socket library and I have my own -- and others have luasocket.
> It would take some time to properly critique all 3 and even set up all
> 3 for this benchmark to get a fairer view of things.
>
> I'm going to try to take a day to look at all of this but I want to
> point out that in suck.lua `data' is a local defined within the block
> -- it is created from the :recv() call on every iteration. In
> suckb.lua `data' is a global declared on line 61 -- I think the global
> lookup as `data' is passed to :recv() might have some cost in this. I
> don't believe this should make very much of a difference -- if these
> results are to be believed I'm shocked you can get away with
> allocating such a large buffer for the string-returning :recv() so
> often.
Fair criticism, and fixed in the repo. I hoisted the definitions out of
the main loop, and for suckb.lua, I made sure that data was declared local.
I reran the tests but the results were similar as before.
> Besides that, I think your recv()/recvb() functions could be "faster"
> for both/overall if they avoided allocating for the peer address
> userdata/sockaddr -- you can get the remote peer information with
> getpeername(). As I'm sure you know, recv/recvfrom/recvmsg are
> functions that are expected to be called quite often -- allocating for
> the peer sockaddr is bad (imo) on every call to them. I also think
> it's odd that recv() calls recvfrom(), but there should be no
> difference in the results.
There are four functions you can use to read data from a socket:
read()
recv()
recvfrom()
recvmsg()
read() is fine for TCP and for "connected" UDP sockets [1], but for
unconnected TCP (or even packets for other IP protocols like OSPF) you can't
use read(). That's one reason I rejected using read().
Of the three left, recv() is similar to recvfrom(), but drops the remote
address, and recvfrom() can be written in terms of recvmsg() but with a bit
more work, so from that standpoint, I decided to just use recvfrom(), but
name the routine "sock:recv()" on the Lua side. I mostly work with
unconnected UDP, so having sock:recv() push the address is required. It
would be easy enough to comment out that part of the code to test, but I
doubt it'll give any significant performance boost (but I'm willing to be
surprised).
But that's the reasoning behind the way I wrote the code.
> You did a fair benchmark iterating over every :byte() in the buffer
> and string -- I think this is also a fair benchmark in terms of
> "real-world" usage even if I feel it should have just been focused on
> duplicating and running the string library over a mutating userdata.
The data has to come from somewhere. Another approach would be to read
"/dev/urandom", but I felt like doing this via the network would be more in
tune towards the type of work you are doing with Lua.
> > On a related note---for drastic changes like introducing mutable strings
> > to Lua, it should be up to the presenter to do the work, to do the
> > implementation (even if it does duplicate code) to show the idea is sound.
> > Working code trumps wishes and laments. I applaud Jan Behrens' work on
> > ipairs because he's taking the time to implement his ideas.
>
> Twist that dagger a little deeper, eh? :p
It's a fair cop 8-P
> Introducing a buffer type and metatable interface to it was not my idea. I
> was hoping to learn from upstream how I might trick Lua's typechecking
> functions into believing userdata were strings and getting it to safely
> access the ->buf portion of the userdata (rather than the similar data
> member in the TString struct).
Harder than it sounds. One thougt I did have was to "tweak" the type byte
in the common header from LUA_TUSERDATA to LUA_TSTRING, but then I looked at
the defintions of both strings and userdata:
typedef union TString {
L_Umaxalign dummy; /* ensures maximum alignment for strings */
struct {
CommonHeader;
lu_byte reserved;
unsigned int hash;
size_t len;
} tsv;
} TString;
typedef union Udata {
L_Umaxalign dummy; /* ensures maximum alignment for ocal' udata */
struct {
CommonHeader;
struct Table *metatable;
struct Table *env;
size_t len;
} uv;
} Udata;
You *MIGHT* get away with that on a 32-bit system where sizeof(pointer) ==
sizeof(int) == sizeof(lu_byte + padding) and nothing touches the hash,
metatable or env fields of either structure. But on a 64-bit system, I
woudn't count on sizeof(tsv) (from TString) being equal to sizeof(uv) (from
Udata).
-spc
[1] A call to connect() on a UDP socket associates a remote/local pair
of addresses to the socket. The downside is that you can only
receive packets from the remote address you called connect() on.
- References:
- Re: Pooling of strings is good, Philipp Janda
- Re: Pooling of strings is good, Coroutines
- Re: Pooling of strings is good, Philipp Janda
- Re: Pooling of strings is good, Coroutines
- Re: Pooling of strings is good, Axel Kittenberger
- Re: Pooling of strings is good, Coroutines
- Re: Pooling of strings is good, Roberto Ierusalimschy
- Re: Pooling of strings is good, Coroutines
- Speed of Lua's immutable strings vs buffers (split from Re: Pooling of strings is good), Sean Conner
- Re: Speed of Lua's immutable strings vs buffers (split from Re: Pooling of strings is good), Coroutines