lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

I have lots of experience with Python, and am just beginning to use Lua in earnest.
I'm trying to get some benchmarks on the sizes of function closures vs tables. Not
to optimize anything specific, just to know which coding style to
default to as I go forward. I realize that Lua handles both pretty
efficiently already.

>From what I can tell, using a function closure and accessing the
variables you need as "upvalues" is faster, but takes more memory, than
wrapping the variables you need in a table and accessing them there.

But how much more memory? This WoW wiki page:
says that closures take 20 bytes + 4 bytes per upvalue + 32 bytes per upvalue that ever
changes (whether in the closure or outside it).

That is, if I understand rightly, the return value of this would be 24

function make_closure1(x)
    return function(y)
        return x+y

and the return values of these would be 56 bytes:

function make_closure2(x)
    x = x==nil and 0 or x
    return function(y)
        return x+y

function make_closure3(x)
    return function(y)
        x = x+1
        return x+y

Aha! I thought. Some functions I'm writing for my personal collection of
utility libraries would benefit from being as lean as possible. (I know,
I know, premature optimization. But humor me.) Some of them are
currently in the make_closure2 style but I could rewrite them like this:

function make_closure4(x)
    local function static_closure(x) -- this is in make_closure1 style
        return function(y)
            return x+y
    x = x==nil and 0 or x
    return static_closure(x)

there's an extra tail call in make_closure4 as opposed to make_closure2,
but we end up returning a "static" closure in the style of
make_closure1, which---if the WoW page is right, will in general take up less heap space. (With
these trivial examples the difference may be minuscule, but in real
cases it might be worthwhile.) 

But I thought to check myself about these size claims. Here's the
function I'm using to do that:

function sizeof(maker, ...)
    local siz=0
    local ret
    if type(maker) == "string" then
        maker=assert(loadstring(maker), "couldn't compile maker")
    return collectgarbage("count")*1024-siz

Here are the sizes I expect to see, if the size figures from the WoW page are accurate. Now, if Lua allocates heap space in blocks then my sizeof function may be less fine-grained than I'd like. But the results are still unexpected.

I expected to see:
make_closure1 = 20+4 = 24 bytes
make_closure2 = 20+4+32 = 56 bytes
make_closure3 = 20+4+32 = 56 bytes
make_closure4 = 20+4 = 24 bytes

Here's what I'm actually seeing:
> return sizeof(function() return function(y) return 10+y end end, 10)
> return sizeof(make_closure1,10)
> return sizeof(make_closure2,10)
> return sizeof(make_closure3,10)
> return sizeof(make_closure4,10)

>From the first line, it looks like a simple function (no upvalues) is taking 40 bytes not 20. Unless as I said, heap space is being allocated in blocks and that's affecting our results.

But for make_closure1, I expected to see 24 bytes, which should still fit in the 40 byte block. Why 88 bytes?

Or is what's going on that a simple function value is 40 bytes, and the 24,56,56,24 figures are what we should expect _in addition_, the additional overhead of tracking upvalues? In that case, I'd expect the result of make_closure1 to be 40+24 bytes = 66 bytes, which may be enough to prompt an 88 byte allocation. But then we'd also expect the results of make_closure2 and make_closure3 to be 96 bytes, which exceeds what we're actually seeing. So I don't think this is the explanation.

It's unclear to me then that there is any size advantage to doing a make_closure1-type closure. It's also unclear why our first three closures are taking up the sizes they do (88 bytes allocated) and how to reconcile that with what's reported on the WoW wiki page.

Furthermore, I'd expect the make_closure4 call to be the same as the make_closure1 call, but in fact it's the worst of the four! This really puzzles me.

The make_* functions were all defined before calling the sizeof function on any of them. So if heap space is needed to hold code? that shouldn't be affecting the results.

Can anyone illuminate?

Jim Pryor