lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hoy.

Writing this because I have seen a lot of inflexible attempts at memoizing function calls out there. :>

Mostly I wanted to tackle these two problems:
- preserving nil in a call with multiple arguments
- preserving nil in a function returning multiple values

This is what I came up with (to be used as a module):

-- a table to track all objects passed to any memoized functions
local guids = setmetatable({}, { __mode = 'k' }) 

-- note: this could *eventually* overflow -- unlikely for practical use
local counter = 0

-- the memoized functions are the weak-keys
local funcs = setmetatable({}, { __mode = 'k' }) 

-- example: (1, nil, 'cat', '', function() end) -> '3||7|38|27'
local args_to_str =
    function (...)
        local ids = {} 

        -- use of select() is important here
        for i = 1, select('#', ...) do 
            local v = select(i, ...)

            if v ~= nil and not guids[v] then
                counter  = counter + 1
                guids[v] = counter
            end

            -- nil becomes empty-string,
            -- everything else has a number identifier
            ids[i] = guids[v] or '' 
        end

        -- the separator is important, but can be anything
        return table.concat(ids, '|')
    end

local call =
    function (f, ...)
        if not funcs[f] then funcs[f] = {} end

        local call    = args_to_str(...)
        local returns = funcs[f]

        if not returns[call] then
            funcs[f][call] = table.pack(f(...))
            print(('call signature: %q \t calling: %s'):format(call, f))
        else
            print(('call signature: %q \t not calling: %s'):format(call, f))
        end

        return table.unpack(funcs[f][call])
    end

local clear =
    function (f)
        -- no function specified, reset memoize module (essentially)
        if not f then
            -- replace/clear guid table, guid counter, drop function call caches
            guids   = setmetatable({}, { __mode = 'k' }) 
            counter = 0
            funcs   = {} 
        else
            funcs[f] = nil
        end
    end

return setmetatable({ call = call, forget = clear }, { __call = function (_, ...) return call(...) end })

----

Of the memoize attempts I've seen supporting multiple-argument calls, I usually see people converting each argument to a string and just concatenating them -- but this can lead to collisions if you had calls like:

f(true, true) -> 'truetrue'
f(true, 'true') -> 'truetrue'

The better ones use pretty-printers to more uniquely represent string values (such as single-quoting strings and escaping characters within them).  The problem then turns into how to uniquely represent objects from third-party libraries that you can't easily introspect.  The best you can do is type(io.stdout) -> 'userdata' if they don't have a __tostring() defined -- but you're not guaranteed a unique string representation.

To this end I thought it would be "cleaner" to collect every object passed to a memoized function (any of them) in a shared table called `guids' that is weakly-keyed.  I increment `counter' to associate a global identifier (a number).  This way I can do: table.concat({ guids[arg[1]] or '', guids[arg[2]] or '', etc... }, '|')  It's very important that a separator exists in the call to concat() so the object '1' and '11' don't become '111'.  Only numbers or '|' exist in this string, nils become a vacancy between ||.

Anyway, the last odd/clever part is using table.pack() to preserve nils that are returned from functions.  table.pack() (which is in 5.2) constructs a table as if it were written: { 'a', nil, 'c', nil, nil, 'f', n = 6 } -- which is not quite the same as: { [1] = 'a', [3] = 'c', [6] = 'f', n = 6 }

This is the part I am fuzzy on, but somehow the table constructions are different and nils are preserved in the sequence portion of the table "invisibly" when you go to table.unpack().

Moist people like doing:

local max = memoize(math.max)

To get this, write:

function memoize_me(f)
    return function (...) return memoize.call(f, ...) end
end

local max = memoize_me(math.max)

Same thing.  I didn't do that because I thought memoize.call(f, ...) would be more flexible somehow?  Also you can clear the "call cache" of an individual memoized function with memoize.forget(math.max) or you can clear all of the call caches for all functions with just memoize.forget().

Final thoughts:

1) `counter' can potentially reach the limits of a lua_Number as its incremented

2) if used in sandboxed code, getting the `guids' upvalue of memoize.call() could be an issue, but you'd need debug.getupvalue() which should not be in a sandbox.  Just be conscious of references to ~things in greater scopes~ ?

3) Remove the print()'s in call() if used, those are to show it works

4) You can add :gsub('|+$', '') to the return line in args_to_str() to "trim" trailing nils and combine some calls like math.max(1, 2) & math.max(1, 2, nil) to save space.  Some functions might rely on select('#', ...) so I wouldn't advise it.

5) This is still a fair amount of work to convert a call (signature?) to a string to find the associated cached return values, I would only recommend using this on functions that do much more than math.max().

(If you're not a muppet you wouldn't memoize math.max anyway...)

Toodles :-)