[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Soliciting criticism of memoize function
- From: Norman Ramsey <nr@...>
- Date: Sat, 17 May 2008 13:54:16 -0400
> Some issues to deal with in getting a generic memoize to work correctly:
>
> 1. Do you need to support memoization for f( nil )? If so, you
> potentially need to handle this separately (but see below).
>
> 2. Do you handle nil results from f correctly? Efficiently?
Nasty questions, both of them. Well done! The answers are that
1. memoize(f)(nil) causes a run-time error
2. For every x such that f(x) is nil,
memoize(f)(x) == f(x) but is more expensive than f(x).
Because I'm not willing to muck up the common case for something I
expect never to occur, I will add to the specification of memoize that
it is a checked run-time error to pass nil to memoize(f).
> The easiest way to deal with nil is to use a magic value in its place.
> For example, a custom table. We also need to cope with NaN as a
> parameter since it can't be used as a table key.
Argh! That must have happened when I wasn't paying attention :-(
> This leads to something like the following (untested):
>
> ... return decodeValue( cache[ encodeKey( x ) ] ) ...
I'm not willing to pay this additional overhead on the fast path.
Background for those who are curious: we're most interested in passing
tables to f, and we had been memoizing results of f by using an
__index metamethod that would 'create fields on demand' by calling f.
Switching to memoize simplifies the interface and metamethod
significantly, and equally important, it makes it easier to for us to
experiment with different functions f. But we want not to take a big
hit in performance. Our old cost was
- field access
Our new cost is
- function call
- check select('#', ...) -- for safety
- field access
I'm reluctant to add something like encode and decode, which do actual
work, and moreover which have to be explained.
> (Hey. If you are going to check the number of arguments, you ought to
> handle nil and NaN correctly as well...)
Our 'checking' amounts to an assertion failure. What we want is for
the program to halt if memoize is misused. This will happen if nil or
NaN is used as an argument to memoize(f), so it's OK...
> 3. Cache mode gets tricky for a couple of reasons.
>
> The first is that semi-weak tables can lead to cyclic situations since
> all of the values get marked. For example, if we memoize:
>
> function( x ) return { x } end
>
> then nothing will ever get collected.
>
> The second is that strings are considered values and hence will tend
> to get marked anyway as keys even if they aren't referenced elsewhere.
> Hence, a safer choice though it results in a memoization cache that
> doesn't last as long is to use a fully weak table.
This is an area where I really don't understand the issues. It sounds
like what you're saying is that it's bad to have string keys because
strings are not likely to be reused, yet the associations will clutter
up the table. But we have weak keys, so the strings and values can
be collected, no? What am I missing here? Are you thinking that
having results alive will keep the string arguments alive and in the
table unnecessarily?
Norman