• Subject: Re: Why Lua table pairs iteration doesn't hold the last key's position in Node[] ?
• From: Philippe Verdy <verdyp@...>
• Date: Fri, 9 Oct 2020 19:55:58 +0200

Le jeu. 8 oct. 2020 à 22:57, William Ahern <william@25thandclement.com> a écrit :
On Fri, Oct 09, 2020 at 02:41:07AM +0800, 张伟智 wrote:
> for k,v in pairs(tbl) do end -->
> detail:
> 1 local tbl = {}
> 2 local iter_func, tbl, init = pairs(tbl)
> 3 local k, v = iter_func(tbl, init) --iter_func is next(luaB_next)
> 4 while k do
> 5     k, v = iter_func(tbl, k)
> 6 end
>
> tbl = {A1=1, A2=1, A3=1, B1=1, B2=1}
> struct Table Node[]: |... |A1|...|B1| ...|B2|...|A3|...|A2|...|
>                                              i        j        k
> Suppose hash(Ai)、hash(Bi) are equal.
>
> luaB_next is a normal C function, it is stateless and doesn't hold
> so pairs(tbl) needs to find out all key's position in Node[].
> Take next(tbl, "B2") as an example:
> 1. find the mainpostion of B2: i
> 2. Node[i] is B1, B1 != B2, try Node[i]->next: j;
> 3. Node[j] == B2, look for the first not null value from j+1,
>     which is A3,so next(tbl, "B2") returns "A3".
> The overhead of next(tbl, "B2") equals to tbl["B2"] (retrieve "B2").
> table.len() is widely used in our project:
> function table.len(t)
>     local cnt = 0
>     for k,v in pairs(t) do
>         cnt = cnt + 1
>     end
>     return cnt
> end
> and the overhead of table.len() equals to retrieving all keys from table.
> why pairs doesn't return a C closure with an integer upvalue to hold
> the last key's position in Node[] ?
> I think this can be improved.

A closure requires heap allocation, though. I often make use of closures to
write iterators, but in heavily used functions I try to avoid that.

Closures are usually very small and have a static size, this means you can allocate/free/reallocate them very fast with a pool-based allocator, and there's not much overhead. I thing that any decent implementation of Lua engines should have a pool for small objects of static sizes
And I think it can also use some statistic metrics to know the most frequent sizes in order to tune the number of the pools and the number of free slots to keep for each pool.
For small sizes, the slots in each pool can be preallocated in pages with minimum size, and the garbage collector can more easily group together the pages that are insufficient used if it needs to free up some pages (according to its collected usage statistics for each pool).
Basically the default implementation Lua just trusts the standard C library for malloc/calloc/realloc/free, but you can link your C program with a more convenient memory allocator without really needing to modify the Lua sources. And there are many excellent replacement libraries (including some with additional security features, like randomization of addresses, or debugging, or that can also monitor external metrics from the OS) that will implement the allocation strategy that fits your needs in your app, or the restriction requirements for your device, or that will trade between reduction of memory footprint, data locality, speed/real time constraints, or extreme security with constant allocation time for known sizes (managing constraints is where pools become effective, and pools are a native feature of C++ with its "placement" feature).

• Follow-Ups:
• References: