lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Fri, Oct 09, 2020 at 07:55:58PM +0200, Philippe Verdy wrote:
> Le jeu. 8 oct. 2020 à 22:57, William Ahern <william@25thandclement.com> a
> écrit :
> 
> > On Fri, Oct 09, 2020 at 02:41:07AM +0800, 张伟智 wrote:
> > > for k,v in pairs(tbl) do end -->
> > > detail:
> > > 1 local tbl = {}
> > > 2 local iter_func, tbl, init = pairs(tbl)
> > > 3 local k, v = iter_func(tbl, init) --iter_func is next(luaB_next)
> > > 4 while k do
> > > 5     k, v = iter_func(tbl, k)
> > > 6 end
> > >
> > > tbl = {A1=1, A2=1, A3=1, B1=1, B2=1}
> > > struct Table Node[]: |... |A1|...|B1| ...|B2|...|A3|...|A2|...|
> > >                                              i        j        k
> > > Suppose hash(Ai)、hash(Bi) are equal.
> > > Link: A1->A2->A3, B1->B2
> > >
> > > luaB_next is a normal C function, it is stateless and doesn't hold
> > > information about iteration,
> > > so pairs(tbl) needs to find out all key's position in Node[].
> > > Take next(tbl, "B2") as an example:
> > > 1. find the mainpostion of B2: i
> > > 2. Node[i] is B1, B1 != B2, try Node[i]->next: j;
> > > 3. Node[j] == B2, look for the first not null value from j+1,
> > >     which is A3,so next(tbl, "B2") returns "A3".
> > > The overhead of next(tbl, "B2") equals to tbl["B2"] (retrieve "B2").
> > > table.len() is widely used in our project:
> > > function table.len(t)
> > >     local cnt = 0
> > >     for k,v in pairs(t) do
> > >         cnt = cnt + 1
> > >     end
> > >     return cnt
> > > end
> > > and the overhead of table.len() equals to retrieving all keys from table.
> > > why pairs doesn't return a C closure with an integer upvalue to hold
> > > the last key's position in Node[] ?
> > > I think this can be improved.
> >
> > A closure requires heap allocation, though. I often make use of closures to
> > write iterators, but in heavily used functions I try to avoid that.
> 
> 
> Closures are usually very small and have a static size, this means you can
> allocate/free/reallocate them very fast with a pool-based allocator, and
> there's not much overhead. I thing that any decent implementation of Lua
> engines should have a pool for small objects of static sizes

Most system allocators already use fast pools for small object sizes, so I
don't bother.

The problem with dynamic allocation in the general case is 1) it doesn't
play nice with CPU pipelines and caches and 2) all the dynamic memory
creates more garbage for the GC to chew threw, which likewise increases the
amount of the worst kind of work for the CPU.

In general you can't really optimize general purpose allocations very well.
I do often use special purpose allocators in my C libraries, but
specifically for FIFO buffers and so-called object stacks. They work
exceptionally well, both in terms of convenience and performance, because
they're specialized to the lifetime and relational semantics of the objects
being allocated. And because they're specialized to specific use cases the
various functions are short and simple and can be optimized well, whether
compiled inline (i.e. as header files) or inlined through the link-time
optimizer.

If you're smart about how you use Lua in critical paths, you don't need to
incur the cost of sub-optimal memory and object semantics. You can use Lua
to stitch together bits of C code (leaves in the codeflow graph) without the
Lua VM or its library calls blowing your pipelines, prediction buffers, etc
through unnecessary memory churn and code paths.

That said, I don't normally worry about this. My point is just that C++ and
Rust aren't the only ones providing so-called "zero-cost" abstractions. In
its own way Lua does as well. If all I want out of Lua in a particular
scenario is a tight bytecode loop operating on scalar values (like Forth or
K), then I can have that. Introducing dynamic allocations in simple
iterators breaks that model.