lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


I tend to disagree. The "sequence" meaning does not imply that ipairs() enumerates only the array part. It had to enumerate both parts to find keys that arte integers in the seuqnce range 1 to N (until it finds  a nil).
So the parts are implementation details.

As I said, nothing prohibits another change on the implementation of tables (I will soon experiment on integrating a variant of my old code using b-trees (for un unrelated project) and soee how it behaves in Lua. There will NOT be any separation between an "array" part and an "hash part" as there will be a single part for all, the b-tree. Still it will support ipairs() as a selective enumerator for sequences while still supporting pairs() for the uselective full enumerator. And it will be efficient on both cases for fast enumeration of sequences, and efficient in terms of memory allocation and data locality.

If you want tuning a specific implementation a separate API should be used for this specific implementation where you could pass or query an array of parameters.

As well the "length" (as returned by #t) should not be settable. the effective length should be a queriable property only for the sequence part of the table independantly of its implementation.

Le lun. 20 juil. 2020 à 11:40, Lorenzo Donati <lorenzodonatibz@tiscali.it> a écrit :
On 16/07/2020 21:07, Mason Bogue wrote:
> We are all generally aware of the issues with the length operator on
> tables containing nil. There was an idea to fix this in 5.4 by
> creating a special t[x] = undef syntax, which was considered too ugly.
> Here I propose an alternative:
>
> #t = 5 -- sets the length of the array portion of t to 5
>
> Essentially we allow the length operator to be used in an lvalue (we
> can already invoke functions in lvalues, so this is not
> unprecedented). If I understand the implementation correctly, the
> length will not be altered unless the table is reindexed, which only
> happens if we make a lot of insertions to the hash-map. If this is
> still too volatile, there could be a flag similar to `isrealasize(t)`
> that doesn't allow the size to shrink if it was set this way. If
> necessary for semantic consistency, doing this could also set t[5] =
> false when otherwise t[5] = nil.
>
> This also speeds up the creation of large arrays. Previously you had to do this:
>
> t = {}
> for i = 1, 10000 do t[i] = false end
>
> During that loop the array would be reallocated several times. With
> length assignment (`#t = 10000`) you get *one* reindex. Much nicer.
>
Really, I can't understand why the simpler, easiest solution, has not
been implemented yet: simply add a new function in table library for
those efficiency aware programs where preallocating space for a table is
paramount.

The function is already there in C API (lua_createtable), so it can't
possibly be considered "adding bloat" since it would be just a few bytes
more to interface add the corresponding table library function.

All the other proposals are really "hacks" and complicate matters adding
conceptual overhead to the language and its syntax for a corner case.
This corner case is, IMO, important enough to grant for an added
`table.create` function, but it is not important enough to mess with the
syntax (how many times would a user need to write #T=maxlen ?). Not to
speak of the interactions with other mechanisms and syntax, as pointed
out by other in this thread.

Your proposal, moreover, will allocate array slots /after/ the table has
been created, so it is not optimal as it could be.

Please, understand I'm not "ranting" against you. I know there is really
the need sometimes to allocate tables efficiently (I've been there and
I'd like to be able to do it in pure Lua). But the real (elegant,
no-bloat, easy, readable) solution is what I told you and "screams" to
be implemented.

Why Lua team didn't do it yet (while struggling with yet another GC
implementation), I can't really understand. IIRC years ago they replied
it would lead to "bad" programming practice, since it leaks an
implementation detail (array part vs. hash part). But really I don't
agree on that. That leak is already there in all the "sequence vs,
non-sequence" thing and in the whole table library.

Moreover, if one wants to be abstract and non-leaky, an hypothetical
`table.create(narr, nhash)` can be defined at Lua level just as
"hinting" the Lua engine to create a table efficiently for a sequence of
`narr` elements and that the user means to fill with `nhash` keys. This
doesn't mention "array parts" and "hash parts" whatsoever and use
concept well defined at Lua level. So it is non-leaky. The
implementation would be free to implement table.create by mapping it to
lua_createtable (straightforward implementation) or to whatever future
mechanism would be needed for efficient allocation, would the underlying
implementation of table change.

So we can have our cake and eat it too, in this case. Easy-peasy.

Another, more elegant long-term solution (maybe), would be to introduce
a way to specify this kind of hint in any table ctor.

Just off the top of my head:

{:1000: ... --[[rest of the ctor, as usual]]}   -- preallocate 1000 array
slots


{:1000,2000:...} -- preallocate 1000 array and 2000 hash slots


{:nil,2000:...} -- preallocate 2000 hash slots

I said "preallocate", so this /is/ implementation details leaking. If
you care, substitute it with "hinting" as I said above.

The syntax could be defined by stating that the engine would do its best
to optimize whatever allocation is needed to satisfy those parameters
(without mentioning hash/array parts).


That (or similar) syntax could be easily extended to other "hinting" or
table properties that could, in the future, lead to other optimizations
or useful features.

e.g.:

{:"const": ....} -- table that cannot be changed
{:"noresize":...} -- table whose number of entries cannot change

Heck, maybe, some could be used to automatically give a table a
metatable with suitable defined metamethods (just brainstorming here),
as in:

{:"weak":...}
{:"ephemeron":...}


Cheers!

-- Lorenzo