lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


That's what I defended. The internal optimization of moving some hasved integer keys and value pairs outside the hash into an indexed array should never have any impact on what sequences should enumerate in Lua or on the value returned by #t which should always be the smallest positive integer key whose value is null or unset, minus 1. The optimization that uses an array should be consistant. #t can still just return an index computed and cached efficiently in the array part, provided that if this cache does not find the relevant integer, an hashed search will scan the hash store to find the relevant index and maintain that in a consistant cache within the hidden table storage structure. That's not what is done, so sequences have ill behavior with unpredictable results. To have lredictable results, then applications have to segregate the tables they use for sequences for those that are just random dictionnaries of unordered properties for generic objects.

Then comes the problem of objects that have still no synthetic prototypes that would allow their properties to be acced faster using array indexing rather than slow hashing with management of collisions in a loop. This is a serious security concern.

Le mar. 5 nov. 2019 à 16:24, Javier Guerra Giraldez <javier@guerrag.com> a écrit :
On Tue, 5 Nov 2019 at 01:58, bil til <flyer31@googlemail.com> wrote:
> would be VERY nice, that currently it seems to be possilbe to iterate
> through the hash part of some larger ipairs-table with a lua for loop in
> some acceptable efficiency?

one little issue that isn't always evident:  the array part of Lua
tables is an implementation detail, a performance trick that is
supposed to be invisible to Lua code.

in principle, a table is a set of (key, value) pairs, unordered and
undistinguished.

Of course, small, dense integer keys are very useful, so several
functions help on dealing with the special case of sequences.

To make them faster and smaller, tables can hold some (but not
necessarily all) of those keys in an internal array.   Even so, all
these functions must handle the frequent case where integer keys are
stored in the "main" hash table.

Check the source code of ipairs or the # operator, you'll see they
don't go directly to the array.

So, what you're asking is not to "simply iterate the hash part".  It's
"iterate everything except sequence-keys".   consider:

for k, v in hpairs({'a', 'b', [1000]='c', y='d') do
   print (k, v)
end

It would obviously skip the (1, 'a') and (2, 'b') pairs, but what
about the (1000, 'c') pair?  it would probably appear in the hash.
ipairs() doesn't return it, since it's not part of a sequence.  So,
should it print?

what about the "nasty" tables, like that one shown by Robeto?
{x=1,y=2,[1]=3,[2]=4,[3]=5}  ipairs does show all three integer keys,
but they're likely stored on the hash part.  should hpairs() just
ignore any integer key?   now it seems it would have to first check if
they form a contiguous sequence or not....

the key point is that the array optimization is just a performance
optimization, not an explicit behaviour.  to a Lua program it should
always behave as a homogeneous set of (key, value) pairs.

--
Javier