lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On 06/11/2019 07.21, bil til wrote:
Hi Tim, thank you for the proposals, but I think this is quite an effort. Hanging such hash parts just as "addition" to the ipairs
part is just too nice and lazy, so proposing to avoid this is a nice
recommendation, but lazyness usually always wins over good recommendations.. .

The base cost of an empty table is 56 bytes (or something like that).
If the array-like part of your data is large enough that quickly
skipping it matters, then those few bytes don't matter.  Using a second
table will be A LOT simpler.

While there may be a "hash part", there is no "ipairs part".  ipairs
IGNORES the internal structure and does a manual linear traversal.  (Is
there something at `1`?  Is there something at `2`? …)  Due to internal
optimizations of tables, usually most of the array-like data ends up
being stored as an actual array in memory, which means that ipairs'
accesses don't need to (re-)traverse the hash chains at every step,
which means it's usually fast for most of the indices.  But if you
create a table, add (2^k)+1 entries with non-integer keys, and then add
array-like data ([1]=…, [2]=…, …) with up to (2^k)-2 entries, ALL OF
THEM will be stored in the hash, in a completely RANDOM order.  (If
you'd add a single extra array element at that point, Lua will allocate
an actual array and move all of those fields over there.)

Here's a sample "weird array":

    t = { }
    for i = 32, 65 do  t[string.char( i )] = i-1  end
    for i = 1, 30 do  t[i] = i  end
    for k, v in pairs( t ) do  print( k, v )  end -- fun!

I think to integrate such an operator like "hashpairs" or "hpairs" into lua would be practically NO effort (it is a VERY VERY simple iterator, just starting at Table struct element "node", and then walking through all next's of the hashed table keys...). I do not really expect this to be more than 10 C lines of code... . (estimate of a newbee ... so beware :) ).

So how should that work?  You don't know the size of the "array part",
because there is no such thing.  (Even if you knew the size of the
allocated array, that'd only tell you that at least 50% of its fields
are in use.  These do not have to be contiguous – there may be lots of
`nil`s at the end or anywhere in between.  Also, all of the "array-like"
data may actually be in the "hash part" (see example above) – should
"hpairs" also return all of the "array data" in that case?)

Besides, if you know exactly how your tables are created[1], and are
willing to depend on specifics of the implementation (i.e. your code
will likely NOT be portable across Lua implementations / versions and
can break in HORRIBLE ways), there are already sufficient tools
accessible to manually skip the array-like part.

    function nonipairs( t )  return next, t, #t  end

(If all the array-like data is stored as a contiguous array, #t will
find the correct length.  As this will start the search from the point
after that, it'll start with the first element stored in the hash part.)

Yep… that's it.  Look how easy it is to shoot yourself in the foot!
Now go and look what that does with the table constructed above, then
think about it for a bit, and then go and use two separate tables.
(Or go and have !!FUN!! with this, if you think you know better…)

-- nobody


[1]  With the current behavior, if you first add a contiguous range of
     integer keys starting from 1, they will always be stored as an
     array.  Even if you later add non-array data, that won't move
     array data over to the hash part.