lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On 8/17/2016 11:17 AM, Pedro Tammela wrote:
    I have a small question regarding the Lua C API's "lua_next" function when
    used on a certain subset of objects: tables that are made like arrays, with
    sequential integer keys. Is the order of iteration guaranteed to go from
    '0' to 'N' in this case, or am I still subject to jumping around in some
    implementation-defined order of traversal?

Any perceived order from lua_next() is an artifact of an implementation detail. The language promises that lua_next() (and next() from the Lua side) will eventually traverse all keys of the table. It does not promise any particular order.

As an implementation detail, certain keys are likely to be stored outside of the hash table, and lua_next() will exhaust those first before traversing the hash table. In "current" versions of Lua, those keys happen to be integers from 1 to n which makes common usage of tables as arrays with integer indices starting at 1 more efficient.

But note that it is possible to construct a table in such a way that it has only integer keys, and yet nearly all are still stored in the hash table and not in the efficient array part. A lot of effort has gone towards making such a pathological case difficult to achieve, but there is nothing in the spec that forbids that. Such a table would iterate keys in a completely arbitrary order with lua_next().

The far more likely case is that most small integer keys got optimized, while some at the large end of the sequence remain hashed.

To guarantee the order of integer keys, use a loop over the index similar to:

  for i = 1,#t do something(t[i]) end

The (IMHO unfortunate) choice of naming the optimized part of the internals of the table structure "the array part" has provided a fair bit of confusion to new (and old) users. Best practice is to forget that distinction exists, and trust that the implementation of tables balances the common use cases of t[1] and t.name for best possible efficiency, while continuing to support t[any_type_of_value] efficiently as well.

--
Ross Berteig                               Ross@CheshireEng.com
Cheshire Engineering Corp.           http://www.CheshireEng.com/
+1 626 303 1602