lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


I spoke about null nodes meaning nodes whose key part is not set (so the associated-value part is also not set, both can be will be null/nil pointers or values): if you don't like it, I cabn speak about free nodes, unused nodes, empty nodes. This is independant of the fact these nodes (also called buckets) may also contain internal chaining pointer (I think there's no need at all for them and their usage in Lua is based on a broken assumption that it could generate shorter chains, while in fact it creates much longer chains and worst cases are in fact easier to reach).

Using linear probes with a constant step (1 or a large prime) *warranties* there's a limit to the worst case for long chains.

That warranty does not exist with the current Lua's implementation using these "next" chaining pointers added in nodes; and this does not save memory at all for hash table, but increases the storage requirement by more than 50% (including alignment constraints). My opinion is that these chaining pointer should be removed completely, using linear probing instead (I also advocate for using a large prime rather than 1, notably because the hashing functions are extremely weak and do not offer correct dispersion of hash values, and when you combine this set hash values to a smaller set of node positions, there are too many hash values colliding to the same chain, so worst cases are very frequent in Lua:

hash tables have then a very poor performance for lookups if most chains are overlong even when the hash table is far from being filled at 50% at least). In addition the Lua's strategy for frowing table sizes is poor and wastes a lot of memory (using a x2.0 growth factor is very wasteful for tables that need to store many distinct keys associated with non-null values, i.e. more than a few dozens keys). And finally Lua's hash table are in fact even slower than using a linear global scan of the full table !



Le ven. 15 mai 2020 à 11:20, 云风 Cloud Wu <cloudwu@gmail.com> a écrit :
Philippe Verdy <verdyp@gmail.com> 于2020年5月12日周二 下午5:58写道:
>
> As I replied in another message, it is absolutely NOT necessary to have ANY chaining pointer per node: collision lists can just add (resp. substrict) a constant prime to the current node index (and take the result module the hash table size), to find the next (resp.) previous node in the chain, until you find the node with the correct hash or a null node.

I think we should distinguish a null node and a node with value nil,
because set a node to nil may break the chain. And the terminal
condition should including back to the main position, because at the
worst case , the chain may not end by a null node.
_______________________________________________
lua-l mailing list -- lua-l@lists.lua.org
To unsubscribe send an email to lua-l-leave@lists.lua.org
_______________________________________________
lua-l mailing list -- lua-l@lists.lua.org
To unsubscribe send an email to lua-l-leave@lists.lua.org