lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Changing keys (replacing non-nil values by non-nil values) should not affect the has table at all (so it cannot cause a "stop the world" issue, which is a critical security issue if the keys are private data).

But if you change them first to nil, then back to a non-nil value, this should not cause "stop the world" issue caused by reallocation (because the has table will remain almost constant and reallocation should occur only if
* the table fill level would exceed some threshold (e.g. over 85%) for adding new keys
* if you hit a collision chain exhausting a maximum number of steps to find a new free position for a new key, or
* the table fill level would fall below some threshold (e.g. below 25%) for removing some keys, and the reducing the size would make it smaller than a minimum size (e.g. 16 nodes)

The fact that the "lastfree" node moves in one way is clearly a bug (a bad choice in the initial design of the implementation) that can be corrected. I indicated the solution:
* stop doing that, you don't need such "last free node" pointer (but as long as you have nodes chained by "next" pointers, you can keep them unchanged so that setting the key again to a non-nil value will restore it in its chain)
* remove the unnecessary "next" pointer from nodes, instead use static advances by a prime contant modulo the hash table size (to be used both when adding new keys by following the implicit chain with this constant step to match the key (if matched, replace the value of the node if the value is not nil, or set both the key and value of the node to nil) or until you hit a free node (with nil key) where you can place the key/value pair to add.

Bonus:
* you get smaller size in memory for nodes, that only store key/value pairs (both non-nil or both nil) and no extra pointers.
* you have a wider control on the dynamic size of the hashtables (with minimum/maximum fill levels like 25%/85%, plus the "minimum table size" like 16) to limit the frequency of rehashings needed
* invariants are simpler to maintain
* worst case ("stop the world") are avoided alsmot always when the hash table is stable or does not change significantly very often.
* you can instantiate a marge table with a "minimum table size" at start that can be used for fast filling (without needing any reallocation with rehashings in the middle of the fill)
* you may reduce the minimum size later back to a small value when the table has been fed with all the initial data you want and that should be quite stable for long period of time even if the contents evolves slightly (in which case only the minimum/maximum values will be considered: rehashing of all keys may occur only at this time but only if the initial "minimum size was not correctly estimated and was much too large for the final fill level).


Le mar. 28 avr. 2020 à 04:21, 云风 Cloud Wu <cloudwu@gmail.com> a écrit :
Philippe Verdy <verdyp@gmail.com> 于2020年4月28日周二 上午6:09写道:

> The basic function used by Lua for its hash table hwoever is not very strong (it is made to be fast): one way to still use Lua tables for critical data is to first create your own strong digest for any private key that you want to secure against malicious collisions, and then append this digest to the unique key to generate the actual key to be used in table, instead of indexing the table directly by this private key (such appending will still preserve the uniqueness). I wonder however if Lua allows an application to specify its own secure hashing function for storing/reading (key=value} pairs in its tables: is there a metamethod we can use in tables? (setting this methamethod to a different hashing function should instantly cause the reashing of all keys already present in that table).

My experience is that don't use a huge table if you changes it's keys
frequently. It will cause stop-the-world problem.

I found this issue years ago. There is a huge table (nearly 1M nodes)
in our online game server. We add some keys and remove keys
frequently, the total number of the nodes not changed but `lastfree`
moves in one way. So rehasing occured periodic , it stopped the world
for a long time.

My solution is spliting this huge table into multiple smaller one.

I think an user-defined hashing function may not be helpful in small
tables, and metamethod could be much slower than native one.

--
http://blog.codingnow.com
_______________________________________________
lua-l mailing list -- lua-l@lists.lua.org
To unsubscribe send an email to lua-l-leave@lists.lua.org
_______________________________________________
lua-l mailing list -- lua-l@lists.lua.org
To unsubscribe send an email to lua-l-leave@lists.lua.org