lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Note also that if you need to reallocate the pool of pages in a B-tree, you NEVER need to rehash all the keys: you can freely move used pages all around inside the pool. without changing the hashed placement of keys in each page (all you need is to change the pointers to pages for subtrees).

It's then easy to reallocate a larger pool or reduce it to compact it, but here also such events (changing the size of the pool and reorganizing used pages in to a contiguous range at start of the pool if you want to reduce the pool size would require updating page pointers stored in other parent pages, so you have to scan all used pages to locate these pointers)

Note that subpages are always allocated *after* parent pages or the root page which stays permanently in the first page of the pool, so actually you just need to scan subtree pointers in pages numbers that are below the page number that you want to move down: an invariant is that all pointers to pages for subtrees stored in the current page have a page number (in the pool) higher than the page number for the current page containing these pointers.

So compacting the pool of pages for the B-tree to group all the used pages together at start of the pool (before reducing the pool size) is easy:
* it must just respect their relative order, so compaction has to be done sequentially from page 1 to page N.
* free pages are easy to distinguish from used pages: all their nodes contain keys that are nil (only the first pointer to subtrees would be not nil, used for the chain of free nodes starting from one reserved page pointer in the root page)

Growing the size of the pool to allow more used pages (when there's no longer any free page and you ned one more) is easy: you don't have to reash any key, just add other free pages at end of the pool and preserve the contents of all existing used pages (their N key/value nodes, and N+1 pointers for subtrees).

Note also that a B-tree will have all leaf pages at the same level, so the leaf pages don't need the N+1 pointers for subtrees: you can have a separate pool for leaf pages and separate pool for the parent pages. Distinguishing them is easy: you can use a sign bit so that leaf pages (without pointers) have negative page numbers (relative to the pool of leaf pages), except the root page and all other parent pages. But when visiting the B-tree you have two distinct methods to enumerate the nodes depending on the type of page (leaf subpage or parent/root page).

Le mar. 28 avr. 2020 à 05:13, Philippe Verdy <> a écrit :
Also the solution for large hash tables could be to use subtables, by splitting the keys into subkeys.

But in that case, for large hashes, why not using simply a B-tree ?

A B-tree avoids most reallocations, because pages (of key/value pairs) in a B-tree have a constant size, and can be allocated/freed from a large pool containing ALL free or used pages, that can also be allocated or reallocated using the minimum/maximum thresholds for used pages in this pool (the maximum threshold may be 100%, the minimum can be between 0% if you want no deallocation or up to 90% if you want the most compact representation and you don't expect a rapid growth of the size).

Free pages in a B-tree can be chained (each page of N nodes also  contains pointers to N+1 other pages for subtrees, you can use one of these pointers to form a forward-only chain: allocation of a new page is immediate)

A "B-tree" is not necessarily fully ordered: for using it as a hash store, you just need to compare if keys are equal or not, so each small page can be a hash table: collisions will use subtrees (if needed, i.e. if there's no more space in the current page by following its local implicit chain for lookups, or simply by sequential loookup). As the page in a B-tree is small (e.g. up to 16 nodes), lookup of hash chains within the page is warrantied to be fast. And the B-tree can be built to equilibrate the depth of its branches so that all leaf pages are at the same level (there can exist nodes however in the parent levels); and so that only the root page of the B-tree may have a low fill level (all other pages in cluding leaf pages having a high fill level, that you can warranty by a higher value of the minimum fill threshold than for the root page which could remain constantly allcoated e.g. with up to 16 nodes for key/value pairs, and 17 pointers for pointing other parent pages for subtrees, keeping only the last pointer of the root page for the chain of free pages).

What will be the page size (managed as a local hash table only for the key/value nodes contained in that subset) is left to be implementation dependant, but it is also tunable (this minimum should be about 16, but it can be also a bit larger if you want less frequent operations in the B-tree for managing reorganisations of nearby branches to maintain the fill-level invariants in eah page and the constant depth of all leap pages in the B-tree. The reorganisation operations in a B-tree are not very costly, they are in O(depth) for a B-tree storing O(pagesize^depth) nodes, ie. for a B-tree storing N nodes, its cost is O(log(N)/N) and given reasonnable values for the minimum fill level of each page, it will not occur very often.

Le mar. 28 avr. 2020 à 04:47, Philippe Verdy <> a écrit :
Changing keys (replacing non-nil values by non-nil values) should not affect the has table at all (so it cannot cause a "stop the world" issue, which is a critical security issue if the keys are private data).

But if you change them first to nil, then back to a non-nil value, this should not cause "stop the world" issue caused by reallocation (because the has table will remain almost constant and reallocation should occur only if
* the table fill level would exceed some threshold (e.g. over 85%) for adding new keys
* if you hit a collision chain exhausting a maximum number of steps to find a new free position for a new key, or
* the table fill level would fall below some threshold (e.g. below 25%) for removing some keys, and the reducing the size would make it smaller than a minimum size (e.g. 16 nodes)

The fact that the "lastfree" node moves in one way is clearly a bug (a bad choice in the initial design of the implementation) that can be corrected. I indicated the solution:
* stop doing that, you don't need such "last free node" pointer (but as long as you have nodes chained by "next" pointers, you can keep them unchanged so that setting the key again to a non-nil value will restore it in its chain)
* remove the unnecessary "next" pointer from nodes, instead use static advances by a prime contant modulo the hash table size (to be used both when adding new keys by following the implicit chain with this constant step to match the key (if matched, replace the value of the node if the value is not nil, or set both the key and value of the node to nil) or until you hit a free node (with nil key) where you can place the key/value pair to add.

* you get smaller size in memory for nodes, that only store key/value pairs (both non-nil or both nil) and no extra pointers.
* you have a wider control on the dynamic size of the hashtables (with minimum/maximum fill levels like 25%/85%, plus the "minimum table size" like 16) to limit the frequency of rehashings needed
* invariants are simpler to maintain
* worst case ("stop the world") are avoided alsmot always when the hash table is stable or does not change significantly very often.
* you can instantiate a marge table with a "minimum table size" at start that can be used for fast filling (without needing any reallocation with rehashings in the middle of the fill)
* you may reduce the minimum size later back to a small value when the table has been fed with all the initial data you want and that should be quite stable for long period of time even if the contents evolves slightly (in which case only the minimum/maximum values will be considered: rehashing of all keys may occur only at this time but only if the initial "minimum size was not correctly estimated and was much too large for the final fill level).

Le mar. 28 avr. 2020 à 04:21, 云风 Cloud Wu <> a écrit :
Philippe Verdy <> 于2020年4月28日周二 上午6:09写道:

> The basic function used by Lua for its hash table hwoever is not very strong (it is made to be fast): one way to still use Lua tables for critical data is to first create your own strong digest for any private key that you want to secure against malicious collisions, and then append this digest to the unique key to generate the actual key to be used in table, instead of indexing the table directly by this private key (such appending will still preserve the uniqueness). I wonder however if Lua allows an application to specify its own secure hashing function for storing/reading (key=value} pairs in its tables: is there a metamethod we can use in tables? (setting this methamethod to a different hashing function should instantly cause the reashing of all keys already present in that table).

My experience is that don't use a huge table if you changes it's keys
frequently. It will cause stop-the-world problem.

I found this issue years ago. There is a huge table (nearly 1M nodes)
in our online game server. We add some keys and remove keys
frequently, the total number of the nodes not changed but `lastfree`
moves in one way. So rehasing occured periodic , it stopped the world
for a long time.

My solution is spliting this huge table into multiple smaller one.

I think an user-defined hashing function may not be helpful in small
tables, and metamethod could be much slower than native one.

lua-l mailing list --
To unsubscribe send an email to
lua-l mailing list --
To unsubscribe send an email to