lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On 15.08.2014 14:39, Enrico Colombini wrote:
On 15/08/2014 12.23, Anton Titov wrote:
Anyway, bottom line is in the long run, realloc running time is
proportional to min(oldsize, newsize), hence log(n)
reallocations is O(n*log(n)).

Interesting considerations, thanks (sometimes one tends not to think too much about the time spent in low-level system libraries).

A note: if needed, realloc time could be made O(1) by using a constant time allocator such as TLSF (http://tlsf.baisoku.org/). It was suggested to me on this list and I used it with Lua with good results. Of course that O(1) could still be longer than O(n*log(n)) with the system libraries, but for a large enough 'n' it could make a difference. (the main usage of TLSF is in real-time systems where an upper bound is desirable for allocation calls time)


There is a memcpy in tlsf_realloc, I just saw the code. There is also a path that is non-memcpy when the memory after current block is free. I've created easily a test case that realloc from 256 to 512, 1024, 2048 and 4096 bytes returned all different pointers. Of course I had to malloc enough memory between reallocs that does not fit in previously freed fragment.

In any case in long running application with some memory fragmentation you should assume realloc not to be constant.

And to get back on topic to Lua - even if you assume realloc to be constant, when there is no place for the next element in the array, Lua does not simply do t->sizearray*=2. Instead it traverses all current values in array part to see how many of them are nil (i.e. if the array is sparse) and makes a (quite efficient) decision how to do a good balance between the array and hash part. That way chances are that if you have an array with t[1], t[2], t[3], t[4], t[6], t[8], t[1000], t[1001] set, the chances are you will end up with numbers 1 to 8 in array and 1000 and 1001 in hash part.

By the way I looked at the code of computesizes now and it choses maximum possible array size that is a power of two and has at least half of the keys present. I think that array part is more than twice more optimal than the hash part and that coefficient may be played with. I would go with finding minimum of sizeof(TValue)*power_of_2_array_size+sizeof(Node)*(total_elements-elements_that_would_go_to_the_array_part)*SOME_HASH_LOAD_CONSTANT_LIKE_1_POINT_5. Given that sizeof(Node) is more than twice the sizeof(TValue) and that you want to have more hash buckets that you have elements (of course rounding up to power of two might do that) it may turn out that 3 or even 4 is a better coefficient. On the other hand optimizing sparse tables may not worth it.

Anyway that I guess officially makes array fill up O(n*log(n)).

Anton