lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On 15.08.2014 04:43, Jan Behrens wrote:
On Thu, 14 Aug 2014 22:36:05 +0200
Dirk Laurie <dirk.laurie@gmail.com> wrote:


Actually, filling a table is O(n). It is true that there may be
O(log n) reallocations, but these are all doublings, so the
total time is like 1+2+4+8+...+2^k = 2n where n=2^k.

Oh, I didn't know :-) Thanks for that information. I always thought
filling tables is O(n*log(n)).

But I'm still unsure about the # operator in that case (see below).


Filling a table is O(n*log(n)) if you do reallocations, so with t[#t+1]=newval you are
only increasing your algorithm constant, not the complexity.

To fill a table with n elements you will on average move n/2 elements to new places
log(n) times which is exactly O(n*log(n)).

Of course this is more visible with hash part that the array part as there you are truly moving the element to it's new place by recalculating new hash position (constant time, at this point even long strings have their hash values computed), handle collisions and all other
hash stuff. Anyway it is pretty obvious that you traverse all the elements.

In the array part it is not so obvious - resizing an array is a realloc+setting to nil of new elements. and if you write off the cost of setting new elements to nil to the future use of this elements, then everything seems constant time (for resize, hence O(n) for filling table). But - "realloc" is not constant time. It is in fact in many cases it takes time proportional to the block size (with small, highly optimized
constant).

Realloc might be constant in cases when memory after yours is free. Then it will only mark the memory as used and return you back the same pointer. If you are filling your table with numbers and you are not allocating any dynamic objects (strings, tables, etc) in your loop, you are likely to hit this case quite often, but still not always and as your block is getting bigger and bigger you are more likely to hit fragments in the address space. More often than not realloc will do malloc+memcpy+free and as you can guess
memcpy takes quite linear time (while being quite fast at it).

At some point, when your allocations become big enough to justify roundtrip to the kernel for every allocation, most allocators will resort to mmap/mremap (or whatever your OS has). Mremap can do tricky things like when it has to move your memory (because there is not enough address space after your old allocation for the new chunk of memory) it will just remap hardware pages with your data to a new location. This is fast, but
still linearly proportional to page size.

Anyway, bottom line is in the long run, realloc running time is proportional to min(oldsize, newsize), hence log(n)
reallocations is O(n*log(n)).

Anton