lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Andrea:

On Tue, May 12, 2020 at 5:42 PM Andrea <andrea.l.vitali@gmail.com> wrote:
> has anyone thought about using crit-bit / patricia trees? they seem to give the same functionality of hash-tables, with guaranteed performance (no worst case), and the capability to support in-order scan of the tree; see: https://cr.yp.to/critbit.html

The problem with many data structure analysis is they tend to focus on
big-O and leave out the constant factors. You do not have to go to
fancy ( in my case read newish ) data structures ( although I think
patricias are in my very old copy of Knuths TAOCPC-3, but I seem to
have misplaced it ). Balanced trees have guaranteed logarithmic
performance for every op, but their overhead makes them much slower
and bigger. I'm nearly sure data structures other than hashes have
been considered by the lua team, and by many other scripting languages
implementers, but  I also think the fact that Lua, Perl, Python, JS
and many more use hash tables for their native mapping implementations
is because they are very, very good for them.

AAMOF, I do a lot of C++ ad Java. And I always end up using unordered
_map and HashMap by default, they tend to be the best ones for the
general case.

In a later post you cite quicksort versus heapsort. I love heapsort,
and used it a lot in my fortran66 and line number basic days because
it is simple to code, non recursive, no extra storage needed,
guaranteed performance. But in current machines it's much slower than
QS ( and, among other things, very cache unfriendly ). Mergesort is
also guaranteeed log, but has the problems of needing extra space. QS
tends to be used by default because it's very simple to code, very
difficult to atack ( get worst case ) with some minor checks, and we
do not program nuclear reactors or avionics controls. For those, or
for any system which needs hard realtimne, heapsort might be better,
just put enough CPU for the extra load ( although in many hard real
time problems the sets to sort are so small you can just do
insertion/selection/bubble sort if you want, it's happened to me ( and
sometimes is the correct one to choose, you save a lot of programmer
time with a super simple algorithm, and if the test shows you can do
it, realtime systems do not normally have other use for the saved cpu
cycles )).



> about the performance: worst case is proportional to the length of the key, not the number of elements in the tree; the key is checked bit-by-bit (or byte-by-byte); no expensive key comparisn (I am thinking about strcmp); there is no hash involved, so no risk of flooding.

You sould check average case. I've found many tree data structures, in
the typical use, are an order of magnitude or more slower than simple
hashes. I've tested this in C and Java, where changing mapping
implementations is particularly easy, with test tapes from real data,
and found normally hash is the way to go.


Francisco Olarte.
_______________________________________________
lua-l mailing list -- lua-l@lists.lua.org
To unsubscribe send an email to lua-l-leave@lists.lua.org