lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Axel Kittenberger wrote:
>  But how about when concating two strings the Lua VM stores a datatype
> that just says "this is str1 joined with str2". Only whenever you do
> an operation that cannot be used on this string of strings, like using
> it as a table key, it is joined and interned.

I know this is a ceterum censeo, but I have to repeat my warning:

Ropes are a good idea on paper, but they increase complexity and
rarely pay off outside targeted benchmarks and textbook examples.
Yes, there may be some specific applications where ropes are the
right data structure (*). However ropes are certainly not the data
structure of choice for a generic string type in a generic
programming language.

Don't fall into the premature optimization trap. Whatever code you
envision that might benefit from ropes is simply uncommon. But all
of your code will pay the price of managing linked, shared and
reference-counted data structures as well as iterating over trees
all the time. This is very cache-inefficient, pointer-chasing has
a high latency, your CPU hardware prefetcher doesn't handle this
very well, you pay a high cost for creation _and_ destruction, and
so on. Modern CPUs are crazy beasts and linear data structures
very often beat elaborate data structures, even for medium-sized
use cases.

There's even compelling evidence, including benchmarks, that
flattened, colocated strings (which is what Lua 5.1 uses) are the
best option: http://ssw.jku.at/Research/Papers/Haeubl08Master/

(*) I guess almost everyone would place a bet that ropes are the
perfect data structure for a text editor. But don't trust your
intuition: e.g. gap buffers are less complex and much faster.

--Mike