lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Someone mentioned ropes on this thread and it got me thinking...

The standard way to avoid the garbage collection overhead when
concatenating strings is to use a stack of strings as per Lua Technical
Note 9. Bearing in mind the relationship between stacks and trees, I
thought it might be possible to use a rope to get much the same effect as
the LTN 9 approach but without any explicit coding from the user.

The most obvious way to implement a rope is to create a new tree root when
a concatenation is performed. However, it could also be done by adding the
shorter of the two operands as a leaf of the larger one. So, if you coded
something like:-

    "Ierusalimschy" .. " knows" .. " best!"

...then you would end up with the equivalent of

    concat("Ierusalimschy", concat(" knows", " best!")
  
...rather than

    concat(concat("Ierusalimschy", " knows "), "best!")

...as you would get with Boehm et al's original description of the rope
structure. The benefit of this version of the rope is that you can easily
consolidate two leaves of the tree into a single char array when it is
advantageous to do so. Basically, the concatenation algorithm would check
if the two branches of a node were equal in length, plus or minus a few
percent. So, in the example above, " knows" and " best!" would be
consolidated immediately because they are the same length. Then, the new "
knows best!" is about the same length as "Ierusalimschy", so these two
strings would be consolidated too. Of course, adding to the leaves of a
tree would be done using a recursive algorithm, so it would be easy to
check/consolidate the nodes on the way back up the call stack after the
new leaf was added.

I think this is conceptually almost the same as the LTN 9 algorithm but it
works automatically and could be implemented in the Lua interpreter itself
without the user coding it explicitly. It also has the advantage of
handling the case where short pieces are added to the start of a string
rather than the end. Where performance is a pressing issue, you could also
have a consolidation method for strings that explicitly flattens them out
into a char array at the user's request.

&.

Lua list <lua@bazar2.conectiva.com.br> writes:
>David Given wrote:
>> Having a quick shufty through the source code reveals that
>SpiderMonkey
>> distinguishes between mutable and immutable strings. Mutable strings
>can have
>> a preallocated buffer.
>> 
>> So what's probably going on here is that the concatenation is mostly
>being
>> done as a data copy, with no extra allocations needed. A neat trick.
>> 
>> Lua could do something similar, but this would require a rather more
>> complicated compiler and garbage collector, to intelligently decide
>whether to
>> use mutable or immutable strings...
>
>Yes, and consider that it's the complexity of SpiderMonkey which makes
>it intolerably slow, have a hideous C API, and be (historically)
>bug-ridden.
>
>--John





#####################################################################################
This e-mail message has been scanned for Viruses and Content and cleared 
by MailMarshal.
The Blackpool Sixth Form College.
#####################################################################################