[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Alternative string concat (was "out of memory ")
- From: "Thomas Wrensch" <twrensch@...>
- Date: Tue, 08 Oct 2002 13:55:29 -0700
Agreed that the concatination operation is not particularly slow (I ran
a few tests, its more fun to argue with real data). Also the way
multiple concatenation works makes the join function I suggested
completely unnecessary.
However, there will be cases where using concatenation is not
appropriate, and does not produce overall linear (as in O(N)) behavior.
This is really only noticable for large numbers of concatenations.
There are two reasons why a group of N iterations might run in greater
than O(N) time.
First, in a situation like this:
local str = ""
for i=1,100000 do
str = str.."x"
end
The size of the string that must be created and copied grows as N grows.
With a sufficiently large N this should approach quadratic (as in
O(N*N)) time.
Running the above code for different values of N gave me these times:
Iterations Time
---------- ------
10000 0.13
20000 0.84
30000 1.92
40000 7.95
50000 13.46
60000 22.13
70000 30.86
80000 41.75
90000 54.35
100000 68.43
(I thought the jump from 30000 to 40000 was particularly interesting. I
was able to repeat the same behavior, thought the times varied up to 10%
or so).
I wish the above case could be called pathological, but it is not
unreasonable to do something like this to build up a string from
individual characters being read from somewhere. Note that in the above
case my join()
function would have exactly the same problem, but a userdata that
implemented a string buffer could operate in linear time.
Another possible case (I didn't collect data on this one) is when each
new string is kept around. Such as:
names = {}
for i=1,10000 do
names[i] = names..tostring(i)
end
In this case all operations are (I think) linear, with the exception of
interning the string. If I understand the way it is done, that would be
O(N log N) where N is the number of strings in the system.
I don't think the above case can really be called pathological either,
though it seems less useful than the first case.
In the first case some kind of string buffering system would speed
things up considerably. In the second case I think you're stuck, since
Lua requires strings to be interned and by definition you're keeping
them all around and available.
- Tom Wrensch
>>> lhf@tecgraf.puc-rio.br 10/07/02 13:38 PM >>>
>Since the speed of the string .. operation is slow (a problem shared by
>several languages) you could fix the problem by increasing the speed of
>the existing .. operation or you could use some other, faster
technique.
The string .. operation is not slow in Lua. It's actually very fast,
even
for multiple concatenations. The virtual machine operation is actually
for
multiple concatenation. Here is the code for
return "a" .."b" .."c" .."d"
1 [2] LOADK 0 0 ; "a"
2 [2] LOADK 1 1 ; "b"
3 [2] LOADK 2 2 ; "c"
4 [2] LOADK 3 3 ; "d"
5 [2] CONCAT 0 0 3
So, this is a linear (not quadratic) operation.
Bottom line: there's no reason not to use multiple concatenations for
long
strings. (Except that there is a limit on the number of concatenations.)
--lhf