lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Here's an approach that uses a table, but the table does not get very big. To concatenate 'n' string it performs O(n log n) concatenations
An implementation in 'C' would of course be faster, but appending 2 million characters one at a time took about 0.4 seconds on my laptop, so it's not horrible.

I believe Lua does something similar internally, as does the Java StringBuilder class.

Ge'


local buffer_class = {}

function buffer_class:append(a)
    if #a > 0 then
        local size = #self
        while size > 0 and 2*#a > #self[size] do
            a, self[size] = self[size] .. a, nil
            size = size - 1
        end
        self[size+1] = a
    end
end

function buffer_class:get()
    return table.concat(self)
end

local buffer_mt = { __index = buffer_class }

local function new_buffer()
    return setmetatable({}, buffer_mt)
end

local b = new_buffer()
for i = 1, 1000000 do
    b:append("x")
    b:append("y")
end
local s = b:get()
print("# of slots:", #b, #s)

On Thu, Jan 12, 2023 at 9:12 AM Lars Müller <appgurulars@gmx.de> wrote:
Both string.gsub and table.concat are internally efficiently implemented
using an exponentially grown string buffer. To efficiently "transform" a
string, you can usually use gsub; to concatenate strings, you can build
a table and concat it. Unfortunately the string buffer remains
unexposed, so you always have to use either function. gsub is often
unsuitable - in particular when you are "generating" a string rather
than "transforming" one, or if your "transformations" are too
sophisticated for gsub. In these cases, you will usually have to resort
to concat, which forces you to build a table. If you are building
strings character-wise, this will use 8 - 4 times the memory of the
internal string buffer on a 64-bit system. Additionally, it requires you
to build a table, even if everything could be streamed directly into the
buffer.

Thus I suggest a new string library function called
string.from_iterator(...) or the like; it would take as parameter(s) a
for-loop iterator func, state, initvar, closevar and would return a
string, internally using a buffer to build it.

This could e.g. be used to conveniently build a string using
string.gmatch: string.from_iterator(("abba aba aa bb"):gmatch"ab*a")

- Lars


--