Utility functions to manage sets

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Utility functions to manage sets
From: Duck <duck@...>
Date: Tue, 3 Jul 2007 10:59:20 +1000 (EST)

G'day everybody.

This is a naive question which has doubtless been answered many times.Recent postings about table.sort() imply but don't explain the answer.

In code in LuaSocket and COPAS, you can find a utility "set" class such asthis:


-- simple set implementation
-- the select function doesn't care about what is passed to it as long as
-- it behaves like a table
-- creates a new set data structure
function newset()
    local reverse = {}
    local set = {}
    return setmetatable(set, {__index = {
        insert = function(set, value)
            if not reverse[value] then
                table.insert(set, value)
                reverse[value] = table.getn(set)
            end
        end,
        remove = function(set, value)
            local index = reverse[value]
            if index then
                reverse[value] = nil
                local top = table.remove(set)
                if top ~= value then
                    reverse[top] = index
                    set[index] = top
                end
            end
        end
    }})
end

This code is very cool, and the "add at end" and "delete by removing thelast element and writing it over the one you want to get rid of" isclearly more time efficient than just using set[value] = true to add andset[value] = nil to delete items.

My question is, how much more efficient (compared to the overhead of usingtwo tables where one is enough)? If I add items with random indices to atable and then randomly remove them, where is the inefficiency? Is it inthe garbage collection of the removed items? Is it a raft of O(n)memcpy()s when compacting metadata (e.g. hashes) for tables which havescattered bunches of removed items?

Where does the time go in starting with an empty (non-array) table andadding and removing entries arbitrarily, where the the adds and removesare paired, but only in the long run. (E.g. you might add 50 items, thenremove a random 10 of them, then add 16 more, then remove 30 of all thatare left, then add 6, then remove the lot, then add 66, rinse, repeatfor a long time.)

(Apologies if this is answered in detail in PiL, which it probably is. Ifanyone has hard measurements of the differences in behaviour I'd be mostinterested. Any OS at all from the set {Windows, Linux} :-)

Follow-Ups:
- Re: Utility functions to manage sets, Javier Guerra Giraldez

Prev by Date: Overloading Concatenate operator
Next by Date: Re: Overloading Concatenate operator
Previous by thread: Re: Overloading Concatenate operator
Next by thread: Re: Utility functions to manage sets
Index(es):
- Date
- Thread