lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]



On May 30, 2015, at 9:28 PM, Brigham Toskin <brighamtoskin@gmail.com> wrote:

This is kind of an interesting point, actually. Does anyone know the rationale for not just automatically tracking the number of entries in a table? Similar data structures in many other languages do this. I know that's not in an of itself a reason to do anything, but it does seem useful. If you know you're dealing with a sequence a sequence, then #t effectively becomes constant instead of logarithmic. And if you don't know it's a sequence (or you know it's not) then you have a convenient counting method that doesn't involve an iterator.

It’s the overhead involved. As Roberto has noted in the past, the code to do the checking would be a performance tax on *all* tables, even those for which sequence checking is irrelevant. Given the current definition of a sequence, it’s quite hard (from an efficiency standpoint) to determine the truth of the statement “table t is a valid sequence” (it’s *not* just a count of elements, even numeric ones).

Here is one (not very efficient) way to do it :

function isSequence(t)
    keys = {}
    for k, _ in pairs(t) do
        if (type(k) == "number") and (k > 0) then table.insert(keys, k) end
    end
    table.sort(keys)
    for k, v in ipairs(keys) do if k ~= v then return false end end
    return true
end

t1 = { "a", "b", "c" }
t2 = { [1] = "one", [3] = "three", [4] = "four" }
t3 = { "a", "b", "c", [1.2] = "d" }
t4 = { "a", "b", "c", "d" }
t4[3] = nil

print(isSequence(t1))
print(isSequence(t2))
print(isSequence(t3))
print(isSequence(t4))

(Note that this code returns true for an empty sequence, which is my interpretation of the Lua definition, which states "for some non-negative integer n”.)

I’m sure others will jump in with more efficient ways to do the same check. If I was writing a C version, I would certainly do the following:

— Use a simple C 64-bit integer array for “keys” (and enlarge it dynamically in power-of-two steps)
— Scan for non-integral positive keys as they are iterated, and fail at once if one was found
— When inserting items into the “keys” array, keep track if the items arrive in order (in which case no sort is needed)
— When inserting items into the “keys” array, keep track if the table is a sequence so far (that is, item[n] == n during the insert)
— Explore using a bitmap to represent the keys array while the maximum entry is less than some reasonable N, and switch to an array only if a larger key is encountered.

However, it’s Sunday, and my optimizer is off for the day :)

—Tim