Lua Table Size

lua-users home
wiki


[!] VersionNotice: The problem described in this article has been addressed. Starting in version 5.0, the value n is no longer referenced in size calculations of the list component of a table.

The problem

Tables contain the member "n" as an optimisation for table insertions. ie., from the manual :-

getn (table)
Returns the size of a table, when seen as a list. If the table has an n field with a numeric value, this value is the size of the table. Otherwise, the size is the largest numerical index with a non-nil value in the table. This function could be defined in Lua:
function getn (t)
  if type(t.n) == "number" then return t.n end
  local max = 0
  for i, _ in t do
    if type(i) == "number" and i>max then max=i end
  end
  return max
end

Due to the dual nature of tables, ie. they can be lists and dictionaries at the same time, "n" can conflict with user data in a table. The section below lists a number of solutions to the problem. Please feel free to comment next to a solution or put forward your own. Please leave your name or initials so "votes" can be counted.

This could be defined better. The n field is not the "table size" and is not for "table insertions". (Granted the Lua API function names and documentation confuse this matter.) Tables are a pure data type in Lua, while lists are not. There are several ways to implement lists on top of the table data type. Since lists are an important data type and required even within the VM itself (for varargs), the standard library provides an implementation including use of an n field for the list size and an algorithm for determining the "list component" of an arbitrary table (that's what getn does when there is no n field).

Alternative solutions

It's not a problem, leave it.

Your programming style may not necessitate the mixing of lists and dictionaries in the same table so it may not be a problem.

So, basically you dont want to break backwards incompatibility, you find that because the table size is named "n" any conflict bugs are easy to find, and you dont mind always having your table size called "n". Why not remove the problem, as you state, "sooner or later it will bite you", and you can be flexible about naming? I dont remember seeing any code which sets a table size using n (so no backward problem?). --NDT

I think that summarises it :-). If I'm using a table as a vector, I don't use it as a dictionary although I might put my own keys in it. In that case, I simply don't use the key n or any numeric key. That's no different from the case of stashing keys in any object implemented as a table; you have to avoid using the object's defined keys, which hopefully are documented. As it happens, I do have code which sets table size using n -- unless you have seen every line of Lua code in existence, I don't think you can blithely make the claim that there is no backward problem. In any event, I think it's pretty common to retrieve the table size using vec.n rather than getn(vec) because that is significantly faster if you can be assured that the key exists (or even vec.n or getn(vec). -- RiciLake

"which hopefully are documented" :-) Its this kind of confusion that can be avoided. The fact that tables are multipurpose and can be used as lists or dictionaries means that this type of restriction should not be applied. Personally I'd rather put up with fixing code which may be broken by this. I consider Lua still in its infancy as a language, with its roots in convenient embedding and configuration. It still has to have little quirks, such as this, ironed out before it is considered a serious scripting language. I'm not sure becoming a "Python-beater" is one its design goals, but I'm sure the authors are keen to see the language and its user base develop. This will continue to happen as the Lua improves. :-) --NDT

To be fair, there are a couple of other places, one being call. However, it is certainly easy enough to write replacement functions and no-one is forcing you to use tinsert and tremove -- RiciLake

It should be renamed

The "n" variable should be renamed to something less likely to clash, eg. "__n__"

setn()

setn() would be a better solution to compliment getn().

len[t]

        settagmethod(tag({}), "index",
              function(v, k) if k == "n" then return len[v] end end)

So I'm willing to change my vote --RiciLake

This is also compatible with getn and setn; if that's what you want, just include:

        function getn(v) return len[v] end
        function setn(v, n) len[v] = n end

Please add any other solutions...

Implement an non-overridable getnEx() function

function getn (t)
  if type(t.n) == "number" then return t.n end
  return getnEx(t) -- internal function
end
and people could ignore the getn() function if it causes them grief, and just use the getnEx() function which doesn't require work arounds. getnEx() would be essentially equivalent to:
function getnEx (t)
  local max = 0
  for i, _ in t do
    if type(i) == "number" and i>max then max=i end
  end
  return max
end

--Paul Hsieh

Votes cast

Please update the list below. If you would prefer to vote anonymously just add a vote below (but it would be nice to hear your opinion :-).


RecentChanges · preferences
edit · history
Last edited October 10, 2009 1:32 pm GMT (diff)