lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi,

I have been using Lua a lot over the past year, and I have always been
puzzled by the table length operator. I understand what it does, but I
don't find it particularly useful.

Lately, there have been many proposals to change the definition of the
length operator, but I feel that most of them either have a high
algorithmic cost or are not much more useful than the current
definition.

I think that a big part of the problem is that people feel 'nil'
should somehow be treated differently from other values when storing
it in a table  used as an array. I don't see why. Being free to store
nil anywhere in any array could be very useful in Lua.

So here's yet another proposal for the table length operator. It is
simple to understand, simple to implement, useful, and with a low
algorithmic cost (i.e. O(1) to update or to query the length of a
table.) Even better, arrays with holes and sparse arrays would not be
a special case anymore.

Given a table t, #t would be defined by those two rules:

1 - #t is the highest positive integer index to which a value was
assigned in t (including nil)

2 - table.remove(t, i) decrements #t by 1 if i <= #t.

The ipairs function should also be changed so that it doesn't stop
when it encounters a nil value. In other words, it would iterate over
every pair (i, t[i]) where 1 <= i <= #t.

Here are some examples to illustrate how it would work:

> t = { 1, nil, 3, nil }
#t is 4 (in Lua 5.1, #t is be 1)

> t = { 1, nil, 3 }
#t is 3 (in Lua 5.1, #t is also 3)

> t = { [1000] = true }
#t is 1000 (in Lua 5.1, #t is 0)

> t = { [1000] = true }
> table.remove(t)
#t is 999 and t[1000] is nil (in Lua 5.1, #t is 0, and t[1000] is true)

> t = { [1000] = true }
> table.remove(t, 1)
#t is 999, and t[999] == true (in Lua 5.1, #t would be 0 and t[999]
would be nil)

This definition has many advantages over the current one:

- Marshalling variable argument lists would work properly (this makes
table.pack redundant in Lua 5.2):

>  function bind(f, ...)
>>   local parameters = { ... }
>>   return function() f(unpack(parameters)) end
>> end
>
> f = bind(print, 1, nil, 3, nil)
> f()
Output: 1   nil 3   nil
In Lua 5.1, the output is: 1

Many people,including me, have been bitten by this behaviour. The code
above deceptively looks like it should work in Lua 5.1, and it does
work in most situations. And then, of course, it fails at the worst
possible moment.

- The t[#t+1] idiom for inserting values into tables becomes much more
efficient for large tables, since the algorithmic cost of #t is O(1).

- The behaviour of the length operator more closely follows the
programmer's intent. The effect of table assignments on the length and
the contents of the table are easier to understand:

> t = {}
> t[#t+1] = GetValue() -- this call to GetValue() returns 1
> t[#t+1] = GetValue() -- this call to GetValue() returns nil
> t[#t+1] = GetValue() -- this call to GetValue() returns 3
> t[#t+1] = GetValue() -- this call to GetValue() returns nil
> print(unpack(t))
Output: 1   nil 3   nil

In that example, no matter what GetValue() returns, #t will be 4, and
t will contain the values returned. In Lua 5.1, #t would depend on the
values returned by GetValue(). In this particular case, #t would be 2,
and this code would print: 1   3

Of course, nothing is perfect. The biggest inconvenience is that the
"t[#t] = nil" idiom to remove the last element from the array would
not work anymore:

> t = {1, 2, 3}
> t[#t] = nil
> print(#t)
3
> print(unpack(t))
1   2   nil

To remove the last element, table.remove(t) must be called:

> t = {1, 2, 3}
> table.remove(t)
#t is 2

However, when you think about it, "t[#t] = nil" reads as "assign nil
to the last index in t", so it doesn't seem right that it reduces the
length of the array. Personally, I see the new behaviour as an
improvement, but I know many will disagree on that point (especially
those who wrote a lot of code using that idiom!) This would be a big
change, but no worse than the function environment changes in Lua 5.2.

The other inconvenient is that tables would have to store their length
and incur a small cost for updating it.  While the Table structure in
lobject.h already has many fields, I understand that people using Lua
on embedded systems may object to adding one more.

I have been testing a modified version of Lua with this change, and I
find it surprisingly natural (I am obviously very biased!) It
generally works the same, but it seems that there are fewer surprises.
I think the hardest thing was to stop considering that a 'nil' in an
array is a hole; it's a value like any other. Whether it is actually
stored in the table or not is an implementation detail that shouldn't
leak in Lua.

- Keith