• Subject: Re: Bug report in length of simple table
• From: Andrew Starks <andrew@...>
• Date: Wed, 14 Sep 2016 13:28:30 -0500

```On Wed, Sep 14, 2016 at 11:28 AM, Dirk Laurie <dirk.laurie@gmail.com> wrote:
> 2016-09-14 17:35 GMT+02:00 Roberto Ierusalimschy <roberto@inf.puc-rio.br>:
>>> "Frontier" feels more appropriate, but if we follow Dirk's suggestion
>>> to avoid collision with %f then perhaps "edge" or "threshold" might be
>>> a reasonable term.
>>
>> "Edge" sounds good, too.
>
>
> 1. Short.
> 2. Similar to image processing usage. One can say that
> the default length calcaltion is an edge-detection algorithm.
>

No matter what, there seems to be a great need for a definition that is:

1: concise (but not overly!)
2: understandable by laypeople
3: not confusing to people with math or CSCI backgrounds[1]

Here is my attempt. It does not satisfy the first condition and I use
"array" instead of "sequence". But, it might be useful and so I'll
share it:

Lua does not have array structures, only tables. However, we can
emulate the behavior of an array, including the ability to get its
length, by following some simple rules:

1: Arrays cannot contain cannot store nil values.
2: An array can be empty. That is, there are no integral keys with
non-nil values.
3: A table may have any number of non-integral keys.
4: The first element of an array is at index `1`.
5: There can be no integral keys that are beyond end of the array,
which is the integral key that has a non-`nil` value where the next
integral key has `nil` as its value.

If your table conforms to these rules, then you may use Lua's `ipairs`
and length operator `#` without any surprises. If your table does not
conform, then the behavior is undefined and that is for a very good
reason. Validating a sequence is slow and as the programmer, you can
decide if you need the validation or not. We'll show how we can
validate a sequence later on.

First, let's look at some invalid results:

```
local t1 = {10,20, nil, 40}
print(#t1)
--> 4
local t2 = {10,20, nil,  [4] = 40}
print( #t2)
--> 2
```

Which answer from Lua's `#` operator is correct? Neither. If your
table does not contain a valid array, then the value returned by Lua's
`#` operator is invalid, and it cannot be depended upon. One more
example:

```
local t = {1,2,3, [100] = 100}
local prevpos, pos = #t, #t + 1
while true do
if prevpos +1 ~= pos then
print("skipped! #t:",#t, "prevpos:", prevpos,"pos:", pos)
break
end
t[pos] = pos
prevpos, pos = pos, #t + 1
end
--> skipped! #t: 100 prevpos: 96 pos: 101
```

If you need to store `nil` values in your "array" or need some other
sort of fanciness, you can define a custom `__len` metamethod, which
allows you to override Lua's result for the `#` operator. Here is a
very basic and potentially slow example:

```
local mt = {
__len = function(t)
local len
for i, v in next, t do
if math.tointeger(i) and i > 1 and  (not len or len < i) then
len = i
end
end
return len
end
}

local t = setmetatable({1,2,3, [100] = 100}, mt)
local t2 = setmetatable( {[-100] = -100, [-84] = -84, [101.3] = 101.3}, mt)
local t3 = setmetatable( {foo = 'bar', [10] = 10}, mt)
print("#t:", #t, "#t2:", #t2, "#t3:", #t3)
-->#t: 100 #t2: nil #t3: 10
```

If you are writing software that receives tables from the outside
world, then you might need something like the above, since you can't
trust that the table will be a valid array, as far as Lua is
concerned.

Sometimes it is better to avoid the use of a metatable and store the
length in a variable of your table, such as the `n` key. This is what
the `table.pack` function does.

```
local t = table.pack(1, 2, nil, 4)
print('t.n', t.n)
--> t.n
```

In summary, follow the rules and you'll never notice anything weird
about Lua's `#` operator. If you can't trust the validity of your
arrays or want to achieve a different result, then override `__len` or
use a different structure.

----------

- Andrew Starks

[1] It seems like most people with CSCI backgrounds think of a
sequence as a structure of commands that execute in a predetermined
order. Lua is using the meaning found in math circles which is a
potential confusion that could be avoided. However, it may be too late
to start using a different word and I'm sure that the topic has been