lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]



-----Original Message-----
From: lua-l-bounces@lists.lua.org [mailto:lua-l-bounces@lists.lua.org] On Behalf Of Jonathan Goble
Sent: Wednesday, September 14, 2016 9:06
To: Lua mailing list
Subject: Re: Bug report in length of simple table

On Wed, Sep 14, 2016 at 2:32 AM,  <Tomas.Lavicka@tieto.com> wrote:
> I am new to lua and I want to understand the language. According book (Programming in Lua, 3rd edition):
> "More precisely, a sequence is a table where the numeric keys comprise 
> a set 1, . . . , n for some n." (page 24) So table {10, nil, 30, 40} contains sequence {10} (n=1), which length is 1. Is there any other definition?
> It is very confusing and operator # can be unusable for tables, because in many cases I need to check sequences and I cannot trust this operator. It should return or 1 (book definition) or if holes in sequence break the sequence, it should return 0. Otherwise I will be pushed for every sequence check to check all array fields manually in some kind of loop.

It behaves the way it does for efficiency.

The length algorithm, as I understand it (from previous explanations on this list), does not and should not check every key-value pair in the table, as you suggest, to determine whether or not a valid sequence is present. That would be an O(n) operation, and nobody wants that. The actual algorithm is (I think) closer to O(log n), and checks pairs of positive integer keys according to a sensible pattern until it finds an `n` where `t[n]` is not nil and `t[n+1]` is nil. It then returns `n` as the length.

Thus, the length of a table is defined if and only it is a sequence, since with a non-sequence, getting the length could result in any non-negative integer `n` where `t[n]` is non-nil and `t[n+1]` is nil.
The length operator guarantees only that.

So what exactly is a sequence? A sequence, as you quoted from PiL, is "a table where the numeric keys comprise a set 1, . . . , n for some n." "Comprise" means that all of the numeric keys have to participate in that, but even that is not perfectly accurate. Let me tweak and expand the wording of PiL to be even more precise:

"A sequence is a table where the set of all positive integer keys that do not correspond to a nil value is exactly equal to the set {1..n} where n is the highest positive integer key that does not correspond with a nil value."

So what is the table {10, nil, 30, 40} (to use your example)? It cannot be a sequence of length 4, because the key 2 is nil. It also cannot be a sequence of length 1 because keys 3 and 4 have values.
Specifically, the set of all positive integer keys is {1, 3, 4}, but the set of all integers {1..n}, where n is the highest integer key in the table, is {1, 2, 3, 4}. As those set are not identical, it is therefore not a sequence at all, and the length of the table is undefined.

The bottom line is that if you want to work with sequences, you CANNOT use nil as a value in those sequences.


P.S. In the explanation above, I repeatedly stated "positive integer keys" rather than "numeric keys" or simply "keys". Why? Because (and this is an important point) zero, negative, non-integer (float) and non-numeric keys are completely irrelevant to whether a table is a sequence. When determining whether a table is a sequence, ONLY positive integer keys matter. All other keys are ignored for that purpose. So, for example, the table

{10, 20, 30, 40, [math.pi] = 3.14, [0] = 'foo', [-4] = 'bar', ignore = 'me'}

is a valid sequence of length 4.
---------------------------------------------------------------------------------------------------------------------------------------

OK, I get it. Thx 4 explanation. But can you also explain to me next behavior?
> table.unpack{"a",nil,"c"}
a       nil     c
> table.unpack{[1]="a",[2]=nil,[3]="c"}
a

I am still confused with sequences, but I hope I will understand it soon.

Tomas Lavicka