lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Fri, Dec 3, 2010 at 10:39 AM, Roberto Ierusalimschy
<roberto@inf.puc-rio.br> wrote:
>> [*] Of course, this is a minor problem compared to explaining the
>> weird behavior of the # operator. I predict it will be changed to
>> return the actual number of elements of a table sometime around Lua
>> 7.0 (when the argument that not maintaining a counter saves precious
>> memory and processing won't be as compelling), [...]
>
> The problem with # returning the actual number of elements of a table
> is not space, but that it does not solve most of the real problems that
> the current # have. It sure becomes deterministic, but mainly useless
> nonetheless. #{1,nil,1,nil,1} being 3 is not very helpful when you think
> about a list with holes. Doing "a[#a + 1] = nil" will still not work
> with that scheme (in the same way it does not work with the current
> one).

I think the notion of length for a "list with nils as holes" mapped
into Lua tables is simply ill-defined anyway, unless one considers the
list to be infinite or that holes are allowed anywhere but the end of
the list -- which is, arguably, a contrived definition.

>From my understanding, the Lua standard libraries acknowledge two main
uses for Lua tables: generic key-value mapping and as contiguous
sequences of elements indexed from 1 to n (ipairs, the sane uses for
default position arguments in table.insert, table.remove, etc.).

>From those uses, I see two sensible deterministic definitions for #,
which would be either:

* the number of elements in the table
   # pros:
      - deterministic;
      - simplest possible definition;
      - same as the number of entries iterated by pairs;
      - works fine as array length for (proper, non-"holed") arrays;
      - can serve as the number of elements for other data structures
a table may represent, such as sets;
      - allows the intuitive "if #t == 0" to check if a table is empty;
   # cons:
      - doesn't work as array length for tables that have both an
"array part" of numeric indices and a "hash part" with other keys;
      - won't serve as length for tables used as "lists with holes
anywhere except the end" (I don't think this is really a con, but
since it was brought up...)

* "array length": the number of continuous integer keys starting from 1
   # pros:
      - deterministic;
      - same as the number of entries iterated by ipairs;
      - works fine as array length whether they have a "hash part" or not.
   # cons:
      - no utility for non-array tables;
      - won't serve as length for tables used as "lists with holes
anywhere except the end";.

Javier's definition, where he argues maintaining the value # could
become linear, is a different one:

* number of the highest numeric key.
   # pros:
      - deterministic;
      - works fine as array length whether they have a "hash part" or not.
      - serves as length for tables used as "lists with holes anywhere
except the end";
   # cons:
      - no utility for non-array tables;
      - doesn't match either pairs or ipairs.
      - it would degrade complexity of operations (showstopper)

For completeness, the current definition of #:

* some number n for which there is a t[n] but not a t[n+1] or perhaps
zero if t[1] is nil
   # pros:
      - lowest memory and processing cost;
      - works fine as array length whether they have a "hash part" or not.
   # cons:
      - contrived, non-deterministic definition;
      - no utility for non-array tables;
      - doesn't match either pairs or ipairs.
      - won't serve as length for tables used as "lists with holes
anywhere except the end"

I'm sorry about bringing this subject on the mailing list again[*] and
sorry about the long reply (I just wanted to let Javier know that I
did think this through before 'complaining'. :) )

-- Hisham
[*] Roberto, in case you want to hit me with a stick for bringing this
up on list, I'll be in Rio this weekend. :)