lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

On Fri, Jul 5, 2013 at 4:48 PM, Tim Hill <> wrote:
> So getting back to my 3 questions…
> 1. Is there a need for an "empty" element within an array (where "empty" as
> a concept is tbd)?
> 2. Assuming #1 is "yes", would this be useful as a standardized technique so
> that everyone uses the same convention?
> 3. Assuming #2 is "yes", what form should this standard technique take?
> You are arguing that the answer to #1 is "no"?
> One thing that I find interesting is the amount of discussion here .. this
> is suggestive of *something* to be sure. Perhaps it's a misunderstanding of
> the problem, or perhaps there is *some* kind of problem here but people
> differ as to what it is.
> One thing I like about Lua is the intuitive and clean nature of the
> language. But the Lua concept of a "sequence" to my mind is a bit odd .. it
> feels more like something that fell out of an internal optimization of table
> implementation (compact storage for integer keys and fast indexing) rather
> than a designed feature. This makes the # operator fragile; any 3rd party
> code or library can "corrupt" an array and make # return arbitrary invalid
> values and there is no way to discover this that I am aware of. I find this
> behavior a bit odd, and clearly some of the other posters do as well, as
> many suggestions here make # more robust (or provide a similar mechanism
> that is itself more robust).
> Why is this relevant? Because my "empty" design (flawed as it might be)
> makes # more robust by providing a way to have them sparse but still a
> defined size, and the various other suggestions also do the same by other
> means.
> --Tim
> On Jul 5, 2013, at 1:24 PM, Andrew Starks <> wrote:
> Would never be seen by the garbage collector because nil isn't added to its
> list of stuff to track and 1 isn't explicitly constructed. [obj] = nil,
> because nil is now seen by gc as a weak object.
> If I'm saying this clearly, that behavior would also explain locals well,
> too.
> This behavior would, to my understanding, have avoided the need for another
> value and only require the need to know if something was set to nil or
> whether it was absent.
> That seemed cheaper than a new type and simpler to add. Since that's not how
> things are, I think that the way it works is fine, because everything else
> seems to make Lua "ever so slightly bigger."

I do understand that everyone is more-or-less focused on the idea of
an `empty` type and that I've been speaking to mechanisms to know if
something is "there but nil" or simply "never there".

In case it is not obvios, I do this because the "there and nil/not
there at all" approach can be made to accomplish the goal and it is
painfully close to already being within Lua, today.

First, I re-looked at tables. I found that when I was playing with
collectgarbage, if I set an index to something and then to nil, the
array seemed to still hold the key. If I simply set a previously
undefined index to nil, Lua seemed to wisely optimize that away.

Step 1: Do not optimize away the setting of a key value to nil,
provided that the key is of a type that does not have explicit
construction (strings and numbers). [This must be how lua treats local
assignment to nil.]

Step 2: Modify `rawget` so that it returns 0 values (`return `) when a
key is not existent.

rawget(t, bar) -- bar is a non-existant key
--> --0 values, not nil. So then therefore (and this is maybe the
biggest egg to crack using this approach):

type(rawget(t, bar))
-->  bad argument #1 to 'type' (value expected)

OPTIONAL STEP 3: Make it so that the return value of `__index` can
also return 0 values. This may also require a change to the way lua
treats indexed tables as arguments, though I don't know that for sure.

I would posit that these changes are pretty mild, with the possible
exception of erroring on an index that was absent. I wonder if anyone
can see any bugs that would come up, due to these changes?

If the above were true, then consider this code:

printf = function(...) print(string.format(...)) end
local array_stop  = 10000
local function array_iter (a, i)
     i = i + 1
     local v = a[i]
--!!!! today, the next line always results in true (select always
returns 1), no mater what. That's because rawget returns
--!!!! 1, even if the slot was never defined (not even nil is there)
     if select('#', rawget(a, i)) == 1 then -- a key set to nil would
result in true, under this proposal.
          return i, v

local t = setmetatable({},{
--!! __index doesn't work, today. It always returns 1 value (nil),
even if `return ` or return is absent.
--Either the return is always nil, or `t[i]` in an argument list has
the same affect that non-indexed variable access does, which is to
always be promoted to nil, if they're undefined.
__index = function(a, i)
     if select('#', rawget(a, i)) == 1 then
          return nil
          return -- if this were possible and select('#', a[i]) would
result in `0` when the array was never set, we'd be in business,
without changing rawget.
__ipairs = function(a)
     return array_iter, a, 0
--What redefining __len would look like:
__len = function(a)
     local i = 1
     while select('#', rawget(a, i)) > 0 do --ick. right now this pegs
the processor, obviously.
          i = i + 1
     return i - 1
-- here are the results that lead me to believe that lua is storing
nils in tables:
local k1= collectgarbage("count")
printf("Start bytes:\t\t%10.2f",k1)

--> If we do not do this first...
for x = 1,  array_stop do
     t[x] = tostring(x)

--Then this gets optimized away.
for x = 1, array_stop do
     t[x] = nil

-- As it stands, according to collectgarbage("count"), there seem to
be nils assigned to number, so far as I can tell.
--Obviously, the optimization of  leaving `t[x] = nil` empty, would
need to be removed, in order for this approach to be considered
-- viable. That is, as when you declare `local foo`, lua needs to
actually put nil into th value spot at `t[x]`. Also, if `x` were a
table, then
--our nil would obviosly go away and all would be truly empty.

print( select('#', rawget(t, 100)), select('#', rawget(t, array_stop + 1)))
--Today, this is always "1, 1"
--> 1, 0 would open up many opportunities.

local k2 = collectgarbage("count")
printf("Before collection:\t%10.2f%10.2f",  k2, k2 -k1 )
local k2 = collectgarbage("count")
printf("Before collection:\t%10.2f%10.2f",  k2, k2 -k1 )
print( select('#', rawget(t, 100)), select('#', rawget(t, array_stop + 1)))
--> 1, 1 :(

local k2 = collectgarbage("count")
printf("Before setting table to nil:\t%10.2f%10.2f",  k2, k2 -k1 )
t= nil

local k3 = collectgarbage("count")
printf("After collection:\t%10.2f%10.2f", k3, k2 - k3)


So, is this completely awesome sauce? No. But, it doesn't add anything
terribly new and it doesn't change behavior in a big bad way.

But with these changes, we can get at the true source of the "nil"
that we got at a given table index. And that gives us a pretty solid
way to make sparse arrays.

I can understand if there is hate for this approach. I just wanted to
be sure I articulated the reasoning behind my talking about this, as
opposed to a new type.

- Andrew