Storing Nils In Tables

lua-users home
wiki

Lua tables make no distinction between a table value being nil and the corresponding key not existing in the table. t = {[k] = nil} is identical to t = {} , and t[k] evaluates to nil when k is not a table key. In fact, you may think of {} as a table with all possible keys set to nil, and this still takes only a small finite amount of memory because all those keys having nil values are not explicitly stored. Furthermore, attempting to set a table key as nil, e.g. {[nil] = true} , raises a run-time error. This is unlike various other common languages. [1]

Occasionally we find ourselves in a situation where we really do want to distinguish between table values that are nil v.s. those that are undefined. Consider this function that reverses its arguments by storing and manipulating the arguments in a table:

function reverse(...)
    local t = {...}
    for i = 1,#t/2 do
      local j = #t - i + 1
      t[i],t[j] = t[j],t[i]  -- swap
    end
    return unpack(t)
end

This usually works, but not necessarily when one of the arguments is nil:

print(reverse(10,20,30)) --> 30 20 10
print(reverse(10,nil,20,nil)) --> 10

Solution: explicit 'n' field

A solution we can use here is to store the table length as a key n in the table. In fact, this is how Lua versions prior to 5.1 implemented array length:

function reverse(...)
    local t = {n=select('#', ...), ...}
    for i = 1,t.n/2 do
      local j = t.n - i + 1
      t[i],t[j] = t[j],t[i]  -- swap
    end
    return unpack(t, 1, t.n)
end

(It's been noted by RiciLake that the extra copy of ... into a function argument list for determining its length involves needless overhead. In fact, it would also be nice to avoid the overhead of constructing a table to perform this operation, but data in ... is cumbersome to manipulate unless copied into a table, though there are some slightly awkward patterns to avoid the table--see "Vararg Saving" in VarargTheSecondClassCitizen.)

Solution: nil Placeholder

The above approach works in the special cases where a table is used as an array containing nil's. In the general case, the table might contain arbitrary (not necessarily numeric) values. The following example expresses a mathematical set by storing its elements as keys in a Lua table. However, since table keys cannot be nil, this presents a challenge to storing nil in the set. The solution is to create a placeholder object NIL that is stored in place of nil in the table:

do
  local NIL = {}
  function set_create(...)
    local t = {}
    for n=1,select('#', ...) do
      local v = select(n, ...)
      t[v == nil and NIL or v] = true
    end
    return t
  end
  function set_exists(t, v)
    return t[v == nil and NIL or v]
  end
end

local t = set_create(10, nil, false)
assert(set_exists(t, 10))
assert(set_exists(t, nil))
assert(set_exists(t, false))
assert(not set_exists(t, "hello"))

There is little possibility for conflict using NIL as a place-holder. NIL, as a table, is an object, and objects have unique identities. The table NIL is lexically scoped in the do block and is not visible anywhere else in the program--except, well, in the table. The user could grab NIL out of the table and attempt to add it to another set, and in this case NIL will be treated as nil rather than NIL, and this is probably what the user intended.

There are other alternatives to using NIL, such as using some other value that is unique to the domain of a particular table (possibly false or the table itself), but this will not work in the general case. You might alternately use a second table enumerating the defined keys:

local t = {red = 1, green = 2}
local is_key = {"red", "green", "blue"}
for _,k in ipairs(is_key) do print(k, t[k]) end

It should be noted that nil is not the only value that cannot be stored in tables. The "not-a-number" value (NAN) defined by 0/0 cannot be stored as a key (but can be stored as a value). There are some potential issues with NAN being a table key: LuaList:2005-11/msg00214.html . This limitation can be worked around in a similar way. However, note that NAN is the only value that doesn't equal itself (NAN ~= NAN), and this is the way to test for NAN.

Hack: Distinguishing UNDEF and nil via Metafunctions

One possible solution would be to define a metatable on a given table so that the table returns nil if a key exists and the value is nil but returns a new unique value UNDEF = {} if a key does not exist. However, this has some fairly serious problems. First, UNDEF logically evaluates to true, so we can't use the idiom if t[k] then ... end since it will execute the branch if k is undefined in the table. More importantly, the programmer might attempt to store these UNDEF values in a table, leading to a similar problem that UNDEF's can't be stored in tables.

That approach is implemented below by making the table be a proxy to another private table that maintains the nil v.s. UNDEF information, and the __newindex metfunction records the setting of nil. It is partly based on the "Tables with default values" example in Programming In Lua, 2nd ed., 13.4.

-- NiledTable.lua
local M = {}

-- weak table for representing nils.
local nils = setmetatable({}, {__mode = 'k'})

-- undefined value
local UNDEF = setmetatable({},
  {__tostring = function() return "undef" end})
M.UNDEF = UNDEF

-- metatable for NiledTable's.
local mt = {}
function mt.__index(t,k)
  local n = nils[t]
  return not (n and n[k]) and UNDEF or nil
end
function mt.__newindex(t,k,v)
  if v == nil then
    local u = nils[t]
    if not u then
      u = {}
      nils[t] = u
    end
    u[k] = true
  else
    rawset(t,k,v)
  end
end

-- constructor
setmetatable(M, {__call = function(class, t)
  return setmetatable(t, mt)
end})

local function exipairs_iter(t, i)
  i = i + 1
  local v = t[i]
  if v ~= UNDEF then
    return i, v
  end
end

-- ipairs replacement that handles nil values in tables.
function M.exipairs(t, i)
  return exipairs_iter, t, 0
end

-- next replacement that handles nil values in tables
function M.exnext(t, k)
  if k == nil or rawget(t,k) ~= nil then
    k = next(t,k)
    if k == nil then
      t = nils[t]
      if t then k = next(t) end
    end
  else
    t = nils[t]
    k = next(t, k)
  end
  return k
end
local exnext = M.exnext

-- pairs replacement that handles nil values in tables.
function M.expairs(t, i)
  return exnext, t, nil
end

-- Undefine key in table.  This is used since t[k] = UNDEF doesn't work
-- as is.
function M.undefine(t, k)
  rawset(t, k, nil)
end

return M

Examples/test:

-- test_nil.lua - test of NiledTable.lua

local NiledTable = require "NiledTable"

local UNDEF    = NiledTable.UNDEF
local exipairs = NiledTable.exipairs
local expairs  = NiledTable.expairs
local exnext   = NiledTable.exnext

local t = NiledTable { }

t[1] = 3
t[2] = nil
t.x = 4
t.y = nil
assert(t[1] == 3)
assert(t[2] == nil)
assert(t[3] == UNDEF)
assert(t.x == 4)
assert(t.y == nil)
assert(t.z == UNDEF)

-- UNDEF is true.  This is possible undesirable since
-- "if t[3] then ... end" syntax cannot be used as before.
assert(UNDEF)

-- nils don't count.  __len cannot be overriden in 5.1 without special
-- userdata tricks.
assert(#t == 1)

-- constructor syntax doesn't work.  The construction is done
-- before the metatable is set, so the nils are discarded before
-- NiledTable can see them.
local t2 = NiledTable {nil, nil}
assert(t2[1] == UNDEF)

-- nils don't work with standard iterators
local s = ""; local n=0
for k,v in pairs(t)  do print("pairs:", k, v); n=n+1 end
assert(n == 2)
for i,v in ipairs(t) do print("ipairs:", i, v); n=n+1 end
assert(n == 3)

-- replacement iterators that do work
for i,v in exipairs(t) do print("exipairs:", i, v); n=n+1 end
assert(n == 5)
for k,v in expairs(t) do print("expairs:", k, v); n=n+1 end
assert(n == 9)
for k,v in exnext, t do print("next:", k, v); n=n+1 end
assert(n == 13)

-- This does not do what you might expect.  The __newindex
-- metamethod is not called.  We might resolve that by making
-- the table be a proxy table to allow __newindex to handle this.
t[1] = UNDEF
assert(t[1] == UNDEF)
for k,v in expairs(t) do print("expairs2:", k, v); n=n+1 end
assert(n == 17) --opps

-- Alternative
undefine(t, 1)
for k,v in expairs(t) do print("expairs3:", k, v); n=n+1 end
assert(n == 20) --ok

-- Now that we can store nil's in tables, we might now ask
-- whether it's possible to store UNDEF in tables.  That's
-- probably not a good idea, and I don't know why you would
-- even want to do that.  It leads to a similar problem.
--
--   Here is why:  any value in Lua can potentially be used as input or
--   output to a function, and any function input or output can potentially
--   be captured as a Lua list, and Lua lists are implemented with tables...

print "done"

Solution: Existence Maintained Internally in a Table

The problems in the hack above can be solved by avoiding exposure of any new values outside the module. Instead, t[k] will return nil both if the k is not in the table or if the corresponding value is nil. We'll distinguish between those two conditions with a new function exists(t,k), which returns a Boolean indicating whether the key k exists in the table t. (This behavior is similar to that in the Perl language.)

In this way, we can still use the idiom if t[k] then ... end when appropriate or use the new idiom if exists(t, k) then ... end. Furthermore, no new values are introduced into the language, thereby avoiding the "storing UNDEF's in tables" problem above.

-- NiledTable.lua
local M = {}

-- weak table for representing proxied storage tables.
local data = setmetatable({}, {__mode = 'k'})

-- nil placeholder.
-- Note: this value is not exposed outside this module, so
-- there's typically no possibility that a user could attempt
-- to store a "nil placeholder" in a table, leading to the
-- same problem as storing nils in tables.
local NIL = {__tostring = function() return "NIL" end}
setmetatable(NIL, NIL)

-- metatable for NiledTable's.
local mt = {}
function mt.__index(t,k)
  local d = data[t]
  local v = d and d[k]
  if v == NIL then v = nil end
  return v
end
function mt.__newindex(t,k,v)
  if v == nil then v = NIL end
  local d = data[t]
  if not d then
    d = {}
    data[t] = d
  end
  d[k] = v
end
function mt.__len(t)  -- note: ignored by Lua but used by exlen below
  local d = data[t]
  return d and #d or 0
end

-- constructor
setmetatable(M, {__call = function(class, t)
  return setmetatable(t, mt)
end})

function M.exists(t, k)
  local d = data[t]
  return (d and d[k]) ~= nil
end
local exists = M.exists

function M.exlen(t)
  local mt = getmetatable(t)
  local len = mt.__len
  return len and len(t) or #t
end

local function exipairs_iter(t, i)
  i = i + 1
  if exists(t, i) then
    local v = t[i]
    return i, v
  end
end

-- ipairs replacement that handles nil values in tables.
function M.exipairs(t, i)
  return exipairs_iter, t, 0
end

-- next replacement that handles nil values in tables
function M.exnext(t, k)
  local d = data[t]
  if not d then return end
  k = next(d, k)
  return k
end
local exnext = M.exnext

-- pairs replacement that handles nil values in tables.
function M.expairs(t, i)
  return exnext, t, nil
end

-- Remove key in table.  This is used since there is no
-- value v such that t[k] = v will remove k from the table.
function M.delete(t, k)
  local d = data[t]
  if d then d[k] = nil end
end

-- array constructor replacement.  used since {...} discards nils.
function M.niledarray(...)
  local n = select('#', ...)
  local d = {...}
  local t = setmetatable({}, mt)
  for i=1,n do
    if d[i] == nil then d[i] = NIL end
  end
  data[t] = d
  return t
end

-- table constructor replacement.  used since {...} discards nils.
function M.niledtablekv(...)
  -- possibly more optimally implemented in C.
  local n = select('#', ...)
  local tmp = {...} -- it would be nice to avoid this
  local t = setmetatable({}, mt)
  for i=1,n,2 do t[tmp[i]] = tmp[i+1] end
  return t
end

return M

Examples/test:

-- test_nil.lua - test of NiledTable.lua

local NiledTable = require "NiledTable"

local exlen      = NiledTable.exlen
local exipairs   = NiledTable.exipairs
local expairs    = NiledTable.expairs
local exnext     = NiledTable.exnext
local exists     = NiledTable.exists
local delete     = NiledTable.delete
local niledarray = NiledTable.niledarray
local niledtablekv = NiledTable.niledtablekv

local t = NiledTable { }

t[1] = 3
t[2] = nil
t.x = 4
t.y = nil
assert(t[1] == 3    and exists(t, 1))
assert(t[2] == nil  and exists(t, 2))
assert(t[3] == nil  and not exists(t, 3))
assert(t.x  == 4    and exists(t, 'x'))
assert(t.y  == nil  and exists(t, 'y'))
assert(t.z  == nil  and not exists(t, 'z'))

-- non-existant and nil values are both returned as nil and
-- therefore both are logically false.
-- allows "if t[3] then ... end" usage.
assert(not t[2] and not t[3])

-- nils don't count in #t since __len cannot be overriden in
-- 5.1 without special userdata tricks.
assert(#t == 0)
assert(exlen(t) == 2) -- workaround function

-- constructor syntax doesn't work.  The construction is done
-- before the metatable is set, so the nils are discarded before
-- NiledTable can see them.
local t2 = NiledTable {nil, nil}
assert(t2[1] == nil)

-- alternate array constructor syntax (value list) that does work
local t2 = niledarray(nil,nil)
assert(t2[1] == nil and exists(t2, 1))
assert(t2[2] == nil and exists(t2, 2))
assert(t2[3] == nil and not exists(t2, 3))

--- more tests of niledarray
local t2 = niledarray(1,nil,nil)
assert(t2[1] == 1 and exists(t2, 1))
assert(t2[2] == nil and exists(t2, 2))
assert(t2[3] == nil and exists(t2, 3))
assert(t2[4] == nil and not exists(t2, 4))
t2[4]=4
assert(t2[4] == 4 and exists(t2, 4))

-- alternate table constructor syntax (key-value pair list) that does work
local t2 = niledtablekv(1,nil, 2,nil) -- {[1]=nil, [2]=nill}
assert(t2[1] == nil and exists(t2, 1))
assert(t2[2] == nil and exists(t2, 2))
assert(t2[3] == nil and not exists(t2, 3))

-- nils don't work with standard iterators
local s = ""; local n=0
for k,v in pairs(t)  do print("pairs:", k, v); n=n+1 end
assert(n == 0)
for i,v in ipairs(t) do print("ipairs:", i, v); n=n+1 end
assert(n == 0)

-- replacement iterators that do work
for i,v in exipairs(t) do print("exipairs:", i, v); n=n+1 end
n = n - 2; assert(n == 0)
for k,v in expairs(t) do print("expairs:", k, v); n=n+1 end
n = n - 4; assert(n == 0)
for k,v in exnext, t do print("next:", k, v); n=n+1 end
n = n - 4; assert(n == 0)

-- Setting an existing element to nil, makes it nil and existant
t[1] = nil
assert(t[1] == nil and exists(t, 1))
for k,v in expairs(t) do print("expairs2:", k, v); n=n+1 end
n = n - 4; assert(n == 0)

-- Calling delete makes an element non-existant
delete(t, 1)
for k,v in expairs(t) do print("expairs3:", k, v); n=n+1 end
n = n - 3; assert(n == 0)

-- nil's still can't be used as keys though (and neither can NaN)
assert(not pcall(function() t[nil] = 10 end))
assert(not pcall(function() t[0/0] = 10 end))

print "done"

Footnotes

[1] Including C, Java, Python, and Perl. In Perl, [exists and defined] differentiate these two cases: exists $v{x} v.s. defined $v{x} .

-- DavidManura

See Also


RecentChanges · preferences
edit · history
Last edited January 13, 2011 6:36 am GMT (diff)