lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]




On Fri, Feb 28, 2020 at 11:59 AM Anton Jordaan <anton.jordaan@sylvean.com> wrote:
Lua is memory efficient for huge but sparse multi-dimensional arrays,
since nil values aren't stored in the tables.

However, reading and writing to such sparse multi-dimensional arrays can
be quite a hassle.  If I want to access the value of
t[a][b][c][d][e], I first have to check whether t[a] is a table, t[a][b]
is a table, t[a][b][c] is a table etc, otherwise Lua throws an error
"Attempt to index a nil value".


For example, to read the value at t[a][b][c][d][e], I cannot simply use:
v = t[a][b][c][d][e]
Instead, the code must look something like:
v = t and t[a] and t[a][b] and t[a][b][c] and t[a][b][c][d] and
t[a][b][c][d][e]


Or, to write a value to t[a][b][c][d][e], I cannot simply use:
t[a][b][c][d][e] = v

Instead, the code must look something like:

if not t then t = {[a] = {[b] = {[c] = {[d] = {[e] = v}}}}}
elseif not t[a] then t[a] = {[b] = {[c] = {[d] = {[e] = v}}}}
elseif not t[a][b] then t[a][b] = {[c] = {[d] = {[e] = v}}}
elseif not t[a][b][c] then t[a][b][c] = {[d] = {[e] = v}}
elseif not t[a][b][c][d] then t[a][b][c][d] = {[e] = v}
else t[a][b][c][d][e] = v
end


I like these sorts of brain-teasers, so I went ahead and implemented something which does most of what you want.

Features:

- For a 3 dimensional table, a[1][2][3] returns nil, and doesn't create values in the process.

- a[1][2][3] = "value" creates the intermediate tables and assigns "value".

b = a[3][4] returns a 1-dimensional table, such that

b[5] = "another value"

assigns "another value" to both b[5] and a[3][4][5].


Limitations:

- The dimension of the sparse table is fixed and cannot be grown.  This is a big one, but I don't know how much of a problem it is in practice.

- You can't have fields called `_count`, `_dimension`, `_idx`, or `_back`. This is minor. You could replace them with UUIDs such that a user would have to go out of their way to collide with them.

- Assignment to a non-existent index of lower dimension than the table is illegal.  This was a judgement call, because it's not truly multidimensional if some tables have fewer dimensions than others.  It *should* be possible to splice in a sparse table of the appropriate dimension, and I left a note where some enterprising programmer could put that in.

- Assignment of any value to an existing multidimensional table succeeds.

This *can* be worked around in Lua, but only with another level of indirection, and I'm out of patience. What I mean is that:

a[1][2][3] = "value"

creates a[1][2], so

a[1][2] = "oops"

is a simple assignment which doesn't trigger metamethods.

- Missed indexing to a 'detached' subtable removes the subtable from the master table

This is a weird thing, but it's a consequence of how we can do lookups without creating garbage.

With

b = a[1][2]

There are 'provisional' tables at a[1] and a[1][2], the latter of which is equal to b.

if we say

b[3] = "value", great; the tables are reified and everything is where it belongs.

If instead, we say

if b[4] then ...

the index on b will return nil, but in the process, a[1][2] is erased. b still exists, and it's a valid 1-dimensional table, but the connection to a is broken.  Note that assignment to b[3] makes the nil-indexing of b[anything_else] safe.

I put this up as a gist: https://gist.github.com/mnemnion/1e73d2c1c4c20bba6269f2ede7d5bcdc

And reproduce the whole 57 lines below, in the interests of posterity.
local Sparse = {}

local function new(dimension)
   local sparse = {}
   sparse._dimension = dimension
   sparse._count = 1
   return setmetatable(sparse, Sparse)
end

function Sparse.__index(sparse, key)
   if sparse._count == sparse._dimension then
      -- clean up and return nil
      local back, idx = sparse._back, sparse._idx
      rawset(sparse, "_back", nil)
      rawset(sparse, "_idx", nil)
      repeat
         rawset(back, idx, nil)
         back, idx = rawget(back, "_back"), rawget(back, "_idx")
      until back == nil
      return nil
   else
      -- provisionally create a Sparse table
      local new_sparse = new(sparse._dimension - sparse._count)
      rawset(new_sparse,"_back", sparse)
      rawset(new_sparse, "_idx", key)
      rawset(sparse, key, new_sparse)
      return new_sparse
   end
end

function Sparse.__newindex(sparse, key, value)
   if sparse._count < sparse._dimension then
      -- as a refinement, this could be allowed for sparse
      -- tables of the correct dimension
      -- an interesting exercise!
      error("attempt to assign value to dimension " .. sparse._count
            .. " of " .. sparse._dimension .. "-dimensional sparse table")
   end
   -- reify the intermediate tables
   local back, idx = rawget(sparse, "_back"), rawget(sparse, "_idx")
   rawset(sparse, "_back", nil)
   rawset(sparse, "_idx", nil)
   local count = 1
   repeat
      count = count + 1
      local back_back, back_idx = rawget(back,"_back"), rawget(back, "_idx")
      rawset(back, "_back", nil)
      rawset(back, "_idx", nil)
      back, idx = back_back, back_idx
   until (back == nil) or (count > 100)
   if value == nil then return nil end
   rawset(sparse, key, value)
   if count > 100 then return "oops2" end
end

return new