lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Disclaimer: I do not expect the following to be adopted by the Lua
team, but for those of use implementing it for ourselves, I would like
to initiate discussion on what the best semantics are.

I often find myself writing Lua code in which closure creation or
table creation appear within a loop, and I know that there doesn't
need to be a new instance for every iteration. Closure caching in 5.2
alleviates part of the pain with regard to closures, but in general,
there is still a problem. At this point, I'm presented with a
trade-off between readability and efficiency. I could leave the code
as-is, which is readable but inefficient, or I could lift the
expression, which is more efficient but less readable. I'd like to
entertain the idea of adding a new keyword which causes the compiler
to lift an expression for me.

As a concrete example, I recently wrote code like the following:
function f(...)
  -- ...
  for widget_type in set {"ListBox", "Label", ...} do
    -- ...
  end
  -- ...
end

I know that it would be more efficient to do the following:
local widget_set = set {"ListBox", "Label", ...}
function f(...)
  -- ...
  for widget_type in widget_set do
    -- ...
  end
  -- ...
end

However, the previous code is harder to read (especially if there is a
lot of other code between the function head and the for loop), so I'd
like to propose that the following be equivalent to it:
function f(...)
  -- ...
  for widget_type in const set {"ListBox", "Label", ...} do
    -- ...
  end
  -- ...
end

Formally, the grammar is extended by an extra clause in the exp rule:
exp ::= (existing stuff) | *const* exp

I know roughly what I'd like this new clause to mean, but there are
several ideas floating around in my head, and I'm not sure which of
them is the most intuitive or most useful:
1) "const E" is syntactic sugar for "const_E" where "local const_E =
E" is inserted at the earliest point in the source code such that the
bindings of the identifiers in E are unchanged, and the name "const_E"
is for exposition only.
2) As 1), but with the scope of the new local restricted to as small a
region as possible. This is harder to explain, but reduces the
pressure on the maximum number of simultaneously active locals, and
also reduces the amount of time for which the debug library can see
it.
3) "const E" causes a new entry to be made in the prototype's constant
table, the value of which is computed the first time "const E" is
evaluated, and then re-used from thereafter. The behavioural semantics
of this are different to 1) and 2), but pressure on locals and
upvalues is abolished.

Beyond the basic definition from the above list, there are also further options:
A) The compiler is allowed to disregard the const keyward, and
therefore treat "const E" as "E". This allows for a conforming
compiler with minimal changes.
B) If "const E1" and "const E2" are such that E1 and E2 are textually
equal, and equal in terms of bindings of identifiers, then they refer
to the same "const slot" (be that a lifted local, or a once-evaluated
constant table entry). This is a nice further optimisation, although
comes at a very high cost to the compiler.

Do any of you have other ideas for semantics, or views on which
collection of the above semantics is best?