lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

On 29-Nov-05, at 11:34 AM, Lisa Parratt wrote:

Surely this makes the handling of + and - metamethods an absolute nightmare?

That depends on the semantics, which I think is the core problem. Also the fact that introducing side effects into an expression means that evaluation order becomes critical. And finally, the whole knotty question of what mutation means, which I won't go into again :) -- just a few remarks on semantics:

Lua doesn't have any equivalent to C's comma operator (or gcc's extensions to that) but it is trivial to define with a macro (of course, Lua doesn't have macros either :( ). Let's say that:

  begin <block> end ==>  (function() <block> end)()

just to make the exposition a little nicer.

Now, it's pretty clear how to macro-expand <lvalue>++ :

  <lvalue>++ ==>
     begin local v = <lvalue>'; <lvalue>' = v + 1; return v end

  where <lvalue>' is:
        if <lvalue> is a variable (whether local or global)
     temp1[temp2] where temp1 = <table> and temp2 = <key>
        if <lvalue> is of the form <table>[<key>]
  and otherwise an error

(The intent is to only evaluate the table and key expressions once.)

However, ++<lvalue> and (<lvalue> += <expr>) allow for two possible expansions. In both cases, we assume:

   ++<lvalue> ==> (<lvalue> += 1)

1) <lvalue> += <expr>  ==>
     begin local v = <lvalue>' + <expr>
           <lvalue>' = v
           return v

2) <lvalue> += <expr>  ==>
     begin <lvalue>' = <lvalue>' + <expr>
           return <lvalue>'

These differ in that the second one does the lookup of <lvalue>' twice; they are identical in the case where <lvalue> is a local, but may differ if <lvalue> is a gettable operation (or a global variable, which is the same thing) and the indexed object (that is, the table or whatever) has an __index metamethod.

Which of these expansions you prefer depends on how you view metamethods. Option 1 is, on the face of it, more "efficient" but it has the disadvantage that it may return a value which the __index metamethod would never return.

For example, suppose I have an object which represent an array of vectors, and furthermore that __add is overridden for vectors to do something sensible. No problems yet, but now I make my array-of-vectors object automatically intern [Note 1] any inserted vector. Now, formulation (1) returns an uninterned vector, whereas formulation (2) correctly returns the interned vector.

As another example, I have an implementation of multi-valued tables (a version of it can be found in my recent message on the iteration protocol) which has the following semantics:

mvtab[key] ==> returns the first pair <key, value> which matches key
  mvtab[key] = value  ==>  adds the pair <key, value> to the table
  mvtab[key] = nil    ==>  removes all pairs <key, *> from the table
  for k, v in mvtab:_pairs(key)
                      ==>  iterates over all pairs <key, *>
  for k, v in mvtab:_pairs(nil)
                      ==>  iterates over all pairs <*, *>

This might be considered an unfortunate interface, but it "works for me" :) The intent of this object-type is to be able to handle things like HTTP headers and LDAP entries which have multivalued attributes.

Now, one of the disadvantages of this object type is that you cannot just write:

  mvtab[key] = mvtab[key] + 1

unless you really wanted to add a new <key, value> pair to the table. The following works, but it takes advantage of an unspecified evaluation order:

  mvtab[key], mvtab[key] = mvtab[key] + 1, nil

Furthermore, the two formulations of += would provide different results; option 1 would return the value of the new <key, value> pair whereas option 2 would return the value of the first pair in the table whose key was key.

An intermediate proposal with simpler semantics (although its arguably a lot uglier) is to allow pseudovariables of the form $<integer> to appear in assignment statements; $<i> refers to the rvalue corresponding to the i'th lvalue, and, for convenience, the <integer> defaults to the current position in the expression list. This would allow things like:

  a[i] =$+ 1  -- increment a[i], or whatever a's metamethods make of it.
  a[i], a[j] = $2, $1 -- swap a[i] and a[j]

The first example is only one keystroke longer than += (and two keystrokes longer than ++ if you leave out the space), so it might satisfy the need. The second example could be handy if i and j were actually expressions.

[Note 1]:
Interning refers to the action of ensuring that equal objects are unique in memory, which is what Lua does with strings. It would be possible for the vector's __add operator to do the interning but this might be horribly inefficient for temporary results; moving the interning operation to the __newindex metamethod of the array of vectors could be considered an optimization. Or it could be considered a hack.