lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


I know this was brought up before (by me, amongst, I think, others) but I
wonder if it would be possible to reopen the debate about whether automatic
variables in for loops are new bindings or not.

It seems to me that constructions like this:

for k, v in pairs(t) do
  g_bindings[k] = function(x) return g(v, x) end
end

ought to "work" (i.e. the functions that are inserted into g_bindings ought
to be bound to different instances of v.)

I can think of a number of cases in which having separate bindings of the
automatic variables makes sense, but I have not yet been able to think of a
good example where it makes sense to have a single binding ... maybe that
is just my own prejudice at work, so I would welcome any counter-examples.

Furthermore, it looks to me like the VM is doing quite a lot of work to
avoid creating a separate binding: it ends up moving the results of the
call to "next" (or whatever generator is being used) twice; once implicitly
in the function call, and once explicitly in the code of OP_TFORLOOP. It
would be simpler and presumably faster to keep the current copy of the
iteration key in a hidden variable along with the iteration function and
state, and simply leave it on the stack in the first position for the
iterated block. This would require one extra stack position, of course, but
that does not seem an enormous price to pay.

The payoff is that the following code:

     case OP_TFORLOOP: {
        int nvar = GETARG_C(i) + 1;
        StkId cb = ra + nvar + 2;  /* call base */
        setobjs2s(cb, ra);
        setobjs2s(cb+1, ra+1);
        setobjs2s(cb+2, ra+2);
        L->top = cb+3;  /* func. + 2 args (state and index) */
        luaD_call(L, cb, nvar);
        L->top = L->ci->top;
        ra = XRA(i) + 2;  /* final position of first result */
        cb = ra + nvar;
        do {  /* move results to proper positions */
          nvar--;
          setobjs2s(ra+nvar, cb+nvar);
        } while (nvar > 0);
        if (ttisnil(ra))  /* break loop? */
          pc++;  /* skip jump (break loop) */
        else
          dojump(pc, GETARG_sBx(*pc) + 1);  /* jump back */
        break;
      }

becomes

     case OP_TFORLOOP: {
        int nvar = GETARG_C(i) + 1;
        setobjs2s(ra+3, ra);  /* copy function + 2 args */
        setobjs2s(ra+4, ra+1);
        setobjs2s(ra+5, ra+2);
        L->top = ra+6;
        luaD_call(L, ra+3, nvar);  /* call function to create all vars */
        L->top = L->ci->top;
        ra = XRA(i) + 3;  /* first result == key */
        if (ttisnil(ra)) {  /* break loop? */
          setobjs2s(ra-1, ra);
          pc++;  /* skip jump (break loop) */
        }
        else
          dojump(pc, GETARG_sBx(*pc) + 1);  /* jump back */
        break;
      }

(The result relocation loop is eliminated because the function returns its
results into the right place. By the way, in the current implementation, I
believe that the test "if (ttisnil(ra)" ought to be before the "do {}"
loop, because that loop is irrelevant if the jump back is not taken.)

The one thing that would "break" with this change is the possibility of
setting the value of the control variable within the for loop, something
the manual describes as "undefined" (and with good reason). With the
current implementation it is possible (although "undefined") to advance or
retract the control variable while the for loop is in progress. The
suggested implementation makes the control variable fresh each time through
the loop so that assigning to it is only visible during the current
iteration and has no side effects. I admit that the current side effect is
potentially useful, but I trust that no-one actually uses it, as one ought
not to use language behaviour documented as having undefined effect.

Am I (once again) missing something here?