lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

On 11-Aug-05, at 10:52 PM, Diego Nehab wrote:

Don't you think it can get quite complicated to trace where these blocks reside once you start passing them around?

No, I don't.

In fact, in the course of responding to that question, I wrote almost the whole implementation except for the code generation. It's some 75 lines or so, including comments and blanks (although there are a few missing details here: the macros to access and set upvals and all the bookkeeping stuff to deal with the two new opcodes, so in the end it might be slightly more than 80 lines.) So I wouldn't call that 'quite complicated'.

I warn you all: The following was written while I was thinking through the code (which is at the end) between midnight and 4 a.m. I take full responsibility for any incoherence. :) But I think there's some useful stuff here.

I originally proposed storing a hidden one on the stack to avoid the tracking problem, but I realize that it is not necessary. Non-local-exits (NLEs) can just be stored in the upvalue chain just like upvalues. Note that luaF_close ensures that upvalues are heap-evicted before they go out of scope (even if they go out of scope via a call to lua_error()) so it is exactly the mechanism necessary to support invalidating non-local-exits and running finalizers. In fact, the following implementation simply uses UpVals to represent NLEs, by squeezing the necessary information into unused bytes in the UpVal structure.

If we were going to support non-local-exits through a C callback, we'd need a more complicated mechanism than the one I'm about to sketch out, so I'll make the simplifying assumption that we're not going to allow that. It would be possible through a mechanism like the one Mike Pall uses in the RVM patch, I think.

The code generation is not dealt with in this message; as I said before, it requires a small reengineering of the way locals are dealt with in the code generator in order to allow locals to be created in the middle of an expression. It's not difficult; I've done it (although I'm not yet 100% satisfied, so I haven't posted the patch yet) but it's really orthogonal to this proposal.

The basics, though, are as follows. We start with an expression that looks like this:

block (non_local_exit) -- the binding is optional
  -- chunk
finally  -- Dylan uses the word 'cleanup' instead of 'finally'
  -- chunk

The code between the block and the finally is executed with the local 'non-local-exit' bound to a non-local-exit function, as I proposed earlier. If that function is called, its arguments become the value(s) of the block expression; otherwise, the block expression returns no values. It might be useful to have an explicit statement to return values from the block, but it is always possible to just call the non-local-exit.

The finally clause is optional, but if it exists the chunk is executed just before the block finishes. This happens whether or not the non-local-exit function is called, and it happens even if an error is thrown.

The implementation below assumes the finally clause has been turned into a function of no arguments and no results, which is pretty straightforward to do in the code generator; there may be another way of doing it, but this way seemed like the easiest. The consequence of that is that it cannot itself call any non-local-exit function (since that would cross a C call boundary). I'm not 100% sure what happens if it throws an error, but it won't be a disaster :)

The block expression reserves two stack slots. The first one is used to hold the non-local-exit local, which is just an ordinary local binding. Indeed, the variable can be reassigned without causing any grief, and the non-local-exit function can be arbitrarily copied. The restrictions are just the ones mentioned above: you can't use it after the block exits, and you can't use it to jump through a C-call.

The second slot is used to hold the finalization function closure, which is a very ordinary function closure. Unfortunately, the finally clause comes at the end of the block, so it's not actually available when we're generating the prelude code, and we can't slot it in because we don't know how many upvalues it has until we reach it. So when we encounter a 'block' we issue a JMP which is later filled in with the address of the end of the block, at which point we create the finalizer closure, and then create the non-local-exit object and JMP back to the block code. That's not ideal, but it seems unnatural to put the finally clause at the beginning, and one-pass compilers do sometimes have their limitations. (Maybe there's a better way, though...)

So if we have:

local a = get_connection()
local x, y, z = 1, -- This is just to prove a point
    block (nle)
      local v = call_if_false(nle, use(a))

the code might look like this:

   GETGLOBAL 1, "get_connection"
   LOADK 2, #1
   JMP  b
a: GETGLOBAL 5, "call_if_false"
   MOV 6, 2      ; 2 = nle
   GETGLOBAL 6, "use"
   MOV 7, 1      ; 1 = a
   CALL 6, 2, 2
   CALL 5, 3, 1
   CLOSE 2       ; 2 = nle
   JMP c
b: CLOSURE 4, func1
   MOV 0, 1      ; 1 = a
   NLE 3, 0, 3   ; 2 results
   JMP a
c: ...

stack mapping:
 a 1
 x 2
 y 3  nle  3
 z 4  (fn) 4
      v    5

func1 (transformed finally clause)

    GETUPVAL 0, 0  ; a
    SELF     0, "close"
    CALL     0, 3, 1
    RETURN   1


I haven't tested or compiled this code, but I have thought about it :) I think it's pretty close.

First, we'll deal with the non-local-exit object itself. It actually looks a lot like an UpVal, so we can just alter the UpVal structure a little bit to handle it. (No bits are wasted during this alteration, at least on 32-bit architectures.)


  typedef struct UpVal {
+ lu_byte nres; /* 0 (UpVal); 1 (Open); 2+ number of retvals expected */ TValue *v; /* stack pointer when open; otherwise &u.value or NULL */
    union {
      TValue value; /* UpVal's value when closed */
      struct {  /* doubly linked list (when open) */
        struct UpVal *prev;  /* really struct NLE */
        struct UpVal *next;  /* ditto */
+       Instruction *pc;     /* index of next instruction */
      } l;
    } u;

We can divide luaF_findupval into two pieces in order to use it to create blocks:


+ #define luaF_newupval(L,lvl) luaF_newnle(L, lvl, 0, NULL);
  UpVal *luaF_findupval (lua_State *L, StkId level) {
-   global_State *g = G(L);
    GCObject **pp = &L->openupval;
    UpVal *p;
-   UpVal *uv;
    while ((p = ngcotouv(*pp)) != NULL && p->v >= level) {
      lua_assert(p->v != &p->u.value);
      if (p->v == level) {  /* found a corresponding upvalue? */
        if (isdead(g, obj2gco(p)))  /* is it dead? */
          changewhite(obj2gco(p));  /* ressurect it */
        return p;
      pp = &p->next;
+   return luaF_newupval(L, level);
+ }
+ UpVal *luaF_newnle (lua_State *L, StkId level, int nres, Instruction *pc) {
+   global_State *g = G(L);
+   UpVal *uv;
    uv = luaM_new(L, UpVal);  /* not found: create a new one */
    uv->tt = LUA_TUPVAL;
    uv->marked = luaC_white(g);
+   uv->nres = nres;
    uv->v = level;  /* current value lives in the stack */
    uv->next = *pp;  /* chain it in the proper position */
    *pp = obj2gco(uv);
    uv->u.l.prev = &g->uvhead;  /* double link it in `uvhead' list */
    uv-> = g->;
    uv->>u.l.prev = uv;
    g-> = uv;
lua_assert(uv->>u.l.prev == uv && uv->u.l.prev-> == uv);
+   uv->pc = pc;
    return uv;

When a Block gets closed, we need to invalidate it, which we do by setting its v field to NULL. We also need to call any finalizers we run into along the way. CLOSE is called at the end of any 'block' construction, which has the interesting side-effect of calling the block's finalizer.

  void luaF_close (lua_State *L, StkId level) {
    UpVal *uv;
    global_State *g = G(L);
    while ((uv = ngcotouv(L->openupval)) != NULL && uv->v >= level) {
      GCObject *o = obj2gco(uv);
      lua_assert(!isblack(o) && uv->v != &uv->u.value);
      L->openupval = uv->next;  /* remove from `open' list */
      if (isdead(g, o))
        luaF_freeupval(L, uv);  /* free upvalue */
      else {
+       if (uv->nres > 0) {
+         StkId finalfunc = uv->v + 1;
+         uv->v = NULL;
+         if (!ttisnil(finalfunc)) {
+           ptrdiff_t savelevel = savestack(L, level);
+           setobjs2s(L->top, finalfunc);
+           L->top++;
+           luaD_call(L, L->top - 1, 0);
+         }
+       }
+       else {
          setobj(L, &uv->u.value, uv->v);
          uv->v = &uv->u.value;  /* now current value lives here */
          luaC_linkupval(L, uv);  /* link upvalue into `gcroot' list */
+       }

Obviously, we need to be able to call the nle. Since we've said that we can only call it from Lua without crossing any C boundary, we'll just do it directly in lvm.c without worrying about integrating it into ldo.c.

For simplicity, I'm adjusting nexeccalls here; I honestly don't see why nexeccalls is necessary, because we ought to be able to tell from the CallInfo structure which is being returned into whether it is a Lua return or a C return. But it's late and I want to get this thing sent off.

(new function in ldo.c)

+ int luaV_nle (lua_State *L, StkId func) {
+   UpVal *nle = upvalue(func);
+   StkId level = nle->v;
+   CallInfo *ci = L->ci;
+   int wanted = nle->nres - 2;
+   int ciadjust = 0;
+   if (level == NULL)
+     luaG_runerror(L, "non-local-exit called after scope exit");
+   /* Find the right ci */
+   while (ci->base > level) {
+     if (!f_isLua(ci))
+       luaG_runerror(L, "non-local-exit crosses C call");
+     --ci;
+     ++ciadjust;
+   }
+   /* Close out upvals and nles down to the return point */
+   /* (Including this nle)
+   luaF_close(L, level);
+   /* Move to the new callframe */
+   L->ci = ci;
+   L->base = ci->base;
+   /* Move the desired number of results onto the right place */
+   while (wanted != 0 && ++func < L->top) {
+     setobjs2s(L, level++, func);
+     --wanted;
+   }
+   while (wanted-- > 0)
+     setnilvalue(level++);
+   /* Set things up for luaV_execute */
+   if (nle->nres > 1)
+     L->top = ci->top;
+   else
+     L->top = level;
+   L->savedpc = nle->u.l.pc;
+   return ciadjust;
+ }

And then we need to call that from the relevant opcodes (OP_CALL and OP_TAILCALL, the code is the same):

      case OP_CALL: {
        int b = GETARG_B(i);
        int nresults = GETARG_C(i) - 1;
if (b != 0) L->top = ra+b; /* else previous instruction set top */
+       if (ttisupval(ra) && upvalue(ra)->nres > 0) {
+         nexeccalls -= luaV_nle(L, ra);
+         goto retentry;  /* No hooks, I'm afraid */
+       }

      case OP_TAILCALL: {
        int b = GETARG_B(i);
if (b != 0) L->top = ra+b; /* else previous instruction set top */
+       if (ttisupval(ra) && upvalue(ra)->nres > 0) {
+         nexeccalls -= luaV_nle(L, ra);
+         goto retentry;  /* No hooks, I'm afraid */
+       }

Finally, we need a new VM opcode:

Create the non-local-exit.
   OP_NLE a, 0, c
   a is the stack position at which to create the nle.
   c is the number of results + 1, or 0 for open (as usual)
The OP_NLE must be followed by a jump to the beginning of the block, and the opcode following the jump must be the end of the block (it is where
   the nle will resume execution.)

+     case OP_NLE: {
+       int nres = GETARG_C(i) + 1;
+       setupvalue(ra, luaF_newnle(L, ra, nres, pc + 1);
+       dojump(L, pc, GETARG_sBx(*pc) + 1);
+       Protect(luaC_checkGC(L));
+       continue;
+     }