Engram Proposal

lua-users home
wiki

Runtime serialisation for messaging and persistence using the dump format

It has been noticed at least once [1] that the format used by string.dump to serialise Lua Code at the virtual machine token level has all the features needed to serve as a serialisation format for runtime Lua objects. However, because string.dump serialises functions rather than closures, there has been no way to get runtime values into a dump. This proposal suggests an extension to the Lua core to achieve this.

The term engram is hereby coined (or borrowed from neuroscience) to refer to a function which, when executed returns a data object encapsulated within it. An engram generator is a library function which receives a data object and returns its engram. The engram generator creates a Lua function directly in virtual machine token format. This function is equivalent to compiling a constructor for the object in its run-time state at the time of generation.

Lua Code

data = "testit"
en = engram(data)
en = loadstring(string.dump(en))
data = nil
print(en())
This example prints "testit". engram is the engram generator library function and this is the only additional code that was required for a complete serialisation round-trip. The string generated by string.dump can be saved to a file or database or transmitted through a pipe or a network connection to another Lua machine or instance. No code is needed to read the serialised format - the existing loadstring library function is fine as it is. data could be a Number, a Boolean, a String or a Table whose keys are one of these types and whose values are one of these types or a nested Table meeting these criteria.
func = function(x) print(x) end
data = "testit"
en = engram(data, func)
en = loadstring(string.dump(en))
data = nil; func = nil
en()
This example also prints "testit". The concept is extended to encapsulate an inner Lua function along with the data in the engram function. When the engram is executed it tail-calls the inner function passing it the reconstructed data object. Both the inner function and the data are encapsulated within the engram function and get serialised by string.dump.

C Code

Unfortunately it is not possible to write the engram generator function using the published Lua C API. It is necessary to compile the function into the Lua core. The following code is highly experimental and is presented here as proof of concept and for expert code review. It is my first attempt at modifying the Lua core and I may well have misunderstood some details or missed some essential step. In particular I am concerned that I may not be 'playing nice' with the Lua Garbage Collector. Critique by Lua experts will be much appreciated!

Also note that tables are not supported in this version. I expect to remedy this over the next few days.

Header File (lengram.h):

/*
** $Id: lengram.h,v 1.0.0.0 2009/02/02 John Hind $
** Engram add-on for Lua **EXPERIMENTAL**
** See Copyright Notice in lua.h
*/

#ifndef lengram_h
#define lengram_h

LUA_API int luaX_engram (lua_State *L, int usesfunc);
LUALIB_API int luaopen_engram (lua_State *L);

#endif
Code File (lengram.c):
/*
** $Id: lengram.h,v 1.0.0.0 2009/02/02 John Hind $
** Engram add-on for Lua **EXPERIMENTAL**
** See Copyright Notice in lua.h
*/

#include <assert.h>
#include <math.h>
#include <stdarg.h>
#include <string.h>

#define lengram_c
#define LUA_CORE

#include "lua.h"
#include "lauxlib.h"

#include "lengram.h"

#include "ldo.h"
#include "lfunc.h"
#include "lmem.h"
#include "lobject.h"
#include "lopcodes.h"
#include "lstring.h"

//On entry:
// top - 1 : Lua Data object (number, boolean, string or table)
// top - 2 : Inner Lua Function (if (usesfunc))
//On exit:
// top - 1 : Engram Lua Function
LUA_API int luaX_engram (lua_State *L, int usesfunc) {

  int pc; int kc; int ms; int tab; Proto* f; Closure* cl;

  if (lua_istable(L, -1))
  {
    // TODO: Check table and evaluate resources needed
    tab = 1;
    luaL_error(L, "Engram: Unsupported type (for now)");
  }
  else
  {
    tab = 0;
    pc = (usesfunc)? 4 : 2;  //Number of opcodes
    kc = 1;                  //Number of constants
    ms = 2;                  //Number of registers
  }

  f = luaF_newproto(L);
  cl = luaF_newLclosure(L, 0, clvalue(L->top)->l.env);
  cl->l.p = f;
  setclvalue(L, L->top, cl); incr_top(L);
  f->source = luaS_newliteral(L,"=(ENGRAM)");
  f->maxstacksize = ms;

  if (usesfunc)
  {
    f->p = luaM_newvector(L, 1, Proto*);  //Space for one inner function
    f->p[0] = clvalue(L->top - 3)->l.p;	  //Insert the inner function
    f->sizep = 1;                         //Number of functions
  }

  f->k = luaM_newvector(L, kc, TValue);   //Space for constants
  kc = 0;
  if (tab)
  {
    //TODO: Define the table constants
  }
  else
  {
    f->k[kc++] = *(L->top - 2);
  }
  f->sizek = kc;

  f->code = luaM_newvector(L, pc, Instruction);	//Space for opcodes
  pc=0;
  if (tab)
  {
    // TODO: Table code generator
  }
  else
  {
    f->code[pc++] = CREATE_ABx(OP_LOADK, 1, 0);
  }
  if (usesfunc)
  {
    f->code[pc++] = CREATE_ABx(OP_CLOSURE, 0, 0);
    f->code[pc++] = CREATE_ABC(OP_TAILCALL, 0, 2, 0);
    f->code[pc++] = CREATE_ABC(OP_RETURN, 0, 0, 0);
  }
  else
  {
    f->code[pc++] = CREATE_ABC(OP_RETURN, 1, 2, 0);
  }
  f->sizecode = pc;
	
  return 0;
}

// The C function published to Lua
static int LuaEngram (lua_State *L) {

  if (!lua_isfunction(L, -1)) lua_pushnil(L);
  lua_settop(L, 2);
  lua_insert(L, 1);
  switch (lua_type(L, 2)) {case LUA_TBOOLEAN: case LUA_TNUMBER: case LUA_TSTRING: case LUA_TTABLE: break;
    default: luaL_error(L, "Engram: Unsupported data type"); break;}
  switch (lua_type(L, 1)) {case LUA_TFUNCTION: case LUA_TNIL: break;
    default: luaL_error(L, "Engram: Second parameter must be a function"); break;}
  luaX_engram(L, lua_isfunction(L, 1));
  return 1;
}

// Open the engram library
LUALIB_API int luaopen_engram (lua_State *L) {

  lua_pushcfunction(L, LuaEngram);
  lua_setglobal(L, "engram");
  return 0;
}

-- JohnHind

Please add any comments or queries below, and tag them with your name:


RecentChanges · preferences
edit · history
Last edited February 5, 2009 11:56 am GMT (diff)