lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


GCC on MacOS X does align doubles to 8-byte boundaries -- at least by
default. I haven't yet looked into whether there is an easy way to change
that locally (since I don't want to do it to my whole program).

With that alignment and the default declarations in lobject.h, Node weighs
in at 40 bytes most of which is wasted space. I reworked things as I'll
describe below and got Node down to 24 bytes (which is better than the
default headers would have allowed for even with 4-byte alignment).

I have not addressed the issue of the array section and that is definitely
worth trimming since if one can accept 4-byte alignment for doubles, one can
move it from 16-bytes per entry to 12-bytes per entry.

Mark

----

A summary of what I did. This may evolve into a patch, but I haven't done so
yet. I'd also like feedback on the changes.

1. Split out the definition of the contents of TObject into a TObjectHeader
macro a la GCHeader, etc..

2. Reorder the fields in TObjectHeader and change the type to be a byte. So,
it now looks like:

#define TObjectHeader Value value; lu_byte tt

typedef struct lua_TObject {
  TObjectHeader;
} TObject;


3. Redefine Node as follows:

typedef union Node {
  TObject i_val;
  struct {
    TObjectHeader; /* Should match the layout of the i_val portion. */
    lu_byte tt_key;
    union Node *next;  /* for chaining */
    Value v_key;
  };
} Node;


4. Introduce a bunch of macros that combine gkey with other operations (word
wrap screws this up in e-mail);

#define gkey_ttype(n) ((n)->tt_key)
#define gkey_value(n) ((n)->v_key)
#define gkey_setnilvalue(n) (gkey_ttype(n) = LUA_TNIL)
#define gkey_ttisnil(n) (gkey_ttype(n) == LUA_TNIL)
#define gkey_ttisnumber(n) (gkey_ttype(n) == LUA_TNUMBER)
#define gkey_nvalue(nd)    check_exp((gkey_ttype(nd) == LUA_TNUMBER),
gkey_value(nd).n)
#define gkey_bvalue(n)    check_exp((gkey_ttype(n) == LUA_TBOOLEAN),
gkey_value(n).b)
#define gkey_ttisstring(n) (gkey_ttype(n) == LUA_TSTRING)
#define gkey_tsvalue(n)    check_exp((gkey_ttype(n) == LUA_TSTRING),
&(gkey_value(n).gc->ts))
#define gkey_iscollectable(n) (gkey_ttype(n) >= LUA_TSTRING)
#define gkey_setttype(n, tt) (gkey_ttype(n) = (tt))
#define gkey_gcvalue(n)    check_exp(gkey_iscollectable(n),
gkey_value(n).gc)
#define gkey_pvalue(n)    check_exp(gkey_ttype(n) == LUA_TLIGHTUSERDATAA,
gkey_value(n).p)

#define gkey_checkconsistency(n) \
  lua_assert(!gkey_iscollectable(n) || (gkey_ttype(n) ==
gkey_value(n).gc->gch.tt))


And some versions of the set routines that expect to work with the key field
in nodes:

#define setobjk2(obj1,node2) \
  { const Node *n2=(node2); TObject *o1=(obj1); \
    gkey_checkconsistency(n2); \
    o1->tt=gkey_ttype(n2); o1->value = gkey_value(n2); }

#define setobjk2s setobjk2

#define setobj2k(node1,obj2) \
  { const TObject *o2=(obj2); Node *n1=(node1); \
    checkconsistency(o2); \
    gkey_ttype(n1) = o2->tt; gkey_value(n1) = o2->value; }


5. In lgc.c, I had to introduce gkey_condmarkobject which parallels
condmarkobject but takes a node.

6. Introduce versions of luaO_rawequalObj, valismarked, arrayindex, and
luaH_mainposition that take nodes instead of objects and then use the gkey_
macros to access the appropriate data.

7. In resize in ltable.c, we have to build a TObject for the old key by
copying the appropriate fields.

8. Update all code that uses the gkey macro. The compiler is helpful in
finding these since it will complain about the absence of i_key. In fact,
most of my other changes were driven by making the initial structure
redefinitions and then working through the compilation errors.

Summary: These changes improve the compiler's ability to pack the Node
structure. They come at the expense of some of the clarity in the existing
code, but the changes are relatively consistent and the altered code doesn't
seem that much harder to understand than the original code. The only
operation that should be adversely affected from a performance standpoint is
resize and that should be a very minor impact. The code growth is also
minimal.