lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


steve donovan <steve.j.donovan <at> gmail.com> writes:

> 
> Hi all,
> 
> We all know that global variables can be a pain [1] and should be
> avoided.  The 'strict struct' pattern brings these benefits to tables
> with named keys
> 
> A 'struct' can be declared so:
> 
> struct.Alice {
> 	x = 1;
> 	y = 2;
> }
> 
> And instantiated like so:
> 
> a = Alice {x = 10, y = 20}
> 
> or
> 
> b = Alice {x = 10}  -- y will be set to 2
> 
> Any attempt to access an unknown field of a and b will be an error,
> like a.z = 1 or print(b.zz), or even Alice{z = 4}.
> 
> So this brings two things to the party:
> (1)  typos in fieldnames are errors, not silent problems.
> (2) such tables now have an _identity_, and this in particular helps
> when trying to write more self-documenting code. In LuaDoc, you can
> then confidently give the type of a parameter as Alice, rather than 'a
> table with x and y being numbers'
> 
> Stronger typing also means that type-specific assertions can be thrown.
> 
> A simple overload of __tostring would also give you type-specific
> string representations like 'Alice #23' for debugging purposes.
> 
> It would be possible (using a suitable proxy table) to enforce dynamic
> type checking on field assignments, but of course this would incur a
> run-time cost.

So I have been working on a project that does something a lot like this.
It's a little bit bigger in scope than what you describe.  It's based on
Protocol Buffers, the serialization format Google released last year.

The idea is this: you define your structure type as a protocol buffer
type.  So instead of writing this in your .lua file:

struct.Alice {
  x = 1;
  y = 2;
}

...you can write this in a .proto file:

message Alice {
  optional int32 x = 1;
  optional int32 y = 2;
}

or you could write this in a .lua file:

Alice = upb.new_type("
  message Alice {
    optional int32 x = 1;
    optional int32 y = 2;
  }
")

This may seem a bit roundabout to you, but here's what it buys you:

- instances of Alice have a more efficient representation in memory.
  Since the set of all possible fields is known, it can be stored
  as an array with known offsets for each member instead of as a
  table keyed by "x" and "y".

- since "x" and "y" are strongly typed, we can throw an error if you
  put the wrong types in them.  There might be cases where you don't
  want this -- for those cases, your solution would be more appropriate.

- you get a very efficient, backward compatible, language-independent
  serialization format for free (protocol buffers).  You can serialize
  it to disk and load it into Python, Ruby, C++, etc.

- to get even slightly crazier, you can pass an in-memory "Alice"
  instance to any other language that has support for my pb system.
  You can pass it to C++ and get full reflection over the fields
  and values, without copying.

If this all sounds a bit heavyweight, let me assure you by saying that
my library is currently ~2300 sloc of C, and compiles to <30kb of object
code on x86.  You can check it out here (the Lua parts aren't written
yet):

http://github.com/haberman/upb/tree/master

I won't argue that people should use this in every situation.  But if
you're in the market for a strongly-typed, compact, language-independent, 
serializable struct-like object, I think you might like it.

ETA: 2 weeks to 2 months (I'm bad at estimating).

Josh