lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Robert G. Jakabosky <bobby <at> sharedrealm.com> writes:
> I am very interested to see how you implement the interface to nested C data 
> structures.  I have done this for a private project and it is an interesting 
> problem (reference counting, nested structure, arrays of structures).

Yes, I have agonized for many many hours over this question.
I want upb to be easy to integrate into any VM/interpreter in
as "native" a way as possible, regardless of its memory
management scheme or concurrency model.  This is very
tricky to do well.  If you define a C-level "message" type that
each language wraps, you have to define a "one size fits all"
set of semantics that is both compatible with any conceivable
VM and efficient enough that people actually want to use -- a
daunting task.  Your definition will also impose API decisions
on your clients.  It's also slightly awkward to have this parallel
graph of structures in C++ space and Lua space.

After a lot of thought, I decided that this was not the best
approach.  Instead, the encoders and decoders are pure
streaming (like SAX, or YAJL) where they just call callbacks
and leave the in-memory representation to clients.

However, I *do* define an "accessor" type which is essentially
a vtable of function pointers that clients can use to read a field,
given a pointer to the message.  So there is still a standard way
to read and write the fields of a proto, but there is no standard
memory management scheme or concurrency model.  For more
info, see:
  https://github.com/haberman/upb/blob/master/src/upb_msg.h

My plan with Lua is to make each message a userdata.  The
integers and bools will go in the userdata's memory.  The
references to strings and submessages will go in the
userdata's environment table.  Then Lua's GC is in charge
and upb doesn't have to do any memory management.

> > Don't worry, I don't plan to use the "pb" namespace --
> > I'm planning to put everything under "upb", since that's
> > the name of my project:
> >   https://github.com/haberman/upb/wiki
> 
> I have been keeping an eye on your 'upb' project for a long time (I think for 
> more then a year) waiting for it to become usable.  I still can't wait to see 
> it finished.

Yes, I can't either.  :)  Sorry it's taken so long.  It's been a lot of
the agonizing I mentioned earlier.  I want to get the core interfaces
right before I have users, because I've been doing very extensive
refactoring and don't want to be burdened with legacy support
of bad interfaces.

> One of the reasons I started lua-pb is that I wanted to see how 
> close LuaJIT could get to the speed of your JIT'ed decoder (haven't optimized 
> the project for LuaJIT yet, so it is not even close right now).

I'll be interested to see this too!

> > Some other things to consider:
> > 
> > - do you plan to allow reparenting of nested messages?  eg.
> >   msg.foo = Foo().  I ask because you say you're emulating
> >   Python proto, which does not allow this AFAIK and instead
> >   uses the C++ convention of: msg.mutable_foo.  I've always
> >   thought this was awkward for a dynamic language, so plan
> >   to allow reparenting, but in that case you have to watch
> >   out for cycles that the user may create.
> 
> I don't plan on emulating every thing from the Python/C++ interface, and 
> msg.mutable_foo is something I don't want to do.
> 
> I was planning on allowing a message to be referenced by multiple parent 
> messages, but restricting messages to one parent will allow invalidating the 
> cached "byte size" of the parent message when a field is changed.  Maybe I 
> will just add a "msg:Duplicate()" method.

Personally I wouldn't worry about putting cached byte sizes in the
message.  Just put them in a separate array.  Personally I don't
think this really needs to be done in the C++ implementation
either (I'm not planning to do it).

Josh