lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]



> -----Original Message-----
> From: lua-l-bounces@lists.lua.org [mailto:lua-l-bounces@lists.lua.org] On
> Behalf Of Coda Highland
> Sent: dinsdag 6 oktober 2015 21:31
> To: Lua mailing list
> Subject: Re: The hypothetical __serialize metamethod (was Re: [ANN] luaproc
> 1.0-4)
> 
> On Tue, Oct 6, 2015 at 12:07 PM, Thijs Schreijer
> <thijs@thijsschreijer.nl> wrote:
> >
> >> >
> >> >   But wouldn't there have to be a consensus as to *what* __serialize
> >> > returns?  I mean, obviously, a string of byte values (variation on a
> >> string)
> >> > but the actual contents can vary widely.  Given a simple Lua table:
> >> >
> >> >         { 1 , "two" , true }
> >> >
> >> >   One person might want to serialize that as JSON:
> >> >
> >> >         [ 1 , "two" , true ]
> >> >
> >> >   Someone else might want BSON [1] (hex dump follows):
> >> >
> >> >         13 00 00 00 04 00 0D 00
> >> >         00 00 30 00 01 00 00 00
> >> >         31 00 74 77 6F 32 01
> >> >
> >> >   Another one (like me) might want to serialize to CBOR [2] (hex dump
> >> > follows):
> >> >
> >> >         83 01 63 74 77 6F F5
> >> >
> >> > and yet another might want straight up Lua:
> >> >
> >> >         { 1 , "two" , true }
> >> >
> >> >   Is it better to perhaps just reserve "__serialize" for serialization
> and
> >> > leave it up to modules to flesh it out?  Or do we need to actually
> define
> >> > the output format?
> >> >
> >> >   -spc (Not a proposal, just something to talk about ... )
> >> >
> >> > [1]     http://bsonspec.org/spec.html
> >> >
> >> > [2]     RFC-7049
> >> >
> >>
> >> Solution: Don't call it __serialize! Call it __json, __bson, __cbor,
> >> or whatever's actually appropriate for the format.
> >>
> >> You could borrow a Pythonism and use __repr for Lua syntax.
> >>
> >> /s/ Adam
> >
> > If that were implemented, now how would my code know which of those it
> would need to call? Just `__serialize` should do.
> >
> > It should either return plain Lua values with no recursion (as mentioned
> earlier), or simply a string. Though the latter option might have everybody
> reinventing the serialization, whilst some good libraries are available, so
> I would prefer the former.
> >
> > A second return value can be added to identify the type. Even if that type
> might be prone to collisions, a recent remark in the thread about the
> usefulness of the registry asked for any cases where there were collisions
> in the registry. I didn't see any response on that. So I don't think it to
> be such a big issue.
> > Just setting some proper examples with namespacing should set people of in
> the right direction.
> >
> > So for the LuaDate library I'm maintaining, something like
> "lua:thijsschreijer.nl/luadate/1.0" would be a fine type name I guess.
> >
> > Thijs
> 
> Your question illustrates exactly why it's relevant: You would only
> ask that question if you aren't actually thinking about serialization.
> If all you want is just "shove this into a byte array and get it back
> out later" then any one of them COULD work, but one thing you
> definitely DON'T want to do is to mix metaphors -- and that's exactly
> the kind of trouble you'd get in if you aren't specifically asking for
> a particular format, because different library maintainers might
> integrate with different serialization modules.

This is Lua; different library maintainers WILL use different serializers. No doubt.

So trying to force a single way upon everyone will not work. Hence my two step proposal.
The __serialize would deliver a non-recursive plain Lua value. And the consumer of the module can then apply their own format on top. I like `serpent` for the `__repr` format you mentioned, or I might use dkjson if I needed your `__json` format.

So the `__serialize` method should only simplify the application/module/object specific structures. And then the consumer can pack it up in any which way the consumer needs it.

> 
> That said: I would imagine you'd probably want to use __repr most
> times, since the deserializer would be load(). In this format, types
> could be serialized as "setmetatable({}, require('LuaDate').Date)" or
> something like that, unless you as the maintainer added a __repr that
> would return something more like "require('LuaDate').Date(m,d,y)".
> (NB: This is just me throwing something out there as a casual example,
> not fully thought out or fleshed out.)
> 
> Meanwhile, someone writing a JSON library would use __json instead,
> and anyone wishing to integrate with that library would offer __json.
> No conflict, no problem, everyone's happy. (And if you as the LuaDate
> maintainer didn't offer __json and someone really wanted to they could
> monkey-patch in a __json in their own code, again without stepping on
> anyone's toes.)

The 2 step approach is generic enough to not require any monkey patching at all. Consider using 3 libraries, each using one of `__repr`, `__json` or `__bson`. Your way would force me to pick one format, and monkey patch the other two.
In my proposal, I would simply call `__serialize` on each and throw those results at `serpent` or `dkjson` and be done with it. (the performance of those two libraries will probably always be better than anything I would come up with for a monkey-patch anyway).

Thijs

> 
> /s/ Adam