lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Tue, Oct 6, 2015 at 12:07 PM, Thijs Schreijer
<thijs@thijsschreijer.nl> wrote:
>
>> >
>> >   But wouldn't there have to be a consensus as to *what* __serialize
>> > returns?  I mean, obviously, a string of byte values (variation on a
>> string)
>> > but the actual contents can vary widely.  Given a simple Lua table:
>> >
>> >         { 1 , "two" , true }
>> >
>> >   One person might want to serialize that as JSON:
>> >
>> >         [ 1 , "two" , true ]
>> >
>> >   Someone else might want BSON [1] (hex dump follows):
>> >
>> >         13 00 00 00 04 00 0D 00
>> >         00 00 30 00 01 00 00 00
>> >         31 00 74 77 6F 32 01
>> >
>> >   Another one (like me) might want to serialize to CBOR [2] (hex dump
>> > follows):
>> >
>> >         83 01 63 74 77 6F F5
>> >
>> > and yet another might want straight up Lua:
>> >
>> >         { 1 , "two" , true }
>> >
>> >   Is it better to perhaps just reserve "__serialize" for serialization and
>> > leave it up to modules to flesh it out?  Or do we need to actually define
>> > the output format?
>> >
>> >   -spc (Not a proposal, just something to talk about ... )
>> >
>> > [1]     http://bsonspec.org/spec.html
>> >
>> > [2]     RFC-7049
>> >
>>
>> Solution: Don't call it __serialize! Call it __json, __bson, __cbor,
>> or whatever's actually appropriate for the format.
>>
>> You could borrow a Pythonism and use __repr for Lua syntax.
>>
>> /s/ Adam
>
> If that were implemented, now how would my code know which of those it would need to call? Just `__serialize` should do.
>
> It should either return plain Lua values with no recursion (as mentioned earlier), or simply a string. Though the latter option might have everybody reinventing the serialization, whilst some good libraries are available, so I would prefer the former.
>
> A second return value can be added to identify the type. Even if that type might be prone to collisions, a recent remark in the thread about the usefulness of the registry asked for any cases where there were collisions in the registry. I didn't see any response on that. So I don't think it to be such a big issue.
> Just setting some proper examples with namespacing should set people of in the right direction.
>
> So for the LuaDate library I'm maintaining, something like "lua:thijsschreijer.nl/luadate/1.0" would be a fine type name I guess.
>
> Thijs

Your question illustrates exactly why it's relevant: You would only
ask that question if you aren't actually thinking about serialization.
If all you want is just "shove this into a byte array and get it back
out later" then any one of them COULD work, but one thing you
definitely DON'T want to do is to mix metaphors -- and that's exactly
the kind of trouble you'd get in if you aren't specifically asking for
a particular format, because different library maintainers might
integrate with different serialization modules.

That said: I would imagine you'd probably want to use __repr most
times, since the deserializer would be load(). In this format, types
could be serialized as "setmetatable({}, require('LuaDate').Date)" or
something like that, unless you as the maintainer added a __repr that
would return something more like "require('LuaDate').Date(m,d,y)".
(NB: This is just me throwing something out there as a casual example,
not fully thought out or fleshed out.)

Meanwhile, someone writing a JSON library would use __json instead,
and anyone wishing to integrate with that library would offer __json.
No conflict, no problem, everyone's happy. (And if you as the LuaDate
maintainer didn't offer __json and someone really wanted to they could
monkey-patch in a __json in their own code, again without stepping on
anyone's toes.)

/s/ Adam