lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

> In my own embedded target arena, a universal bytecode format would be
> very beneficial since I could then exclude the parser. But embedded
> targets are also the ones that require the most varied implementations
> of the loader. Let those implementations modify lundump.c to their own
> benefit, rather than force the SEALs to maintain packing and unpacking
> code for a bunch of implementations.

As has already been discussed several times, there are many issues here.
I'm sorry for the long post, but please bear with me if you're interested
in these issues.

0. Our philosophy regarding precompiled scripts is that loading them
   should be as fast as possible. That goal dictates the current
   implementation, which only works for native formats. Also, if the
   byecode loader is to have a size advantage over the full parser,
   then the loader must be quite small.

1. It is very convenient to precompile scripts once and load them on
   multiple platforms. The solution is a universal bytecode loader,
   not a cross-compiler. However, such a loader is bound to be complex
   and possibly hard to test and maintain, if it caters for many platforms
   (but that's the whole point).

2. The endianness issue. If platforms differ *only* on endianness, then
   the modified lundump.c that I have posted is the solution. This goes
   against simplicity of the loader, but the convenience may offset this.
   Perhaps we could more official and distribute it in etc/.

3. A cross-compiler can be useful if you cannot run luac on the target
   platform. The solution is a modified ldump.c suited to the target;
   this will keep the loader as simple as possible, which is the original

4. Lua does not depend on what ldump.c and lundump.c do, as long as the
   loader builds the correct internal data structures. You can replace
   ldump.c and lundump.c by anything you want, as long as they agree on
   the external format.
5. In the current implementation, this format has two levels: a
   structural level and a physical level. This should make it simple to
   replace the physical level, which only contains a handful of simple
   routines. Take ldump.c. The lowest physical level is implemented
   by DumpBlock and DumpMem. DumpMem is for data that could depend on
   endianness. So you can, say, write a dumper that saves files in a
   fixed endianness by simply rewriting DumpMem (which is currently
   a macro over DumpBlock). The other half of the physical level is
   implemented by DumpChar, DumpInt, DumpNumber, DumpVector, and
   DumpString. You can use a different external representation for
   integers, Lua numbers, and strings by rewriting some of those. The
   rest of the code in ldump.c implements the structural level. You can
   use a different structure by rewriting it. The loader is implemented
   in a similar way. LoadMem is for data that depend on endianness. The
   swap-aware loader that I posted earlier implements byte swapping
   in LoadMemand that's the only real change (except for testing the
   endianness of the file being loaded). Again, you can cater for
   different external representations by rewriting the corresponding
   routines in the loader.

Finally, the header of precompiled scripts contain information about the
internal format. This can be used to make the necessary decisions for
complicaded loaders and bytecode transformers. The header also contains
a format number, which can and should be used by anyone writting a different
external format.

All the modifications discussed above are simple to make, given a specific
goal or target platform. It's just that making all of them is too much to
include in the core Lua distribution.

If you want to modify ldump.c or lundump.c for a specific task, please feel
free to contact me (or post here) if you have any questions.

If you have any