[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: Lua bytecodes and endian-ness
- From: Luiz Henrique de Figueiredo <lhf@...>
- Date: Mon, 12 Jun 2006 13:41:17 -0300
> In my own embedded target arena, a universal bytecode format would be
> very beneficial since I could then exclude the parser. But embedded
> targets are also the ones that require the most varied implementations
> of the loader. Let those implementations modify lundump.c to their own
> benefit, rather than force the SEALs to maintain packing and unpacking
> code for a bunch of implementations.
As has already been discussed several times, there are many issues here.
I'm sorry for the long post, but please bear with me if you're interested
in these issues.
0. Our philosophy regarding precompiled scripts is that loading them
should be as fast as possible. That goal dictates the current
implementation, which only works for native formats. Also, if the
byecode loader is to have a size advantage over the full parser,
then the loader must be quite small.
1. It is very convenient to precompile scripts once and load them on
multiple platforms. The solution is a universal bytecode loader,
not a cross-compiler. However, such a loader is bound to be complex
and possibly hard to test and maintain, if it caters for many platforms
(but that's the whole point).
2. The endianness issue. If platforms differ *only* on endianness, then
the modified lundump.c that I have posted is the solution. This goes
against simplicity of the loader, but the convenience may offset this.
Perhaps we could more official and distribute it in etc/.
3. A cross-compiler can be useful if you cannot run luac on the target
platform. The solution is a modified ldump.c suited to the target;
this will keep the loader as simple as possible, which is the original
4. Lua does not depend on what ldump.c and lundump.c do, as long as the
loader builds the correct internal data structures. You can replace
ldump.c and lundump.c by anything you want, as long as they agree on
the external format.
5. In the current implementation, this format has two levels: a
structural level and a physical level. This should make it simple to
replace the physical level, which only contains a handful of simple
routines. Take ldump.c. The lowest physical level is implemented
by DumpBlock and DumpMem. DumpMem is for data that could depend on
endianness. So you can, say, write a dumper that saves files in a
fixed endianness by simply rewriting DumpMem (which is currently
a macro over DumpBlock). The other half of the physical level is
implemented by DumpChar, DumpInt, DumpNumber, DumpVector, and
DumpString. You can use a different external representation for
integers, Lua numbers, and strings by rewriting some of those. The
rest of the code in ldump.c implements the structural level. You can
use a different structure by rewriting it. The loader is implemented
in a similar way. LoadMem is for data that depend on endianness. The
swap-aware loader that I posted earlier implements byte swapping
in LoadMemand that's the only real change (except for testing the
endianness of the file being loaded). Again, you can cater for
different external representations by rewriting the corresponding
routines in the loader.
Finally, the header of precompiled scripts contain information about the
internal format. This can be used to make the necessary decisions for
complicaded loaders and bytecode transformers. The header also contains
a format number, which can and should be used by anyone writting a different
All the modifications discussed above are simple to make, given a specific
goal or target platform. It's just that making all of them is too much to
include in the core Lua distribution.
If you want to modify ldump.c or lundump.c for a specific task, please feel
free to contact me (or post here) if you have any questions.
If you have any