|
Hi - for reasons discussed earlier (see
"Future Plans for Lua and Unicode"), I want to allow Lua scripts to be encoded
in either ANSI or UTF-8. Legacy scripts are currently ANSI, but
with future ones, script authors will be able to choose between ANSI or
UTF-8 (for interest, I plan to say that the encoding of strings passed
to/returned from my app's API must match/will match the encoding of the script
itself). I use a UTF-8 BOM to allow me to distinguish between ANSI and
UTF-8 scripts. This seems to work fine for a simple script. I use
lua_load with my own supplied reader to load each script, check the BOM (which I
need to do anyway, so I know how to handle strings), and then jump past the BOM
if there is one.
Script authors can also write Lua modules, however,
and by default these are loaded by luaL_loadfile, which doesn't like the
UTF8-BOM and throws an error. This isn't actually that serious a
problem, because one simple solution would be for me to simply specify that
modules must always be encoded in ANSI. However, if possible, I would
prefer to allow modules to be encoded in either UTF-8 or ANSI too (the rule
about string encoding matching script encoding would not apply to
modules). Can anyone suggest a way that I can do this, while still
retaining the UTF-8 BOM? I had a look at the package.loaders section in
the manual, but this seems to only provide a way to have a module-specific
loader, whereas my requirement applies to all modules.
Simon
|