Hi - for reasons discussed earlier (see "Future Plans for Lua and Unicode"), I want to
allow Lua scripts to be encoded in either ANSI or UTF-8. Legacy scripts are
currently ANSI, but with future ones, script authors will be able to choose between ANSI
or UTF-8 (for interest, I plan to say that the encoding of strings passed to/returned from
my app's API must match/will match the encoding of the script itself). I use a UTF-8 BOM
to allow me to distinguish between ANSI and UTF-8 scripts. This seems to work fine for a
simple script. I use lua_load with my own supplied reader to load each script, check the
BOM (which I need to do anyway, so I know how to handle strings), and then jump past the
BOM if there is one.
Script authors can also write Lua modules, however, and by default these are loaded by
luaL_loadfile, which doesn't like the UTF8-BOM and throws an error. This isn't actually
that serious a problem, because one simple solution would be for me to simply specify that
modules must always be encoded in ANSI. However, if possible, I would prefer to allow
modules to be encoded in either UTF-8 or ANSI too (the rule about string encoding matching
script encoding would not apply to modules). Can anyone suggest a way that I can do this,
while still retaining the UTF-8 BOM? I had a look at the package.loaders section in the
manual, but this seems to only provide a way to have a module-specific loader, whereas my
requirement applies to all modules.
Simon