lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi - for reasons discussed earlier (see "Future Plans for Lua and Unicode"), I want to allow Lua scripts to be encoded in either ANSI or UTF-8.  Legacy scripts are currently ANSI, but with future ones, script authors will be able to choose between ANSI or UTF-8 (for interest, I plan to say that the encoding of strings passed to/returned from my app's API must match/will match the encoding of the script itself).  I use a UTF-8 BOM to allow me to distinguish between ANSI and UTF-8 scripts.  This seems to work fine for a simple script.  I use lua_load with my own supplied reader to load each script, check the BOM (which I need to do anyway, so I know how to handle strings), and then jump past the BOM if there is one. 
 
Script authors can also write Lua modules, however, and by default these are loaded by luaL_loadfile, which doesn't like the UTF8-BOM and throws an error.   This isn't actually that serious a problem, because one simple solution would be for me to simply specify that modules must always be encoded in ANSI.  However, if possible, I would prefer to allow modules to be encoded in either UTF-8 or ANSI too (the rule about string encoding matching script encoding would not apply to modules).  Can anyone suggest a way that I can do this, while still retaining the UTF-8 BOM?  I had a look at the package.loaders section in the manual, but this seems to only provide a way to have a module-specific loader, whereas my requirement applies to all modules.
 
Simon