|
On 2023-11-28 18:00, Federico Ferri wrote:
If one wants to execute a piece of lua code in a "protected" environment (so that functions, globals, etc... are not messed up afterwards), it seems the way to go is load(), with an environment param:x = 0 env = {} setmetatable(env, {__index = _G}) assert(load('x = 999', 'foo.lua', 't', env))() print(x) -- x is still 0, env.x is 999however, if the code to run calls require, then it escapes the protected environment:x = 0 env = {} setmetatable(env, {__index = _G})assert(load('require "ext"', 'foo.lua', 't', env))() -- ext.lua contains `x = 999`print(x) -- x is 999! (so is env.x...) is this intended, and how to properly protect in the second case?
`require` caches modules to avoid repeatedly re-requiring them. That means it can't special-case specific `load`s. It also doesn't look at the parent function's environment.
`ext.lua` directly modifies the global environment, which is bad style (and breaks with the caching – after `require "ext" ; x = 1 ; require "ext"`, x will still be 1.) If you trust and control the code, the easiest way is probably to just adjust your modules to not modify the global environment. (E.g. in modules, make a module table and return that, instead of directly messing with the global environment.)
If you trust the code but don't control it, you can replace `require` in the "protected" environment with a self-built version (using `package.*` and `load`, and making sure to set the same `env` environment for required modules). Make sure you make a separate module cache that you can discard at the end. (An idiom that some people use in modules is setting `package.loaded[modname]` to the module table, instead of returning it at the end, so you probably want to also make `env.package` a new table that `__index`-es back to `_G.package`, and then point `env.package.loaded` at your new module cache, possibly with another `__index` to `_G.package.loaded`…)
Your "restricted" code can also still say `_G.x = 123` (since _G._G == _G), so maybe you want `env._G = env`.
A full checklist of the above: (x --> y here means x __indexes to y, i.e. `setmetatable( x, { __index = y } )`)
You currently have: * env --> _G but you want all of: * env --> _G * env._G = env (unless you want to allow easy *explicit* escapes) * env.package --> package * env.package.loaded --> package.loaded * env.require = (custom require variant) * load = (thin wrapper that defaults to env not _G for the environment)That's still leaving plenty of ways out, just the easiest one: `require "_G"` still gets you `_G`. So this is all if you FULLY TRUST the code that's being run. Otherwise, the debug library, the ability to set metatables, some string functions, `load` on bytecode, etc. etc. are all ways to cause lots of trouble and/or escape the "protected" environment, and then you'll want a very different approach to fully sandboxing the code.
-- nobody