lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Tue, Apr 30, 2013 at 1:52 AM, Philipp Janda <siffiejoe@gmx.net> wrote:
I propose the following definition of "globals" in the context of static global checkers:
*   Any access to a chunk's _ENV upvalue (not a local variable) is a globals access, unless the chunk itself or any function sharing the same _ENV upvalue potentially assigns to the _ENV upvalue.
*   Any access to a functions _ENV upvalue (not a local variable) is a globals access, if the _ENV upvalue of the chunk was the only _ENV in scope during the functions definition *and* unless any function sharing the same _ENV upvalue potentially assigns to the _ENV upvalue.
*   Anything else not covered above is not a globals access.

OK, I had to read that several times ;)

Right, globals are usually upvalue references to the special symbol _ENV.  It's not guaranteed that this upvalue actually points to _G, of course, and _ENV may not be an upvalue if defined as a local (look at code for print(boo()) here)

local print = print
local _ENV = {X = 'hoo'}

function boo() return X end

print(boo())

Static checkers _could_ be taught to handle this case, but in general _ENV might be assigned to something dynamically.

As Luiz says, luac -l -l  will give you the local and upvalue tables, _after_ the function listing. So lglob caches the function code and then reads the symbol table; the checker then runs over the cached function code and can check whether any registers refer to symbols, not just to stack slots.   We need this (for instance) to tell whether the following code is bad:

local L = require 'lfs'
print(L.getcwd())

(by default it actually does do a require(), unless there is a -wl whitelist that has an lfs entry)

So we're going beyond plain 'global' access here.  Just finding globals is fine and dandy, but David M's insight was that we could track _fields_ of known globals as well.  Further, lglob tracks _aliases_ to known globals and imported modules.

lglob does get plain module() right (as its 5.2 friend '_ENV={}') because it regards everything after module() or _ENV as a separate scope, and then tracks accesses in that scope specially.  So 'Answer' here is considered a problem:

_ENV = {}  -- or spell it 'module(...)' ;)

function answer() return 42 end

function life() return Answer() end

return _ENV

Now PA's case involves tracking multiple scopes. This is a silly example, but it shows the issue.

local function private_business(val)
   local _ENV = {}
   X = val + 1
   Y = val - 1
   return X + Y
end

Again, this can be done, by tracking the scope of local _ENV in functions, but it seemed a lot of work for a case I did not particularly find interesting.

And as for Tim Hill's point - yes, Lua is the best parser, and that's exactly why we're using the output of the Lua compiler for checking.

steve d.