[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Newbe question on function environment
- From: Peng Zhicheng <pengzhicheng1986@...>
- Date: Mon, 02 Apr 2012 02:01:53 +0800
于 2012-4-1 22:13, Jose Torre-Bueno 写道:
I'm still curious about how closures are actually handled.  Function parameters and function locals are not members of _ENV, I understand function parameters live on the stack and I assume space for function locals is allocated on a heap of some kind maintained by the interpreter. It must have the following properties:
1) When a closure occurs some of these objects are given persistence because they are referred to by the newly defined function.
2)  Further these upvalues can grow since a table can be an upvalue and a function with a table as an upvalue can add to it.
3) If two closures refer to the same local variable or function parameter they each have a reference to the same storage location.
My question is when a function is defined does the interpreter parse the new function to determine what values of the defining function become persistent as part of the closure or does all the local storage of the defining function get preserved as a package?  To clarify: in this bit of code:
function f(fpar)
	local f1 = 1111111111
	local f2 = 22222
	local f3 = 33333
	local f4 = {'imagine this is a table that uses lots of storage'}
	function g() print('in g', fpar, f1, f2) end
	function h() print('in h', fpar, f1, f3) end
end
g() and h() both have shared access to fpar and f1 (and if one of them changed one of these upvalues the other would see it.)
g() has access to f2 but not f3 and visa versa for h()
I assume that f4 went out of scope when f() exits and will be collected but if a closure saves all of the local storage of a defining function this might not be the case and f4 might persist until both g and h went out of scope.
That is what I mean by parsing the defined functions.  When g() and h() were defined did the interpreter look to see which local variables of f() they used and mark each variable appropriately?
If so I assume the usual rules about garbage collection of any memory that no longer has a reference to it apply.  That is g = nil will cause collection of f2 but not f1 or f3.
Further if g() had a statement like f1 = f2 then h() would gain access to the location pointed to by f2 and g = nil would no longer cause collection of that memory.
On Apr 1, 2012, at 2:58 AM, Dirk Laurie wrote:
1)  Globals live in a table called _G, any function has a variable called _ENV which is the same table unless I explicitly redefine it in which case for that function's life globals are whatever is in the table pointed to by _ENV.  (_G and DEFAULT_ENVIRONMENT you  mentioned are the same thing?)
I'm personally not a fan of referring to _G as a specific object, but yes.
Not quite.
1. Globals live in whatever table is currently associated with the name
_ENV.
2. Whenever you enter a function, Lua has pre-initialized _ENV
for you, so that _ENV and _ENV._ENV are the same.
3. _G is a key in the original _ENV.  The only special thing
about _G is that Lua initializes _ENV._G to _ENV right at the
start.
4.  Lua never again changes or refers to _G.  It's there for
your convenience only.
5. If you never muck about with _ENV or _G, Lua automatically
provides an _ENV in which _ENV._G (i.e. _G) still refers to the
original global environment
Let's execute some code in which _ENV is modified.
Lua 5.2.0 (alpha)  Copyright (C) 1994-2010 Lua.org, PUC-Rio
print(_ENV,_G); _ENV=nil; print(_ENV,_G)
table: 0x9fb13b0	    table: 0x9fb13b0
stdin:1: attempt to index upvalue '_ENV' (a nil value)
After `_ENV=nil`, there is no _G anymore until _ENV is redefined,
for example by returning from the current function.
_G=nil; f=function(_ENV) print(_G) end;
f(_ENV)
nil
f{print=print,_G="ha-ha!"}
ha-ha!
If you supply your own _ENV, you must do something like that
`print=print`, or providing a suitable __index metamethod, if
you will need access to routines in the standard library.
firstly, you should under stand that Lua is **lexical scoped**.
lexical scope mean that during compilation, all variable references are resolved (i.e. whether they are locals or upvalues
or globals are determined by the compiler parser and correct OPCODES are genereated accordingly)
a local variable is translated by the compiler into a position(or slot number) of the closure's stack frame (register allocation),
and only this slot number is needed to generate the VM instructions. the name
of a local variable is retained only for debug purpose and can be completely stripped out.
the fixed parameters of a function are treated same way as local variables: they are allocated in the stack frame
the vararg are placed under the stack frame when called, they are not allocated at compile time,
and the VM use special OPCODE to load them onto the stack when accessed.
the upvalues are similar, in that they are translated into slot numbers of the closure's upvalue array(or vector).
while a `global' variable is quite another thing. a global variable is translated into a string indexed **field**
of some predefined table, i.e. the environment of the function. (which is nothing but just a normal lua table).
since global variables are in fact **fields** of the environment table, their name are stored in the constants
array/vector of a function prototype (i.e, they are just string literals) and can't be stripped out.
the difference in semantic of the environment between Lua 5.1 and Lua 5.2 is that,
in Lua 5.1 the environtment is stored specially in the closure object, and you must access it with
getfenv and setfenv (or the C API equivalent ones);
while in Lua 5.2 the environment is just a **lexical** symbol, and the compiler uses the normal scope rule to rosolve it,
instead of using some dedicated API to access it.
in Lua 5.2, for the outer most function (i.e. the chunk) the environment is initially predefined by the parser as the first upvalue.
the name of it is stored in the lexer, whose default value is `_ENV', but can be changed if you build your own Lua.
but as it is a normal lexical symbol,it can be redefined to hide the previous definition per the scope rule of Lua.
so the environment might be a upvalue, or might be a local variable,
one more word about the _G:
_G is actually nothing to do with the environment of a closure.
it is just a global variable which initially refers to the default environment.
you can only access the environment through the API (Lua 5.1) or through the environment table name(Lua 5.2)
change _G won't affect the environment either in Lua 5.1 or Lua 5.2
following examples are for illustration only.
I've commented them in detail as some note to check my own understanding of Lua,
and hopefully they would be helpful to anyone too.
-------------code 1 begin --------------------
-- Lua 5.2
-- _ENV is pre-defined as a upvalue for the outer most chunk
local foo = {"foo"} -- local variable allocated into stack slot 1
local bar = "bar" -- stack slot 2
do -- a block, opens a new stack frame
local bar = 2 -- local variable allocated on stack slot 1
baz = bar -- bar resolved to slot 1 of this frame, not the outer level
-- since baz can't be resolved to lexically, defined symbol, it is a global
-- and translated into _ENV.baz
-- thus _ENV resolved to outer chunk's upvalue,
end -- leave this stack frame.
-- since no closure created inside this stack frame, no need to `close' upvalues
function foo (n) -- this is syntax sugar for foo = function (n), and foo resolved to local in stack slot 1
-- and start a inner lexical level
-- parameter `n' is the first local variable and allocated the stack slot 1
bar = baz -n -- baz is global, translated into _ENV.baz
-- and _ENV resolved to outer upvalue, and _ENV allocated upvalue slot 1 of this closure.
-- n is local, in stack slot 1
-- bar resolved to the outer local variable, slot 2 of the chunk's stack frame
-- bar allocated upvalue slot 2 of this closure.
return function (_ENV) -- create a closure and return it
-- new lexical level
-- parameter _ENV defined as first local variable and allocated slot 1 of stack frame
print (n) -- print is global, translated into _ENV.print
-- then _ENV resolved as local in stack slot 1
-- n resolved to local of the outer function foo.
-- n allocated upvalue slot 1 and point to the stack frame slot 1 of function foo
-- so here n is a `open' upvalue now
end -- leave level
end -- end the scope of n, but n is upvalue of the inner function,
-- a closure of which is returned.
-- so n must be closed, and copied to its own upvalue array
local foo = foo(100) -- this is a bit complicated
-- the `local' statement would define a new local variable called foo, which is allocated
-- as stack slot 3 of the outer most chunk. but its scope begins just after this statement,
-- so the call will still resolve to the former local variable, which is in stack slot 1
-- then a constant number 100 is pass as parameter, bond to n of function foo
-- the returned value is a closure with one `closed' upvalue of value 100, and takes on parameter _ENV
-- this closure is then assigned to the newly allocated stack slot 3 for the new local variable `foo',
-- note, since here the old foo in stack slot 1 is no longer visible.
-- foo() -- we call this closure then. but we don't pass the _ENV parameter
-- so this should generate a error: "attemp to index local _ENV(a nil value)"
-- pcall here is global, which is looked up in the upvalue _ENV of the chunk, assuming which is
-- the default environment of the interpreter
-- foo {} -- if we call it passing a empty table as the _ENV parameter
-- the error would be: "attemp to call global print (a nil value)"
--foo{print = 2}) -- this would generate error "attemp to call global print(a number value)"
-- before the end of chunk, the local variable `bar' in stack slot 2 needs to be closed,
-- because it is the 2nd upvalue of one closure created in this chunk.
-- but the 1st upvalue of the same closure(_ENV) is not closed, because it is not allocated in
-- current stack frame as **local**. only locals of this level which are also upvalues of inner closures
-- will be closed.
-- but the created closure is not referenced any more when this chunk end, it would be collected sometime,
-- then the closed upvalue is collected with it.
------------- code 1 end ------------------------
------------ code 2 begin ----------------------
assert(_G._G == _G)
foo = {}
assert(_G.foo == foo)
_G = nil
assert(getfenv(1)._G == nil) -- Lua 5.1
assert(_ENV._G == nil) -- Lua 5.2
_G = { foo = "bar"} -- set variable _G won't affect the environment
print (foo) --nil
print (_G.foo) -- "bar"
------------ code 2 end ----------------