[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Feature request: hiding upvalues
- From: Gabriel Bertilson <arboreous.philologist@...>
- Date: Thu, 15 Nov 2018 17:22:01 -0600
It's not true that there is a chain of environment tables and that
local variables are looked up in a table by name (if I'm reading you
right), at least not in Lua 5.3. (I think this is also true of Lua
5.1, though some of the opcodes are different.)
Global and local variables are implemented differently. Global
variables are treated as fields in the _ENV table, while local
variables are assigned to registers by the compiler. So setting and
getting a global variable uses different opcodes than setting and
getting a local variable:
--------
$ luac -l -l -
x = 10
print(x)
main <stdin:0,0> (5 instructions at 0x55c499b0bde0)
0+ params, 2 slots, 1 upvalue, 0 locals, 3 constants, 0 functions
1 [1] SETTABUP 0 -1 -2 ; _ENV "x" 10
2 [2] GETTABUP 0 0 -3 ; _ENV "print"
3 [2] GETTABUP 1 0 -1 ; _ENV "x"
4 [2] CALL 0 2 1
5 [2] RETURN 0 1
constants (3) for 0x55c499b0bde0:
1 "x"
2 10
3 "print"
locals (0) for 0x55c499b0bde0:
upvalues (1) for 0x55c499b0bde0:
0 _ENV 1 0
--------
$ luac -l -l -
local x = 10
print(x)
main <stdin:0,0> (5 instructions at 0x55a6a4b9bde0)
0+ params, 3 slots, 1 upvalue, 1 local, 2 constants, 0 functions
1 [1] LOADK 0 -1 ; 10
2 [2] GETTABUP 1 0 -2 ; _ENV "print"
3 [2] MOVE 2 0
4 [2] CALL 1 2 1
5 [2] RETURN 0 1
constants (2) for 0x55a6a4b9bde0:
1 10
2 "print"
locals (1) for 0x55a6a4b9bde0:
0 x 2 6
upvalues (1) for 0x55a6a4b9bde0:
0 _ENV 1 0
--------
Here the global variable x is set with SETTABUP and gotten with
GETTABUP, using the constant "x", but the local variable x is set with
LOADK and gotten with MOVE, using the register index 0. The name of a
local variable isn't used by the bytecode instructions, though it is
stored elsewhere in the bytecode (so error messages can mention local
variables by name).
And upvalues have a different implementation from both locals and
globals; they are set with SETTABUP and gotten with GETTABUP.
So hash tables are not involved in the implementation of local
variables or upvalues at all.
There is also no chain of environment tables by default in Lua 5.3.
The metatable _ENV is nil if it hasn't been modified.
$ lua -e 'print(getmetatable(_ENV))'
nil
But I guess you can make a chain of environment tables by doing `local
_ENV = setmetatable({}, { __index = _ENV })`.
Because locals, globals, and upvalues are implemented differently, the
compiler must determine whether a variable is local, global or an
upvalue when a chunk is compiled and the nature of a variable in a
chunk cannot be changed after that point. So the function expression
`function () return x end` either returns an upvalue or a global, and
the assignment `x = 10` is resolved to an assignment to a local, an
upvalue, or a global depending on context.
— Gabriel
On Tue, Nov 13, 2018 at 3:25 PM Philippe Verdy <verdy_p@wanadoo.fr> wrote:
>
> I don't think so; within the same block of statements, all variables are automatically bound to the same environment (i.e. a table), and the compiler does not need to know if it's local or external: all of them are local and accessed by the "__index" meta entry of the environment table, which is always used as first level of indirection before performing an actual lookup to the environment table itself (not its metatable).
> Unlike tables in Lua, all environments must have a metatable associated to their table, so there's always an "__index" entry in it (it also has a "__newindex" for assignments). A compiler may want to perform some optimizations for not creating a metatable with "__index" and "__newindex", but it cannot safely know if these two entries are set or not (they may be set by the block of instruction by using the fsetenv function, possibly by calling external functions which will execute with the parent element in their on environment linked to the parent environment, and so can also modify the parent environment).
> So all names are local. The fact that when assigning a variable or reading it has an external effect comes only from the fact that the default "__index" function will lookup in parent environments in a chain to see if there's a matching name: if no such name is found in the chain, then the effect of reading the variable will return "nil"; the same occurs for "__newindex" which also tries to lookup the local table, then if not found performs a lookup in the parent environment, and if not found it will then create a new variable in the initial environment.
> All you want is to stop the recursive lookup of variable names in the chain of environment, so that all variables behave as pure local variables (creating as many new variables as needed).
> It's not really possible to block the recursion: your code even needs the chain for all basic operations (including operators like "+"). If you break the lookup, then your local code can simply do nothing at all!
> Remember that the environment does not include only local variables, it also includes all functions and operators your code can use.
> So your proppsed "blind" keyword in:
> function (_ENV,c,s) blind
> return {x=c*x-s*y,y=s*x+c*y}
> end
> would have the effect of leaving only three names accessibles: _ENV, c and s, but operations like "=" (assignment made via "__newindex" function call), "*", "-", and "+" would also have no defined function (their lookup would return nil, and you'd then get errors: cannot call a function referenced by nil !
>
> The only way to do that is to allow passing selected properties you need for your function to run, by creating a restrictive environment, in which the function:
> function (c,s)
> return {x=c*x-s*y,y=s*x+c*y}
> end
> now can run in perfect isolation: it is effectively the case that variable names "x" and "y" are not defined locally, but you have to force them to use the local environment and not any parent environment, but you sill need the function references for the 3 arithmetic operators. Note that for function calls (including operator evaluations) there's also a "__call" entry in the environment to find matching function names: functions are not called directly.
>
> An interesting reading:
>
> http://lua-users.org/wiki/DetectingUndefinedVariables
> or more generally
> http://lua-users.org/wiki/LuaScoping
>
> and the manual of course (which details all "__" prefixed functions needed in valid environment and that allow your code to be really executable) :
>
> http://www.lua.org/manual/5.2/manual.html#2.4
>
>
>
> Le mar. 13 nov. 2018 à 14:23, Dirk Laurie <dirk.laurie@gmail.com> a écrit :
>>
>> Op Di., 13 Nov. 2018 om 14:04 het Philippe Verdy <verdy_p@wanadoo.fr> geskryf:
>> >>
>> >> I'm not too sure how one could implement hiding of upvalues at the
>> >> language level. (At the implementation level, it's obvious. Just skip
>> >> the phase that looks for them.)
>> >
>> > This is not so obvious because Lua highly depends on this; the "phrase" that looks for it is exactly the one that lookups variables in the environment using its "__index" meta-entry, which is where the environment is already stated: so the first level of lookup would be required (otherwise the function itself not would have itself access its own local variables) but you want to avoid the recursion of the lookup to the next level to look for upvalues.
>> > Note that this recursion is a trailing recursion (so Lua optimizes it natively as a loop: the "phrase" you want to hide would be a statement within that loop, and you want it to be used only on a specific loop number to break that loop by returning early a "nil" value so that an "undefined variable error" can be stated). The difficulty is that there's no loop number which is accessible. So all I see you can do is to set the "__index" meta entry specifically to your need.
>>
>> I think we are talking at cross-purposes.
>>
>> Whether a name is recognized as an upvalue happens at compile time. No
>> metatable is involved. It's a question of what is in scope.
>>
>> Do 'luac -l' for my two examples. The one without "x=1" generates the
>> instruction GETTABLE 4 0 -1 ; "x"
>> but the one with "x=1" generates GETUPVAL 4 0 ; x
>>
>> The scope of a name is lexical. That means there is a sequence of
>> local scopes with the entire chunk outermost, each containing a
>> smaller scope until we get to the innermost scope. The compiler does
>> this when one refers to 'x':
>>
>> 1. Is there a local variable named 'x' in the innermost scope? If so,
>> it does not need to be loaded: the VM instruction can access it
>> directly.
>> 2. For each containing containing scope working outwards, the question
>> is asked again. If a local variable named 'x' is found in that scope,
>> a GETUPVAL instruction is generated to load the variable via the
>> upvalue list that sits in the function's closure.
>> 3. If no containing scope has 'x', a GETTABLE instruction is issued to
>> load the value as a table access from _ENV.
>>
>> The requested "blind" keyword would merely tell the compiler to treat
>> the current innermost scope, from that point onwards, as not having a
>> containing scope, so that step 2 is an empty loop.
>>
>> Youmay have been thinking of what happens in case 3: the GETTABLE from
>> _ENV could trigger a whole chin of __index metamethods, depending on
>> what you have done with _ENV (in fact, since this idiom is used in an
>> object-oriented paradigm, your _ENV is an object which may well have a
>> complicated metatable).