[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Suggestion: names-isolation-statement
- From: nobody <nobody+lua-list@...>
- Date: Mon, 15 Jul 2019 10:55:28 +0200
On 11/07/2019 23.15, Egor Skriptunoff wrote:
A new syntax:
with white-list-of-names do
....
end
This statement makes all names from the outer scope (with exception of
white-listed names) invisible in the inner block.
Interesting idea, might be fun.
But these two things in combination create a problem:
The statement works for all names independently of what a name is:
local/upvalue/global.
The white-list must contain only visible names, otherwise
compilation error is raised.
What about to-be-defined "globals"? (Will you have to "pre-declare" them?)
And what about this?
Point2D.length =
function( _ENV ) with x, y do return (x^2+y^2)^0.5 end end
(which could (more or less) equivalently be written as
function Point2D:length( ) return (self.x^2+self.y^2)^0.5 end
but is shorter / more readable this way and already (without with)
gives the guarantee that you'd at most pollute the object you're
working on, not the outer environment.)
Or even
do
local print = print
local _ENV = { x = 23 }
function _G.foo( ) with print, x do print( x ) end end
-- up to this point you cannot even tell that _ENV might change
-- again (so this would break Lua's single-pass compilation)
function _G.bar( newenv ) _ENV = newenv end
-- only now it's clear that _ENV might be changed
end
bar { y = 5 }
foo ()
Because _ENV can change dynamically, you cannot determine at compile
time what names will be valid, so you cannot always raise a compile time
error.
Sergey Kovalev complained recently about lack of syntax for defining
"pure functions" in Lua.
Due to "with" statement we could define them easily (3 variants):
Variant #1
local function pure_func(x, y)
with x, y do -- if the function must be recursive, add its
name to this white-list
-- all upvalues and globals are inaccessible here
...
end
end
This isn't ideal for "syntactical sandboxing" (you have to look inside
the function to check that it's safe) and you have to write all
arguments twice…
Variant#2
local pure_func
with pure_func do -- we need to include something in the
white-list to be able to pass the constructed function value outside the
block
function pure_func(x, y)
-- all upvalues and globals are inaccessible here
...
end
end
This is awfully verbose. (You have to give the name THREE times!?!)
Variant#3
with do -- white-list is empty, the function value is passed
outside the block by "return" statement
local function pure_func(x, y)
-- all upvalues and globals are inaccessible here
...
end
return pure_func
end
This needs an outer (function() … end)() wrapper or has to live in its
own file, again awfully clunky.
I'd prefer that to work like
local x, y, z with a, b, c do
… (both x,y,z AND a,b,c are visible here) …
end
(but I already have local x,y,z do … end blocks in my code.) Because it
wouldn't be possible to disambiguate that from
local x, y, z;
with a, b, c do … end
without an explicit semicolon, I'd restrict it to always have the
`'local' <NAMELIST> 'with' <NAMELIST> 'do' <BLOCK> 'end'` form. That
would force blocks with purely external side-effects to take the form
local _ with io, self, … do …block… end
which I personally wouldn't mind… but at that point I'm left wondering:
What exactly is gained over the current possibilities of _ENV?
I start my modules (whether they're files or blocks) with something like
-- (_M is the exposed module, _MENV / _ENV is the environment)
local _ENV = setmetatable( { _M = { } }, { __index = _ENV or _G } )
if setfenv then setfenv( 1, _ENV ) end
_M._MENV = _ENV -- be REPL-/testing-/monkey-patching-friendly
package.loaded[...] = _M -- no `return _M` needed
which is 5.1-thru-(probably)-5.4-compatible. Because everything has
_ENV, there's essentially no place where actual global variables can
accidentally be created – everything ends up in a "local environment" at
worst. Also, (unless I need that tiny bit of extra speed for
recursion,) I don't `local` my internal functions, I intentionally let
them go to the module's environment (for easier monkey patching,
testing, etc.). Tests can easily check whether any "local global" was
accidentally changed (per module, snapshot the initial foo._MENV after
loading, run all the tests, compare with final state & flag any changes
not explicitly whitelisted – this also catches updates to existing
fields, unlike __newindex-based stuff.)
Instead of going { __index = _ENV or _G } for a new _ENV, I could be
more specific, which would get me the white-listing of "globals". And
because I don't have to `local` the module-internal state (as I have an
actual environment), I actually don't have that many local variables
outside of functions (and those that exist are `do`…`end`-scoped), so I
don't need the hiding of locals at all. (Stuff that I want to sandbox
goes into its own file anyway and gets its _ENV set via load(), so I
have the guarantee that there won't be any accidental upvalues / locals.)
So… I guess… have you tried working with _ENV before? Why was it
insufficient for your purposes?
How "with" statement should work with "_ENV" name:
1) If "_ENV" is written explicitly (user wants to directly access the
upvalue "_ENV") then such "_ENV" is visible only if it's in white-list.
2) If "_ENV" is used implicitly (user accesses global variable "var",
and the preprocessor replaces "var" with "_ENV.var") then this implicit
"_ENV" is always accessible, even when "_ENV" is not white-listed.
This just sounds scary.
I believe the identifier "with" is used not very frequently in Lua
programs because it is not a noun/adjective/verb,
so there would be not very much old Lua code broken by introducing this
new keyword.
Seems correct, I have an awful lot of zipWith, withOpenFile and other
withFOOs… but only two places where a function parameter is called
'with'… (not related to the above functions).
-- nobody