[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Globals (more ruminations)
- From: Mark Hamburg <mark@...>
- Date: Thu, 8 Jul 2010 09:45:38 -0700
As someone I think pointed out, the reason the globals issue seems to keep coming up is that globals are viewed as "bad" and Lua makes them easy to create. In addition, that ease of creation can also hide errors caused by typos.
Why are globals "bad"?
The primary reason is the same reason they are "bad" in other languages: They create opportunities for unexpected coupling. Code in one script can unexpectedly interact with code in another script through shared global names. This problem can be mitigated by assigning each script its own environment table when it is loaded. These tables can use a metatable with an __index entry to access an intentionally shared namespace of values. It is also worth noting that this sort of coupling is exactly why globals are useful in interactive mode where we need to tie together a series of separately compiled chunks.
(Side note: Unexpected coupling is also why having the module function add the module to the global namespace is "bad".)
The secondary reason is that globals are slower than locals or upvalues. How often does this speed difference matter? Probably not all that often. Still it may be more often than one might think: Lightroom uses a chain of environments for exactly the reasons cited above and profiling showed a surprising amount of time spent dealing with global lookups until we started getting more aggressive about caching standard library functions into locals. And if you post a benchmark result asserting that "Lua is slow" and use globals rather than locals, someone from the Lua community is likely to tell you that you aren't using the language properly thereby implying that the easy path is not the proper path. So, performance may or may not be a reason to view globals as problematic. That said, I believe LuaJIT more or less eliminates the issue, so this perhaps can fade as an issue over time.
Why are globals "good"?
As noted above, they provide essential coupling between chunks in interactive mode. The alternative would be some way to pre-populate chunks with a set of pre-bound upvalues. One could then harvest the upvalues from one chunk and pre-populate them into the next chunk. If you thought global environments were hard to reason about, this seems much harder (though perhaps useful for experts).
Globals also provide ways to do interesting things by using special environments. The module function switches the environment so that assignments go into the module table. Some class systems do similar things for class definition. LuaGravity redefines the global environment so that function declarations turn into reactors. That said, this is a case where "in env do ... end" made a certain amount of sense. These uses also tend to have trouble with being intended to rebind writing but in so doing also messing with reading unless one uses a custom environment that balances between the two needs.
How to balance between these two?
As a starting point, I think the _ENV approach in 5.2work3 provides an interesting opportunity by allowing one to distinguish between chunk level global accesses and using globals to access an explicitly created _ENV variable. This makes some of the special environment tricks work better though they are a bit uglier syntactically unless we also re-introduce "in env do ... end" as sugar for "do local _ENV = env; ... end" but that leads to a bunch of other issues around what one is trying to accomplish and around expectations about how environments work.
That then leaves the question of whether it needs to either be harder to create chunk level globals and/or whether it needs to be easier to create chunk level locals and/or whether we just need better performance analysis tools.
For example, the vast majority of the code I write outside of interactive mode would be simpler with the addition of a simple import statement:
which translates into:
local foo = require "foo"
Additional syntax could provide name overrides or member import, but I would be cautious here. Given this, I could then see banning both reading and writing of globals at the chunk level outside of interactive mode. This makes a common case easier to write, but it would also slap me and other developers on the team whenever we slipped onto the seemingly easy path.
This could probably be handled via a combination of a token filter and a byte-code analyzer, but I haven't looked closely into the former and the latter is going to take a bit more thought in 5.2work3.
P.S. With regard to typos, in addition to detecting inadvertent globals, a lint should perhaps also check the names of messages used in message sends...