lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Fri, Jul 9, 2010 at 12:32 PM, Mark Hamburg <mark@grubmah.com> wrote:
> Say there were a new statement of the form:
>         global var1, var2, var3
> ...So, now I can write:
>
>        local _ENV = module()
>        function say_hello()
>                global print
>                print( "Hello" )
>        end
>
> This seems like it points toward something useful, but I'm not sure it's there yet.

Let me refine that...

A core aspect of your ideas is to suggest we allow more than one
environment to be active simultaneously.  So, code like this:

  foo(bar)

might be compiled into bytecode equivalent to

  _ENV1.foo(_ENV2.bar)

The interesting thing is that the Lua 5.2.0work3 VM doesn't treat _ENV
specially, nor for that matter _ENV1 or _ENV2.  It doesn't care.  We
are free to support this multiple environment concept in the language
and implement it in the code generator without modifying the VM or
runtime.

The question then is how the code generator should know whether to
render `foo` as `_ENV1.foo` or `_ENV2.foo`.  It needs to decide prior
to serializing the chunk into bytecode.  To make this more concrete,
consider this example:

  -- baz.lua
  local require = require
  _ENV = module()
  local _ENV1 = require "_G"
  local _ENV2 = require "mathx"
  local lfs = require "lfs"
  function hello()
    print(acosh(0), foo, lfs.currentdir())
  end

One answer is that the compiler (luac) should load the modules "_G"
and "mathx" into memory before compiling baz.lua.  When the code
generator comes across the identifier `print`, it checks for a non-nil
value in `require"mathx".print` or `require"_G".print`.  For the first
one found, the corresponding environment table (_ENV1 or _ENV2) is
selected.  If none is found, it defaults to the environment of the
chunk (_ENV), which may be initialized to an empty table.  So, the
above code gets rendered to bytecode equivalent to

  -- baz.lua
  local require = require
  _ENV = module()
  local _ENV1 = require "_G"
  local _ENV2 = require "mathx"
  local lfs = require "lfs"
  function _ENV.hello()
    _ENV1.print(_ENV2.acosh(0), _ENV.foo, lfs.currentdir())
  end

Note that the binding of a global variable reference to module table
is now a compile-time rather than a run-time decision.  There is no
longer an __index=_G fallback (package.seeall) on _ENV to be queried
at run-time.  The _ENV table is also clean, containing only the
module's public API, so there is no more need for the package.clean
workaround for package.seeall.  The actual retrieval of the variable
in the module table remains a run-time operation, although an
optimizing compiler may localize these variables if safe to do so, but
this is now more of an implementation decision.

Most likely, there would be a convenience syntax for the requires, so
the code would really be written like

  -- baz.lua
  _ENV =
  module()
  using "_G.*"
  using "mathx.*"
  using "lfs"
  function hello()
    print(acosh(0), foo, lfs.currentdir())
  end

where "using" is perhaps a keyword that internally invokes `require`.

It should be stated, however, that "using 'foo.*'" amounts to, in Java
terminology, a "static import" and should be used sparingly [1], such
as for common top-level standard library functions.  My usual
preference is to prefix variables by module name, like
"mathx.cosh(0)", so that it's clear to the reader which package a
function comes from.  However, it's commonly recognized that there are
limited cases where static import is justified, and _G and mathx
roughly fit the bill.

[1] http://download.oracle.com/docs/cd/E17476_01/javase/1.5.0/docs/guide/language/static-import.html