lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Asko Kauppi wrote:
Imho, this is not a question of Turing-to-or-not but of proper sandboxing, right? I mean, if there's absolutely no way the program can contact outside world (read files etc.) then what harm could turingness possibly do?

Well, if I understood Ben correctly, the problem is less a question of security (hanging host program, doing illegal operations on client computer) than of having all the relevant data without the original interpreter.

He has both concerns ("reliability and trust") but I feel the accent is on the first.

The problem is the same for TeX, but also for PostScript, perhaps PDF, and other scripted data description languages like SVG: it is hard to extract pure data (like text) without having a full blown interpreter. In case of Lua, the syntax is simple and the interpreter easy to get, in case of PS or JS, it is a bit harder...

OTOH, ability to read files etc. would be beneficial in the configuration files, so.. it's about where you draw the line. Perhaps just disallow any writes? Sandboxing per se is relatively easy with Lua.

I beleive that "where [to] draw the line" is really the hard part of the question...

-ak

1.12.2004 kello 02:05, Ben Crowell kirjoitti:

 Is there any reasonably easy way to use Lua as a data-
description language, while disallowing the features
that make it a Turing-complete language? For instance,
Unix and the apps that run on it have traditionally had
lots and lots of configuration files, each of which had
its own idiosyncratic syntax. It would be nice to be able
to standardize them in terms of syntax. I believe MacOS X
uses XML for a lot of its config files, for instance.
However, sometimes it's nice to have a real programming
language, and XML isn't a programming language per se.
For instance, it would be nice to be able to use conditional
constructs in a config file: if the operating system is
FreeBSD, then X, else if it's Linux, ... But you probably
don't want a Turing-complete language for this purpose,
for reasons of reliability and trust; it's fundamentally
impossible for a computer program to verify or manipulate
another program written in a Turing-complete language.
This is why I'm much more willing to accept an
e-mail written in HTML (a Turing-incomplete language) than
one that contains an executable program.

Well, HTML *can* contains an executable program: JavaScript (or VBScript...). That's why spider task can be hard, become some links are hidden in JS code (navigation menu, scrambled e-mail addresses...).

Another interesting
example is TeX, which is Turing-complete; spell-checkers for
TeX are always full of weird heuristics for attempting to
determine what should be spell-checked and what shouldn't,
and the problem is fundamentally impossible to solve because
of the Turing-complete nature of the language.

So if you can see what I'm getting at --- is there a way
to read a data description written in Lua, but make sure it's
sufficiently limited in what it does that we know it's
limited to a Turing-incomplete subset of Lua? A couple of
possibilities present themselves to my mind:

  1. Have a separate parser that verifies that the program is
  really written only in the Turing-incomplete subset. Once
  this has been verified, feed the program to Lua.

  2. Make a crippled version of the Lua interpreter that doesn't
  support anything outside the Turing-incomplete subset.

Both of these have obvious drawbacks.

It's also not obvious to me whether there is a unique or optimal
subset of the language that should be defined. For instance,
any language is probably Turing-complete if it has (a) looping
constructs, and (b) the ability to break out of the loop based on
condition that's not known at compile time. Well, which should be
disallowed, the loops or the conditionals? Or is it possible to
allow some loops and some conditionals, but not all combinations
of them?

Looping can be useful, but perhaps too sophisticated for most needs.
Perhaps you are actually looking for something like C's pre-processor...

Maybe I should just be looking at some other language rather than Lua,
one that's purely a data description language, and Turing-incomplete
by design. The thing is, it would be really sweet IMO to have
both available: a nice Turing-complete language *and* a Turing-
incomplete subset. Working with a language that's purely meant
to be Turing-incomplete, there would always be cases where I would
find myself at a dead end: unable to do something I want to do,
because the language is Turing-incomplete. Using the e-mail analogy,
it's not that I'm *never* willing to run software written by someone
else. It's just that I have to make that decision on a case-by-case
basis.

On the other hand, complex code can be full part of data description.
I already gave an example here of the power of Lua as complex data description:

I had a "database" of street names with postal areas, ie. town is divided in numbered areas, with sometime simple affectation (this street is in the area 7) and sometime more complex setting (odd numbers of this street are in area 8, even numbers in area 102, or numbers below 77 are in area 9, and over in area 333).

So my data looks like this:

["argenteuil"] =
{
  ru = { liasse = "25", voie = "rue du Pont d'Argenteuil" },
  av1 = { liasse = "637", voie = "av. d'Argenteuil" },
  av2 = { liasse = "0000", voie = "av. du Vx Ch. d'Argenteuil" },
  special =
    function(voie)
      local f = string.find(string.lower(voie), "rue")
      if f ~= nil then
        return "ru"
      else
        -- Avenue
        f = string.find(string.lower(voie), "ch")
        if f == nil then
          return "av1"
        else
          return "av2"
        end
      end
    end
},
["barbusse"] =
{
  liasse =
    function(n)
      if EstPair(n) then -- IsOdd(n)
        if n <= 10 then
          return "11"
        elseif n <= 24 then
          return "07"
        else
          return "14"
        end
      else  -- impair
        if n <= 15 then
          return "07"
        else
          return "14"
        end
      end
    end,
  voie = "rue H. Barbusse"
},
["bas"] = { liasse = "01", voie = "rue des Bas" },
["basly"] =
{
  ru = { liasse = "04", voie = "rue Basly" },
  al = { liasse = "04", voie = "allée Basly" },
  pref = "ru"
},

Without Lua and its functions as first-class object, I would have to put a lot of conditional code in the data management code itself... Here, I just had to test whether liasse was a function or special was non nil, and run these functions to get the value.

--
Philippe Lhoste
--  (near) Paris -- France
--  Professional programmer and amateur artist
--  http://Phi.Lho.free.fr
--  --  --  --  --  --  --  --  --  --  --  --  --  --