Re: Lua and undefined behaviour: A bullet-proof (sandboxed?) secure environment is an impossible dream

A maximum string size is not enough, what is important is the total memory (used by all strings, if they are different and not canonicalized to the same immutable copy, but also all other objects).

You could as well run a string that creates a giant table with tons of different strings:

t = {} for i=1,1E100 do t[i] = tostring(i) end

You could as well create infinite recursion easily with a broken implementation of the factorial function:

function fac(n) return n < 2 and n or n * fac(n+1) end

(with the typo changing the correct - into a +)

Here, you'll exhaust the stack limit, and what happens at this step is unpredictable (it may be difficult to even send an error or force the current function to return nil, if it requires adding an additional stack frame for building an error object, or to call a debug.traceback, before unwinding the stack up to the topmost pcall() error handler.

As well pcall() may be called recursely infinitely :

function fac(n) if n < 2 return n local ok,ret=pcall(fac(n+1)) if ok then return n * ret else return nil end

This code still looks correct as it handles the error, but unwinding is limited, and nothing prevent that function to retry the failed pcall (and it will fail again):

function fac(n) if n < 2 return n repeat local ok,ret=pcall(fac(n+1)) if ok then return n * ret until ok nil end

As long as the fix changing the incorrect + into a - is not done, the program still breaks infinitely and insists in using all memory allowed for the standard call stack (and the pcall stack).

This does not seem to be a case of UB, but what happens when the stack can't grow is UB (and Lua has lot of difficulties to track and check the stack usage as the tracking code also depends on correct behavior of the internal C/C++ error handlers used by the Lua engine implementation, which itself also depends on error handlers in the OS itself).

Asserting if the quotas cannot be exhausted and failure recovery is possible and safe, requires writing and running stress tests as a routine task for programmaers. A very classical programming problem, not always easy to write for complex programs that fail in very specific or complex conditions hard to reproduce.

And many programmers to take the time to write correct stress tests (just like they often do not insert any assertion check in their code, except in rare places, thinking first in terms of practical performances for the most common cases they want or have to handle, taking time to write a program that will voluntarily bug is considered a waste of work time... until later someone reports a severe bug that will be harder to isolate later).

Le mar. 7 sept. 2021 à 15:29, Coda Highland <chighland@gmail.com> a écrit :

On Mon, Sep 6, 2021 at 1:17 AM sur-behoffski <sur_behoffski@grouse.com.au> wrote:
G'day,

Some of the things that the C standard(s) leave undefined are left
for the OS/runtime libraries to decide. It is simply impossible to
stub off every loose end at the Lua-to-C interface, so please give
up hoping.

Here is a variant of the "billion laughs" attack that I saw Roberto
post many years ago... extended by a significant number of steps.

As the subject line of this message says:

A bullet-proof (sandboxed?) secure environment
is an impossible dream.

I have a fairly hefty machine that can run over ten virtual machines
simultaneously.... I use this as a test-bed for my "lgcicua" project
so I can expose a single tarball to multiple Ubunt/LinuxMint/CentOS
releases, and can check that very simple sanity tests pass.

https://sourceforge.net/projects/lglicua/

At the time of posting, the billion-laughs-attack script below is
running at 100% CPU; "top" shows the resident memory for the Lua
script as having just passed the 61GiB mark... and we've just
reached the point to print "lol11"...

... What will happen at some point, is that the kernel's
out-of-memory killer will be invoked, as it sees the precious
resource run out, and the Lua script will be killed mid-operation.

cheers,

s-b etc etc

----- (cut here) -----

#!/usr/bin/env lua

-- Demonstration of billion-laughs attack
-- https://en.m.wikipedia.org/wiki/Billion_laughs_attack

lol1 = "lol" print("lol1")
lol2 = string.rep(lol1, 10) print("lol2")
lol3 = string.rep(lol2, 10) print("lol3")
lol4 = string.rep(lol3, 10) print("lol4")
lol5 = string.rep(lol4, 10) print("lol5")
lol6 = string.rep(lol5, 10) print("lol6")
lol7 = string.rep(lol6, 10) print("lol7")
lol8 = string.rep(lol7, 10) print("lol8")
lol9 = string.rep(lol8, 10) print("lol9")
lol10 = string.rep(lol9, 10) print("lol10")
lol11 = string.rep(lol10, 10) print("lol11")
lol12 = string.rep(lol11, 10) print("lol12")
lol13 = string.rep(lol12, 10) print("lo1l3")
lol14 = string.rep(lol13, 10) print("lo1l4")
lol15 = string.rep(lol14, 10) print("lo1l5")
lol16 = string.rep(lol15, 10) print("lo1l6")

Avoiding undefined behavior is ONE part of writing secure software. Well-formed code that does not invoke undefined behavior can still be a security risk. Case in point, consider the function "os.execute()". Its behavior is well-defined. It is also horrifically dangerous to call with user-supplied input.

In this particular case, this attack is trivially defeated by applying a maximum buffer size, which is a sensible thing to do when operating on user-supplied input of unknown size.

Just because you can't stop all security issues just by avoiding undefined behavior doesn't mean it isn't appropriate to avoid undefined behavior.

/s/ Adam