lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

On 04/09/2021 15:53, Egor Skriptunoff wrote:
On Fri, Aug 27, 2021 at 10:31 PM Lorenzo Donati wrote:

UB is just dragons waiting to wreak havoc
on your machine.

How dangerous are string.format() dragons?
Should string.format() be inaccessible to untrusted Lua code?

Or, maybe Lua protects safely from the main threat (buffer overflow),
and all other dragons are small and acceptable?

It all depends on the implementation of the runtime of the C library Lua is linked to. UB means what it means. There is no "small" dragon. The term "dragon" doesn't indicate a bug, but the event that a bug could be catastrophic. It's the whole point of the standard washing their hands about it. And that's why there are so many tools to analyze code to avoid UB in C source code.

Who knows what a printf implementation does when passed something the standard states it is UB? You cannot rule out that a buffer overflow won't happen when passing "%#c" to printf. The standard says it's UB and *anything can happen*. You have NO guarantee at all, even if Lua checked every other thing. Once it crosses the Lua/C barrier, Lua has no control over it.

Remember that an implementation is allowed to give it for granted that a C program doesn't contain code that generates UB. This means that an implementation can legitimately lack any code to even check that "#" flag is present for a "c" conversion. So, for example, imagine a big switch statement where every branch handles one possible flag that's allowed: when "#" appears it may force execution the default branch, if present, or execution may fall-thru to the switch end without handling the format code properly because that "#" should not be there. You could end-up in a segment of code never meant to be executed in those conditions. You cannot assume the state of the program is necessarily consistent at that point, and resources may be leaked, such as buffers. An implementation is not "wrong" or "buggy" if it doesn't handle that case, it simply takes advantage of the leeway afforded by the standard.

If an implementation gave you some *guarantees* in the handling of printf format string beyond what the standard states (e.g. no buffer overflows, regardless of what the format string is, even in UB cases), that would be an extension. No implementation is required to do so. And if your program relied on those guarantees, it would be non-portable *by definition*. That may be fine, if you don't mean to port your code to other platforms, but that's all.

The whole mess of UB is just that: people thinks "most implementation won't do something silly in this case", then you find the "right" compiler switch, the "right" compiler version, the "right" DLL linked-in, the "right" C-lib version and some years down the road something goes horribly wrong.

Of course there are applications where a catastrophic crash can be tolerated (by some definition of "tolerance"), especially when you run your application under an OS (although losing days of work because, for example, an engineering simulation program crashed unexpectedly because someone produced a badly formatted log message is not a "small" dragon, IMO).

But at system level, perhaps on a MCU running Lua embedded in a C application, maybe with no OS, where "stdio.h" is not even a required library (and if provided is often dumbed down) a "dragon" may mean a physical system goes down hard. Even if we don't consider safety critical applications (Lua in automotive? Probably not, since I don't think it's code was written with MISRA in mind), a printer suddenly resetting to factory defaults or losing control of a bunch of sensors, leading to massive paper jams, is not something I would dismiss.

FWIW, IMO the absence of "C-type UB" in a programming language is a very big plus. If something is wrong a program should fail early and loudly, possibly with a clear error message. Lua is advertised as a "safe" language as long as one doesn't use the C-API or the debug library. I consider it a bug having corner cases where this is not true, especially because a Lua programmer doesn't expect a "C-type UB" dragon lurking in their code.


-- Lorenzo