lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

On 29/08/2021 07:19, Flyer31 Test wrote:
But what is your problem with this result "1" or better "01" for

str= "1"
string.format("%02c", str)


I wasn't criticizing your statements, but I was replying to @nobody, who seemed to imply that UB was fine if you could test that the executable consistently provides a sane result on a given platform.

That is a fallacious approach, since a compiler can produce whatever code it sees fit when the source code triggers an UB. So observing the actual apparent behavior of the executable is not enough to state something like "yes, the code has UB, but in this case the executable is ok and doesn't do anything nasty".

The *executable* code could hide a segfault (or worse) waiting to happen if its environment changed, for example.

The only guaranteed safety when you write code with any UB described in the standard is when an implementation chooses to *define* and *document* that case (the standard allows that as an extension). However the source code becomes non-portable.

Relying on observable behavior from an executable generated by an implementation that doesn't provide that guarantee is wrong.

UB doesn't mean "the executable generates an error" or "the executable (always) behaves in obvious erratic ways". It means "you can't say ABSOLUTELY NOTHING about the behavior of the executable". It could behave nicely for 1 millions executions on the same machine with the same configuration and then format your hard disk when executed on the 1st of July of a leap year. Yes, an implementation could also produce sane (and safe) machine code, but the point is: you cannot tell for sure by simply observing some output.

The only way to be absolutely sure that the compiler generated safe executable code when compiling a source with UB would be to actually analyze the machine code produced, which is ridiculous in practice (you could do that for research or curiosity, but not in a production environment).

Isn't this exactly what would be expected, or am I standing on the line somehow?

On Fri, Aug 27, 2021 at 9:31 PM Lorenzo Donati
<> wrote:

On 27/08/2021 20:00, nobody wrote:
On 27/08/2021 16.28, Roberto Ierusalimschy wrote:
Thanks for the report. Do you have any real case where this is causing
problems? (e.g., a platform with a weird behavior for these uses, a tool
that complains about Lua source code.)

For reference, the standard *explicitly* says "behavior is undefined"
and not just unspecified, but I tried both gcc and clang in gnu99 and
C11 modes with -fsanitize=undefined and neither of them produced any
warnings when formatting "%02c" or "%#02.4c" and other nonsense.  (That
said, as far as I understand UBSAN isn't supposed to catch _everything_,
just a subset of all undefined stuff.)

And from testing them on my machine,
string.format("%02c",string.byte"1") results in "01", so it isn't

On my machine, "%02c" produces " 1" so while it seems to be not truly
undefined "in practice", at least the behavior can't be relied on.

Just for the record, as always in C, undefined behavior is to be avoided
at all costs, since the *actual behavior* of the specific platform can
change unexpectedly even with the same executable on the same machine.

What appears to be consistent and sane behavior (e.g. no segfault) after
compilation could even change if the OS runs low on memory (for
example). The only way to tell if the behavior is sane is to check the
disassembled compiler output and see if the machine code behaves sanely
*in every possible case*, which is of course ridiculous.

Unless an implementation chooses to define an otherwise UB (and
specifies that in the docs), UB is just dragons waiting to wreak havoc
on your machine.

Such bugs can lay dormant for years, until someone changes a compiler
setting or switches to a new compiler or, even worse, changes the
runtime environment of the same old executable, then Murphy will grant
you hours and hours of fun! <grin>

-- nobody


-- Lorenzo