lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On 31/08/2021 19:45, Roberto Ierusalimschy wrote:
To define a behavior as undefined sounds like those "this page
intentionally left blank" pages :-)


C is a real mess! It took me years to really understand what undefined behavior meant. I never was a C programmer and my knowledge of C has progressed with jumps, as the need to improve my C skills arose from time to time.

It *could* be an enjoyable language, although an hard one, if it weren't for all its warts that the standard committee never (or painfully slowly) removed.

I particularly hate the uselessness of bit fields, the absence of a namespacing facility and the fact that the standard reserves a whole bunch of unrelated identifiers (mem*, str*, *_t and what not).

And most of the time you end up with an UB in your code.

Anyway, it is as widespread and as standard as a system language can go, so that's a big deal.

In the particular case of 'printf', the format "%#c" is defined as
undefined, while the format "%+c" is literally undefined. Should we
consider "%+c" as undefined behavior(™)?


I just checked C99 draft standard N1256 (section 7.19.6.1, page 274+).
Effectively paragraph 6 states explicitly that "For other conversions, the behavior is undefined." wrt to "#" flag and "c" conversion.

OTOH, nothing is said about "+" flag and "c" conversion, as you point out.

Anyway, IMO, Lua should avoid leaking "C undefined behavior" (possibly bar using the debug library) and it should also avoid other confusing unspecified behavior derived from C. So anything that doesn't make sense or is confusing should raise an error.

So "%+c" doesn't really make sense, since "c" means "print the character whose code is specified", so an error is in order, IMO.

As absurd as it may seem, "%#c" could be given a reasonable meaning in Lua: since "#" means "use an alternate form", one could "define" what is "defined as undefined" in C (ugh!). For example, one could force interpreting the argument as UTF-8 encoded.

Not that I'm actually proposing this. Just pointing out that what makes sense in a language is not necessarily the same in another. :-)

Anyway, I think also "%#c" should raise an error.

In the long run, IMHO, maybe Lua should really specify what string.format format string syntax is. I understand that it would make the manual somewhat bigger (and increase the implementation size), but I don't think making a reference to the C printf is friendly to pure-Lua (non-C) programmers. Finding the details of printf mini-language syntax is not trivial at all for someone not knowing C. Even if they find a (reliable) reference, they have to parse it ignoring all the C-specific stuff, which is not easy at all if they don't know jack of C.


-- Roberto



Cheers!

-- Lorenzo