lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


> But there are other nonportabilities too, more severe than just
> performance issues; examples include log2maxs() and point2int()....

I've received enough off-list to prompt me to expand a little on this.

Specifically, I'm working with 5.4.4 source.

log2maxs is in llimits.h, which says

/*
** floor of the log2 of the maximum signed value for integral type 't'.
** (That is, maximum 'n' such that '2^n' fits in the given signed type.)
*/
#define log2maxs(t)	(sizeof(t) * 8 - 2)

This amounts to assuming CHAR_BIT is 8, which is common but (in my
opinion) not so common that it's appropriate to hardwire it into
supposedly-portable software.  (It also includes an implicit assumption
that t's representation includes no padding bits, which is also
commonly true, probably more common than CHAR_BIT==8, but also
something that really shouldn't be assumed.)

point2int is a mistake on my part.  It is actually point2uint.  It too
is from llimits.h:

/*
** conversion of pointer to unsigned integer:
** this is for hashing only; there is no problem if the integer
** cannot hold the whole pointer value
*/
#define point2uint(p)	((unsigned int)((size_t)(p) & UINT_MAX))

Converting pointers to integers is extremely nonportable, unless the C
implementation provides intptr_t and uintptr_t (C99 specifies that they
are optional).  The comment ("no problem if...") is also wrong; C99
6.3.2.3 #6 says, of converting pointers to integers, that "[i]f the
result cannot be represented in the integer type, the behavior is
undefined".  This code thus invokes undefined behaviour as soon as a
pointer is passed to it which cannot be represented in size_t.  That
may not be possible on a particular system, and if it is the actual
behaviour might not be undesirable, but it's a portability risk.
(intptr_t/uintptr_t alleviate this in that they are types for which the
result not only always can be represented, but the result also can be
converted back without information loss.)

Unless intptr_t/uintptr_t are provided and one of them is the integer
type being converted to, I can't see any promise in C99 that _any_
pointer value can be represented in _any_ integer type.  As a quality
point rather than a correctness point, even assuming pointer->integer
conversions work "the obvious way"[%], there also is no guarantee that
size_t isn't smaller than unsigned int, meaning that the cast to size_t
could be throwing away potentially useful information.  (An example
might be some MS-DOS memory models, where 32-bit int with 16-bit size_t
is a reasonably plausible combination.)

[%] What "the obvious way" is depends on the system.  For example, I've
read of a C compiler for Lisp Machines which represents C pointers as
an <array,subscript> pair.  "The obvious way" to convert to integer
there would probably be to just return the subscript value.

Glancing over other things....

The comment on l_castU2S (in llimits.h) is a little misleading.  The
cast is actually fine from a C perspective, _unless_ it is used to
convert a value that is not within range of both the unsigned and
signed types involved.  And it's actually not "two['s]-complement" that
matters; it's that the compiler does something close enough to what the
code author intended when it's applied to an out-of-range value.  I
think, for example, that it would be conformant for an implementation,
even on a two's-complement architecture, to implement casting unsigned
to signed by clearing the high bit.

In lstate.h:

/*
** About 'nCcalls':  This count has two parts: the lower 16 bits counts
** the number of recursive invocations in the C stack; the higher
** 16 bits counts the number of non-yieldable calls in the stack.
** (They are together so that we can change and save both with one
** instruction.)
*/

Nothing says that 32-bit values can be loaded, stored, or changed
atomically, nor in "one instruction".  If atomicity is important, it
needs to be ensured properly; if not, I'd say the comment should be
changed, either to remove the parenthetical comment or to clarify that
this is a desirable property but not one the code depends on for
correctness.

There may well be others.  This is just what I happen to have noticed,
not a full code review.

/~\ The ASCII				  Mouse
\ / Ribbon Campaign
 X  Against HTML		mouse@rodents-montreal.org
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B