lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

On Tue, Jan 13, 2015 at 09:39:28AM -0200, Roberto Ierusalimschy wrote:
> > Traditionally you allocate memory using malloc() and Lua environments you might use lua_newuserdata() to get garbage collection. Now, when you allocate memory for more than one element, usually the idiom malloc(nelem * size) or lua_newuserdata(nelem * size) is used.
> > 
> > The integer multiplication, however, can overflow and lead to buffer overflows.  Try e.g. malloc(65536 * 65536). In C libraries a function calloc(nelem, size) exists, but unfortunately it does not guarantee to not overflow either.  On some operating systems, e.g. FreeBSD, it detects overflow and returns NULL.
> > 
> > I am suggesting to add a function to the Lua C API that is like lua_newuserdata(), but takes two parameters, a size and a number of elements, and that checks for overflow and returns NULL in this case:
> > 
> > lua_newuserdatas(size_t count, size_t size)
> > 
> > Thoughts on this?
> Overflows in C can occurr for any arithmetic operation. Usually we use
> the program logic to avoid that, not explicit checks for each operation.
> Moreover, the only portable way to check multiplication overflow is by
> doing a division, which is not very efficient.

I express no opinion regarding the addition of a new allocation function.
But the following is a routine I wrote for use in some of my libraries,
usually for checking overflow when allocating memory. It's portable C (well,
see inline comment regarding lobits on how to make it portable) and it was
about 5x faster than using division on the Intel Core chip I tested it on. I
believe the treatment of signed integers would be nearly identical and
equally portable C code, although I can't speak to the additional cost of
handling signedness.

#define nbits(t) (sizeof (t) * CHAR_BIT)
#define lshift(a) ((a) << (nbits(a) / 2))
#define rshift(a) ((a) >> (nbits(a) / 2))
// NOTE: lobits needs typeof operator! Or (a^a)+1. Or shift left then right.
// Or just (size_t)1 if we don't care about being generic.
#define lobits(a) (((1UL << (nbits(a) / 2)) - 1) & (a))
#define hibits(a) rshift(a)

static inline _Bool addz(size_t *, size_t, size_t);

 * implement multiplication using a polynomial with four multiplications and
 * three additions, except we can optimize out some operations.
static inline _Bool mulz(size_t *_r, size_t _a, size_t _b) {
	size_t a[2] = { lobits(_a), hibits(_a) }, b[2] = { lobits(_b), hibits(_b) };
	size_t r[2];

	/* if both are non-0, we'd always overflow */
	if (a[1] && b[1])
		return 0;

	/* either a[1] or b[1] must be 0 here, so no intermediate overflow */
	r[1] = (a[1] * b[0]) + (a[0] * b[1]);

	/* if the result has MSW bits set, we'd overflow */
	if (rshift(r[1]))
		return 0;

	r[0] = a[0] * b[0];

	return addz(_r, r[0], lshift(r[1]));

static inline _Bool addz(size_t *r, size_t a, size_t b) {
	if (~a < b)
		return 0;

	*r = a + b;

	return 1;