lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi,

well, 64 bit machines are becoming a commodity now (a German grocery
store is selling them over the counter ;-)). So I think it's due time
to clean up the 64 bit support in Lua. But some care is required to avoid
spoiling the high portability of the Lua core from 16 bit up to 64 bit
platforms ...

Here is a table of the common data models that Lua supports:

Data  |                long |       |                  | Platform
model | short int long long | void* | size_t ptrdiff_t | examples 
------+---------------------+-------+------------------+-------------------
LP32  | 16    16   32    -  |  32   |  16       32     | WIN16, MSDOS-large
------+---------------------+-------+------------------+-------------------
ILP32 | 16    32   32  [64] |  32   |  32       32     | WIN32, Unix-32bit
L64   | 16    32   64    -  |  32   |  32       32     | PS2
------+---------------------+-------+------------------+-------------------
LP64  | 16    32   64   64  |  64   |  64       64     | Unix-64bit
LLP64 | 16    32   32   64  |  64   |  64       64     | WIN64
ILP64 | 16    64   64   64  |  64   |  64       64     | (abandoned ~'95)
------+---------------------+-------+------------------+-------------------

[BTW: the choice for lua_Number is a separate issue I won't discuss here.]

The Lua core has been carefully coded to use chars, shorts, ints,
size_t and pointers almost exclusively. This already avoids a large number
of portability problems. Kudos to the Lua authors.

However there are a few requirements for specific data type lengths:
- A type to hold 32 bit VM Instructions (lua_Instruction).
- A type to count/compare memory usage (lu_mem, l_mem).
- A type to count the number of strings (uses lu_int32).

The current definition sets all of these to 'long' which is ok for LP32
and ILP32 platforms, but is a waste on most others and an error on LLP64
platforms (WIN64 has sizeof(size_t) > sizeof(long)).

I have attached a patch to hopefully clean up this issue for all supported
data models.

It drops the use of 'long' everywhere except for 16 bit platforms or
64 bit platforms (implicitly through size_t). Especially Playstation 2
developers will be delighted about this, since they have been complaining
a lot on this list (but they still need to change lua_Number to float).

I have tested this patch on a dozen combinations of architectures (i386,
x86_64, alpha, ia64, hppa), operating systems (Linux, FreeBSD, NetBSD,
HP-UX, OSF/1) and compilers (native vs. gcc). I do not expect any surprises
for any other 32 bit Unix platforms or the remaining two 64 bit platforms
with any market share (Sparc64 and PPC64).

I've successfully allocated and used ~2.8 GB of memory on 32 bit platforms
and >10 GB of memory on 64 bit platforms.

[Thanks go to the HP/Compaq Testdrive Program for providing access to
 such a wide variety of machines.]

I have only cross-compiled for WIN16 and WIN32. I do not have access to
a native Windows or PS2 development environment.

Some feedback from developers with access to 16 bit embedded development
environments and for WIN64 would be helpful.

There are some outstanding issues:

- I changed a line in lgc.c:atomic() that has a comparison 'x <= 4*y/2'
  to 'x/2 <= y'. I kindly ask the Lua authors to verify that this still
  does what they intended it to do. Due to subtle side-effects the original
  version may affect proper garbage collection when more than 1G of memory
  is in use (I haven't verified this and maybe I misunderstood the code).

- Since this patch already improves bytecode portability it would be nice
  to go one step further and drop the dependency on size_t. While a single
  string with >4 GB size is imaginable (but a bad idea), I cannot imagine
  a use for a >4 GB string *constant* embedded into bytecode.

  When this is done, the bytecode will be portable across all 32 bit
  and 64 bit platforms that are in common use (inefficiencies due to
  byte-swapping upon loading aside). I guess cross-platform developers
  will like the idea. Opinions?

- There is a hard limit of 2^24 on the size of the array portion of a table.
  I'm not sure about the rationale, though. TValue has 12 or 16 bytes and
  the array is indexed with 32 bit ints. This implies a limit of 2^27
  (or realistically 2^26) on 32 bit platforms and 2^31 on 64 bit platforms.

  Although I do not have a use for such large arrays, somebody else might
  (and may not want the array objects to overflow to hash slots instead).
  Should this hard limit be modified for 64 bit platforms?

  [While we are at it: how difficult is it to enhance the GC to avoid the
   atomic traversetable() and split it across GC steps?]

Bye,
     Mike
diff -ur lua-5.1-work2/include/luaconf.h lua-5.1-work2-64bit/include/luaconf.h
--- lua-5.1-work2/include/luaconf.h	2004-11-02 15:56:56.000000000 +0100
+++ lua-5.1-work2-64bit/include/luaconf.h	2004-11-02 17:31:37.000000000 +0100
@@ -30,11 +30,25 @@
 ** =======================================================
 */
 
+
+/* number of bits in an `int' */
+/* avoid overflows in comparison */
+#if INT_MAX-20 < 32760
+#define LUA_BITSINT	16
+#elif INT_MAX > 2147483640L
+/* machine has at least 32 bits */
+#define LUA_BITSINT	32
+#else
+#error "you must set LUA_BITSINT to the number of bits in an integer"
+#endif
+
+
 /* default path */
 #define LUA_PATH_DEFAULT	"?;?.lua"
 
 
 /* type of numbers in Lua */
+/* you need to change more stuff below if you change this */
 #define LUA_NUMBER	double
 
 /* formats for Lua numbers */
@@ -42,8 +56,12 @@
 #define LUA_NUMBER_FMT		"%.14g"
 
 
-/* type for integer functions */
+/* type for integer API functions */
+#if LUA_BITSINT >= 32
+#define LUA_INTEGER	int
+#else
 #define LUA_INTEGER	long
+#endif
 
 
 /* mark for all API functions */
@@ -126,12 +144,19 @@
 #define api_check(L, o)		lua_assert(o)
 
 
-/* an unsigned integer with at least 32 bits */
-#define LUA_UINT32	unsigned long
-
-/* a signed integer with at least 32 bits */
+/* define integers with at least 32 bits (but not more, if possible) */
+#if LUA_BITSINT >= 32
+/* 32 and 64 bit platforms */
+/* note: most 64 bit platforms have 32 bit ints (LP64 or LLP64 data model) */
+#define LUA_INT32	int
+#define LUA_UINT32	unsigned int
+#define LUA_MAXINT32	INT_MAX
+#else
+/* 16 bit platforms */
 #define LUA_INT32	long
+#define LUA_UINT32	unsigned long
 #define LUA_MAXINT32	LONG_MAX
+#endif
 
 
 /* maximum depth for calls (unsigned short) */
@@ -170,7 +195,7 @@
 /* function to convert a lua_Number to int (with any rounding method) */
 #if defined(__GNUC__) && defined(__i386)
 #define lua_number2int(i,d)	__asm__ ("fistpl %0":"=m"(i):"t"(d):"st")
-#elif 0
+#elif defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 199900L)
 /* on machines compliant with C99, you can try `lrint' */
 #include <math.h>
 #define lua_number2int(i,d)	((i)=lrint(d))
@@ -195,18 +220,6 @@
 #define LUA_UACNUMBER	double
 
 
-/* number of bits in an `int' */
-/* avoid overflows in comparison */
-#if INT_MAX-20 < 32760
-#define LUA_BITSINT	16
-#elif INT_MAX > 2147483640L
-/* machine has at least 32 bits */
-#define LUA_BITSINT	32
-#else
-#error "you must define LUA_BITSINT with number of bits in an integer"
-#endif
-
-
 /* type to ensure maximum alignment */
 #define LUSER_ALIGNMENT_T	union { double u; void *s; long l; }
 
diff -ur lua-5.1-work2/src/lgc.c lua-5.1-work2-64bit/src/lgc.c
--- lua-5.1-work2/src/lgc.c	2004-09-15 22:38:15.000000000 +0200
+++ lua-5.1-work2-64bit/src/lgc.c	2004-11-02 20:31:58.000000000 +0100
@@ -554,7 +554,7 @@
   g->sweepgc = &g->rootgc;
   g->gcstate = GCSsweepstring;
   aux = g->gcgenerational;
-  g->gcgenerational = (g->estimate <= 4*g->prevestimate/2);
+  g->gcgenerational = (g->estimate/2 <= g->prevestimate);
   if (!aux)  /* last collection was full? */
     g->prevestimate = g->estimate;  /* keep estimate of last full collection */
   g->estimate = g->totalbytes;  /* first estimate */
diff -ur lua-5.1-work2/src/llimits.h lua-5.1-work2-64bit/src/llimits.h
--- lua-5.1-work2/src/llimits.h	2004-09-10 19:30:46.000000000 +0200
+++ lua-5.1-work2-64bit/src/llimits.h	2004-11-02 16:56:48.000000000 +0100
@@ -22,19 +22,25 @@
 
 
 /*
-** an unsigned integer big enough to count the total memory used by Lua;
-** it should be at least as large as `size_t'
+** signed/unsigned integer big enough to count the total memory used by Lua;
+** it should be at least as large as `size_t'.
+** Note: size_t is defined to be the result of sizeof(), i.e. the size of
+** a single memory object; but we need to account for multiple memory objects.
+** Alas, this is only an issue on 16 bit platforms with the LP32 data model.
 */
+#if LUA_BITSINT >= 32
+/* 32 and 64 bit platforms */
+/* using size_t is safe. using ptrdiff_t is lame, but ssize_t is not in C89 */
+/* do not use long here as this will break on LLP64 platforms (WIN64) */
+typedef size_t lu_mem;
+typedef ptrdiff_t l_mem;
+#define MAXLMEM	((~(lu_mem)0)>>1)
+#else
+/* 16 bit platforms */
 typedef lu_int32 lu_mem;
-
-
-/*
-** a signed integer big enough to count the total memory used by Lua;
-** it should be at least as large as `size_t'
-*/
 typedef l_int32 l_mem;
 #define MAXLMEM	LUA_MAXINT32
-
+#endif
 
 
 /* chars used as small naturals (so that `char' is reserved for characters) */
@@ -51,7 +57,7 @@
 ** this is for hashing only; there is no problem if the integer
 ** cannot hold the whole pointer value
 */
-#define IntPoint(p)  ((unsigned int)(p))
+#define IntPoint(p)  ((unsigned int)(lu_mem)(p))
 
 
 
diff -ur lua-5.1-work2/src/lundump.c lua-5.1-work2-64bit/src/lundump.c
--- lua-5.1-work2/src/lundump.c	2004-09-01 23:22:34.000000000 +0200
+++ lua-5.1-work2-64bit/src/lundump.c	2004-11-02 17:17:18.000000000 +0100
@@ -255,7 +255,7 @@
  TestSize(S,sizeof(Instruction),"instruction");
  TestSize(S,sizeof(lua_Number),"number");
  x=LoadNumber(S);
- if ((long)x!=(long)tx)		/* disregard errors in last bits of fraction */
+ if ((l_int32)x!=(l_int32)tx)	/* disregard errors in last bits of fraction */
   error(S,"unknown number format");
 }