[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: [ANN] LuaJIT x64 port sponsorship program
- From: "Robert G. Jakabosky" <bobby@...>
- Date: Wed, 9 Dec 2009 18:56:56 -0700
On Wednesday 09, Mike Pall wrote:
> Alexander Nasonov wrote:
> > 09.12.09, 00:40, "Mike Pall" <mikelu-0912@mike.de>:
> > > But LJ2 uses 32 bit pointers for its GC objects on both platforms.
> >
> > Does this mean that 4G is a limits? If so, make it clear to your sponsors
> > ;)
>
> http://lua-users.org/lists/lua-l/2009-08/msg00202.html
>
> --Mike
Quote from:
http://lua-users.org/lists/lua-l/2009-08/msg00202.html
> 32 bit pointers on a 64 bit system. :-)
>
> They just have to reside in the lowest 4 GB (actually it's more
> efficient to only use the lowest 2 GB). That's easy to guarantee
> since LJ2 already uses a (much faster) bundled memory allocator.
> You can turn it off in 32 bit mode, but not in 64 bit mode.
Will LJ2 try to reserve that lower 2-4Gbytes of the processes address space
when the lua_State is created? If it doesn't then it will be competting with
the other 64bit C code for use of the lower 2-4Gbytes of address space. Also
that lower address range might not even be available by the time the first
lua_State is created, if LJ2 is embedded in a host application that mmaps a
large data file or loads a large dataset into memory before initializing the
lua_State. In some cases it wouldn't be difficult to restrict the other
64bit C code from allocating/mmapping memory in the lower 4Gbytes address
range, but in most cases this kind of change will not be easy.
Also if a program needs to create many lua_States, then that lower 2-4Gbyte
address range will get crowded fast. For programs that run for long periods
of time the lower 4Gbyte address range will be come fragmented making it
impossible to allocate large blocks of memory (i.e. for large Lua tables,
large string or the Lua stack).
Another quote from:
http://lua-users.org/lists/lua-l/2009-08/msg00202.html
> If someone, somehow eventually outgrows the 4G limit one could
> switch to compressed pointers. Shifting the pointer by 3 or 4 bits
> results in a 32 GB or 64 GB limit.
It might be good to do this even if each instance of LJ2 isn't going to use
more then 4Gbytes of memory.
Another method would be to reserve one of the registers (since x86_64 has more
to spare then x86 this isn't a problem) as a base pointer for all 32bit
pointers in LJ2 when compiled as 64bit code. LJ2 could then reserve a full
1-4Gbyte block of the processes address space as read/write protected (using
mmap). Then when LJ2's allocator needs more memory it just re-maps a few
pages from it's reserved block, that way all the memory it allocates will be
in the same 32bit range.
An example of this method would be LLVM's experimental pool allocation
optimization pass that can be used on normal C/C++ code that converts 64bit
pointers to 32bit offsets into different memory pools based on the type of
object allocated. This method decreases memory usage (smaller pointers, less
memory overhead per allocation) and improves performance by reducing CPU
cache misses.
The research paper and slides about this method can be found here:
http://llvm.org/pubs/2005-05-21-PLDI-PoolAlloc.html
--
Robert G. Jakabosky