lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Thursday 02, Norman Ramsey wrote:
> For those interested I had several tables with 13M entries each, with
> string keys totalling about 330MB of data.  Apparently the overheads
> were enough to consume about 3.3GB of RAM, which pushes up against
> fundamental limits on a 32-bit machine.

These two patches could help you, since they improve the memory usage during 
resizing of the internal string hashtable and the hashpart of Lua tables:
http://lua-users.org/files/wiki_insecure/users/RobertGabrielJakabosky/stringtable_resize-5.1.3.patch
http://lua-users.org/files/wiki_insecure/users/RobertGabrielJakabosky/hashpart_resize-5.1.3.patch

Internalized strings are keepted in a global hashtable, when the Lua core 
needs to grow that hashtable it doesn't just resize the memory block used by 
the old hashtable, it allocates a new hashtable and them copies the strings 
from the old hashtable.  The first patch above changes the Lua core to resize 
that hashtable in-place.  The second patch does that same for the hashpart of 
all Lua tables.

Those patches are included in the Emergency Garbage Collector (EGC) patch I 
wrote:
http://lua-users.org/files/wiki_insecure/power_patches/5.1/emergency_gc-5.1.4-r2.patch

If you only want minimal changes to the Lua core use the first two patches 
instead of the EGC patch.  The EGC patch will force a full garbage collection 
when realloc returns NULL (i.e. when you hit the 3.3Gbyte limit).

> 
> It looks as if a custom data structure is in the offing.

I would recommend trying the EGC patch first, since you wouldn't need to 
change your Lua code.  A custom data structure might still be best long term, 
since your dataset is very large.

-- 
Robert G. Jakabosky