On Sun, Oct 12, 2008 at 07:57:46PM -0400, Canute Bigler wrote:
> I had not thought of setting the metatable __index to a C function that 
> would then lookup the id.  That would probably be the fastest as far as 
> script execution time would be concerned.  Any good ideas how I could 
> handle nested tables which are not known at run time?  I guess I could 
> receive the field name as a period delimited string ala 
> "record.subrecord1.subrecord2.field" and pay the cost to parse and 
> lookup the field at runtime.  I don't know if that would be optimal though.
> My largest concern across the board is that of speed and not size.  For 
> the application for which this would be a part of, the start up time has 
> been a major focus.
the obvious way to improve start-up speed is to do almost nothing
on startup, so you clearly don't want to load the full hash.
A simple alternative to the structures suggested by Norman is a cdb.

For a starter try the plain
Startup cost is nothing more than opening the file.

> I would gladly trade size for start-up speed in 
> just about any way, shape, or form.  Run-time speed is less of an issue.
It happily deals with pretty large DBs and long keys,
so you may just want to spend a couple MB to replicate the full path
"record.subrecord1.subrecord2.field" a 100 thousand times.
It is going to take quite a number of lookups before the
penalty for increased memory usage kicks in.

For faster run-time access use a mmap-based implementation
(ask me for one or roll your own, the structure is dead simple, which should almost be on par
with other internal structures and the Lua hash.
Especially for a large DB with a small number of lookups it has the big
advantage of examining only a small number of memory/file locations.

Incorporating the full cdb directly into your code is also
straight forward, if you prefer a bigger executable over
an external file.