[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Fast loading of very large "symbol" tables
- From: Klaus Ripke <paul-lua@...>
- Date: Mon, 13 Oct 2008 10:00:58 +0200
hi
On Sun, Oct 12, 2008 at 07:57:46PM -0400, Canute Bigler wrote:
> I had not thought of setting the metatable __index to a C function that
> would then lookup the id. That would probably be the fastest as far as
> script execution time would be concerned. Any good ideas how I could
> handle nested tables which are not known at run time? I guess I could
> receive the field name as a period delimited string ala
> "record.subrecord1.subrecord2.field" and pay the cost to parse and
> lookup the field at runtime. I don't know if that would be optimal though.
>
> My largest concern across the board is that of speed and not size. For
> the application for which this would be a part of, the start up time has
> been a major focus.
the obvious way to improve start-up speed is to do almost nothing
on startup, so you clearly don't want to load the full hash.
A simple alternative to the structures suggested by Norman is a cdb.
For a starter try the plain http://www3.telus.net/taj_khattra/luacdb.html
Startup cost is nothing more than opening the file.
> I would gladly trade size for start-up speed in
> just about any way, shape, or form. Run-time speed is less of an issue.
It happily deals with pretty large DBs and long keys,
so you may just want to spend a couple MB to replicate the full path
"record.subrecord1.subrecord2.field" a 100 thousand times.
It is going to take quite a number of lookups before the
penalty for increased memory usage kicks in.
For faster run-time access use a mmap-based implementation
(ask me for one or roll your own, the structure is dead simple
http://cr.yp.to/cdb/cdb.txt), which should almost be on par
with other internal structures and the Lua hash.
Especially for a large DB with a small number of lookups it has the big
advantage of examining only a small number of memory/file locations.
Incorporating the full cdb directly into your code is also
straight forward, if you prefer a bigger executable over
an external file.
HTH
Klaus