[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: ANN: Fast String Patch
- From: Mike Pall <mikelu-0506@...>
- Date: Sun, 5 Jun 2005 03:43:30 +0200
Javier Guerra wrote:
> > This is an experimental patch against Lua 5.1-work6 to add support
> > for "fast strings" to the Lua VM.
> how would these fast strings behave as table keys? i'm guessing it'll be
> similar to already-interned strings, am i right?
Sort of. Fast strings need to be hashed only when they are used
as table keys (but not as table values or when they never leave
the stack). But a fast bulk hash can be used because the size of
the tagged value slot is fixed.
Regular strings are always hashed when they are created, i.e.
both for initial allocation and on any subsequent object
creation from plain C strings (e.g. when being returned from
a read() or string.gsub()). However the hash value does not
need to be re-created for table indexing.
So fast strings are most useful when you have _many_, _different_
_short_ strings, that are either not yet interned or are not
always used as table keys.
Passing around short temporary strings (I/O) and concatenating
or splitting them (text processing, regexps) are good examples.
Lookup tables with a high miss rate on semi-random short strings
(keyword matching) and tables storing semi-random short string
values (natural language words) will definitely benefit.
Plain lookup tables (with strings as keys) with a high hit ratio
(e.g. a small set of protocol commands) won't see any benefit.
Method dispatch tables usually fall into this category, too.
One should probably note that many applications are tuned to
avoid creating many short strings, because this is a well known
performance bottleneck for most VMs with immutable strings
(standard Lua, Java, Python, ...).
> a related question: on a fetch like table.key, is the 'key'
> string interned at compile time, or runtime?
It's compiled into a constant as part of the function prototype.
Constants are created when the prototype is created. Either
directly by the compiler (when loading a source file) or when
loading (undumping) a pre-compiled chunk.
> what about table.["key"]? (with "key" a constant string).
> in Xavante there are several of these queries, and i'd like to know
> if there's a performance advantage in either one.
table.key and table["key"] generate identical code.