lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi,

This is an experimental patch against Lua 5.1-work6 to add support
for "fast strings" to the Lua VM.

This is based on an idea by Rici Lake (briefly mentioned in the
'Lua strings as "atoms"' mailing list thread in May 2005). It was
further discussed and refined on irc:#lua. I decided to do a
prototype implementation to test the validity of the concept.

This patch is intended to gain feedback on the usefulness of fast
strings for Lua applications. Any and all comments are welcome.
Especially needed are benchmark comparisons for real applications
that use many (different) small strings. Thank you!

To avoid any misunderstanding: the patch is *NOT* intended to be
merged into the Lua 5.1 standard distribution nor would I
recommend to do so (at this time). I leave it entirely up to the
judgement of the Lua authors whether anything from this patch 
is worthwhile to be included into any future Lua version.

Documentation and download is available from the Wiki:
  http://lua-users.org/wiki/FastStringPatch

Here are some excerpts from the documentation:
--------------------------------------------------------------------

The basic idea is to store fast strings (i.e. short strings of
up to 11 or 15 bytes) directly into tagged value slots instead
of allocating them on the heap.

This means that fast strings behave more like numbers in that
they do not need to be allocated, garbage collected or freed.
This significantly reduces VM overhead.


Neither the Lua API not the Lua/C API have been changed. There
is no need to change anything in your Lua or C modules.

Fast strings behave and look exactly like regular strings outside
the Lua core. You do not need any special provisions for using
them and there is no externally visible difference.


Obviously one can prove almost anything by selecting the proper
benchmark. So I left out some targetted benchmarks like:
  lua -e 'for i in io.lines() do end' </usr/share/dict/words
or excessive use of string.gfind() or string.gsub().

These show tremendous speedups (anywhere from +60% to +200%
depending on the architecture), but probably do not exercise
typical application behaviour.

On the converse you won't see any difference on standard
benchmarks that do not use short strings extensively (e.g.
fibionacci numbers or matrix multiplication).

So I've just taken the obvious candidates from the "great
computer language shootout" benchmarks and compared them on
different architectures:

Benchmark   |   x86      x64     PPC32    PPC64
------------+----------------------------------
hash        | +38.2%   +55.8%   +40.9%   +38.7%
spellcheck  | +23.6%   +45.8%   +17.7%   +25.2%
reversefile |  +6.2%   +22.7%   +10.2%   +13.6%
wordfreq    |  +2.2%    +8.3%   +10.6%    +8.6%   

Well ... not bad, huh?

But of course it would be best to see some benchmark results
from real applications. Please share any insights you may have
with the mailing list. Thank you!

--------------------------------------------------------------------

Bye,
     Mike