lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


 On 15/09/10 00:19, David Given wrote:
I'm working on a new (yet another...) programming language, and I'm
about to start working on the string system. I'm looking with interest
at Lua's string handling, as it works very well.

I'm already sold on having immutable strings, as there are lots of
advantages with regard to sharing of string data etc, but what are the
advantages to having atomised strings (i.e., each string has one and
precisely one copy in memory)? Is it to allow strings to be compared for
equality by just comparing their pointers? My language isn't based
around key/value pairs the way Lua is, so that may not be as important
to me; are there any other benefits?

Also, what's the performance characteristics of using atomised strings?
In particular, I'm wondering what the amortised performance of adding a
new string to the system is like; when using atomised strings you need
to do a lookup of the string table to see if the string is there, plus
an optional memory allocation if it turns out that it's not. A
non-atomised implementation doesn't do the lookup but has to always do
the memory allocation.

What hash function does Lua use for strings
Having only one copy of each string is called string interning. In Lua I assume (I haven't looked!) that, as well as comparing strings, it means looking a string key up in a table requires only a modulus of the atom, rather than hashing the whole string content, to find the mainposition.