> Besides reserved words, Lua also creates the metamethod names (__add,
> etc.). At all, there would be ~50 strings (22 reserved words + 24
> metamethods + _ENV + "not enough memory") in bare Lua.
Ah yes forgot about those!
>> Out of interest why does string data need to be strongly aligned? I
>> can't think of a situation offhand where it would ever be needed?
> Because you can store binary data in strings. E.g., you could move
> an entire struct to a string and then type-cast the string back
> to the struct. But of course the idea would be not to waste space
> with that.
Hmm I suppose you could do that with the Lua C API, it wouldn't have occurred to me that anyone would want to though, given that's what userdata is for! I guess that's one of those situations where I'd happily restrict it if it were just for me, but in base Lua you can't risk making such assumptions.
> For small strings, we want to internalize them so that we can use
> simple pointer equality when comparing strings (good for a fast
> hashing). Moreover, the internal string would need 4 bytes anyway (for a
> pointer to the external string), so the gains would not be too big.
I was thinking you'd still internalise them to a struct similar to TString and keep the pointer comparison hashing benefits, but that this TShortString would shrink the len field to a single byte (and there's one spare between extra and hash if I've understood the struct layout right) meaning you could fit everything including the 4 byte pointer to the actual data into 16 bytes. Short strings would be limited to 255 chars or whatever, but I don't suppose too many people change the current max len?
It's not a huge gain it's true, but 8 bytes multiplied by 100 or 200 is pretty good - it's maybe 1 or 2 percent of the entire memory footprint of a barebones Lua runtime, depending how barebones you make it.
> large strings, however, we are considering a good API to allow the use
> of external strings.
Interesting. Not quite the same problem but I'd be interested to know what you decide. In my situation though it's very much about the potential for optimising literals large or small, because code is executed in place from ROM and thus constdata literals don't have any impact on RAM usage.