lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


on 9/21/06 1:28 PM, Rici Lake at lua@ricilake.net wrote:

> My problem with operating over integers is that it limits the size of
> a bit sequence to a smallish and possibly too small value (it might
> be 23, for example) without providing any alternative if that value
> is too small for the application. So I'd vote for strings, whose
> size is more or less unlimited and not affected by an unrelated
> compile-time configuration (the definition of what a lua_Number is).

Together with the short strings optimization of putting short strings into
TValue structures this might be reasonably viable.

Of course, putting short strings into TValue structures while legal within
the existing Lua API is almost certain to reveal bugs in code that is
removing strings from the stack while counting on them to still be valid.
I'm sure we've probably got some code like that. Perhaps the easiest way to
deal with that from a programming API standpoint -- though it still takes
code cleanup -- is to provide a version of lua_tostring that takes a
short-string-sized buffer into which it can optionally copy a short string
if it so chooses. (Of course, then someone will start depending on it doing
the copy.) The reason for doing this is that so long as Lua has a non-moving
GC -- which doesn't seem likely to change without other C API changes as
well -- one can do something like the following perfectly safely:

        lua_getfield( L, tindex, "key" );
        const char* s = lua_tostring( L, -1 );
        lua_pop( L, 1 );

And we need an equivalent for short strings such as:

        lua_getfield( L, tindex, "key" );
        lua_ShortStringBuffer buffer;
        const char* s = lua_tostringwithbuffer( L, -1, &buffer );
        lua_pop( L, 1 );

Though even with that it becomes impossible to write something like the
following which deals with fetching values out of tables:

        getfieldsf( L, tindex, "{ key = %s, number = %f, integer = %d }",
            &myStringPtr,
            &myNumber,
            &myInteger );

So, on the one hand, I think short strings are useful for a variety of
things -- and I could see them making a noticeable improvement in GC speed
-- and with respect to this thread they do a lot to enable a general purpose
bitset library, but shifting the implementation to use short strings makes
the C API noticeably harder to code against in practice even if the spec
essentially allows it.

Mark