[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Suggestion : Use unique string type instead of two (short and long)
- From: Roberto Ierusalimschy <roberto@...>
- Date: Mon, 17 Jun 2019 16:42:42 -0300
> My idea is use a fixed size hash cache. For example , there are 10 slots .
>
> During the mark stage, we put the highest id and lowest id into the slots.
>
> For example
>
> high: 80 81 92 93 74 95 86 97 98 89
> low: 00 11 02 03 14 05 16 27 28 09
>
> It means the highest id we used is 89-98, and 01, 04, 06-08 can be reused. We can always find the highest id we still used, and a few unused id .
>
> So that , we can recycle a few id for each gc cycle.
I don't think recycling "a few" ids each gc cycle is enough to control
the ids growing for ever. After these few are reused, we need more
new ids for new strings. We cannot run a new GC cycle every time a few
new strings are created.
> And then, we can remap 92, 93, 95, 97, 98 to 01 04 06 07 08 during the sweep stage.
I don't think we can do that. We remap one string from 92 to 01, but
there may be other equal strings also numbered 92, which were not
changed yet. (Remember that the collector is incremental.) Then, we
compare this first string with one still with the original number, and it
will go back to 92.
> For the length, we can still store 1 byte for small string length, if the string longer than 255 bytes, store 0xff and an additional size_t.
But then we are back with two kinds of strings again. Several of the
gains you suggested (e.g, a faster lua_getfield) go away.
-- Roberto