[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Lua 5.4-work1 with first class arrays (implementation & benchmarks)
- From: Petri Häkkinen <petrih3@...>
- Date: Wed, 28 Mar 2018 08:00:18 +0300
Thanks to all for the answers! My responses inlined below:
On 28 Mar 2018, at 0.02, Dibyendu Majumdar <mobile@majumdar.org.uk> wrote:
> The table as the sole data structure is a
> defining feature of Lua; not only that, the semantics associated with
> table iteration and nils, makes it impossible to introduce such a
> subtype without break compatibility with existing programs.
Tables are a defining feature of Lua... until it is no more so. Similarly single numeric type was a defining feature of Lua, until integers were added. Mind you, “arrays” are still internally tables, just without the hash part and with an explicit length. Conceptually the leap from tables to arrays is small.
I think I've shown how an array subtype can be added without breaking compatibility with existing programs. Can you show how the patch breaks an existing program, please?
On 28 Mar 2018, at 0.56, Duane Leslie <parakleta@darkreality.org> wrote:
> The issue of length is a quirk of the language but I can't see any generic way to keep a more accurate version without cost
The original email outlined how this would work. The patch implements # in constant time for arrays (note that this does not negatively affect the performance of tables — see benchmarks). Once you have arrays you no longer need # operator for tables. So # for tables could be first deprecated and ultimately removed. At this point you have # which works for arrays, is extremely fast, and does not have issues with holes. Problem solved!
(Naturally ‘#’ could still be implemented using metatables for tables, if so desired.)
Also, I think you are underestimating the value of a fast # operator. For example, a benchmark doing "t[#t+1] = value" runs about 5 times faster with arrays and constant time length operator. And the # operator alone runs about 16 times faster on arrays (on a big dataset; the speed of ‘#’ for tables varies depending on the table size) . That’s a sizable speed difference that should show on real programs, not just benchmarks.
On 28 Mar 2018, at 0.56, Doug Currie <doug.currie@gmail.com> wrote:
> The drawback I noticed was increasing the size of the Table structure. Since the hash part is not used,
> perhaps there's a way to share the hashtable control fields and "Sizeused" to avoid growing the structure?
> I suspect the assessment of the impact of sharing fields on code complexity and performance may be a big job.
In practice, the impact of increasing the size of Table data structure by 4 or 8 bytes (depending on 32/64 bitness of the app) was discussed in the article. The benchmarks show that it does not negatively affect performance and the increase in memory consumption is very small. For example, a single element in a table needs two TValues for storage (key & value), i.e. 32 bytes per key-value. So the added "sizeused" field increases memory consumption as much as a 1/4th of a key-value.
To get high performance "sizeused" needs to be a separate field in Table. Hashing it into other fields would be almost impossible without destroying the performance, I’m afraid.
Petri