lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hey Soni!

Ah, yes... I was lead astray by your comment[1]. Heh. I was just prototyping so I could profile different implementations, so it's not a big deal to throw it all away. I'm kinda sad to dump it though, because it actually is a tiny bit faster on vanilla lua, and I think the code I ended up writing is really pretty. My language does support stack objects as a datatype, so perhaps I'll normalize the two implementations and pick the best one at runtime? Maybe not worth the hassle...

I just tested a pure varags version of the core stack routines, and they are basically an order of magnitude faster than the next fastest contender by an order of magnitude under both lua and luajit, so I'm happy there. Also, I have a few ideas for how to adjust my threading model to fit this new stack pattern that I'm pretty excited about. I think the end result is going to be much smaller, simpler, more elegant, and (bonus!) way faster to boot. I'll check in here with progress if anyone is interested.

The takeaway here seems to be: if you're writing a forth-like language, in lua, don't think like a C++ programmer!


[1] https://github.com/IonoclastBrigham/firth/issues/1#issuecomment-98295913

On Thu, May 7, 2015 at 3:46 PM, Soni L. <fakedme@gmail.com> wrote:


On 07/05/15 12:38 AM, Brigham Toskin wrote:
Greetings.

I apologize if this is kinda long; I'll try to compress. First, my situation:

I've been working on a stack-based language, implemented in Lua. Being a C++ programmer, my first impulse was to wrap an array table with some ADT metamethods. After some futzing, my stack relatively fast, but fairly complicated for what it ultimately is—a LIFO.

Recently, someone pointed out that you could do it with much less code by wrapping Lua's call stack in a coroutine, manipulating and returning values in response to external inputs. The simplicity of the design and the realization that the Lua devs must have implemented a much faster stack than I ever could lead me to explore this space. To my surprise, my first prototype was actually 50% slower than the original, under Lua 5.2.3. After thinking about what was going on and profiling several iterations, I have what is still a simple and (I think) very clean and elegant solution, utilizing a hand full of mutually-tail-recursive continuations inside a coroutine, and it's about 13% faster than the ADT!

Sounds like a win, right? If I run the tests in LuaJIT (2.0.2 or 2.0.3), the newest prototype is even five times faster than under vanilla Lua. But, it's two orders of magnitude slower than the jit'ed ADT-style code. Now, this is still an improvement over the first prototype, which was *three* orders of magnitude slower than the jit'ed ADT code, but it still ain't great. I very strongly suspect (after looking at the -jdump) that the heavy use of switching coroutine contexts is foiling the compiler's ability to trace (and thus, optimize) the code, and I don't see a fix.

The very specific question: Do we see a workaround, optimization, or perhaps an alternative implementation, which circumvents what I think is a limitation of how LuaJIT analyzes Lua code? I can provide github links to different versions of my code, if anyone thinks it will help, but I'm pretty sure "it's a coroutine" is a good starting place.

The more general question: Where do we draw the line between writing simple code, and performance? Or phrased another way, how slow is too slow, for the sake of an elegant design? When I optimized the ADT code, it got uglier and more complex. When I optimized the coroutine prototype, it got simpler and more elegant.

--
Brigham Toskin

Looks like you're talking about me...

First of all you should AVOID AT ALL COSTS using coroutines in Lua(JIT): they're slow. As can be seen here[1], I don't use them, so you too can avoid them.

Second, there's no arguing against KISS. This is how I call Lua functions[2]. To make a wrapper around a Lua function you just `function(word, i, arg1, arg2, arg3, etc, ...) local v1, v2, v3 = f(arg1, arg2, arg3, etc) return word, i, v1, v2, v3, ... end`, and that's it!

Third... avoid `select(var, ...)`, LuaJIT doesn't like it.

[1]: https://github.com/SoniEx2/Stuff/blob/master/lua/Forth/VM.lua#L3-L21
[2]: https://github.com/SoniEx2/Stuff/blob/master/lua/Forth/VM.lua#L13

--
Disclaimer: these emails are public and can be accessed from <TODO: get a non-DHCP IP and put it here>. If you do not agree with this, DO NOT REPLY.





--
Brigham Toskin