lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hello,

I'm new to Lua and (like many before) am grappling with understanding
what can and can't be done in the language. My background: been
programming in C since the late 80's and involved in open source since
the '90's. I may not be good, but I'm old :)

I have a couple of initial observations, plus an idea I'd like to
throw out there... apologies in advance for the long post. Hopefully
this has a logical flow to it.

OBSERVATIONS

I could be misunderstanding things below - if so please correct me.

First, Lua is designed for embedability and for that reason it adheres
to strict ANSI C. This is a laudable goal but it's also necessarily
pretty restrictive, especially for "plain" Lua (i.e., Lua without any
other assumptions beyond ANSI C).

The most profound restriction is the limitation to a single C call
stack (you can't assume pthreads), and so Lua must longjmp() every
time there is a coroutine context switch. In other words, the stack
frames of any C functions in the call stack of a coroutine have to
tossed when that coroutine is switched out.

For "plain" Lua, restriction to ANSI C creates these problems:

    Problem #1. Lua is single-threaded and can't take advantage of
multi-core machines
    Problem #2. If Lua invokes any blocking C function, then the
entire VM blocks (all coroutines). In other words, coroutine blocking
is serialized.
    Problem #3. C functions that yield must be written so they return
immediately and provide a resume() callback (relatively minor
inconvenience)

and these advantages:

    Advantage #1. Lua is extremely portable
    Advantage #2. Lua coroutines can pretend all code that runs
between two yield()'s executes atomically

Problem #1 speaks for itself and is easy to understand. You can
address this by creating multiple Lua contexts, but then you have to
handle the resulting isolation: there are no shared globals or
upvalues, and so all communication between contexts requires
serialization of some kind.

Within a single Lua context, it's not possible to address Problem #1
in plain Lua. But even beyond plain Lua, it's hard to imagine how you
could address this, while also preserving Advantage #2, without adding
a bunch of locking overhead or creating a weird hybrid.

In any case, let's ignore Problem #1 for now.

As for Problem #3, it's more of an inconvenience than a major problem.

Let's talk about Problem #2!

In my view, Problem #2 is the most serious problem. It eliminates a
huge class of applications, namely, any application supporting
simultaneous blocking operations. For example, a web server!  I'm
assuming there that a web server that would serialize every blocking
I/O call - which is all you can do in plain Lua - does not count as a
real web server :)

For many applications (e.g., web servers) this is too restrictive, and
so people become willing to compromise on Advantage #1 (i.e.,
sacrifice portability) in order to get a solution to Problem #2.

For the sake of argument, let's assume we are willing to make that trade-off...

I'm still trying to sort through all the ways people have tried to
address Problem #2.

But let's take the "luv" module (event library based on non-blocking
file descriptors) as an exmaple of a reasonable and popular solution.

OK so far so good...  BUT - there's still a larger problem with "luv"
or any other solution to Problem #2 I've run across so far: none of
them are COMPOSABLE with other, unrelated Lua modules.

Here's what I mean: take "luv" for example. It only works if all file
descriptors are set to non-blocking mode. Now suppose you pull some
random mysql module off the shelf that communicates over a network
socket to your SQL server. And suppose that mysql library uses
LuaSocket to create a network connection to the database.

Unless that mysql module is a priori designed to work with "luv", as
soon as you start using it, it will make some blocking call, and your
entire application will block - oops, here comes Problem #2 again.

Even if you could somehow access the socket that the mysql module uses
and set it to non-blocking mode, that still wouldn't work because the
mysql module wouldn't know what to do with the  new EWOULDBLOCK error
code it would get back.

In other words, there exist "solutions" to Problem #2, but if you use
them, then EVERY other module you use that makes blocking calls needs
to also be designed to work with that solution, or else you haven't
really solved the problem.

Another example: suppose you want your application to fetch an HTTPS
URL like "https://foobar.com/"; without blocking the entire VM, using
some simple "request" object API. Then you have to find a module that
contains ALL THREE of (a) non-blocking I/O, (b) HTTP client support,
and (c) SSL support... or else you have to find three separate modules
for (a), (b), and (c) that were all written to work together
(unlikely).

In other words, because there is no standard solution to Problem #2,
you can't compose arbitrary modules and solve Problem #2 at the same
time.

In my opinion this problem is actually the biggest problem with the
Lua "ecosystem". As evidence look at the discussion going on in the
"batteries" thread right now.

OK end of observations..

PROPOSAL

I tried to think about whether there is a way to address Problem #2
directly and transparently, i.e., in a way that doesn't require other
modules to know anything about it. So that, using the above example,
if you started using some random mysql module that makes blocking
network calls, it wouldn't lock up the entire VM while it waits.

How could you do this? Here was my first idea:

    1. Give each coroutine it's own pthread and C stack.
    2. Disable preemption (i.e., context switch only on blocking call
or yield())
    3. Disable parallel execution to preserve Advantage #2 (e.g., lock
all pthreads to one CPU)

The key here is that the combination of #2 and #3 means only one
coroutine runs at a time, and it runs without being preempted until it
either (a) yields, or (b) blocks. In effect, a blocking call is
treated like a yield(), with the corresponding resume() occurring when
the file descriptor becomes readable (or whatever).

Can this be done? Here's how you can actually do this on Linux:

    1. You get separate C stacks by using a separate pthread for each
coroutine, or using setcontext() etc.
    2. You can disable preemption via pthread_attr_setschedpolicy(3)
with SCHED_FIFO
    3. You can lock all pthreads to one CPU using sched_setaffinity(3))

Unfortunately on Linux step #2 requires root permission, and moreover
makes your process "real-time" and therefore a CPU hog.

More generally, the pthreads specification provides no portable way to
simply turn off preemption. So that is a major problem with this idea.

Too bad. But let's complete the thought experiment...

This type of solution preserves Advantage #2 (and provides a simple
solution to Problem #3) with no additional locking. There is only one
small downside compared to "plain" Lua: blocking calls now behave as
if they could yield() internally. For example, this code could behave
differently:

    counter = 1;
    function mycoroutine(client)
        print(counter)
        print(counter)     // counter must be the same here
        client:receive()   // this is a blocking call
        print(counter)     // counter could have changed here!
    end

In plain Lua the counter could not have changed in the last print() statement.

But having blocking calls appear to yield() is a very natural change -
and in any case the same thing already is happening with "luv" and all
other solutions to Problem #2, which is proof that it won't be a big
problem.

However, compared to those existing solutions, this solution has the
key advantage that it doesn't break composability.

Now, back to the issue of how to implement it... is there any hope?

In other words, let's summarize the situation:

  1. Problem #2 is a major problem with Lua that everyone wants to solve
  2. The existing solutions work but they are not composable with
arbitrary other Lua modules
  3. We could make them composable if we could do the following:
      (a) Give coroutines (or at least those that need to block) their
own C stacks
      (b) Disable preemption
      (c) Disable parallel execution

This would have to be done in such a way that existing C modules could
be adapted with minimal change, AND so that the same C module code
could still work on "plain" Lua as before.

Since we can't do it with pthreads, what about GNU pth? It is
specifically designed for non-preemptive multitasking and gives you
3(a), 3(b), and 3(c) by design. Moreover, pth is very portable and
stable (I used it back in the 90's).

(If people barf on the GPL, it would be easy to write the same thing
from scratch under a looser license.)

What would be required of existing C modules to support this? => Any
blocking calls would have to be changed so they invoke pth's version.
For example, read becomes pth_read(), sleep() becomes pth_sleep(),
etc.

Does this mean C modules would end up being littered with #ifdef's and
incompatible with "plain" Lua?

No. Instead, this addition to the Lua API would define a Lua wrapper
funtion for each blocking function.

So the C module would change read() into lua_read(), sleep() into
lua_sleep(), etc., after including some appropriate Lua header.

But these wrapper functions could work either way: the same C module
would compile and work on either "plain" Lua or on this new, optional
"nonblock" version of Lua: on platforms supporting "nonblock" the
wrappers would redirect to non-blocking versions; on "plain" Lua
platforms, the wrappers would just fall back to the original versions
- so you get exactly what you did before.

To summarize this idea:

  1. Add new, optional "nonblock" support to Lua on platforms that can
handle it, based on cooperative threading (GNU pth or equivalent)
  2. There is a new header file #include "lua_nonblock.h" defining
wrappers for blocking C calls: lua_read(), lua_write(), lua_sleep(),
etc.
  3. C modules written to support "nonblock" continue to work when
compiled with "plain" Lua, but now can also context switch if compiled
on "nonblock" versions of Lua
  4. coroutine.create() gets a new optional parameter, where true
means "use a separate C stack (if supported)"

To me this would be a major improvement in the status quo and improve
the Lua ecosystem dramatically by allowing all of the modules out
there to be composable.

The only cost to module developers would be match & replace read() ->
lua_read(), etc.

Thoughts?

-Archie

--
Archie L. Cobbs