lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


In my head, session_data is a Lua table, and so are all the other
aspects of processing except for sockets. (Socket model below.)

Let's imagine the code started like this:

function web_server:handle_login_html(req)
   local session_data = this:find_session(req) or this:die()
   session_data:handle_login(req)
end

In a non-preemptive responder there is no need to reestablish the
freshness of session_data between the two statements. handle_login may
be documented to require fresh and valid session_data. Depending on
where the abstraction barriers are it's possible handle_login has no
ability to revalidate everything.

With an evocative name like "handle_login", full
authentication-and-authorization paranoia should be in effect, so if I
actually saw this code in production I'd probably re-assert()
everything I could at the top of :handle_login. That name choice was a
distracting mistake in this example as the consistency problems could
happen with any process step.

So anyway, next in the software development saga, somebody changes this to:

function web_server:handle_login_html(req)
   local session_data = this:find_session(req) or this:die()
   print(os.date(), req.path, req,session)
   session_data:handle_login(req)
end

and again, a non-preemptive responder has no reason to revalidate;
print() is synchronous and so is os.date().

Now somebody refactors out the print()s into a real logging function,
no doubt on a base object several source files away. It's still
print(), just nicer.

Finally, somebody changes log() to allow logging to a socket. This is
the proximate cause of the disaster, because it broke the contract
that log() does not ever cause rescheduling--does not allow non-local
control flow. The log()-to-socket functionality may have been
cut&pasted from a program with blocking socket calls. But in
environments which emulate that and reschedule on I/O blocking, log()
is broken in a subtle way even though it looks correct.
Congratulations, you are now the proud owner of a Heisenbug. Probably.
:-)

I'm sticking to "non-preemptive responder" because I need a short
example of unexpected concurrency. Justification: perhaps a toy web
server for management embedded in a bigger app. Or the web server does
non-blocking/preemptive reads until it has a complete request, figures
out what to do with it in synchronous mode, then allows non-blocking
writes to stream out results. Come to think of it, that's not such a
terrible design. (But you probably do not want to write your own web
server.)

Anyway, here are the non-Lua primitives I think I need to implement a
"real" web server. All of the coroutine stuff is pure Lua.

  count = readsocket:read_nonblocking(maxsize)
  count = writesocket:write_nonblocking(s)
  serversocket = Socket.listen(port [, address])
  readsocket,writesocket = serversocket:accept_nonblocking()
  socket:close_nonblocking()
  possibly_changed_list =
      wait_for_condition_change{
          read={socket,socket...},
          write={socket,socket,...},
          accept={serversocket},
          timeout=10}

possibly_changed_list is only an optimization; on every wake you
*could* go through all your sockets and try again. No,
wait_for_condition_change is not an optimization. Busy-waiting is UN!
AC! CEPT! ABLE!

Jay

On Fri, May 11, 2012 at 3:51 PM, Graham Wakefield
<wakefield@mat.ucsb.edu> wrote:
>
> Even if no yield/resume is used, since session_data encapsulates some data which are external to Lua, its contents could have been modified between any two ordinary Lua instructions; session_data:handle_login() should be implemented to throw an error if the session_data contents have become invalid. The same applies to a callback-based system; the session_data object still needs to be checked for consistency whenever it is used.
>
> Where the implicit coroutine yield approach can be problematic is when the user expects a non-local pure Lua object to contain the same values before and after this:log(). It can be a big gotcha, but the clarity of exposition and preservation of the stack might be worth it.
>
> On May 11, 2012, at 12:21 PM, Jay Carlson wrote:
>
>> On Fri, May 11, 2012 at 2:27 PM, Tim Caswell <tim@creationix.com> wrote:
>>> I mean I don't want to call `sleep(10)` and suddenly be suspended till 10
>>> seconds later when the timeout occurs.  The problem with this is any
>>> function that I call could in turn call something else that implicitly
>>> suspends me.  Your wrapper is plenty explicit.
>>
>> Here's the problem case:
>>
>> function web_server:handle_login_html(req)
>>   -- Grab the current session and ensure it's valid
>>   local session_data = this:find_session(req) or this:die()
>>   -- Log the request
>>   this:log(req)
>>   -- delegate to session
>>   session_data:handle_login(req)
>> end
>>
>> The problem is "this:log(req)". Logging could be implemented by
>> writing to a socket--and you can't tell by looking at this code. If it
>> writes to a socket, the socket may want to block. In a simple socket
>> coroutine system, operations which would block instead are suspended
>> until the socket can be read or written. Waiting for this:log() could
>> take arbitrarily long. The session_data we grabbed at the top may be
>> stale by then.
>>
>> Adding to the fun, *most* of the time the log socket will probably be
>> ready to write; only under load will two identical requests end up
>> going separate ways.
>>
>> I hadn't thought about how the coroutine style preserves the stack,
>> which is nice. But transparently converting blocking operations to
>> coroutine dispatch can give you back many of the concurrency problems
>> when you aren't expecting them. I guess neither should be surprising;
>> it's non-preemptive green threads.
>>
>> Jay
>>
>
>