lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Nov 26, 2007 12:37 PM, Roberto Ierusalimschy <roberto@inf.puc-rio.br> wrote:
>
> > The recursive coroutine/stack overflow bug that's listed on the Lua
> > website [http://www.lua.org/bugs.html#5.1.2-4] currently doesn't have
> > a patch listed and instead suggests that the right thing to do might
> > be moving the nCcalls counter to the global state.  It seems there was
> > discussion about there being issues with that in a multi-threaded
> > environment.
> >
> > I'm currently using the following patch that approaches the problem in
> > a slightly different manner, and I'm wondering if there are any
> > obvious reasons why this isn't a good way to patch the issue (or if
> > there's something that's been missed entirely).
> >
> > The patch can be found here: http://pastebin.ca/797346
> >
> > The author would appreciate any feedback you can provide.
>
> Would you mind explaining your patch?

>From the author:

*Symptom:
*
Deep coroutine recursion overflows C stack without detection from lua:

a = function(a) coroutine.wrap(a)(a) end return a(a)

*Cause:
*
While lua tracks the number of nested C calls within a thread (nCcalls),
there is no similar count for the number of nested coroutine executions,
yet each of these results in a nested call to luaV_execute on the new
state. This leads to consumption of the stack outside of the count lua
is maintaining, in normal code it's unusual that it's an issue but
pathalogical cases such as the example above rapidly result in failure.

*Solution:
*
Maintain a count in each state representing the 'depth' of coroutine
resumptions (nresumes) which are active at that moment, and use the sum
of resumes and C calls for stack depth tests. This is incremented and
tested before the nested luaV_execute, and then decremented after it.
The value is propagated from state to state at the same time execution
moves into the coroutine's thread.

*Possible simplifications:
*
One could maintain separate limits for coroutines and C calls,
(essentially the sum of the two becomes the old C call limit). This has
the benefit of reducing the footprint of the patch, essentially the only
test then exists when the resume occurs, and the old C call depth tests
are unaffected, but would introduce another configuration setting and is
thus less compatible with the 5.1 codebase.