I am sorry guys, but for some reason it is impossible to reproduce the issue any more.
I am struggling over the last 3 days to make it happen again, I 've been through almost
all the latest commits, but nothing.
Possibly something supposedly unrelated? Some combination of other issues? Maybe I changed
something in the script, and now the issue is now triggered any more... I don't know.
I will have to stop searching for this problem for the time, and check if it appears again.
Either way, thanks for the help!
As per your recommendations / questions:
>Have you checked that after each GC call the stack is back where it
>started (basic correct C behavior)?
No, but I will if I have the opportunity again.
>- What allocator are you using? Does it fully support realloc?
It is a custom implementation. It does support realloc, and as per
my latest tests, it works OK.
>-You could try tracing allocations by inserting a tracer between Lua
>allocation calls and the actual allocator, also doing a stack check
>(and possibly other checks) before and after each call, to get a few
>more clues about what is happening.
I tried to, but to be honest I have trouble understanding the internals
of Lua. So even if there was anything wrong, I may not have noticed it...
>- How does Your memory layout looks like? (linker script). Maybe stack space and heap space overlaps?
The stacks and the heap are located to different non-contiguous RAM banks. I seen no
way the heap to corrupt the stack (a hardware exception will be raised).
Apart from this, there are also lots of other checks, so I believe that it is very
improbable that something unrelated to Lua (and thus running at another thread
with an isolated stack), will cause any such problem:
1. Stacks are prefilled with a known pattern. Their usage level is monitored.
2. The stack pointer is checked during context switches.
3. The heap blocks are padded with a known pattern. Any overflow will be caught.
4. Heap blocks are chained in a linked list fashion. During every malloc, realloc
or clear, this list is validated. Any corruption of the block headers will break the
chain and will be caught.
5. Lua runs on a dedicated thread. Thus it has a dedicated stack.