[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: "Fix" for crash in newkey (SIGBUS on Mac OS X)
- From: Kay Röpke <kay@...>
- Date: Tue, 29 Jan 2008 20:37:33 +0100
Hi *!
Over the last couple of days we have had problems with a SIGBUS in
newkey (ltable.c) on Mac OS X (Leopard).
Please bear with me as I explain what happened:
I've been using Apple's GCC (i686-apple-darwin9-gcc-4.0.1 (GCC) 4.0.1
(Apple Inc. build 5465)) on a Core 2 Duo, generating x86 32bit code.
Our C code set up a bunch of tables/metatables in preparation for a
setfenv call, but nothing really fancy. All attempts to reduce it to a
manageable test case failed, though.
The relevant part of the stacktrace in question was:
#0 0x00053964 in newkey (L=0x50e030, t=0x50f3d0, key=0x510040) at
ltable.c:425
#1 0x00053d05 in luaH_set (L=0x50e030, t=0x50f3d0, key=0x510040) at
ltable.c:503
#2 0x0005520b in luaV_settable (L=0x50e030, t=0x813eb4, key=0x510040,
val=0x813ec0) at lvm.c:142
#3 0x0005672f in luaV_execute (L=0x50e030, nexeccalls=1) at lvm.c:456
#4 0x00049131 in luaD_call (L=0x50e030, func=0x80d26c, nResults=1) at
ldo.c:377
#5 0x000438b5 in lua_call (L=0x50e030, nargs=1, nresults=1) at lapi.c:
778
#6 0x00064626 in ll_require (L=0x50e030) at loadlib.c:484
#7 0x004914a1 in luaD_precall (L=0x50e030, func=0x80d224, nresults=0)
at ldo.c:319
#8 0x004a02d7 in luaV_execute (L=0x50e030, nexeccalls=1) at lvm.c:589
#9 0x00491701 in luaD_call (L=0x50e030, func=0x80d218, nResults=0) at
ldo.c:377
#10 0x0048bed9 in f_call (L=0x50e030, ud=0xbfffe92c) at lapi.c:796
#11 0x0049096e in luaD_rawrunprotected (L=0x50e030, f=0x48beaf
<f_call>, ud=0xbfffe92c) at ldo.c:116
#12 0x00491a50 in luaD_pcall (L=0x50e030, func=0x48beaf <f_call>,
u=0xbfffe92c, old_top=24, ef=0) at ldo.c:461
#13 0x0048bf76 in lua_pcall (L=0x50e030, nargs=0, nresults=0,
errfunc=0) at lapi.c:817
#14 0x004838e4 in lua_register_callback (con=0x50cb20) at plugin.c:1387
The Lua code that was loaded just set an entry in a table we prepared
in the C code in lua_register_callback (our own code) prior to the
lua_pcall.
After several lengthy gdb sessions and barking up entirely wrong
trees, it became apparent that 'gkey(mp)' in
gkey(mp)->value = key->value; gkey(mp)->tt = key->tt;
was actually referring to dummynode (static const Node dummynode_ in
ltable.c:75) which was improperly aligned.
(Aside: Can someone comment on the issue that mp actually is dummynode
at this point? Would that be correct at all? Just so I know next time
when I'm in there...)
The "fix" we did was to change the definition of dummynode_ to
static volatile dummynode_ = {
which made the SIGBUS disappear.
Discussing this on #lua, we agreed that this smells like a compiler
bug in Apple's gcc and I will report it as a bug with them.
The volatile-"fix" surely isn't the correct approach, but at least it
allows further development with Lua on OS X x86, so it might be
helpful for some people out there.
My next steps will be to try to compile the whole thing with LLVM [1]
to see whether or not it gets it right. If this problem is present on
pre-Leopard versions of OS X, it might be worthwile to try gcc-3.3
there (I can't do this on Leopard, though).
Sadly, I'm not familiar enough with the Lua internals to say where the
misalignment of &dummynode_ makes things go wrong, maybe someone else
can comment on it.
Just for people who find this in the archives, this probably is
related: http://lua-users.org/lists/lua-l/2008-01/msg00415.html
cheers,
-k
[1] http://llvm.org
--
Kay Roepke, Software Engineer
MySQL AB, www.mysql.com
Are you MySQL certified? www.mysql.com/certification