lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Fri, Jul 3, 2009 at 8:42 PM, Edgar Toernig <froese@gmx.de> wrote:
Fabio Mascarenhas wrote:
>
> I think mkstemp and symlink are supposed to
> be atomic, so where is the race condition?

The symlink is fine.  The race is when the lock is broken:

       /////
       while (symlink("foobar", lockfile) == -1) {
               if (errno == EEXIST)
                 if (lstat(lockfile, &st) == 0)
                   if (time(NULL) - st.st_mtime > expire) {
                     unlink(lockfile)
                     continue;
                   }
               return ERROR
       }
       return SUCCESS;
       /////

       Process A                       Process B

 T     symlink -> EEXIST               symlink -> EEXIST
 i     lstat -> OK                     lstat -> OK
 m     expired                         expired
 e     unlink
       symlink -> OK
       return SUCCESS
                                       unlink
                                       symlink -> OK
                                       return SUCCESS

I stand corrected, and do not know how I missed this; as a weak defense I notice that the NetBSD implementation of shlock that I linked in the previous message has the same race condition... I guess I have been spoiled by lock-free threading models. :-)
 

Generally, breaking locks without appropriate rollback is
seldom a good idea.  The lock holder tried to perform some
non-atomic changes (else it wouldn't need the lock) and
for whatever reason didn't finish with that.  Breaking the
lock now just means: I don't care if the data is in an
inconsistant state.  So, why using a lock in the first
place? *g*

Yes, the only way a lock can be orphaned forever is if there is a segfault before releasing it, as the lock's finalizer also releases it. I will remove the staleness check (and the race condition that goes with it) while I think this over. 


I was wrong about the busy loop, sorry.  When I read the code,
I thought the "while" loops until it gets the lock.

The memory and fd leak happens when lua_newuserdata fails.

I will move the lua_newuserdata call to the beginning of the function, thanks. I am so used to not checking the return value of lua_newuserdata, as it raises a Lua error in case it has problems, that I forget that you should allocate the userdata for the source *before* acquiring it...
 

Regarding that "strange algorithm": I have no idea what the
mkstemp file is good for.  Afaics, it has no function.
Its name is used as the contents of the symlink and it is
kept open until unlock.  What for?

I thought symlink cared about where you are linking from (as link), but I now see in the POSIX documentation (buried the NOTES section, sigh) that it can be anything. 
 

> If the lock is taken lock_dir returns an error, how to wait until trying
> again is up to the user.

Well, here is a global problem with lfs: error reporting.
You only get a random (system and locale dependant) error
string.  The caller has no chance to check what kind of
error happened (i.e. why was the lock not taken? ENOENT?
EACCESS? EEXIST?) and take appropriate action.

Yeah, lfs error reporting needs an overhaul, it should at least return the errno of what caused the error, too... but this problem is not restricted to lfs, Lua functions that raise errors should move away from just error messages to more structured error objects.

On the flip side, overhauling the error reporting will probably break a lot of clients. :-)
 

Btw, errno is often read too late in lfs, intervening calls
may have modified it.

Ciao, ET.

Many thanks for the feedback!

--
Fabio Mascarenhas