lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Sat, Apr 04, 2015 at 04:55:25PM -0700, raksoras lua wrote:
> I understand that C code generally should not store the pointer to a
> string returned by lua_tostring beyond the lifetime of the C function
> call.(PIL says "The lua_tostring function returns a pointer to an
> internal copy of the string. Lua ensures that this pointer is valid as
> long as the corresponding value is in the stack. When a C function
> returns, Lua clears its stack; therefore, as a rule, you should never
> store pointers to Lua strings outside the function that got them.")
> 
> However, is this restriction strictly true if I can guarantee - from
> my code flow - that the original string in lua will continue to be in
> reachable (that is, it is not garbage collected) when my C function
> call returns?

AFAIK, yes, as long as you can guarantee that the string is anchored.
 
> Here is my actual scenario: Luaw HTTP server uses Lua coroutines to
> handle multiple connections simultaneously. While generating response
> a Lua coroutine calls a C function with Lua string that the C function
> is eventually supposed to write to socket. But since the socket write
> may block, the C function actually uses async write API (libuv to be
> specific) to write to the socket. The function then returns
> immediately and the Lua coroutine that called the C function yields to
> suspend itself. When the socket is ready for the actual write, libuv's
> async API invokes a C callback that writes the string on the socket
> and then resumes the suspended coroutine.
 
> Now in classical model I should make a copy of the string passed into
> the first C function as the function will return before the callback
> has a chance to write the string onto the socket. However I want to
> avoid the data copy (memcpy) if I can get away with it for performance
> reasons. My rationale is even though the original C function has
> returned the Lua coroutine that called the C function gets suspended
> as soon as C function call returns. So the original string that was
> passed in the C call is still reference-able from the Lua coroutine
> and hence should not be garbage collected.

This is only true if the string were stored in or through a local variable.
But what if I do something like this:

	socket:write(myobject:createstring())

The string isn't stored in any intermediate variable. The stack frame is
cleared when you yield (see luaD_poscall in Lua 5.3). The GC can and very
well might collect the string because it's not anchored anywhere.

> So I should be able to just store the pointer to the internal Lua string
> returned by lua_tostring() without making a copy of it. When the write
> callback is called it first calls a C function - kind of a continuation of
> first C call - that writes the string onto the socket and then resumes Lua
> coroutine. This way I can guarantee that the original Lua string's
> lifetime is more than the two C calls - original call and then the
> callback invoked by libuv - involved.
> 
> Is it safe to not make a copy of a lua string in C function in this case?

In order to make your optimization work, you need to be sure that the string
is stored in a local for the duration of the yield. The only way to
guarantee that is to add an extra call with a function written by yourself
that you know stores the string in a local variable. For example:

	function lib:write(str)
		return self:uvwrite(str)
	end

The problem is that Lua will turn this into a tail call and optimize away
our str local. So we have to do something like:

	function lib:write(str)
		local retval = self:uvwrite(str)
		return retval
	end

It's possible Lua might optimize away the str local here, too. Certainly
LuaJIT is likely to optimize it away. You might need to add some function
call or other crazy call after :uvwrite and before the return statement.

If you're using Lua 5.2 or 5.3 you can use an intermediate C function to
preserve the stack frame. C function code and its stack can't be optimized
away. Using the 5.3 API you can do roughly something like the following:

	static int lib_write(lua_State *L) {
		int nargs = lua_gettop(L);
		int i;

		/*
		 * push our real libuv write binding. this would be quicker
		 * if it were cached as an upvalue.
		 */
		lua_pushcfunction(L, &lib_uvwrite);

		/*
		 * copy arguments so originals remain alive on our stack
		 * frame for the duration of the call.
		 */
		for (i = 1; i <= nargs; i++)
			lua_pushvalue(L, i);

		return lua_callk(L, nargs, NRET, 0, &lib_poswrite);
	}

	/* return from our call to lib_uvwrite */
	static int lib_poswrite(lua_State *L, int status, lua_KContext ctx) {
		assert(status == LUA_YIELD);
		assert(lua_gettop(L) >= NRET);
		return NRET;
	}

	static int lib_uvwrite(lua_State *L) {
		/*
		 * here's your original binding that called into libuv.
		 */
	}