lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


It was thus said that the Great Coroutines once stated:
> On Fri, Aug 22, 2014 at 2:27 AM, Dirk Laurie <dirk.laurie@gmail.com> wrote:
> 
> >
> > Except that userdata is advanced, whereas strings are very,
> > very basic indeed, there are languages like SNOBOL that have
> > nothing but.
> >
> > Think of pooling strings as Lua's way of dispensing with the
> > need to write "const" zillions of times as in C++ (or even C, for
> > that matter). It gives me great comfort that I can write
> >
> >    if line=="rather long string that appears only once"
> 
> I am aware of what immutable, pooled strings affords me -- I simply
> argue that it can be useful to have both facilities at your disposal.
> Mutable, potentially duplicated strings and immutable, pooled strings.
> 
> I would rather we stop treating 'string' like a distinct type when it
> really carried on the back of userdata.  I'd much rather see
> type('cat') == 'userdata'.  We have the ability to create subtypes of
> userdata in Lua, we could even manage string pooling directly in a
> lookup table.  In my opinion, it might be a big "win" to drop the
> responsibility of handling pooled, immutable strings in C -- as
> designed by upstream.
> 
> I play with networking.  

  So do I [1].  Both for my own stuff, and for work.

> While doing so, it's always been a loss to use Lua's
> standard string type when I want to fetch data from a socket and pass
> it to the user to be worked on easily with the string library or
> third-party libraries.  It's *expensive* to create long, immutable
> strings but the libraries I've come across for unpacking data deal
> mostly with strings and not userdata.  A lot of the time it's come
> down to just passing data from a connection directly through to a
> string parser for human-readable protocols.

  First suggestion:  write code as if your idea already existed.  How
different is it?  From Lua:

	if type(x) == 'userdata' then
	  -- what?  Do I have a file?  A socket?
	  -- A network address?  An LPeg expression?
	end

  Okay, that suggests to me that type(), as defined, isn't sufficient. 
Okay, let's assume we have typeof() then [2] which expects a userdata and
returns the type of userdata.  So now we have:

	if type(x) == 'userdata' then
	  if typeof(x) == ... um ... what now?

  An easy answer is the tname argument to luaL_checkudata() (from C).  

	if type(x) == 'userdata' then
	  if typeof(x) == 'FILE*' then
	    -- we have a file
	  elseif typeof(x) == 'string' then
	    -- we have Coroutine's string thingy ...
	  end
	end

  But now you are exposing what has been an implementation detail.  The
tname just has to be unique, it doesn't have to have any meaning (so I guess
the days of #define MY_TYPE "\200\201\202\203\344\377\300" are over) and we
run headlong into the namespace problem we (potentially) have with modules,
but now with types.

  I'm not shooting down your idea---I'm just pointing out that there are
unintended consequences to design decisions.

  Other unintended consequences---why did I create a new function typeof()? 
Why not extend type() to return a second value?  Because there might be
code that expects type() to only return a single value (calling type()
within the definition of an array)?  

  Okay, let's go with what we have so far.  type() and typeof() (which
returns the formerly opaque tname).  And I want to use this new string type. 
Right now, my socket API looks like:

	remote_addr,packet_data,error = sock:recv()

	-- remote_addr == userdata of type org.conman.net:addr
	-- packet_data == userdata of Coroutine's string
	-- error       == number

  Okay, I can add a paramter to that:

	remote_addr,packet_data,error = sock:recv(buffer)

  And the C code (because this *has* to be in C, unless I'm doing this with
LuaJIT):

	static int socklua_recv(lua_State *L)
	{
	  sock__t *sock   = luaL_checkudata(L,1,TYPE_SOCK);
	  lua_String *buf = luaL_checkmutablestring(L,2);
	  
	  ... um ...
	  
	}
	
Hmmm ... this is bringing up an interesting problem---do we need to check
for mutable strings (buffers?) and immutable strings?  What happens to C
code that does:

	const char *name = lua_tostring(L,idx);

Right now, we get back an immutable string (or a number converted to an
immutable string, but that's a different argument).  I suppose it could
check behind the covers and if it gets a mutable string, just return the
internal pointer (as a const char *) so that doesn't have to change.  Okay,
good, but that does lead to another issue---documentation.  Right now, I
just have to say that a Lua function takes a string and it's over.  With
mutable strings, we now need to make a distinction between mutable and
immutable.  Does that mean we have implicit conversion between mutable and
immutable?  Is that an error now?  (or up to the programmer to decide?)
Should we, as a convenience, convert a number to a mutable string, like we
now convert numbers to immutable strings?

  Mutable strings also have two properties---the overall size (space set
aside to hold the mutable string) and the actual usage.  How does space
reallocation work?  And here I'm talking about the C side of things, where I
still do quite a bit of work.  Right now, the code for sock:recv() does:

	char buffer[65536uL];
	bytes = recvfrom(sock->fh,buffer,sizeof(buffer),0,&remaddr->sa,&remsize);
	lua_pushlstring(L,buffer,bytes);

(Okay, an aside---my networking API works more on the packet level than as a
stream.  An IP packet can be 65536 bytes in size (although realistically,
you'll might see a 65,508 byte packet (UDP with 20 bytes overhead for IP
header, and 8 bytes for UDP header) or a 65,496 byte packet (waaay less
likely, but that's 20 bytes for IP header, and 20 bytes for TCP header). 
I'm not making any assumptions here about the packet size (because my
network API can also do raw socket operations) so I'm going with the maximum
size IP packet I could receive)

  Now, I have to check the minimum size, and if it's too low, grow it.  Or I
could just accept it (in which case, the rest of the packet is tossed out by
the kernel) or error out.  But in any case, it's a bit more to think about
now.

  Okay, where was I?  

> Sorry if this is a bit confused, I have a lot of ideas in different
> directions :\  At minimum I wish more of the standard library were
> written in pure Lua, so things like string.match() not accepting
> patterns with NULs in them can't happen again.  

  Okay, nothing stopping anybody from doing that.  But I suspect only a
small subset of the standard Lua library could be done in pure Lua.  

> PS: String are userdata :(  I don't want to treat them like a special
> type when they *are* just userdata with added magic.  We could do it
> in Lua.  userdata used for a buffer type would be much more applicable
> ~everywhere~, I feel like I have to turn to C to make the best of it
> and that frustrates me.  I like working only in Lua if I can.

  -spc (By the way, how much C programming do you do?)

[1]	https://github.com/spc476/lua-conmanorg/blob/master/src/net.c