lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


It was thus said that the Great Coroutines once stated:
> On Thu, Mar 27, 2014 at 1:30 AM, Oliver Kroth <oliver.kroth@nec-i.de> wrote:
> > It would use one single buffer, possibly with precomputed sufficient size
> > (if, e.g. all operands are strings).
> > The .. operators use a buffer each plus the memory for the (intermediate)
> > result strings.
> > So string.concat() could save memory allocation operations and the multiple
> > copy operations from one buffer into the next.
> >
> > At least this is my understanding, correct me if I am wrong...
> >
> > -- Oliver
> >
> > Am 27.03.2014 00:26, schrieb Luiz Henrique de Figueiredo:
> >>>
> >>> Why not create such a function as a member of the string library?
> >>> string.concat( "there ", "I", " would look ",1, "st")
> >>
> >> How is that different from just this?
> >>         "there " .. "I" .. " would look ",1 .. "st"
> >>
> >
> >
> 
> I don't know within what sort of structure Lua strings eventually wind
> up (where is the hash stored?) but this is the luaL_Buffer structure:
> http://www.lua.org/source/5.2/lauxlib.h.html#luaL_Buffer
> 
> I kind of wish another member were added for referencing a string
> after this one.  So when you concatenate strings (especially long
> strings) they instead form a linked list that can be walked invisibly
> later, rather than actually moving the strings together in memory.
> The hash would be recomputed from the starting segment to include all
> segments it can iterate to (the string in total).  If intermediate
> string segments become unreferenced (collectable) they then get
> realloc'd together with the nearest segment before that is still
> reachable/referenced during a GC collection (for little gain though --
> all you'd be doing away with is the "header" of the string structure.
> 
> If the 'next' ptr is NULL the string ends at that segment, if not you
> continue forward.  A linked list of Lua strings has the potential to
> contain many smaller, intermediate strings and avoid moving strings in
> memory.

  I think the overhead of tracking these partial strings will swamp any
savings in ... what?  

  Take, for instance, the function here [1]

static int strcore_wrap(lua_State *const L)
{
  const char *src;
  size_t      ssz;
  size_t      margin;
  size_t      breakp;
  const char *lead;
  size_t      lsize;
  luaL_Buffer buf;

  src    = luaL_checklstring(L,1,&ssz);
  margin = luaL_optinteger(L,2,DEF_MARGIN);
  lead   = lua_tolstring(L,3,&lsize);
  breakp = margin;

  luaL_buffinit(L,&buf);

  while(true)
  {
    if (ssz < breakp)
      break;

    if (find_break_point(&breakp,src))
    {
      if (lead) luaL_addlstring(&buf,lead,lsize);
      luaL_addlstring(&buf,src,breakp - 1);
      luaL_addchar(&buf,'\n');
      src    += breakp;
      ssz    -= breakp;
      breakp  = margin;
    }
    else
      break;
  }

  if (lead) luaL_addlstring(&buf,lead,lsize);
  luaL_addlstring(&buf,src,ssz);
  luaL_addchar(&buf,'\n');
  luaL_pushresult(&buf);
  return 1;
}

What this does is take a potentially long string, and inserts line breaks at
around the given margin point (it finds the closest bit of whitespace at the
margin to break at) and adds an optional string to begin each new line.  So,
for instance, 

	x = "The quick brown fox jumps over the lazy dog"
	y = wrap(x,1,"> ")
	print(y)

	The
	> quick
	> brown
	> fox
	> jumps
	> over
	> the
	> lazy
	> dog

(I had a very particular use case for this)

So, using your method, the resulting ... um ... thing, would look I guess
something like:

	x = "The quick brown fox jumps over the lazy dog"
	pre = "> "
	nl  = "\n"

	y = 
	[ &x[0]:3 ]	-- pointer to x[0], length of 3
	[ &nl:1 ]
	[ &pre:2 ]
	[ &x[4]:5 ]	-- quick
	[ &nl:1 ]
	[ &pre:2 ]
	[ &x[10]:5 ]	-- brown
	[ &nl:1 ]
	[ &pre:2 ]
	[ &x[16]:3 ]	-- fox
	[ &nl:1 ]
	[ &pre:2 ]
	[ &x[20]:5 ]	-- jumps
	[ &nl:1 ]
	[ &pre:2 ]
	[ &x[26]:3 ]	-- the
	[ &nl:1 ]
	[ &pre:2 ]
	[ &x[31]:4 ]	-- lazy
	[ &nl:1 ]
	[ &pre:2 ]
	[ &x[36]:3 ]	-- dog
	[ &nl:1 ]

Even assuming a minimal overhead of a next pointer, string pointer and
length, on a 32 bit system that's 276 bytes of overhead (ignoring any
"string structure" overhead inherent in Lua) vs. 53 bytes (if I added
correctly) for the string copy (ignoring any "string structure" overhead
inherent in Lua).  And the discrepency only gets worse on a 64-bit system
(552 bytes vs 53 bytes).

  You will have to have a lot of very long strings for your method to win
over the current method.

  -spc

[1] from https://github.com/spc476/lua-conmanorg/blob/master/src/strcore.c