lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi Alex,

I think it's problem of believe and the cure is simply trying it out.

I'd be happy if anybody felt like supplying a JIT-tuned example to support Alex' assumption.

As I said, I couldn't make it a match for Fleece but I am no pro with LuaJIT.

I posted an article about how the speed is achieved for Fleece here:

http://www.eonblast.com/blog/fleece-performance/

It's very much about string buffers, memory copying and number conversion. Not necessarily only about Lua.

Like, you can use sprintf() or write something faster. That's where Fleece is pulling the stops out. Those things could be applied to an encoder in any language.

(post continued below)

On 3/20/11 10:50 PM, Alexander Gladysh wrote:
On Sun, Mar 20, 2011 at 22:28, Henning Diedrich <hd2010@eonblast.com> wrote:
On 3/20/11 1:49 AM, Alexander Gladysh wrote:
On Sun, Mar 20, 2011 at 03:33, Henning Diedrich <hd2010@eonblast.com> wrote:
Fleece is a ~ 10 times faster Lua to JSON converter.

I would very much like to see comparison of this module performance
with fine-tuned LuaJIT2/FFI implementation.
Totally seconded. I tried and didn't get too far, e.g. with JIT+Yajl,
JIT+JSON4.
Um. No, I mean plain Lua code under JIT, without bindings — assuming
"ideal" compiler you do not need any if you're only generating JSON,
not parsing it.

Well, again: because it is very much faster.

It's a byproduct of an actual problem, and its solution, so let's leave the theorizing + principles level behind.

If you simply can't believe it, the right thing is to try it out - but insisting on disbelief doesn't change reality. Well, at least not that of others.

So if you would like to discuss, you are invited to try it out. That might make all further talk idle.

AND I will grant the possibility any time that I am wrong.

I still am not sure though that you actually yet looked too closely at Fleece. You might have found described (and tried out) that the last 10% of performance come from hand written assembler.

It tells you where the C source is at that assembler only gets another 10%.

JITs can get faster than C, breaking that sound barrier is their real magic. But I don't know that they can get faster than manually tuned assembler. And it's not like gcc is too stupid either.

Regarding the (not) need of bindings, you could of course build a Firewall with a plain packet filter, which could cover a couple of use cases. It's definitely not what we needed.

Just as the existence of packet filters is no sufficient argument against the existence of Firewalls.

But still, if you gave me any simple snippet of source that generates JSON, I'll be happy to benchmark it for you. Unless it's simply 'print("[1,2,3]"), you might be surprised.

I did it before starting Fleece and it was not convincing. Here is a recent experiment with LuaJIT, from a source, I think that Roberto wrote originally:

-----------------------------------------------------------------------
-- mini json stringify for output
-----------------------------------------------------------------------
-- from http://lua-users.org/wiki/TableUtils

function table.val_to_str ( v )
  if "string" == type( v ) then
    v = string.gsub( v, "\n", "\\n" )
    if string.match( string.gsub(v,"[^'\"]",""), '^"+$' ) then
      return "'" .. v .. "'"
    end
    return '"' .. string.gsub(v,'"', '\\"' ) .. '"'
  else
    return "table" == type( v ) and table.tostring( v ) or
      tostring( v )
  end
end

function table.key_to_str ( k )
  if "string" == type( k ) and string.match( k, "^[_%a][_%a%d]*$" ) then
    return k
  else
    return "[" .. table.val_to_str( k ) .. "]"
  end
end

function table.tostring( tbl )
  local result, done = {}, {}
  for k, v in ipairs( tbl ) do
    table.insert( result, table.val_to_str( v ) )
    done[ k ] = true
  end
  for k, v in pairs( tbl ) do
    if not done[ k ] then
      table.insert( result,
        table.key_to_str( k ) .. "=" .. table.val_to_str( v ) )
    end
  end
  return "{" .. table.concat( result, "," ) .. "}"
end

There must be much room to improve the performance of this snippet, and then to tune it right with LuaJIT.

But then it also lacks some detail before it can be said to produced actual JSON.

What I saw, as it is, Fleece is ~ 40 times faster for Lua 5.1 and ~ 20 times faster than this run on LuaJIT 2.

If someone knows how to tune it for LuaJIT to get it to fly, I'd be happy to learn about it.

LuaJIT seems to be speeding up JSON4 up around 4 times, as I would expect. I am sure this can be increased. If you, or somebody else would like to give it a try, it would be very welcome.

BTW, there is also another luajson:
https://github.com/harningt/luajson which you did not cover with
benchmarks (or is it this one
http://luaforge.net/projects/luajsonlib/? looks like it is not).

No, it's native Lua. You could use this instead of JSON4.

As I tried to convey in other posts in this thread, I consider the
programming practice when one accesses private internals of a complex
third party system to be inacceptable in all, except the most
exceptional cases.

You're entitled to your opinion but let's look at facts.

There was a use case we needed this for, so it makes little sense to argue that it's useless.

There was a clear priority that guided the development. It makes less sense to state that priorities can be different. They sure can.

Also, Lua is not that complex and it is sufficiently slow with releases, as explicit policy.

To give some idea, here is the part where Lua 5.1 and LuaJIT 2 must differ. Lua 5.1 first, then LuaJIT

It's ~50 lines net per VM, out of a couple of thousand (200 both + comments).

All the rest is the same.

/*****************************************************************************\
***                                                                         ***
***                            HANDLING TABLES                              ***
***                                                                         ***
\*****************************************************************************/

/*---------------------------------------------------------------------------*\
 ** Stringify the array part of a table                                     **
 *---------------------------------------------------------------------------*
 * A table, internally, has a hash & an array part                           *
\*---------------------------------------------------------------------------*/

#ifdef LUA_5_1
/* This is Lua 5.1.4 specific.                                               */
void stringify_array_part (insp_ctrl *ctrl, const Table *t, size_t *count, int *pure, int *tried, int force)
{
    int lg;
    int ttlg;  /* 2^lg */
    int i = 1;  /* count to traverse all array keys */
    int try = 0; /* 1 := started out well to stay w/o a key = pure */
    int pu = 1;
    for (lg=0, ttlg=1; lg<=MAXBITS; lg++, ttlg*=2) {  /* for each slice */
        int lim = ttlg;
        if (lim > t->sizearray) {
            lim = t->sizearray;  /* adjust upper limit */
            if (i > lim)
                break;  /* no more elements */
        }
        /* elements in range (2^(lg-1), 2^lg] */
        for (; i <= lim; i++) {
            TValue * v = &t->array[i-1];
            if(!ttisnil(v)) {
 
                /* key ----------------------------------------------------- */
                if(force || *count != i) {
                    table_array_key(ctrl, i);
                    if(try) { (*count)++; *pure = 0; *tried = 1; return; } /* cancel and redo */
                    pu = 0;
                } else {
                    try = 1; /* we try to stay w/o key = pure */
                }
                (*count)++; // TODO: local
                /* value --------------------------------------------------- */
                stringify_value(ctrl, v);
                /* this must NOT be done entirely blindly. But it is taken care of here (*) */
                buffer_add_char_blindly(ctrl, ','); 
 
            }
        }
    }
       *pure = pu;
    *tried = try;
 
}

/*---------------------------------------------------------------------------*\
 * Stringify the hash part of a table                                        *
 *---------------------------------------------------------------------------*
 * A table, internally, has a hash & an array part                           *
\*---------------------------------------------------------------------------*/

/* This is Lua 5.1.4 specific.                                               */
void stringify_hash_part (insp_ctrl *ctrl, const Table *t, size_t *count, int *ppure)
{
    int i = sizenode(t);
    int pure = *ppure;

 
    while (i--) {
        Node *node = &t->node[i];
        if(!ttisnil(key2tval(node)) && !ttisnil(gval(node))) {
 
            table_hash(ctrl, n)
 
        }
    }
    *ppure = pure;
}
#endif

#ifdef JIT_2

/*---------------------------------------------------------------------------*\
 ** Stringify the array part of a table                                     **
 *---------------------------------------------------------------------------*
 * This is Lua 5.1.4 specific.                                               *
 * A table, internally, has a hash & an array part                           *
\*---------------------------------------------------------------------------*/

/* This is LuaJIT 2 specific.                                                */
void stringify_array_part (insp_ctrl *ctrl, const Table *t, size_t *count, int *pure, int *tried, int force)
{
    int try = 0; /* 1 := started out well to stay w/o a key = pure */
    int pu = 1;
    uint32_t i, b;
    if (t->asize == 0) return;
    for (i = b = 0; b < LJ_MAX_ABITS; b++) {
        uint32_t n, top = 2u << b;
        TValue *array;
        if (top >= t->asize) {
            top = t->asize-1;
            if (i > top)
            break;
        }
        array = tvref(t->array);
        for (n = 0; i <= top; i++) {
            //- printf("Loop array part (count %zd)\n", *count);
            TValue *v = &array[i]; /* i, not i-1, as in the Lua 5.1 part  */
            if (!tvisnil(v)) {
                /* key ----------------------------------------------------- */
                /* >> cloned (diff 2x: i->i+1) */
                if(force || *count != i) { 
                    table_array_key(ctrl, i);
                    if(try) { (*count)++; *pure = 0; *tried = 1; return; } /* cancel and redo */
                    pu = 0;
                } else {
                    try = 1; /* we try to stay w/o key = pure */
                }
                (*count)++; // TODO: local
                /* value --------------------------------------------------- */
                stringify_value(ctrl, v);
                /* this must NOT be done entirely blindly. But it is taken care of here (*) */
                buffer_add_char_blindly(ctrl, ','); 
 
 
            }
        }
    }
    *pure = pu;
    *tried = try;
    
}


/*---------------------------------------------------------------------------*\
 * Stringify the hash part of a table                                        *
 *---------------------------------------------------------------------------*
 * A table, internally, has a hash & an array part                           *
\*---------------------------------------------------------------------------*/

/* This is LuaJIT 2 specific, adapted from counthash(), lj_tab.c             */
void stringify_hash_part (insp_ctrl *ctrl, const Table *t, size_t *count, int *ppure)
{
  uint32_t i, hmask = t->hmask;
  int pure = *ppure;
  Node *node = noderef(t->node);
  for (i = 0; i <= hmask; i++) {
 
    Node *n = &node[i];
    if (!tvisnil(&n->val) && !tvisnil(&n->key)) {
 
            table_hash(ctrl, n)

    }
  }
  *ppure = pure;
}


This is the thing that violates my "code of engineering" in a
hard way.

But the golden rule can be overwritten by tools by people going by the initials of M.P.? Or what is the consistency with the exceptions?

I just want to know, why did you do this in that way?

Speed. If you get something twice as fast, or even five times faster, it allows for new things.

Specifically for us, reducing shards, empowering game designers.

I think the problem is just that you don't believe how much this was a bottle neck and that it's possible to speed things up so much.

Cheers,
Henning