lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Am 27.03.2013 22:54 schröbte Philipp Janda:

[...]

There are some options (with varying degrees of speed and safety)
mentioned in this[1] thread.

   [1]: http://lua-users.org/lists/lua-l/2012-01/msg00396.html

Another possibility (that I think isn't mentioned in the above thread,
but I only took a brief look) could be to put all your string constants
into upvalues and use the Lua API (lua_upvalueindex,
lua_compare/lua_rawequal) to compare those upvalues to the argument
strings on the stack.

But then again, libc implementers and compiler writers have had some
time to optimize strcmp.

Time for some micro-benchmarks ...
I've tested 6 possible implementations:

sample1: using memcmp
  size_t len = 0;
  char const* s = lua_tolstring( L, 1, &len );
#define KEY "mymethod"
  if( len == sizeof( KEY )-1 && !memcmp( s, KEY, sizeof( KEY )-1 ) ) )
#undef KEY
    /* ... */;
  else if( ... ) ...


sample2: using strcmp
  char const* s = lua_tostring( L, 1 );
  if( !strcmp( s, "mymethod" ) )
    /* ... */;
  else if( ... ) ...


sample3: using ragel to get a compiled state machine (see linked thread)
  size_t len = 0;
  char const* s = lua_tolstring( L, 1, &len );
  switch( str2id( s, len ) ) {
    case ID_MYMETHOD:
      /* ... */;
      break;
    case ...


sample4: using lua_rawequal and lua_upvalueindex
  if( lua_rawequal( L, 1, lua_upvalueindex( 1 ) ) )
    /* ... */;
  else if( ... ) ...


sample5: using gperf (see linked thread)
  switch( str2id( s, len ) ) {
    case 0:
      /* ... */;
      break;
    case ...


sample6: lua_upvalueindex, lua_tostring and pointer comparison
  char const* s = lua_tostring( L, 1 );
  if( s == lua_tostring( L, lua_upvalueindex( 1 ) ) )
    /* ... */;
  else if( ... ) ...


Some interesting facts before the results:
- sample1 does actually "call" memcmp (using -O3) which was surprising for me. I suspected inlining and loop unrolling, since two of the three arguments are compile-time constants
- sample2 uses a special assembler instruction for strcmp
  'if( strcmp( s, "mymethod" ) )' basically becomes:
        movl    $.LC0, %edi     # string constant "mymethod"
        movl    $9, %ecx        # length of "mymethod"
        movq    %rax, %rsi
        repz cmpsb
        je      .L18

And here are the relative timings for
- 10 valid keys (mean length: 6.3)
- 2/3 lookups of valid keys, 1/3 invalid lookups:

sample1
time elapsed:	1.15
sample2
time elapsed:	2.19
sample3
time elapsed:	1.2
sample4
time elapsed:	2.58
sample5
time elapsed:	1.26
sample6
time elapsed:	2.2


Philipp


#!/usr/bin/lua5.1

local n = 1000000

local keys = {
  -- those are valid keys:
  "mymethod", "mykey", "foo", "bar", "baz", "helloworld", "terminal", "imoutofideas", "ubuntu", "linux",
  -- those are nonexistent:
  "abcedefg", "12343567", "blablabla", "non", "enough",
}

local funcs = {
  require( "sample1" ), -- memcmp
  require( "sample2" ), -- strcmp
  require( "sample3" ), -- ragel state machine
  require( "sample4" ), -- lua_rawequal, lua_upvalueindex
  require( "sample5" ), -- gperf perfect hashing function
  require( "sample6" ), -- s1 == s2, lua_upvalueindex
}

for i = 1, #funcs do
  local f = funcs[ i ]
  print( "sample" .. i )
  local start = os.clock()
  for j = 1, n do
    for k = 1, #keys do
      f( keys[ k ] )
    end
  end
  local stop = os.clock()
  print( "time elapsed:", stop-start )
end