lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]



Thanks for the background, so I'm kind of comparing apples and oranges?

Anyways, I had done the benchmarking, and up until the last yards Lua was indeed always faster. With use of qr// in Perl, it wasn't.

Which lead me to think, is there anything that could be done. In general terms, I too would prefer Lua over anything, but it also needs to be "faster or the same speed" as competitor. Otherwise, a transition would certainly -and justifyably- be questioned.

The actual issue is CSV parsing (comma separated values). Just simple, "%d+,%d+,[^,]," kind. Ideas for a better approach in Lua then string.match? (making a C module just for cutting out the , fields would be Fast but is not really an alternative, at least not in comparison to other languages)

-asko



On Thu, 9 Nov 2006 23:23:12 +0100
 Karel Tuma <ktuma@email.cz> wrote:
just benchmark it :)
lua of course wins, but not always.

the thing with compiling regexs, they're FSA (finite state automata) basically tree of states, where nodes can point meshed to other nodes. FSA compiler is considered non-trivial in terms of implementation size.

but lua patterns are sort of NFA (nonfinite), they're already compiled as they are. due to that referring back to some state or sub-expressions is very limited making lua patterns "less powerful" than pcre's, but enough for the usual daywork and when you need something more,
you've to code the logic all by yourself.
to be exact FSA is slower for "simple" expressions where NFA on the complex ones, some pcre implementations even use both and choose
between them depending on the expression (!).

hey, this is lua, we want the simple and fast, right?

add to this perl's suckyness on much everything else and lua is the winner
(using lua for parsing 1Gb+ logs using mmap(), look at
http://blua.leet.cz/sep/STRHOOK_PATCH.patch to get the idea)

On Thu, Nov 09, 2006 at 09:58:22PM +0100, Asko Kauppi wrote:

I didn't find any reference to discussing precompiled regular expressions, and Lua.

Some background:

In huge log file handling, Lua loses to Perl (not by much!) seemingly because it has no concept of precompiling, and then applying the regular expression patterns.

In Perl, one can:

	my $re= qr/^\d+\s/;
$var =~ $re; # $re is a precompiled, optimized regex, applied to $var
    or:
$var =~ /^\d+\s/o; # 'o' for once, compile once, cache, reuse

Lua:
string.match( var, "%d+%s" ) -- is "%d+%s" analyzed anew each time?


Is Lua losing in speed, since it has not this concept, or have the authors been so clever, it's already "there", just invisible? :) We're talking BIG files, say 1 million lines, or more.

-asko