[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: yet another pattern-matching library for Lua
- From: Mike Pall <mikelu-0701@...>
- Date: Sat, 6 Jan 2007 21:33:21 +0100
> Doing some testing with very simple grammers I found them to be about
> twice as slow as the equivalent regular expression in Perl.
I cannot confirm this. I've translated the regex-dna benchmark to
PEGs and compared it with the existing Lua, C, Perl and Tcl
programs (Tcl was chosen because it wins on this particular
benchmark). Here's a performance comparison for N=500000,
normalized to the timing for regexdna.lua:
2.7x Tcl (embedding Henry Spencer's regex library)
1.3x C (PCRE)
1.2x Lua LPeg (*)
1.0x Lua regex
As you can see, Perl is slower in this particular benchmark. Of
course your settings may differ, but I suggest to check these:
- Compile LPeg and Lua with the maximum C compiler optimization.
- Construct your patterns _once_ and only then use them with
lpeg.match repeatedly (construction is slow, matching is fast).
- Optimize your PExpressions: combine common prefixes, add fast
prechecks, use charsets, ...
(*) This was for LPeg 0.1. Unfortunately LPeg 0.2 lost some steam
because it's now binary-clean (and can't rely on NUL-checks).
I've already sent some suggestions to Roberto on how to get back
most of the performance.
> I know this is still very early, can we expect performance
> inprovements or is this about as fast as it gets?
The PEVM interpreter in LPeg 0.2 isn't fully optimized (yet). I'm
pretty sure Roberto is aware of this. But it doesn't make sense
to start optimizing the VM before the functionality is complete.
I.e. wait for version 1.0.
But ... I've toyed with the idea to do a JIT compiler for the
PEVM. This should be pretty easy since it's a really tiny VM and
most opcodes translate to just a handful of machine instructions.
I've prototyped this by handcrafting the optimal machine code for
the patterns used in the above benchmark and got a whopping
20x-30x speedup ... so, yes, there's quite a bit of headroom. :-)
[ Caveat: this is just a theoretical upper limit. Don't expect to
get this kind of speedup with JIT compiler generated code or on
other (more branchy) benchmarks. I currently have no plans to
write such a compiler and certainly won't start before LPeg has
reached 1.0. ]