[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Looking for ideas to lower binding overhead for lua-re2. in plain-text)
- From: William Ahern <william@...>
- Date: Sat, 24 Aug 2013 19:46:56 -0700
On Fri, Aug 23, 2013 at 05:42:30PM -0700, lua.greatwolf@mamber.net wrote:
> >
> >You should be using luaL_checklstring() here. You can pass a pointer _and_
> >the length to the StringPiece constructor:
> >
> > size_t n;
> > const char *p = luaL_checklstring(L, 2, &n);
> > StringPiece subject(p, (int)n);
> >
> >Here's a snippet from the RE2 bindings I've written. These routines
> >implement the gmatch iterator:
> >
> >...
> Thank you for the feedback. Yes in my actual code I am going to be using
> "luaL_checklstring". My "re_countmatch" is really just a throwaway
> function that I'm using to help isolate possible performance
> bottlenecks. I'm finding that trying to shave off those last couple
> 100ms is proving to be very difficult.
>
> Did you happen to do any performance benchmarking and tuning on your
> bindings to see how it compares with C/C++ usage or another language
> binding to re2?
No. It was enough for us that RE2 was several times faster than the PCRE
library. Plus, there were other issues we ran into, like memory usage of
RE2, that distracted us. Perhaps we'll end up using your module.
I'm actually currently working on a library that will utilize PCRE, RE2, and
Ragel. Ragel can be an order of magnitude faster than even RE2, but many
PCRE expressions cannot be converted, and certain subpatterns (e.g.
repetitions of large character classes) cannot be compiled in a reasonable
amount of time when in a large union. So we ended up writing a regular
expression parser that analyzes and rewrites our Perl expressions, and
assigns them to be run either as PCRE, RE2, or Ragel. We've improved
throughput over 10x, and still have a ways to go.
<snip>
> - I originally used the iterator approach like what you've presented in
> your code and benchmarked that. Here's a sample output of the benchmark.
> Each line has the re2 dna pattern to match followed by the number of
> matches found in the provided dna seq string from stdin.
>
> agggtaaa|tttaccct 356
> [cgt]gggtaaa|tttaccc[acg] 1250
> a[act]ggtaaa|tttacc[agt]t 4252
> ag[act]gtaaa|tttac[agt]ct 2894
> agg[act]taaa|ttta[agt]cct 5435
> aggg[acg]aaa|ttt[cgt]ccct 1537
> agggt[cgt]aa|tt[acg]accct 1431
> agggta[cgt]a|t[acg]taccct 1608
> agggtaa[cgt]|[acg]ttaccct 2178
>
> Those numbers also meant how many times the code had to go from lua -> C
> and back again because 'luare2_nextmatch' had to be called that number
> of times before finally returning nil. I found this to be almost 400ms
> slower on my machine per iteration than if I simply constructed and
> returned a table containing the captures.
I'm not surprised. But I still prefer using iterators because it makes the
Lua code cleaner, and the :gmatch style interface is well docmented in the
manual. I habitually default to interators unless someone complains directly
to me, or it's clearly more appropriate to return a table.
Some of the engineers on my team dislike the way I make them use iterators,
but it makes their code more concise IMO. Plus, they tend to overemphasize
CPU performance over memory pressure. We do network scanning, and a smart
module will exit early after deciding what to do. It doesn't make sense to
load a table with thousands of elements if, after processing the hundreth,
the rest are dropped on the floor. And if filling an entire table is faster
overall, there's nothing stopping them from doing that--I don't care if they
hack on my Lua bindings, and I wish they'd do it more often.
<snip>
> On another note, have you considered releasing your luare2 bindings to
> the public?
I'm not very experienced with C++. I'd need someone to carefully re-examine
my C++ bindings before I'd ever publish the code. I just ran into a bug last
month in my bindings because a C construct I was using behaved completely
differently in C++, and it was never caught before because our C++ engineers
didn't understand what the C construct was doing--and they believed the
pernicious lie that C is just a subset of C++.
> I'm also curious how 'luare2_captures' is defined?
I'll send that to you offline.