[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: destructive iteration patterns
- From: Josh Haberman <jhaberman@...>
- Date: Tue, 12 Apr 2011 21:46:45 +0000 (UTC)
Sean Conner <sean <at> conman.org> writes:
> It was thus said that the Great Josh Haberman once stated:
> > I ultimately want to avoid having to do a per-entry allocation.
> > malloc() is *expensive* when performed this often.
>
> In other words, don't worry about calling malloc() unless you have proof
> it's a performance issue.
If I had a nickel for every time someone to convince me that my
performance concerns are invalid...
(I've had this argument before about C++ virtual functions [0] and
about sub-optimal compiler-generated code [1]. In both cases
benchmarks and/or profiles showed that yes, these things can make
a significant difference).
I'll see your profile and raise you a benchmark. Here are numbers for
my protobuf parser parsing into a data structure. I have optimizations
that avoid malloc()/free() and memcpy() for the strings contained in the
protobuf, but I can tweak my code a little bit to turn these optimizations
off and benchmark the results:
malloc/free memcpy MB/s
----------- ------ ----
712
X 643
X 465
X X 436
As you can see, avoiding the malloc/free and memcpy sped up my
benchmark by >60%. You might expect then that malloc/free/memcpy
would show up as 60% of the profile in the slowest case, but this is
actually not the case -- they show up as more like 20%. I'm guessing
that this is due to reduced locality: reallocating the strings every time
means they jump around in memory and can't stay in caches as easily.
In other words: yes malloc()/free() can be expensive if you're doing it
enough and the rest of your code is fast enough. Also a profile can't
always capture the effects of the memory hierarchy.
Josh
[0] http://stackoverflow.com/questions/5099967/how-to-obtain-a-pointer-out-of-a-c-vtable
[1] http://blog.reverberate.org/2011/03/19/when-a-compilers-slow-code-actually-bites-you/