lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


It was thus said that the Great Josh Haberman once stated:
> Sam Roberts <vieuxtech <at> gmail.com> writes:
> 
> > Why not use userdata, so if they keep an entry in a global, they can
> > keep it, and if they don't assign it to a global it will get garbage
> > collected?
> 
> I ultimately want to avoid having to do a per-entry allocation.
> malloc() is *expensive* when performed this often.

  I thought that too until I actually profiled some code.  My email API
allocates around 6k-8k per message (total---there's at least one call to
malloc() per header).  On my 2.4Gh dual core Pentium system the output of
the profiler showed this (I wrapped the calls to malloc(), realloc() and
free() so they would show up in this output):

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 71.10      0.73     0.73   756091     0.00     0.00  header_parse
  7.85      0.81     0.08                             hmap_search_cmp
  6.86      0.88     0.07    81694     0.00     0.00  address_fix
  1.47      0.89     0.02   884790     0.00     0.00  alloc_value
  0.98      0.90     0.01  1666702     0.00     0.00  my_free	*****
  0.98      0.91     0.01   735819     0.00     0.00  fix_header
  0.98      0.92     0.01   585962     0.00     0.00  trim_space
  0.98      0.93     0.01   283824     0.00     0.00  a_free
  0.98      0.94     0.01   140962     0.00     0.00  state_outside
  0.98      0.95     0.01   101330     0.00     0.00  split_line
  0.98      0.96     0.01    58484     0.00     0.00  state_address
  0.98      0.97     0.01    40544     0.00     0.00  r_free
  0.98      0.98     0.01    20272     0.00     0.00  email_free
  0.98      0.99     0.01    20272     0.00     0.00  fix_return_path
  0.98      1.00     0.01    20271     0.00     0.00  fix_list_archive
  0.98      1.01     0.01    20271     0.00     0.00  fix_reply_to
  0.98      1.02     0.01    20208     0.00     0.00  fix_mime_version
  0.00      1.02     0.00   987510     0.00     0.00  my_malloc	*****
  0.00      1.02     0.00   405448     0.00     0.00  s_free
  0.00      1.02     0.00   284498     0.00     0.00  my_realloc *****
	[ snip ]

  I've marked the wrapped calls with '*' and as you can see, it's not worth
it to remove the calls.  I thought about it but after profiling the code I
can see it's not really worth the time to remove the calls [1]

  In other words, don't worry about calling malloc() unless you have proof
it's a performance issue.

  -spc (There's not much I can do about header_parse() though ... )

[1]	"Okay, but what about multithreaded apps?  Perhaps the contention
	would make things worse." Okay, I linked against pthreads and the
	actual times for malloc(), realloc() and free() *DROPPED*.  Go
	figure.