Re: [reSIProcate] The Million User Dilemma
Hi Kennard,
In addition to what I stated yesterday, I've done quite a bit of
profiling to identify hot spots in resiprocate.
So first was to cover the networking related apis and such. The next
would be to deal with the billions of calls to the allocator. It
doesn't even matter if you're using the best allocator known to man (in
this case tbbmalloc), the application spends more time making calls to
the allocator because of all of those vectors of pointers than anything
else.
In addition to this, there is an incredible amount of heap fragmentation
that can occur still because the average SipMessage instance and it's
combined components can take up well over 4K in memory usage and each
memory access on it can cause cache misses. This is true even though
the average sip message is 1K as data is 1K or less. I think if we
could implement an allocator at the SipMessage level (e.g. something it
holds on to, and is passed down the food chain to all the containers),
something simple like a rolling page allocator or such, a significant
improvement in performance could be made.
Likewise, profiling has also shown me that serialization of the
sipmessages is one of the slowest parts of the application as well.
Using something like boost::spirit::karma could greatly improve the
speed of the generation process to improve performance that way.
Of course, all of this is still again without the stack being threaded.
I forked the resiprocate main branch not that long ago and demonstrated
a significant improvement in performance by threading the transaction
layer of the stack. It should be in svn and you can find it here:
https://svn.resiprocate.org/rep/resiprocate/branches/b-dw-repro-c++0ximprovements/
In this case, I also threaded the testStack app. I'm interested to see
how much better it performs using epoll now.
Dan