< Previous by Date Date Index Next by Date >
< Previous in Thread Thread Index Next in Thread >

Re: [reSIProcate] The Million User Dilemma


Hi,

Regarding serializing SipMessage's; have you done any testing with
resipfaststreams.hxx? It's basically a wrapper to the standard lib
conversion functions (ltoa, etc).  The performance of ltoa, for example, is
on par with karma for at least the conversion portion of the generation
process
(http://www.boost.org/doc/libs/1_45_0/libs/spirit/doc/html/spirit/karma/perf
ormance_measurements/numeric_performance/int_performance.html).   

Karma is slightly faster (17% for ltoa) and along with enhanced data
structures (fusion?) the performance of the generation process could
possibly be improved further.  I'm all for that :-). 

Faststreams.hxx would at least be a good place to start to confirm that
there are significant gains possible in modifying the generation process.
This was the case for my application (win32) even though my app has more
modest performance requirements (max ~100cps, ~2000 active calls).

-justin

-----Original Message-----
From: resiprocate-devel-bounces@xxxxxxxxxxxxxxx
[mailto:resiprocate-devel-bounces@xxxxxxxxxxxxxxx] On Behalf Of Dan Weber
Sent: Monday, January 31, 2011 12:28 PM
To: resiprocate-devel@xxxxxxxxxxxxxxx
Subject: Re: [reSIProcate] The Million User Dilemma

Hi there,


I personally believe that the reactor design is the most scalable approach
especially when combined with the likes of epoll.  It can be made concurrent
very easily in the case of the stack inside resiprocate.
 Simply use hash based concurrency at the transaction layer based on
transaction ids to distributed across threads.  This involves no necessary
locks.

As for making DUM more parallel, I've been working with someone who is an
expert in Software Transactional Memory (author of TBoost.STM), to help
build data structures to support the needs of DUM and build an in memory
database.  Along with hash based concurrency, this could greatly improve the
performance of DUM.


Likewise, let's not forget that the two largest costs of resiprocate right
now are still memory consumption with fragmentation, and serializing
messages to the streams.  I believe we can greatly improve the performance
of those areas using something like boost::spirit::karma with some enhanced
data structures and a per sip message allocator.

Dan


On 01/29/2011 09:52 PM, Francis Joanis wrote:
> Hi guys,
> 
> I've been using resip + dum for a while, but since I was more focusing 
> on building UAs (and not proxies, ...) with it I've had no performance 
> issue so far. In fact I found that it was actually performing better 
> than some other SIP application layers, especially when handling 
> multiple SIP events at the same time.
> 
> The reason why it was performing better was because the DUM 
> (application
> level) generally uses a single thread for all SIP events, rather than 
> using one thread per event or event type (imagine what 1 thread per 
> session would do... :( ).
> 
> I see the current resip threading code as being like the reactor 
> design pattern, where only a single thread is used to "select" then 
> synchronously process events. From my experience, one main advantage 
> of this approach is that the stack's general behaviour is "predictable"
> with regards to its performance and the flow of events (i.e. 
> processing a single call VS processing 100+ incoming calls).
> 
> However, one downside of the reactor is that it doesn't scale well on 
> multicore CPUs since it only has a single thread. To really leverage 
> multicore, programs need to become more and more concurrent (that is 
> truly concurrent - i.e. without mutexes and locking) in order to get 
> faster. This is probably nothing new for most of us, but it is 
> something that I've been realizing practically more and more since 
> I've been exposed to concurrent languages like Erlang.
> 
> I think that investing time into making resip (at least the stack and 
> DUM parts) multicore aware would be a great way to future-proof it.
> 
> To add to what was already said in this thread:
> 
> - It does make a lot of sense to leverage libevent or asio or ... to 
> ensure best performance on all platforms. This is a long term goal but 
> maybe we could start some prep work now (like decoupling stuff and 
> laying down foundations). The alternative could be to try to implement 
> the best select() substitute for each supported platforms, but we 
> might then end up rewriting libevent ourselves.
> - Regarding the reactor design pattern, there is also the proactor one 
> which uses (unless I'm mistaken) OS-level asynchronous IO (resip 
> currently uses synchronous IO). The idea is that the resip transport 
> thread would be able to service multiple IO operations at the same 
> time through the kernel. I think this is similar to what Kennard 
> mentioned as a post-notified system and this would not be an easy change.
> - Adding more threads where it makes sense (like one per transport or
> ...) might not be good enough if those threads still use thread 
> locking to communicate between each other. I've done a bit of googling 
> about lock-free data structures and it is quite interesting. I might 
> try it one day to see how much faster it could get just between the 
> stack and the DUM.
> - It does also make sense to look into code profiling and ensuring 
> that the code is not "wasting cycles"
> 
> Anyway, I think this is a great idea and I would be happy to help :)
> 
> Regards,
> Francis
> 

_______________________________________________
resiprocate-devel mailing list
resiprocate-devel@xxxxxxxxxxxxxxx
https://list.resiprocate.org/mailman/listinfo/resiprocate-devel