< Previous by Date Date Index Next by Date >
< Previous in Thread Thread Index Next in Thread >

[reSIProcate] STL stream performance enhancement for SipMessage encoding


After integrating the resip stack into our application, initial profiling showed that we could improve performance significantly by replacing the STL streams classes that are used for encoding SipMessage objects.

 

The code is checked in under branches/b-jmatthewsr-streamperf.  Note that at this time only the default settings for ./configure should be used and repro is currently excluded from the build.

 

The data below shows (on win32) a 30-40% overall CPU reduction when running our application with the STL alternative and a 5-6x factor improvement running the stand alone encoder test.  Note that although the CPU usage is relatively low, under heavier call load the significance of the STL alternative becomes greater.

 

I would really appreciate some feedback from running the stand-alone test apps (STL vs alternative) on additional windows hardware as well as true Linux (not running under virtual machine)/unix/osx machines. I have binaries for win32 and for linux (fedora 5) that I can send on request.  To build the test app in the branch define/undefine RESIP_USE_STL_STREAMS in rutil/resipfaststreams.h.  The test app is under resip/stack/test/testSipMsgEncode.cxx.

 

Thanks,

 

-Justin

 

Running the test app with args:  test.exe -r 500000

 

Test Scenario 1

 

 

 

 

 

 

Description:

Running testSipMsgEncode.cxx in branches/b-jmatthewsr-streamperf/resip/stack/test

 

 

 

 

 

arguments: -r 500000

 

 

 

 

 

 

 

 

Test platform

STL Encode (sec)

NO STL Encode (sec)

Factor

 

 

 

 

 

 

 

 

Win32 Pentium D 2.8Ghz

8.9

1.7

5.24

 

 

 

 

 

 

 

 

Win32 Pentium 4 3.0Ghz

9.8

1.5

6.53

 

 

 

 

 

 

 

 

Linux-nodebug Fedora 5 under VMware on Pentium D 2.8Ghz

6.4

5.9

1.08

 

 

 

 

 

 

 

 

Test Scenario 2

 

 

 

 

 

 

Description:

B2BUA, test setup running 120 lines (simulator)

 

 

 

 

Test platform

STL Average CPU %

NO STL Average CPU %

Avg CPU % Change

Max CPU STL

Max CPU NO STL

 

 

 

 

 

 

Win32-Release Pentium D 2.8Ghz @ approx 1.5 calls/sec, 500 total calls

3.43

2.34

31.78%

44

30

 

 

 

 

 

 

Win32-Release Pentium 4 3.0Ghz @ approx .63 calls/sec, 231 total calls

3.25

1.93

40.62%

30

18

Test platform

STL Encode (sec)

NO STL Encode (sec)

Factor

 

 

 

 

 

 

Win32 Pentium D 2.8Ghz

8.9

1.7

5.24

 

 

 

 

 

 

Win32 Pentium 4 3.0Ghz

9.8

1.5

6.53

 

 

 

 

 

 

Linux-nodebug Fedora 5 under VMware on Pentium D 2.8Ghz

6.4

5.9

1.08