< Previous by Date Date Index Next by Date >
< Previous in Thread Thread Index Next in Thread >

Re: [reSIProcate] NULL Pointer crash with resip 1.3.3


Hi Aron,

 

I made an additional fix for 4xx level retry errors that happen mid-dialog.  The code used to create a new dialog (ie. makeNewSubscription) – this doesn’t seem right to me - I modified the code to just call requestRefresh instead.  New complete diff/patch attached.

 

If I don’t hear from anyone on why mRemoteTarget (ie. Remote Contact header)  was used on retries – then I will commit this fix, that includes removal of that as well.   

 

To all  - Please review this patch and provide feedback.

 

Scott

 

From: Aron Rosenberg [mailto:arosenberg@xxxxxxxxxxxxxx]
Sent: Thursday, July 24, 2008 1:27 PM
To: Scott Godin; resiprocate-devel
Subject: RE: [reSIProcate] NULL Pointer crash with resip 1.3.3

 

Scott,

 

I tested your patch combined with my patch and I no longer get the crash or the issue with empty to/from values. Attached is the combined patch against ClientSubscription.cxx

 

-Aron

 

From: Scott Godin [mailto:slgodin@xxxxxxxxxxxx]
Sent: Thursday, July 24, 2008 7:56 AM
To: Aron Rosenberg; resiprocate-devel
Subject: RE: [reSIProcate] NULL Pointer crash with resip 1.3.3

 

In this case onNewSubscription was never called, so there shouldn’t be a onTerminated callback.

 

Scott

 

From: Aron Rosenberg [mailto:arosenberg@xxxxxxxxxxxxxx]
Sent: Wednesday, July 23, 2008 9:23 PM
To: Scott Godin; resiprocate-devel
Subject: RE: [reSIProcate] NULL Pointer crash with resip 1.3.3

 

I was testing and came across a potential pathway which might need a onTerminated callback, but I am not sure

 

1.       ClientSubscription is created

2.       onRequestRetry is called due to initial/local 408/503/etc and >0 is returned.

3.       DUM timer is added for DumTimeout::SubscriptionRetry

4.       ClientSubscription::dispatch(const DumTimeout& timer) gets called and mOnNewSubscriptionCalled is false

5.       Do we need to add an onTerminated callback since it appears this pathway doesn’t fire one

 

-Aron

 

From: Scott Godin [mailto:slgodin@xxxxxxxxxxxx]
Sent: Wednesday, July 23, 2008 9:06 AM
To: Aron Rosenberg; 'resiprocate-devel'
Subject: RE: [reSIProcate] NULL Pointer crash with resip 1.3.3

 

I’m not sure I fully understand yet why the crash is happening.  The logs seem to indicated that the ClientSubscription is handling two 408 responses????  For some reason the two 408’s have the same TID, but a different to tag – very strange.

 

SIP: [.\ClientSubscription.cxx:59] ClientSubscription::dispatch SipResp: 408 tid=151dd73cc5746622 cseq=SUBSCRIBE / 2 from(wire)

SIP: [.\ClientSubscription.cxx:114] processing client subscription response

SIP: [.\ClientSubscription.cxx:168] Received 408 to SUBSCRIBE <sip:kristie.lomond@xxxxxxxxxxxxxxxxxx>;tag=10.11374.1216071735.678374

SIP: [.\ClientSubscription.cxx:59] ClientSubscription::dispatch SipResp: 408 tid=151dd73cc5746622 cseq=SUBSCRIBE / 2 from(wire)

SIP: [.\ClientSubscription.cxx:114] processing client subscription response

SIP: [.\ClientSubscription.cxx:168] Received 408 to SUBSCRIBE <sip:kristie.lomond@xxxxxxxxxxxxxxxxxx>;tag=10.11385.1216071758.680732

 

 

But I did find another issue with ClientSubscriptions, and the fix for this may end up stopping the crash as well.

 

The problem is that when ClientSubscription::end is called, an un-subscribe message is sent out.  Normally if this is successful, then we wait for a Notify to arrive with subscription-state terminated, before destroying the ClientSubscription.  However if this un-subscribe request fails, then the whole AppDialogSet reuse and ClientSubscription retry logic kicks in.  However the retry request generated is not an unsubscribe request, it is a new subscription that is retried.  I’ve attached a patch that will just terminate the ClientSubscription, if we call end() then receive an error response to the unsubscribe.

 

It seems that the whole ClientSubscription usage needs a makeover.  : )

 

Scott

 

From: resiprocate-devel-bounces@xxxxxxxxxxxxxxx [mailto:resiprocate-devel-bounces@xxxxxxxxxxxxxxx] On Behalf Of Aron Rosenberg
Sent: July 14, 2008 7:20 PM
To: resiprocate-devel
Subject: Re: [reSIProcate] NULL Pointer crash with resip 1.3.3

 

I  was finally able to get a working pcap, resip log and debug crash at the same time. Here is what is going on

 

1.       Client makes subscription

2.       Client ends the subscription by invoking end() on the handle

3.       This end results in a local 408 error, which calls onRequestRetry

4.       Our code returns 0 to onRequestRetry(ClientSubscriptionHandle) to retry the request since we want the server to know we ended the sub

5.       "Application requested immediate retry on Retry-After" is printed to log

6.       Crash happens in the else statement in ClientSubscription.cxx:198 when trying to call getAppDialogSet()->reuse().

 

I have a full log (over 100MB of resip data which I can send to a developer who wants to look at it along with the matching pcap error file

 

-Aron

 

 

From: resiprocate-devel-bounces@xxxxxxxxxxxxxxx [mailto:resiprocate-devel-bounces@xxxxxxxxxxxxxxx] On Behalf Of Aron Rosenberg
Sent: Monday, July 14, 2008 2:17 PM
To: resiprocate-devel
Subject: Re: [reSIProcate] NULL Pointer crash with resip 1.3.3

 

Here is a little bit more information gleaned from a pcap trace.

 

The stack seems to be crashing when dealing with a 400 error where the “From:” header looks like this

 

“From: <sip:>;tag=5b461e50”

 

I was able to find the outbound SUBSCRIBE request and it also has an empty From address so something strange is going on in the stack. Still working on getting the resip logs.

 

-Aron

 

From: resiprocate-devel-bounces@xxxxxxxxxxxxxxx [mailto:resiprocate-devel-bounces@xxxxxxxxxxxxxxx] On Behalf Of Aron Rosenberg
Sent: Monday, July 14, 2008 11:50 AM
To: resiprocate-devel
Subject: [reSIProcate] NULL Pointer crash with resip 1.3.3

 

Resip ver: SVN rev 8128 on 1.3 branch

 

Call Stack:

resip::AppDialogSet::getHandle() Line 22 + 0x3 bytes C++
resip::DialogUsage::getAppDialogSet() Line 38 + 0x18 bytes C++
resip::ClientSubscription::processResponse(const resip::SipMessage & msg={...}) Line 198 + 0x12 bytes C++
resip::ClientSubscription::dispatch(const resip::SipMessage & msg={...}) Line 117 C++
resip::Dialog::dispatch(const resip::SipMessage & msg={...}) Line 651 + 0x1a bytes C++
resip::DialogSet::dispatchToAllDialogs(const resip::SipMessage & msg={...}) Line 1028 C++
resip::DialogSet::dispatch(const resip::SipMessage & msg={...}) Line 608 C++
resip::DialogUsageManager::processResponse(const resip::SipMessage & response={...}) Line 1810 C++
resip::DialogUsageManager::incomingProcess(std::auto_ptr<resip::Message> msg=auto_ptr {tu=??? }) Line 1363 C++
resip::DialogUsageManager::internalProcess(std::auto_ptr<resip::Message> msg=auto_ptr {tu=??? }) Line 1190 C++
resip::DialogUsageManager::process(resip::RWMutex * mutex=0x00000000) Line 1390 + 0x49 bytes C++
SipEP::run() Line 3408 + 0xa bytes C++

 

The crash is because the appDialogSet returned in DialogUsage::getAppDialogSet() is NULL.

 

It came from our production client and is reasonable repeatable, so I am working on getting the resip logs that would go with it.

 

-Aron

 

 

---------------------------------------------

Aron Rosenberg

Founder and CTO

SightSpeed - http://www.sightspeed.com/

 

918 Parker St, Suite A14

Berkeley, CA 94710

 

Email: arosenberg@xxxxxxxxxxxxxx

Phone: 510-665-2920

Cell: 510-847-7389

Fax: 510-649-9569

SightSpeed Video Link: http://aron.sightspeed.com

 

 

 

Attachment: sub2.patch
Description: sub2.patch