[reSIProcate] NULL pointer crash in DialogSet::dispatch withmAppDialogSet

Byron Campen bcampen at estacado.net
Mon Oct 1 17:30:44 CDT 2007


	No comments? So, I take this to mean that nobody would be sore at me  
if I were to remove every use of AppDialogSet::reuse() that looks  
suspicious to me (ie, all of them)? Pipe up!

Best regards,
Byron Campen



> 	So, I've given this a look, and I have a theory. Aron said that  
> the crash happened sometime after the onRequestRetry() callback  
> returned 0...
>
> From ClientSubscription.cxx, starting at line 165
> *snip*
>       if (msg.header(h_StatusLine).statusCode() == 408)
>       {
>          InfoLog (<< "Received 408 to SUBSCRIBE "
>                   << mLastRequest->header(h_To));
>          retry = handler->onRequestRetry(getHandle(), 0, msg);
>       }
>       else
>       {
>          InfoLog (<< "Received non-408 retriable to SUBSCRIBE "
>                   << mLastRequest->header(h_To));
>          retry = handler->onRequestRetry(getHandle(), msg.header 
> (h_RetryAfter).value(), msg);
>       }
>
>       if (retry < 0)
>       {
>          DebugLog(<< "Application requested failure on Retry-After");
>          mEnded = true;
>          handler->onTerminated(getHandle(), msg);
>          delete this;
>          return;
>       }
>       else if (retry == 0)
>       {
>          DebugLog(<< "Application requested immediate retry on  
> Retry-After");
>          //!dcm! -- why isn't this just a refresh--retry after  
> might be
>          //middle element and not indicate dialog destruction
>          if (mDialog.mRemoteTarget.uri().host().empty())
>          {
>             SharedPtr<SipMessage> sub = mDum.makeSubscription 
> (mLastRequest->header(h_To), getEventType());
>             mDum.send(sub);
>          }
>          else
>          {
>             SharedPtr<SipMessage> sub = mDum.makeSubscription 
> (mDialog.mRemoteTarget, getEventType(), getAppDialogSet()->reuse());
>             mDum.send(sub);
>          }
>          //!dcm! -- new sub created above, when does this usage get  
> destroyed?
>          //return;
>       }
>       else
>       {
>          // leave the usage around until the timeout
>          // !dlb! would be nice to set the state to something dead,  
> but not used
>          mDum.addTimer(DumTimeout::SubscriptionRetry,
>                        retry,
>                        getBaseHandle(),
>                        ++mTimerSeq);
>          // leave the usage around until the timeout
>          return;
>       }
>
>       delete this;
>       return;
>
> *snip*
>
> 	So, returning 0 on the onRequestRetry() callback causes the  
> AppDialogSet* to be unset (See impl of AppDialogSet::reuse()), and  
> the ClientSubscription to be deleted. Then, in ~DialogUsage() (from  
> which ClientSubscription is derived)...
>
> *snip*
> DialogUsage::~DialogUsage()
> {
>    mDialog.possiblyDie();
> }
> *snip*
>
> When the Dialog gets this call, it calls DialogUsageManager::destroy 
> () on itself, like so:
>
> *snip*
> void Dialog::possiblyDie()
> {
>    // !slg! Note:  dialogs should really stick around for 32s, in  
> order to ensure that all 2xx retransmissions get Ack'd, then BYE'd  
> correctly
>    if (!mDestroying)
>    {
>       if (mClientSubscriptions.empty() &&
>           mServerSubscriptions.empty() &&
>           !mInviteSession)
>       {
>          mDestroying = true;
>          mDum.destroy(this);
>       }
>    }
> }
> *snip*
>
> Which in turn causes the following to happen:
>
> *snip*
> void
> DialogUsageManager::destroy(Dialog* d)
> {
>    if (mShutdownState != Destroying)
>    {
>       post(new DestroyUsage(d));
>    }
>    else
>    {
>       InfoLog(<< "DialogUsageManager::destroy() not posting to  
> stack");
>    }
> }
> *snip*
>
> 	So, we asynchronously schedule the destruction of the Dialog. The  
> DialogSet is still around. If some messaging comes in, and beats  
> the DestroyUsage message to the front of the queue (not an unlikely  
> scenario at all, and made much more likely by sitting at a  
> breakpoint), the messaging will be sent to the DialogSet, which  
> will then core in exactly the manner you're seeing.
>
> 	Ultimately, if the AppDialogSet is reused, we have to be  
> completely sure that the DialogSet will be gone before any further  
> messaging can arrive.
>
> 	On top of this, we have the issue of what happens when we have  
> more than one Dialog in this DialogSet (say, the SUBSCRIBE got  
> forked). If one of the forks decides it needs to start a new  
> subscription in the manner above, the other subscriptions in the  
> DialogSet will find themselves in trouble very shortly. This isn't  
> even a race. It's basically guaranteed to fail.
>
> 	My inclination is to turn AppDialogSet::reuse() into a clone(), or  
> into a call to the AppDialogSetFactory. (Or, we need to just be  
> tolerant if the AppDialogSet is null. But I imagine we have opened  
> a rather large can or worms here, and there are probably other  
> nasty race-conditions that we'll encounter as a result of the non- 
> atomicity of DialogSet teardown.)
>
> Best regards,
> Byron Campen
>
>> It’s possible you’ve exposed a race condition by using the  
>> debugger – I can’t think of what it would be offhand though. :  
>> (  The Debug log should provide more insight.
>>
>>
>>
>> Scott
>>
>>
>>
>> From: Aron Rosenberg [mailto:arosenberg at sightspeed.com]
>> Sent: Friday, July 27, 2007 2:48 PM
>> To: Scott Godin; resiprocate-devel at list.resiprocate.org
>> Subject: RE: [reSIProcate] NULL pointer crash in  
>> DialogSet::dispatch withmAppDialogSet
>>
>>
>>
>> One thing that might be going on is that when I sit in a  
>> breakpoint, a 408 transaction is created by the resip stack  
>> because of the “timeout”. Could this because the reason for the  
>> crash, onRemoved gets called, but a 408 is still active…
>>
>>
>>
>> -Aron
>>
>>
>>
>> From: Scott Godin [mailto:slgodin at icescape.com]
>> Sent: Friday, July 27, 2007 10:24 AM
>> To: Aron Rosenberg; resiprocate-devel at list.resiprocate.org
>> Subject: RE: [reSIProcate] NULL pointer crash in  
>> DialogSet::dispatch withmAppDialogSet
>>
>>
>>
>> Are you calling all DUM API’s from the same thread (including the  
>> process loop)?
>>
>>
>>
>> From: Aron Rosenberg [mailto:arosenberg at sightspeed.com]
>> Sent: Friday, July 27, 2007 12:39 PM
>> To: Scott Godin; resiprocate-devel at list.resiprocate.org
>> Subject: RE: [reSIProcate] NULL pointer crash in  
>> DialogSet::dispatch withmAppDialogSet
>>
>>
>>
>> DUM controls everything. This crash only happens if I have hit a  
>> breakpoint and let it sit for a couple seconds before continuing.  
>> For example, If I have a breakpoint in the onTerminated or  
>> onRequestRetry for ClientSubscriptionHandler implementation. It  
>> will crash after the continue. This never happens except in debug  
>> mode and only if breakpoints were used on the resiprocate thread.
>>
>>
>>
>> -Aron
>>
>>
>>
>> From: Scott Godin [mailto:slgodin at icescape.com]
>> Sent: Friday, July 27, 2007 6:48 AM
>> To: Aron Rosenberg; resiprocate-devel at list.resiprocate.org
>> Subject: RE: [reSIProcate] NULL pointer crash in  
>> DialogSet::dispatch withmAppDialogSet
>>
>>
>>
>> If the DialogSet is still around the AppDialogSet should also be  
>> around.  Are you deleting your AppDialogSet instances or letting  
>> DUM control their lifetime?
>>
>>
>>
>> Scott
>>
>>
>>
>> From: resiprocate-devel-bounces at list.resiprocate.org  
>> [mailto:resiprocate-devel-bounces at list.resiprocate.org] On Behalf  
>> Of Aron Rosenberg
>> Sent: Thursday, July 26, 2007 12:20 PM
>> To: resiprocate-devel at list.resiprocate.org
>> Subject: [reSIProcate] NULL pointer crash in DialogSet::dispatch  
>> withmAppDialogSet
>>
>>
>>
>> So I have been debugging some resip issues with  
>> ClientSubscriptions and keep getting a NULL pointer crash after I  
>> have been sitting at some breakpoints in the code. The crash  
>> occurs in:
>>
>>
>>
>> DialogSet.cxx line 775
>>
>>       DebugLog ( << "### Calling CreateAppDialog ###: " <<  
>> std::endl << std::endl <<msg);
>>
>> --->  AppDialog* appDialog = mAppDialogSet->createAppDialog(msg);
>>
>>       dialog->mAppDialog = appDialog;
>>
>>       appDialog->mDialog = dialog;
>>
>>       dialog->dispatch(msg);
>>
>>
>>
>> mAppDialogSet is NULL. The call stack for this crash is as follows:
>>
>>
>>
>> >          resip::DialogSet::dispatch(const resip::SipMessage &  
>> msg={...})  Line 775 + 0x1c  C++
>>
>>             resip::DialogUsageManager::processRequest(const  
>> resip::SipMessage & request={...})  Line 1560   C++
>>
>>             resip::DialogUsageManager::incomingProcess 
>> (std::auto_ptr<resip::Message> msg={...})  Line 1281 C++
>>
>>             resip::DialogUsageManager::internalProcess 
>> (std::auto_ptr<resip::Message> msg={...})  Line 1134   C++
>>
>>             resip::DialogUsageManager::process()  Line 1300 +  
>> 0x49  C++
>>
>>             SipEP::run()  Line 2404 + 0xb     C++ // Is our thread  
>> loop
>>
>>
>>
>>
>>
>> mState == Established
>>
>> msg is a NOTIFY
>>
>>
>>
>> The crash happens after I have requested the ->end() of the  
>> ClientSubscription and a NOTIFY comes in from the wire and I call  
>> rejectUpdate(404) since the user has “ended” the Client  
>> Subscription. Then I usually get a call to onRequestRetry where we  
>> return 0. The crash generally occurs after that. I will try to get  
>> a resip debug log up to the crash.
>>
>>
>>
>> Does this appear to be a resip issue or are we improperly  
>> implementing our own AppDialog and AppDialogSet instances.
>>
>>
>>
>> -Aron
>>
>>
>>
>>
>>
>> ---------------------------------------------
>>
>> Aron Rosenberg
>>
>> Founder and CTO
>>
>> SightSpeed - http://www.sightspeed.com/
>>
>>
>>
>> 918 Parker St, Suite A14
>>
>> Berkeley, CA 94710
>>
>>
>>
>> Email: arosenberg at sightspeed.com
>>
>> Phone: 510-665-2920
>>
>> Cell: 510-847-7389
>>
>> Fax: 510-649-9569
>>
>> SightSpeed Video Link: http://aron.sightspeed.com
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> resiprocate-devel mailing list
>> resiprocate-devel at list.resiprocate.org
>> https://list.resiprocate.org/mailman/listinfo/resiprocate-devel
>
> _______________________________________________
> resiprocate-devel mailing list
> resiprocate-devel at resiprocate.org
> https://list.resiprocate.org/mailman/listinfo/resiprocate-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.resiprocate.org/pipermail/resiprocate-devel/attachments/20071001/a4cc2ddf/attachment.htm>


More information about the resiprocate-devel mailing list