[reSIProcate] NULL pointer crash in DialogSet::dispatch withmAppDialogSet
Byron Campen
bcampen at estacado.net
Mon Oct 1 17:30:44 CDT 2007
No comments? So, I take this to mean that nobody would be sore at me
if I were to remove every use of AppDialogSet::reuse() that looks
suspicious to me (ie, all of them)? Pipe up!
Best regards,
Byron Campen
> So, I've given this a look, and I have a theory. Aron said that
> the crash happened sometime after the onRequestRetry() callback
> returned 0...
>
> From ClientSubscription.cxx, starting at line 165
> *snip*
> if (msg.header(h_StatusLine).statusCode() == 408)
> {
> InfoLog (<< "Received 408 to SUBSCRIBE "
> << mLastRequest->header(h_To));
> retry = handler->onRequestRetry(getHandle(), 0, msg);
> }
> else
> {
> InfoLog (<< "Received non-408 retriable to SUBSCRIBE "
> << mLastRequest->header(h_To));
> retry = handler->onRequestRetry(getHandle(), msg.header
> (h_RetryAfter).value(), msg);
> }
>
> if (retry < 0)
> {
> DebugLog(<< "Application requested failure on Retry-After");
> mEnded = true;
> handler->onTerminated(getHandle(), msg);
> delete this;
> return;
> }
> else if (retry == 0)
> {
> DebugLog(<< "Application requested immediate retry on
> Retry-After");
> //!dcm! -- why isn't this just a refresh--retry after
> might be
> //middle element and not indicate dialog destruction
> if (mDialog.mRemoteTarget.uri().host().empty())
> {
> SharedPtr<SipMessage> sub = mDum.makeSubscription
> (mLastRequest->header(h_To), getEventType());
> mDum.send(sub);
> }
> else
> {
> SharedPtr<SipMessage> sub = mDum.makeSubscription
> (mDialog.mRemoteTarget, getEventType(), getAppDialogSet()->reuse());
> mDum.send(sub);
> }
> //!dcm! -- new sub created above, when does this usage get
> destroyed?
> //return;
> }
> else
> {
> // leave the usage around until the timeout
> // !dlb! would be nice to set the state to something dead,
> but not used
> mDum.addTimer(DumTimeout::SubscriptionRetry,
> retry,
> getBaseHandle(),
> ++mTimerSeq);
> // leave the usage around until the timeout
> return;
> }
>
> delete this;
> return;
>
> *snip*
>
> So, returning 0 on the onRequestRetry() callback causes the
> AppDialogSet* to be unset (See impl of AppDialogSet::reuse()), and
> the ClientSubscription to be deleted. Then, in ~DialogUsage() (from
> which ClientSubscription is derived)...
>
> *snip*
> DialogUsage::~DialogUsage()
> {
> mDialog.possiblyDie();
> }
> *snip*
>
> When the Dialog gets this call, it calls DialogUsageManager::destroy
> () on itself, like so:
>
> *snip*
> void Dialog::possiblyDie()
> {
> // !slg! Note: dialogs should really stick around for 32s, in
> order to ensure that all 2xx retransmissions get Ack'd, then BYE'd
> correctly
> if (!mDestroying)
> {
> if (mClientSubscriptions.empty() &&
> mServerSubscriptions.empty() &&
> !mInviteSession)
> {
> mDestroying = true;
> mDum.destroy(this);
> }
> }
> }
> *snip*
>
> Which in turn causes the following to happen:
>
> *snip*
> void
> DialogUsageManager::destroy(Dialog* d)
> {
> if (mShutdownState != Destroying)
> {
> post(new DestroyUsage(d));
> }
> else
> {
> InfoLog(<< "DialogUsageManager::destroy() not posting to
> stack");
> }
> }
> *snip*
>
> So, we asynchronously schedule the destruction of the Dialog. The
> DialogSet is still around. If some messaging comes in, and beats
> the DestroyUsage message to the front of the queue (not an unlikely
> scenario at all, and made much more likely by sitting at a
> breakpoint), the messaging will be sent to the DialogSet, which
> will then core in exactly the manner you're seeing.
>
> Ultimately, if the AppDialogSet is reused, we have to be
> completely sure that the DialogSet will be gone before any further
> messaging can arrive.
>
> On top of this, we have the issue of what happens when we have
> more than one Dialog in this DialogSet (say, the SUBSCRIBE got
> forked). If one of the forks decides it needs to start a new
> subscription in the manner above, the other subscriptions in the
> DialogSet will find themselves in trouble very shortly. This isn't
> even a race. It's basically guaranteed to fail.
>
> My inclination is to turn AppDialogSet::reuse() into a clone(), or
> into a call to the AppDialogSetFactory. (Or, we need to just be
> tolerant if the AppDialogSet is null. But I imagine we have opened
> a rather large can or worms here, and there are probably other
> nasty race-conditions that we'll encounter as a result of the non-
> atomicity of DialogSet teardown.)
>
> Best regards,
> Byron Campen
>
>> It’s possible you’ve exposed a race condition by using the
>> debugger – I can’t think of what it would be offhand though. :
>> ( The Debug log should provide more insight.
>>
>>
>>
>> Scott
>>
>>
>>
>> From: Aron Rosenberg [mailto:arosenberg at sightspeed.com]
>> Sent: Friday, July 27, 2007 2:48 PM
>> To: Scott Godin; resiprocate-devel at list.resiprocate.org
>> Subject: RE: [reSIProcate] NULL pointer crash in
>> DialogSet::dispatch withmAppDialogSet
>>
>>
>>
>> One thing that might be going on is that when I sit in a
>> breakpoint, a 408 transaction is created by the resip stack
>> because of the “timeout”. Could this because the reason for the
>> crash, onRemoved gets called, but a 408 is still active…
>>
>>
>>
>> -Aron
>>
>>
>>
>> From: Scott Godin [mailto:slgodin at icescape.com]
>> Sent: Friday, July 27, 2007 10:24 AM
>> To: Aron Rosenberg; resiprocate-devel at list.resiprocate.org
>> Subject: RE: [reSIProcate] NULL pointer crash in
>> DialogSet::dispatch withmAppDialogSet
>>
>>
>>
>> Are you calling all DUM API’s from the same thread (including the
>> process loop)?
>>
>>
>>
>> From: Aron Rosenberg [mailto:arosenberg at sightspeed.com]
>> Sent: Friday, July 27, 2007 12:39 PM
>> To: Scott Godin; resiprocate-devel at list.resiprocate.org
>> Subject: RE: [reSIProcate] NULL pointer crash in
>> DialogSet::dispatch withmAppDialogSet
>>
>>
>>
>> DUM controls everything. This crash only happens if I have hit a
>> breakpoint and let it sit for a couple seconds before continuing.
>> For example, If I have a breakpoint in the onTerminated or
>> onRequestRetry for ClientSubscriptionHandler implementation. It
>> will crash after the continue. This never happens except in debug
>> mode and only if breakpoints were used on the resiprocate thread.
>>
>>
>>
>> -Aron
>>
>>
>>
>> From: Scott Godin [mailto:slgodin at icescape.com]
>> Sent: Friday, July 27, 2007 6:48 AM
>> To: Aron Rosenberg; resiprocate-devel at list.resiprocate.org
>> Subject: RE: [reSIProcate] NULL pointer crash in
>> DialogSet::dispatch withmAppDialogSet
>>
>>
>>
>> If the DialogSet is still around the AppDialogSet should also be
>> around. Are you deleting your AppDialogSet instances or letting
>> DUM control their lifetime?
>>
>>
>>
>> Scott
>>
>>
>>
>> From: resiprocate-devel-bounces at list.resiprocate.org
>> [mailto:resiprocate-devel-bounces at list.resiprocate.org] On Behalf
>> Of Aron Rosenberg
>> Sent: Thursday, July 26, 2007 12:20 PM
>> To: resiprocate-devel at list.resiprocate.org
>> Subject: [reSIProcate] NULL pointer crash in DialogSet::dispatch
>> withmAppDialogSet
>>
>>
>>
>> So I have been debugging some resip issues with
>> ClientSubscriptions and keep getting a NULL pointer crash after I
>> have been sitting at some breakpoints in the code. The crash
>> occurs in:
>>
>>
>>
>> DialogSet.cxx line 775
>>
>> DebugLog ( << "### Calling CreateAppDialog ###: " <<
>> std::endl << std::endl <<msg);
>>
>> ---> AppDialog* appDialog = mAppDialogSet->createAppDialog(msg);
>>
>> dialog->mAppDialog = appDialog;
>>
>> appDialog->mDialog = dialog;
>>
>> dialog->dispatch(msg);
>>
>>
>>
>> mAppDialogSet is NULL. The call stack for this crash is as follows:
>>
>>
>>
>> > resip::DialogSet::dispatch(const resip::SipMessage &
>> msg={...}) Line 775 + 0x1c C++
>>
>> resip::DialogUsageManager::processRequest(const
>> resip::SipMessage & request={...}) Line 1560 C++
>>
>> resip::DialogUsageManager::incomingProcess
>> (std::auto_ptr<resip::Message> msg={...}) Line 1281 C++
>>
>> resip::DialogUsageManager::internalProcess
>> (std::auto_ptr<resip::Message> msg={...}) Line 1134 C++
>>
>> resip::DialogUsageManager::process() Line 1300 +
>> 0x49 C++
>>
>> SipEP::run() Line 2404 + 0xb C++ // Is our thread
>> loop
>>
>>
>>
>>
>>
>> mState == Established
>>
>> msg is a NOTIFY
>>
>>
>>
>> The crash happens after I have requested the ->end() of the
>> ClientSubscription and a NOTIFY comes in from the wire and I call
>> rejectUpdate(404) since the user has “ended” the Client
>> Subscription. Then I usually get a call to onRequestRetry where we
>> return 0. The crash generally occurs after that. I will try to get
>> a resip debug log up to the crash.
>>
>>
>>
>> Does this appear to be a resip issue or are we improperly
>> implementing our own AppDialog and AppDialogSet instances.
>>
>>
>>
>> -Aron
>>
>>
>>
>>
>>
>> ---------------------------------------------
>>
>> Aron Rosenberg
>>
>> Founder and CTO
>>
>> SightSpeed - http://www.sightspeed.com/
>>
>>
>>
>> 918 Parker St, Suite A14
>>
>> Berkeley, CA 94710
>>
>>
>>
>> Email: arosenberg at sightspeed.com
>>
>> Phone: 510-665-2920
>>
>> Cell: 510-847-7389
>>
>> Fax: 510-649-9569
>>
>> SightSpeed Video Link: http://aron.sightspeed.com
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> resiprocate-devel mailing list
>> resiprocate-devel at list.resiprocate.org
>> https://list.resiprocate.org/mailman/listinfo/resiprocate-devel
>
> _______________________________________________
> resiprocate-devel mailing list
> resiprocate-devel at resiprocate.org
> https://list.resiprocate.org/mailman/listinfo/resiprocate-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.resiprocate.org/pipermail/resiprocate-devel/attachments/20071001/a4cc2ddf/attachment.htm>
More information about the resiprocate-devel
mailing list