[reSIProcate] NULL pointer crash in DialogSet::dispatch withmAppDialogSet
Scott Godin
slgodin at icescape.com
Fri Oct 5 16:30:54 CDT 2007
Sorry for the late response....
I think the fix you made is a good bandaid for now. You are right we
have a real issue with AppDialogSet::reuse() and DialogSets that have
multiple Dialogs. We should look at a more permanent solution in a
future release. Currently Reuse() is only used for ClientSubscriptions
- I think we should revisit if it is even required.
Scott
From: Byron Campen [mailto:bcampen at estacado.net]
Sent: Wednesday, September 26, 2007 5:04 PM
To: Scott Godin
Cc: Aron Rosenberg; resiprocate-devel at list.resiprocate.org
Subject: Re: [reSIProcate] NULL pointer crash in DialogSet::dispatch
withmAppDialogSet
So, I've given this a look, and I have a theory. Aron said
that the crash happened sometime after the onRequestRetry() callback
returned 0...
>From ClientSubscription.cxx, starting at line 165
*snip*
if (msg.header(h_StatusLine).statusCode() == 408)
{
InfoLog (<< "Received 408 to SUBSCRIBE "
<< mLastRequest->header(h_To));
retry = handler->onRequestRetry(getHandle(), 0, msg);
}
else
{
InfoLog (<< "Received non-408 retriable to SUBSCRIBE "
<< mLastRequest->header(h_To));
retry = handler->onRequestRetry(getHandle(),
msg.header(h_RetryAfter).value(), msg);
}
if (retry < 0)
{
DebugLog(<< "Application requested failure on Retry-After");
mEnded = true;
handler->onTerminated(getHandle(), msg);
delete this;
return;
}
else if (retry == 0)
{
DebugLog(<< "Application requested immediate retry on
Retry-After");
//!dcm! -- why isn't this just a refresh--retry after might be
//middle element and not indicate dialog destruction
if (mDialog.mRemoteTarget.uri().host().empty())
{
SharedPtr<SipMessage> sub =
mDum.makeSubscription(mLastRequest->header(h_To), getEventType());
mDum.send(sub);
}
else
{
SharedPtr<SipMessage> sub =
mDum.makeSubscription(mDialog.mRemoteTarget, getEventType(),
getAppDialogSet()->reuse());
mDum.send(sub);
}
//!dcm! -- new sub created above, when does this usage get
destroyed?
//return;
}
else
{
// leave the usage around until the timeout
// !dlb! would be nice to set the state to something dead, but
not used
mDum.addTimer(DumTimeout::SubscriptionRetry,
retry,
getBaseHandle(),
++mTimerSeq);
// leave the usage around until the timeout
return;
}
delete this;
return;
*snip*
So, returning 0 on the onRequestRetry() callback causes the
AppDialogSet* to be unset (See impl of AppDialogSet::reuse()), and the
ClientSubscription to be deleted. Then, in ~DialogUsage() (from which
ClientSubscription is derived)...
*snip*
DialogUsage::~DialogUsage()
{
mDialog.possiblyDie();
}
*snip*
When the Dialog gets this call, it calls DialogUsageManager::destroy()
on itself, like so:
*snip*
void Dialog::possiblyDie()
{
// !slg! Note: dialogs should really stick around for 32s, in order
to ensure that all 2xx retransmissions get Ack'd, then BYE'd correctly
if (!mDestroying)
{
if (mClientSubscriptions.empty() &&
mServerSubscriptions.empty() &&
!mInviteSession)
{
mDestroying = true;
mDum.destroy(this);
}
}
}
*snip*
Which in turn causes the following to happen:
*snip*
void
DialogUsageManager::destroy(Dialog* d)
{
if (mShutdownState != Destroying)
{
post(new DestroyUsage(d));
}
else
{
InfoLog(<< "DialogUsageManager::destroy() not posting to stack");
}
}
*snip*
So, we asynchronously schedule the destruction of the Dialog. The
DialogSet is still around. If some messaging comes in, and beats the
DestroyUsage message to the front of the queue (not an unlikely scenario
at all, and made much more likely by sitting at a breakpoint), the
messaging will be sent to the DialogSet, which will then core in exactly
the manner you're seeing.
Ultimately, if the AppDialogSet is reused, we have to be completely
sure that the DialogSet will be gone before any further messaging can
arrive.
On top of this, we have the issue of what happens when we have more
than one Dialog in this DialogSet (say, the SUBSCRIBE got forked). If
one of the forks decides it needs to start a new subscription in the
manner above, the other subscriptions in the DialogSet will find
themselves in trouble very shortly. This isn't even a race. It's
basically guaranteed to fail.
My inclination is to turn AppDialogSet::reuse() into a clone(), or
into a call to the AppDialogSetFactory. (Or, we need to just be tolerant
if the AppDialogSet is null. But I imagine we have opened a rather large
can or worms here, and there are probably other nasty race-conditions
that we'll encounter as a result of the non-atomicity of DialogSet
teardown.)
Best regards,
Byron Campen
It's possible you've exposed a race condition by using the debugger - I
can't think of what it would be offhand though. : ( The Debug log
should provide more insight.
Scott
From: Aron Rosenberg [mailto:arosenberg at sightspeed.com]
Sent: Friday, July 27, 2007 2:48 PM
To: Scott Godin; resiprocate-devel at list.resiprocate.org
Subject: RE: [reSIProcate] NULL pointer crash in DialogSet::dispatch
withmAppDialogSet
One thing that might be going on is that when I sit in a breakpoint, a
408 transaction is created by the resip stack because of the "timeout".
Could this because the reason for the crash, onRemoved gets called, but
a 408 is still active...
-Aron
________________________________
From: Scott Godin [mailto:slgodin at icescape.com]
Sent: Friday, July 27, 2007 10:24 AM
To: Aron Rosenberg; resiprocate-devel at list.resiprocate.org
Subject: RE: [reSIProcate] NULL pointer crash in DialogSet::dispatch
withmAppDialogSet
Are you calling all DUM API's from the same thread (including the
process loop)?
From: Aron Rosenberg [mailto:arosenberg at sightspeed.com]
Sent: Friday, July 27, 2007 12:39 PM
To: Scott Godin; resiprocate-devel at list.resiprocate.org
Subject: RE: [reSIProcate] NULL pointer crash in DialogSet::dispatch
withmAppDialogSet
DUM controls everything. This crash only happens if I have hit a
breakpoint and let it sit for a couple seconds before continuing. For
example, If I have a breakpoint in the onTerminated or onRequestRetry
for ClientSubscriptionHandler implementation. It will crash after the
continue. This never happens except in debug mode and only if
breakpoints were used on the resiprocate thread.
-Aron
________________________________
From: Scott Godin [mailto:slgodin at icescape.com]
Sent: Friday, July 27, 2007 6:48 AM
To: Aron Rosenberg; resiprocate-devel at list.resiprocate.org
Subject: RE: [reSIProcate] NULL pointer crash in DialogSet::dispatch
withmAppDialogSet
If the DialogSet is still around the AppDialogSet should also be around.
Are you deleting your AppDialogSet instances or letting DUM control
their lifetime?
Scott
From: resiprocate-devel-bounces at list.resiprocate.org
[mailto:resiprocate-devel-bounces at list.resiprocate.org] On Behalf Of
Aron Rosenberg
Sent: Thursday, July 26, 2007 12:20 PM
To: resiprocate-devel at list.resiprocate.org
Subject: [reSIProcate] NULL pointer crash in DialogSet::dispatch
withmAppDialogSet
So I have been debugging some resip issues with ClientSubscriptions and
keep getting a NULL pointer crash after I have been sitting at some
breakpoints in the code. The crash occurs in:
DialogSet.cxx line 775
DebugLog ( << "### Calling CreateAppDialog ###: " << std::endl <<
std::endl <<msg);
---> AppDialog* appDialog = mAppDialogSet->createAppDialog(msg);
dialog->mAppDialog = appDialog;
appDialog->mDialog = dialog;
dialog->dispatch(msg);
mAppDialogSet is NULL. The call stack for this crash is as follows:
> resip::DialogSet::dispatch(const resip::SipMessage &
msg={...}) Line 775 + 0x1c C++
resip::DialogUsageManager::processRequest(const
resip::SipMessage & request={...}) Line 1560 C++
resip::DialogUsageManager::incomingProcess(std::auto_ptr<resip::Message>
msg={...}) Line 1281 C++
resip::DialogUsageManager::internalProcess(std::auto_ptr<resip::Message>
msg={...}) Line 1134 C++
resip::DialogUsageManager::process() Line 1300 + 0x49 C++
SipEP::run() Line 2404 + 0xb C++ // Is our thread loop
mState == Established
msg is a NOTIFY
The crash happens after I have requested the ->end() of the
ClientSubscription and a NOTIFY comes in from the wire and I call
rejectUpdate(404) since the user has "ended" the Client Subscription.
Then I usually get a call to onRequestRetry where we return 0. The crash
generally occurs after that. I will try to get a resip debug log up to
the crash.
Does this appear to be a resip issue or are we improperly implementing
our own AppDialog and AppDialogSet instances.
-Aron
---------------------------------------------
Aron Rosenberg
Founder and CTO
SightSpeed - http://www.sightspeed.com/
918 Parker St, Suite A14
Berkeley, CA 94710
Email: arosenberg at sightspeed.com
Phone: 510-665-2920
Cell: 510-847-7389
Fax: 510-649-9569
SightSpeed Video Link: http://aron.sightspeed.com
<http://aron.sightspeed.com/>
_______________________________________________
resiprocate-devel mailing list
resiprocate-devel at list.resiprocate.org
https://list.resiprocate.org/mailman/listinfo/resiprocate-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.resiprocate.org/pipermail/resiprocate-devel/attachments/20071005/9c7a31b0/attachment.htm>
More information about the resiprocate-devel
mailing list