[reSIProcate] NULL pointer crash in DialogSet::dispatch withmAppDialogSet

Scott Godin slgodin at icescape.com
Mon Oct 1 21:27:33 CDT 2007


I given this a read through once, but I need more time to understand what
the issue is.  Hopefully I will get some time this week to think about it.
: )

 

From: Byron Campen [mailto:bcampen at estacado.net] 
Sent: October 1, 2007 6:31 PM
To: resiprocate-devel
Cc: Scott Godin
Subject: Re: [reSIProcate] NULL pointer crash in DialogSet::dispatch
withmAppDialogSet

 

            No comments? So, I take this to mean that nobody would be sore
at me if I were to remove every use of AppDialogSet::reuse() that looks
suspicious to me (ie, all of them)? Pipe up!

 

Best regards,

Byron Campen

 

 





            So, I've given this a look, and I have a theory. Aron said that
the crash happened sometime after the onRequestRetry() callback returned
0...

 

>From ClientSubscription.cxx, starting at line 165

*snip*

      if (msg.header(h_StatusLine).statusCode() == 408)

      {

         InfoLog (<< "Received 408 to SUBSCRIBE "

                  << mLastRequest->header(h_To));

         retry = handler->onRequestRetry(getHandle(), 0, msg);

      }

      else

      {

         InfoLog (<< "Received non-408 retriable to SUBSCRIBE "

                  << mLastRequest->header(h_To));

         retry = handler->onRequestRetry(getHandle(),
msg.header(h_RetryAfter).value(), msg);

      }

 

      if (retry < 0)

      {

         DebugLog(<< "Application requested failure on Retry-After");

         mEnded = true;

         handler->onTerminated(getHandle(), msg);

         delete this;

         return;

      }

      else if (retry == 0)

      {

         DebugLog(<< "Application requested immediate retry on
Retry-After");

         //!dcm! -- why isn't this just a refresh--retry after might be

         //middle element and not indicate dialog destruction       

         if (mDialog.mRemoteTarget.uri().host().empty())

         {

            SharedPtr<SipMessage> sub =
mDum.makeSubscription(mLastRequest->header(h_To), getEventType());

            mDum.send(sub);

         }

         else

         {

            SharedPtr<SipMessage> sub =
mDum.makeSubscription(mDialog.mRemoteTarget, getEventType(),
getAppDialogSet()->reuse());

            mDum.send(sub);

         }

         //!dcm! -- new sub created above, when does this usage get
destroyed?

         //return;

      }

      else 

      {

         // leave the usage around until the timeout

         // !dlb! would be nice to set the state to something dead, but not
used

         mDum.addTimer(DumTimeout::SubscriptionRetry, 

                       retry, 

                       getBaseHandle(),

                       ++mTimerSeq);

         // leave the usage around until the timeout

         return;

      }

            

      delete this;

      return;

 

*snip*

 

     So, returning 0 on the onRequestRetry() callback causes the
AppDialogSet* to be unset (See impl of AppDialogSet::reuse()), and the
ClientSubscription to be deleted. Then, in ~DialogUsage() (from which
ClientSubscription is derived)...

 

*snip*

DialogUsage::~DialogUsage()

{

   mDialog.possiblyDie();

}

*snip*

 

When the Dialog gets this call, it calls DialogUsageManager::destroy() on
itself, like so:

 

*snip*

void Dialog::possiblyDie()

{

   // !slg! Note:  dialogs should really stick around for 32s, in order to
ensure that all 2xx retransmissions get Ack'd, then BYE'd correctly

   if (!mDestroying)

   {

      if (mClientSubscriptions.empty() &&

          mServerSubscriptions.empty() &&

          !mInviteSession)

      {

         mDestroying = true;

         mDum.destroy(this);

      }

   }

}

*snip*

 

Which in turn causes the following to happen:

 

*snip*

void

DialogUsageManager::destroy(Dialog* d)

{

   if (mShutdownState != Destroying)

   {

      post(new DestroyUsage(d));

   }

   else

   {

      InfoLog(<< "DialogUsageManager::destroy() not posting to stack");

   }

}

*snip*

 

     So, we asynchronously schedule the destruction of the Dialog. The
DialogSet is still around. If some messaging comes in, and beats the
DestroyUsage message to the front of the queue (not an unlikely scenario at
all, and made much more likely by sitting at a breakpoint), the messaging
will be sent to the DialogSet, which will then core in exactly the manner
you're seeing.

 

     Ultimately, if the AppDialogSet is reused, we have to be completely
sure that the DialogSet will be gone before any further messaging can
arrive.

 

     On top of this, we have the issue of what happens when we have more
than one Dialog in this DialogSet (say, the SUBSCRIBE got forked). If one of
the forks decides it needs to start a new subscription in the manner above,
the other subscriptions in the DialogSet will find themselves in trouble
very shortly. This isn't even a race. It's basically guaranteed to fail.

 

     My inclination is to turn AppDialogSet::reuse() into a clone(), or into
a call to the AppDialogSetFactory. (Or, we need to just be tolerant if the
AppDialogSet is null. But I imagine we have opened a rather large can or
worms here, and there are probably other nasty race-conditions that we'll
encounter as a result of the non-atomicity of DialogSet teardown.)

 

Best regards,

Byron Campen





It's possible you've exposed a race condition by using the debugger - I
can't think of what it would be offhand though. : (  The Debug log should
provide more insight.

 

Scott

 

From: Aron Rosenberg [mailto:arosenberg at sightspeed.com] 
Sent: Friday, July 27, 2007 2:48 PM
To: Scott Godin; resiprocate-devel at list.resiprocate.org
Subject: RE: [reSIProcate] NULL pointer crash in DialogSet::dispatch
withmAppDialogSet

 

One thing that might be going on is that when I sit in a breakpoint, a 408
transaction is created by the resip stack because of the "timeout". Could
this because the reason for the crash, onRemoved gets called, but a 408 is
still active.

 

-Aron

 

  _____  

From: Scott Godin [mailto:slgodin at icescape.com] 
Sent: Friday, July 27, 2007 10:24 AM
To: Aron Rosenberg; resiprocate-devel at list.resiprocate.org
Subject: RE: [reSIProcate] NULL pointer crash in DialogSet::dispatch
withmAppDialogSet

 

Are you calling all DUM API's from the same thread (including the process
loop)?

 

From: Aron Rosenberg [mailto:arosenberg at sightspeed.com] 
Sent: Friday, July 27, 2007 12:39 PM
To: Scott Godin; resiprocate-devel at list.resiprocate.org
Subject: RE: [reSIProcate] NULL pointer crash in DialogSet::dispatch
withmAppDialogSet

 

DUM controls everything. This crash only happens if I have hit a breakpoint
and let it sit for a couple seconds before continuing. For example, If I
have a breakpoint in the onTerminated or onRequestRetry for
ClientSubscriptionHandler implementation. It will crash after the continue.
This never happens except in debug mode and only if breakpoints were used on
the resiprocate thread.

 

-Aron

 

  _____  

From: Scott Godin [mailto:slgodin at icescape.com] 
Sent: Friday, July 27, 2007 6:48 AM
To: Aron Rosenberg; resiprocate-devel at list.resiprocate.org
Subject: RE: [reSIProcate] NULL pointer crash in DialogSet::dispatch
withmAppDialogSet

 

If the DialogSet is still around the AppDialogSet should also be around.
Are you deleting your AppDialogSet instances or letting DUM control their
lifetime?

 

Scott

 

From: resiprocate-devel-bounces at list.resiprocate.org
[mailto:resiprocate-devel-bounces at list.resiprocate.org] On Behalf Of Aron
Rosenberg
Sent: Thursday, July 26, 2007 12:20 PM
To: resiprocate-devel at list.resiprocate.org
Subject: [reSIProcate] NULL pointer crash in DialogSet::dispatch
withmAppDialogSet

 

So I have been debugging some resip issues with ClientSubscriptions and keep
getting a NULL pointer crash after I have been sitting at some breakpoints
in the code. The crash occurs in:

 

DialogSet.cxx line 775

      DebugLog ( << "### Calling CreateAppDialog ###: " << std::endl <<
std::endl <<msg);

--->  AppDialog* appDialog = mAppDialogSet->createAppDialog(msg);

      dialog->mAppDialog = appDialog;

      appDialog->mDialog = dialog;

      dialog->dispatch(msg);

 

mAppDialogSet is NULL. The call stack for this crash is as follows:

 

>          resip::DialogSet::dispatch(const resip::SipMessage & msg={...})
Line 775 + 0x1c  C++

            resip::DialogUsageManager::processRequest(const
resip::SipMessage & request={...})  Line 1560   C++

 
resip::DialogUsageManager::incomingProcess(std::auto_ptr<resip::Message>
msg={...})  Line 1281 C++

 
resip::DialogUsageManager::internalProcess(std::auto_ptr<resip::Message>
msg={...})  Line 1134   C++

            resip::DialogUsageManager::process()  Line 1300 + 0x49  C++

            SipEP::run()  Line 2404 + 0xb     C++ // Is our thread loop

 

 

mState == Established

msg is a NOTIFY

 

The crash happens after I have requested the ->end() of the
ClientSubscription and a NOTIFY comes in from the wire and I call
rejectUpdate(404) since the user has "ended" the Client Subscription. Then I
usually get a call to onRequestRetry where we return 0. The crash generally
occurs after that. I will try to get a resip debug log up to the crash.

 

Does this appear to be a resip issue or are we improperly implementing our
own AppDialog and AppDialogSet instances.

 

-Aron

 

 

---------------------------------------------

Aron Rosenberg

Founder and CTO

SightSpeed - http://www.sightspeed.com/

 

918 Parker St, Suite A14

Berkeley, CA 94710

 

Email: arosenberg at sightspeed.com

Phone: 510-665-2920

Cell: 510-847-7389

Fax: 510-649-9569

SightSpeed Video Link: http://aron.sightspeed.com
<http://aron.sightspeed.com/> 

 

 

 

_______________________________________________

resiprocate-devel mailing list

resiprocate-devel at list.resiprocate.org

https://list.resiprocate.org/mailman/listinfo/resiprocate-devel

 

_______________________________________________

resiprocate-devel mailing list

resiprocate-devel at resiprocate.org

https://list.resiprocate.org/mailman/listinfo/resiprocate-devel

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.resiprocate.org/pipermail/resiprocate-devel/attachments/20071001/2fb910ac/attachment.htm>


More information about the resiprocate-devel mailing list