[reSIProcate] race condition in AppDialogSet::end()
Hello,
When running stress tests on my application a race condition happens
after several minutes clearing a call from a different thread (stack
trace follows):
__________________________________________________________
signalgw.exe!std::_Tree<std::_Tmap_traits<resip::DialogId,resip::Dialog
*,std::less<resip::DialogId>,std::allocator<std::pair<resip::DialogId
const ,resip::Dialog *> >,0> >::const_iterator::_Inc() Line 370 + 0x15
bytes C++
signalgw.exe!std::_Tree<std::_Tmap_traits<resip::DialogId,resip::Dialog
*,std::less<resip::DialogId>,std::allocator<std::pair<resip::DialogId
const ,resip::Dialog *> >,0> >::const_iterator::operator++() Line 264
C++
signalgw.exe!std::_Tree<std::_Tmap_traits<resip::DialogId,resip::Dialog
*,std::less<resip::DialogId>,std::allocator<std::pair<resip::DialogId
const ,resip::Dialog *> >,0> >::iterator::operator++() Line 462 C++
signalgw.exe!resip::DialogSet::end() Line 901 + 0x2c bytes C++
signalgw.exe!resip::AppDialogSet::end() Line 38 C++
signalgw.exe!TestAppDialogSet::end() Line 53 C++ (from here
downwards is my application code)
signalgw.exe!tSessionInfo::endCall() Line 158 + 0x25 bytes C++
signalgw.exe!UserAgent::clearCall(const
std::basic_string<char,std::char_traits<char>,std::allocator<char> > &
strCallToken="YzcwMDE5MWViZDNkOTdmOTk1MmQyN2YxOTNmNDMyNWE.", int
nQ931=224) Line 473 C++
__________________________________________________________
I get a debug error - the following is the cause ( in DialogSet.cxx,
line 901 )
case Established:
{
-> for (DialogMap::iterator it = mDialogs.begin(); it !=
mDialogs.end(); ++it)
{
try
{
it->second->end();
}
catch(UsageUseException& e)
{
InfoLog (<< "Caught: " << e);
}
}
mState = Terminating;
break;
}
debug error ( "map/set iterator not incrementable" )
I assumed this was because there is a race condition on mDialogs and the
dialog was removed by the resiprocate thread in Dialog.cxx (in the
Dialog destructor, at, mDialogSet.mDialogs.erase(this->getId()); ) just
when the for loop above was started. I searched a bit and found no
obvious method to inform the other thread or protect from calling the
dialog end method. At first I thought of using the virtual destructor of
my AppDialog derived class but the call order
in the the destructor is
mDialogSet.mDialogs.erase(this->getId());
delete mAppDialog;
So the virtual destructor code is reached after the damage is already
done, when it is too late to take any action.
To solve this problem I recompiled the DUM library after switching the
order of the two lines of code above:
delete mAppDialog;
mDialogSet.mDialogs.erase(this->getId());
So now in my AppDialog virtual destructor code, I set a flag so that
getAppDialogSet()->end() will not be called in my application after the
Dialog destructor is invoked. This change seemed to to the trick and
with it the application remains stable in stress tests. If there is a
better way to solve this race condition (i.e a relatively simple method
that does not involve changing the DUM code) please let me know.
Thanks
Amnon