[reSIProcate] AbstractFifo bug?

david Butcher david at purplecomm.com
Tue May 24 16:39:35 CDT 2005


Both getNext and getNext(int) look wrong.

Being woken up by the condition is not a guarantee that there is a message
available -- some other thread may have grabbed the element.

Need to check again after the condition returns and loop around the condition.

I'll try and fix it and test...

david



Quoting Scott Godin <slgodin at icescape.com>:

> I?ve noticed this problem too.   I?m pretty sure it?s Windows specific.
> There is relatively new code in the Condition class for Windows.  Also
> AbstractFifo.getNext(timeout) with a timeout hasn?t been used much inside
> the stack itself.
>
>
>
> Note:  The while (mFifo.empty()) line you are referencing is from getNext()
> with no parameters ? but the assertion happens in getNext(timeout).
>
>
>
> The temporary fix I?ve been using is to change line 44 from
>
> If (mFifo.empty())  to
>
> While (mFifo.empty())
>
> This more closely matches what getNext() without a timeout is doing.
> Although I?m not entirely sure why getNext is even using a while statement.
> Can anyone explain that?
>
>
>
> My best guess it that there is a bug a with the windows Condition ? such
> that it will signal, even though nothing has been added to the Fifo.
>
>
>
> I plan to



look at this closer this week.
>
>
>
> Note:  The mSize counter is always adjusted within a Mutex ? so its value
> should be properly maintained and protected.
>
>
>
> Scott
>
>
>
>  _____
>
> From: Micky Kaufmann [mailto:micky at proxy.co.il]
> Sent: Sunday, May 22, 2005 5:16 AM
> To: Derek MacDonald; resiprocate-devel at list.sipfoundry.org
> Subject: RE: [reSIProcate] AbstractFifo bug?
>
>
>
> I?m working on windows, visual studio 7.1   (Maybe the problem is with the
> Mutex being recursive?)
>
> Still what troubles me in the AbstractFifo is the fact that the while
> condition is ? while (mFifo.empty()) ? but the assert is not on mFifo. So it
> seems that the while condition should be ? while (mSize!=0) ?.
>
> The bug appears when a client tries to reach a non registered contact - the
> server tries to skip remaining stages of a transaction and terminate it but
> it fails...
>
>
>
> Here?s the stack trace:
>
>>          repro.exe!_NMSG_WRITE(int rterrnum=10)  Line 195      C
>
>            repro.exe!abort()  Line 44 + 0x7  C
>
>            repro.exe!_assert(const char * expr=0x00a652b0, const char *
> filename=0x00a65298, unsigned int lineno=57)  Line 306           C
>
>            repro.exe!resip::AbstractFifo::getNext(int ms=100)  Line 57 +
> 0x1a            C++
>
>            repro.exe!resip::TimeLimitFifo<resip::Message>::getNext(int
> ms=100)  Line 143 + 0xc            C++
>
>            repro.exe!repro::Proxy::thread()  Line 67 + 0xd    C++
>
>            repro.exe!threadWrapper(void * threadParm=0x000e06dc)  Line 34 +
> 0xd C++
>
>            kernel32.dll!7c80b50b()
>
>            ntdll.dll!7c96e0d4()
>
>            kernel32.dll!7c8399f3()
>
>
>
> And here is the debug info produces by repro:
>
>
>
> INFO | 20050522-110727.926 | c:\sip\repro\Debug\repro.exe | REPRO:APP | 1724
> | proxy.cxx:133 | Inserting new RequestContext
> tid=28CA0A6D1F57474F96CCF80478461139 -> RequestContext:  identity=
> candidates=[] count=1 final=0
>
> INFO | 20050522-110727.941 | c:\sip\repro\Debug\repro.exe | REPRO:APP | 1724
> | locationserver.cxx:50 | LocationServer monkey : no registered target for
> sip:33 at mickyxp.proxy.local send 480
>
> INFO | 20050522-110732.705 | c:\sip\repro\Debug\repro.exe |
> RESIP:TRANSACTION | 3356 | TransactionState.cxx:357 | discarding stray
> response: SipResp: 200 tid=c0a800190131c9b142904bd40000751c0000000a
> cseq=OPTIONS / 10 from(wire)
>
> INFO | 20050522-110732.986 | c:\sip\repro\Debug\repro.exe | REPRO:APP | 1724
> | requestcontext.cxx:49 | RequestContext::process(TransactionTerminated)
> 28CA0A6D1F57474F96CCF80478461139 : RequestContext:  identity= candidates=[]
> count=1 final=1
>
> Assertion failed: mSize != 0, file .\os\AbstractFifo.cxx, line 57
>
> INFO | 20050522-110752.699 | c:\sip\repro\Debug\repro.exe |
> RESIP:TRANSACTION | 3356 | TransactionState.cxx:357 | discarding stray
> response: SipResp: 200 tid=c0a800190131c9b142904be800006a170000000b
> cseq=OPTIONS / 11 from(wire)
>
> INFO | 20050522-110812.724 | c:\sip\repro\Debug\repro.exe |
> RESIP:TRANSACTION | 3356 | TransactionState.cxx:357 | discarding stray
> response: SipResp: 200 tid=c0a800190131c9b142904bfc000056860000000c
> cseq=OPTIONS / 12 from(wire)
>
>
>
>  _____
>
> From: Derek MacDonald [mailto:derek at xten.com]
> Sent: ? 19 ??? 2005 20:20
> To: Micky Kaufmann; resiprocate-devel at list.sipfoundry.org
> Subject: RE: [reSIProcate] AbstractFifo bug?
>
>
>
> Can you get a stack trace this? AbstractFifo is used a lot in resip, but it
> does have several subclasses.  The size is incremeneted in the add methods
> of the subclasses, which lock before adding.
>
>
>
> Also, run resiprocate/test/testFifo.cxx on your configuration.
>
>
>
> --Derek
>
>
>
> PS -- What is your platform/compiler?
>
>
>
> CONFIDENTIALITY NOTICE
>
> This email and any files transmitted with it contains proprietary
> information and, unless expressly stated otherwise, all contents and
> attachments are confidential. This email is intended for the addressee(s)
> only and access by anyone else is unauthorized. If you are not an addressee,
> any disclosure, distribution, printing or copying of the contents of this
> email or its attachments, or any action taken in reliance on it, is
> unauthorized and may be unlawful. If you are not an addressee, please inform
> the sender immediately and then delete this email and any copies of it.
> Thank you for your co-operation.
>
>  _____
>
> From: resiprocate-devel-bounces at list.sipfoundry.org
> [mailto:resiprocate-devel-bounces at list.sipfoundry.org] On Behalf Of Micky
> Kaufmann
> Sent: Thursday, May 19, 2005 6:31 AM
> To: resiprocate-devel at list.sipfoundry.org
> Subject: [reSIProcate] AbstractFifo bug?
>
>
>
> Hi. All
>
>
>
> While running repro I go an assert error from the AbstractFifo::getNext
> method:
>
>
>
> void* AbstractFifo::getNext() {
>
>   Lock lock(mMutex); (void)lock;
>
>
>
>   while (mFifo.empty())
>
>   {
>
>      mCondition.wait(mMutex);
>
>   }
>
>   void* firstMessage = mFifo.front();
>
>   mFifo.pop_front();
>
>   assert(mSize != 0);
>
>   mSize--;
>
>   return firstMessage;
>
> }
>
>
>
> As I see it the problem is caused probably because non thread safe adding of
> elements to the Fifo. Meaning mSize++ is missing somewhere and from what I
> investigated the only reason for using mSize instead of using
> std::deque::size is because we want to make sure that mSize < mMaxSize.
>
> Is there another reason?
>
>
>
> Is it ok to replace the lines:
>
>   assert(mSize != 0);
>
>   mSize--;
>
> with
>
>   mSize = mFifo.size();
>
>
>
> Is it ok to replace the line:
>
>   while (mFifo.empty())
>
> with
>
>   while (mSize==0)  //this.empty()
>
>
>
> Any other solution?
>
>





More information about the resiprocate-devel mailing list