[reSIProcate] resiprocate stack memeory leak?????

FrankYuan frankyuan at emergent-netsolutions.com
Fri Sep 22 09:42:21 CDT 2006


I find a way to trace Leak TIDs by adding time stamp in the 
TransactionState class locally and print out the error msg if its still 
exists for more than two minutes.

I can figure out why the problem occurs and fix my codes later on.

Deeply appreciate your help.

Frank Yuan




Byron Campen wrote:
> Okay, I will see about adding this feature, but the more pressing 
> concern I have right now is the (apparent) leakage of client 
> TransactionStates. Client transactions have a fixed lifetime, as 
> opposed to server transactions. If no response is received in a client 
> transaction, that transaction is supposed to die after 32s (regardless 
> of what the TU does). There is only one exception that I know of; when 
> we have sent an INVITE, and received a provisional response, there are 
> only two things that will cause the client transaction to be torn down:
>
> 1. We receive a final response (in this case, the transaction is 
> cleaned up after 64*T1, or 32s)
> 2. The TU sends a CANCEL to the stack (this will cause a 128*T1, or 
> 64s, timer to be set on the INVITE transaction).
>
> So, if we are not getting final responses, it is up to the TU to 
> decide when it no longer wants to pursue the call. In proxies, this is 
> done with Timer C (See RFC 3261 Sec 16.6 bullet 11, Sec 16.7 bullet 2, 
> and Sec 16.8). In endpoints, it is assumed the human user will 
> eventually give up on the call. I am not aware of any defined 
> mechanism for B2BUAs.
>
> Since you are running under heavy load, it is very likely stuff is 
> getting dropped by the kernel, so it seems likely that we are winding 
> up in a situation where the only thing that can keep us from leaking 
> is CANCEL from the TU.
>
> Best regards,
> Byron Campen
>
>> Yes, very useful.
>>
>> Could  you add debug msg concerning the potential leak of TID  if 
>> there is no response from TU when the timer expires?
>>
>> Could you list the conditions where TIDs may get leaked in 
>> Transaction Map?
>>
>> case 1:  For None INVITE Requests (Internal and Externam),  no 
>> resposne from remote and its own TU etc.
>> case 2: ...............
>> .............................
>>                   
>>
>> Frank Yuan
>> Emergent-Netsolutions.com
>> 972-359-6600
>>
>>
>> Byron Campen wrote:
>>> There is not (at present) any way to detect if particular 
>>> transactions have been up for an unusually long time. We could add a 
>>> debug feature that would have the stack start a timer when it sends 
>>> a request to the TU, and if that timer fires and finds that the 
>>> transaction is still in its initial state (ie, it hasn't gotten 
>>> anything from the TU), log information about that transaction, and 
>>> maybe even poke the TU. Would this be useful?
>>>
>>> Best regards,
>>> Byron Campen
>>>
>>>
>>>> Your input is very helpful.
>>>>
>>>> Are there ways for TU to detect TID leaks and free the lost TIDs?
>>>>
>>>> If yes, how? If no, it is tough to use this stateful stack.
>>>>
>>>> Could the stateless resip stack be used in order to get rid of TIDs 
>>>> issue and hughe memory holding (32 to 64 seconds holding time for 
>>>> each request)?
>>>>
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Frank Yuan
>>>> Emergent-Netsolutions.com
>>>> 972-359-6600
>>>>
>>>>
>>>> Byron Campen wrote:
>>>>> I disagree with your first point. If the message gets to the TU, 
>>>>> it is the TU's responsibility to handle it. I have in the past 
>>>>> proposed something like a SipStack::defer(const resip::Data& tid, 
>>>>> MethodType method) that would tell the stack to clean up state for 
>>>>> that tid, but this idea did not take on. 
>>>>>
>>>>> As for the call volume question, you can set your TU's fifo to 
>>>>> have size and/or time-depth limitations (by re-initializing your 
>>>>> TU's fifo in the c'tor). If the stack detects that your fifo is 
>>>>> overfull, it will not forward new requests to you, but will 
>>>>> instead statelessly respond with 503s. Responses and ACKs will 
>>>>> still make it through, but if your TU ignores these, it could end 
>>>>> up leaking memory itself (since it never saw the response for the 
>>>>> request it is holding onto state for).
>>>>>
>>>>> However, there comes a point when the stack's ability to send out 
>>>>> 503s fast enough is overwhelmed, and really bad stuff starts 
>>>>> happening then. Also, if we are statelessly sending 503s, any ACKs 
>>>>> to those 503s end up being indistinguishable from ACK/200 (since 
>>>>> we never remembered the tid), and therefore get sent up to the TU, 
>>>>> further compounding the problem.
>>>>>
>>>>> Best regards,
>>>>> Byron Campen
>>>>>
>>>>>> Case 1:  if the call volume is very high, some messages may get 
>>>>>> lost or dropped, and the sip stack should have self protection to 
>>>>>> prevent this problem.
>>>>>>              Other way: Are there mechanisms for  the application 
>>>>>> to identify dangling TIDs and free them?
>>>>>>               Does RESIP stack have the func intefaces for 
>>>>>> application to use.
>>>>>>
>>>>>> Question: if the call volume is too high, is there any mechanism 
>>>>>> for resip stack to detect it and discard any new request messages 
>>>>>> if the computer cannot handle it?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Frank Yuan
>>>>>> Emergent-Netsolutions.com
>>>>>> 972-359-6600
>>>>>>
>>>>>>
>>>>>> Byron Campen wrote:
>>>>>>>> TU summary: 0 TRANSPORT 0 TRANSACTION 0 *CLIENTTX 1998 SERVERTX 
>>>>>>>> 10690* TIMERS 0
>>>>>>>> Transaction summary: reqi 1225266 reqo 1200525 rspi 955555 rspo 
>>>>>>>> 1229993
>>>>>>>> Details: INVi 383069/S322324/F36145 INVo 344348/S322414/F0 ACKi 
>>>>>>>> 312462 ACKo 3223
>>>>>>>> 60 BYEi 507150/S507141/F0 BYEo 307875/S307111/F0 CANi 
>>>>>>>> 22517/S507141/F0 CANo 2223
>>>>>>>> /S1550/F593 MSGi 0/S0/F0 MSGo 0/S0/F0 OPTi 0/S0/F0 OPTo 0/S0/F0 
>>>>>>>> REGi 68/S64/F0 R
>>>>>>>> EGo 0/S0/F0 PUBi 0/S0/F0 PUBo 0/S0/F0 SUBi 0/S0/F0 SUBo 0/S0/F0 
>>>>>>>> NOTi 0/S0/F0 NOT
>>>>>>>> o 0/S0/F0
>>>>>>>>
>>>>>>>
>>>>>>> Ok, the CLIENTTX and SERVERTX fields in the above logging 
>>>>>>> statement indicate that there are lots and lots of 
>>>>>>> TransactionStates lying around. Further, there are no timers 
>>>>>>> left in the TimerQueue, so we aren't likely to clean any of 
>>>>>>> these up. Lets talk about the server transactions first. There 
>>>>>>> are a couple of likely possibilities:
>>>>>>>
>>>>>>> 1. The TU is failing to respond to some of the requests that the 
>>>>>>> stack passes it; the stack will wait indefinitely for a response 
>>>>>>> from the TU. It is the TU's responsibility to respond to EVERY 
>>>>>>> request that is passed to it, no matter how malformed the 
>>>>>>> request might be. The TU should *never* elect to "quietly" drop 
>>>>>>> a request. Doing so is guaranteed to leak exactly one server 
>>>>>>> TransactionState.
>>>>>>>
>>>>>>> 2. High load conditions (note the number of retransmissions) 
>>>>>>> have caused the stack to leak transactions (I will take a closer 
>>>>>>> look at this)
>>>>>>>
>>>>>>> As for the client TransactionStates, this worries me more. There 
>>>>>>> are fewer things that the TU can do wrong that will cause the 
>>>>>>> stack to leak client TransactionStates. I will try to figure out 
>>>>>>> what might be happening here.
>>>>>>>
>>>>>>>
>>>>>>> So, are you using your own TU? If so, try putting a simple 
>>>>>>> counter that gets incremented for each request that comes from 
>>>>>>> the stack (excepting ACKs), and decremented for every 
>>>>>>> *final* response sent to the stack. If this counter ends up 
>>>>>>> being non-zero, you have a bug in your TU.
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Byron Campen
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sep 21, 2006, at 1:36 PM, FrankYuan wrote:
>>>>>>>
>>>>>>>> After call generator stopped for 10 minutes, I found that the 
>>>>>>>> resip  statistics did not have any problem on these FIFO queues.
>>>>>>>> So I created core file and print the size of Transaction map.
>>>>>>>> There are still lot of TIDs in the transaction map. At least it 
>>>>>>>> is part of  culprit to hold memory.
>>>>>>>> Should there be a grarbage collection to free  these lost TIDs?
>>>>>>>>
>>>>>>>> Here are the log files:
>>>>>>>>
>>>>>>>> 20060921-125408.091 | TuSelector.cxx:71 | Stats message
>>>>>>>> 20060921-125408.091 | StatisticsMessage.cxx:153 | RESIP:TRANSACTION
>>>>>>>> TU summary: 0 TRANSPORT 0 TRANSACTION 0 CLIENTTX 1998 SERVERTX 
>>>>>>>> 10690 TIMERS 0
>>>>>>>> Transaction summary: reqi 1225266 reqo 1200525 rspi 955555 rspo 
>>>>>>>> 1229993
>>>>>>>> Details: INVi 383069/S322324/F36145 INVo 344348/S322414/F0 ACKi 
>>>>>>>> 312462 ACKo 3223
>>>>>>>> 60 BYEi 507150/S507141/F0 BYEo 307875/S307111/F0 CANi 
>>>>>>>> 22517/S507141/F0 CANo 2223
>>>>>>>> /S1550/F593 MSGi 0/S0/F0 MSGo 0/S0/F0 OPTi 0/S0/F0 OPTo 0/S0/F0 
>>>>>>>> REGi 68/S64/F0 R
>>>>>>>> EGo 0/S0/F0 PUBi 0/S0/F0 PUBo 0/S0/F0 SUBi 0/S0/F0 SUBo 0/S0/F0 
>>>>>>>> NOTi 0/S0/F0 NOT
>>>>>>>> o 0/S0/F0
>>>>>>>> Retransmissions: INVx 116463 BYEx 105757 CANx 1499 MSGx 0 OPTx 
>>>>>>>> 0 REGx 0 finx 0 n
>>>>>>>> onx 0 PUBx 0 SUBx 0 NOTx 0
>>>>>>>> 20060921-125708.084 | TuSelector.cxx:71 | Stats message
>>>>>>>> 20060921-125708.084 | StatisticsMessage.cxx:153 | RESIP:TRANSACTION
>>>>>>>> TU summary: 0 TRANSPORT 0 TRANSACTION 0 CLIENTTX 1998 SERVERTX 
>>>>>>>> 10690 TIMERS 0
>>>>>>>> Transaction summary: reqi 1225268 reqo 1200525 rspi 955555 rspo 
>>>>>>>> 1229995
>>>>>>>> Details: INVi 383069/S322324/F36145 INVo 344348/S322414/F0 ACKi 
>>>>>>>> 312462 ACKo 3223
>>>>>>>> 60 BYEi 507150/S507141/F0 BYEo 307875/S307111/F0 CANi 
>>>>>>>> 22517/S507141/F0 CANo 2223
>>>>>>>> /S1550/F593 MSGi 0/S0/F0 MSGo 0/S0/F0 OPTi 0/S0/F0 OPTo 0/S0/F0 
>>>>>>>> REGi 70/S66/F0 R
>>>>>>>> EGo 0/S0/F0 PUBi 0/S0/F0 PUBo 0/S0/F0 SUBi 0/S0/F0 SUBo 0/S0/F0 
>>>>>>>> NOTi 0/S0/F0 NOT
>>>>>>>> o 0/S0/F0
>>>>>>>> Retransmissions: INVx 116463 BYEx 105757 CANx 1499 MSGx 0 OPTx 
>>>>>>>> 0 REGx 0 finx 0 n
>>>>>>>> onx 0 PUBx 0 SUBx 0 NOTx 0
>>>>>>>> 20060921-130008.078 | TuSelector.cxx:71 | Stats message
>>>>>>>> 20060921-130008.085 | StatisticsMessage.cxx:153 | RESIP:TRANSACTION
>>>>>>>> TU summary: 0 TRANSPORT 0 TRANSACTION 0 CLIENTTX 1998 SERVERTX 
>>>>>>>> 10690 TIMERS 0
>>>>>>>> Transaction summary: reqi 1225270 reqo 1200525 rspi 955555 rspo 
>>>>>>>> 1229997
>>>>>>>> Details: INVi 383069/S322324/F36145 INVo 344348/S322414/F0 ACKi 
>>>>>>>> 312462 ACKo 3223
>>>>>>>> 60 BYEi 507150/S507141/F0 BYEo 307875/S307111/F0 CANi 
>>>>>>>> 22517/S507141/F0 CANo 2223
>>>>>>>> /S1550/F593 MSGi 0/S0/F0 MSGo 0/S0/F0 OPTi 0/S0/F0 OPTo 0/S0/F0 
>>>>>>>> REGi 72/S68/F0 R
>>>>>>>> EGo 0/S0/F0 PUBi 0/S0/F0 PUBo 0/S0/F0 SUBi 0/S0/F0 SUBo 0/S0/F0 
>>>>>>>> NOTi 0/S0/F0 NOT
>>>>>>>> o 0/S0/F0
>>>>>>>> Retransmissions: INVx 116463 BYEx 105757 CANx 1499 MSGx 0 OPTx 
>>>>>>>> 0 REGx 0 finx 0 n
>>>>>>>> onx 0 PUBx 0 SUBx 0 NOTx 0
>>>>>>>>
>>>>>>>>
>>>>>>>> (gdb) p 
>>>>>>>> (EnSipStack->myStack->mTransactionController->mClientTransactionMap)
>>>>>>>> warning: can't find class named `resip::SipStack', as given by 
>>>>>>>> C++ RTTI
>>>>>>>> $1 = {mMap = {_M_ht = {_M_node_allocator = {<No data fields>},
>>>>>>>>       _M_hash = {<No data fields>},
>>>>>>>>       _M_equals = 
>>>>>>>> {<binary_function<resip::Data,resip::Data,bool>> = {<No data f
>>>>>>>> ields>}, <No data fields>},
>>>>>>>>       _M_get_key = {<unary_function<std::pair<const 
>>>>>>>> resip::Data, resip::Transact
>>>>>>>> ionState*>,const resip::Data>> = {<No data fields>}, <No data 
>>>>>>>> fields>},
>>>>>>>>       _M_buckets = 
>>>>>>>> {<_Vector_base<__gnu_cxx::_Hashtable_node<std::pair<const res
>>>>>>>> ip::Data, resip::TransactionState*> 
>>>>>>>> >*,std::allocator<resip::TransactionState*>
>>>>>>>> >> = 
>>>>>>>> {<_Vector_alloc_base<__gnu_cxx::_Hashtable_node<std::pair<const 
>>>>>>>> resip::Data
>>>>>>>> , resip::TransactionState*> 
>>>>>>>> >*,std::allocator<resip::TransactionState*>,true>> =
>>>>>>>>  {_M_start = 0x920bdd10, _M_finish = 0x920c9d14,
>>>>>>>>             _M_end_of_storage = 0x920c9d14}, <No data fields>}, 
>>>>>>>> <No data fields>
>>>>>>>> }, *_M_num_elements = 1998*}}}
>>>>>>>> (gdb) p 
>>>>>>>> (EnSipStack->myStack->mTransactionController->mServerTransactionMap)
>>>>>>>> warning: can't find class named `resip::SipStack', as given by 
>>>>>>>> C++ RTTI
>>>>>>>> $2 = {mMap = {_M_ht = {_M_node_allocator = {<No data fields>},
>>>>>>>>       _M_hash = {<No data fields>},
>>>>>>>>       _M_equals = 
>>>>>>>> {<binary_function<resip::Data,resip::Data,bool>> = {<No data f
>>>>>>>> ields>}, <No data fields>},
>>>>>>>>       _M_get_key = {<unary_function<std::pair<const 
>>>>>>>> resip::Data, resip::Transact
>>>>>>>> ionState*>,const resip::Data>> = {<No data fields>}, <No data 
>>>>>>>> fields>},
>>>>>>>>       _M_buckets = 
>>>>>>>> {<_Vector_base<__gnu_cxx::_Hashtable_node<std::pair<const res
>>>>>>>> ip::Data, resip::TransactionState*> 
>>>>>>>> >*,std::allocator<resip::TransactionState*>
>>>>>>>> >> = 
>>>>>>>> {<_Vector_alloc_base<__gnu_cxx::_Hashtable_node<std::pair<const 
>>>>>>>> resip::Data
>>>>>>>> , resip::TransactionState*> 
>>>>>>>> >*,std::allocator<resip::TransactionState*>,true>> =
>>>>>>>>  {_M_start = 0x8cc3e790, _M_finish = 0x8cc567d4,
>>>>>>>>             _M_end_of_storage = 0x8cc567d4}, <No data fields>}, 
>>>>>>>> <No data fields>
>>>>>>>> }, _*M_num_elements = 10691*}}}
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>> Frank Yuan
>>>>>>>> Emergent-Netsolutions.com
>>>>>>>> 972-359-6600
>>>>>>>>
>>>>>>>>
>>>>>>>> FrankYuan wrote:
>>>>>>>>> I am still working on it and will let you know as soon as I find 
>>>>>>>>> anything related.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>> Frank Yuan
>>>>>>>>> Emergent-Netsolutions.com
>>>>>>>>> 972-359-6600
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Byron Campen wrote:
>>>>>>>>>   
>>>>>>>>>>     This code was written long before my time here at resiprocate, so 
>>>>>>>>>> I do not know. To those who are in the know, is this a relic that can 
>>>>>>>>>> be safely done away with?
>>>>>>>>>>
>>>>>>>>>> Did you verify whether or not you had a genuine memory leak (this is 
>>>>>>>>>> something I am very interested to know)?
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>> Byron Campen
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>     
>>>>>>>>>>> My question why NoSize(0U-1)  is used for mSize when clear func is 
>>>>>>>>>>> called.
>>>>>>>>>>>
>>>>>>>>>>> mStateMachineFifo.size() may return either 0 or NoSize if the queue 
>>>>>>>>>>> is empty.
>>>>>>>>>>>
>>>>>>>>>>> It should alway return 0 if the queue is empty and NoSize should not 
>>>>>>>>>>> be used.
>>>>>>>>>>>
>>>>>>>>>>> NoSize causes confusion and is error prone.
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>>
>>>>>>>>>>> Frank Yuan
>>>>>>>>>>> Emergent-Netsolutions.com
>>>>>>>>>>> 972-359-6600
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Jason Fischl wrote:
>>>>>>>>>>>       
>>>>>>>>>>>> On 9/20/06, Byron Campen <bcampen at estacado.net> wrote:
>>>>>>>>>>>>         
>>>>>>>>>>>>>         As for your questions about AbstractFifo, I am unsure why 
>>>>>>>>>>>>> mSize is
>>>>>>>>>>>>> needed. Can anyone answer this (or, answer why clear is a no-op)?
>>>>>>>>>>>>>
>>>>>>>>>>>>>           
>>>>>>>>>>>> The clear method is virtual and gets defined in the subclasses.
>>>>>>>>>>>>
>>>>>>>>>>>> I believe that mSize is there as an optimization.
>>>>>>>>>>>>
>>>>>>>>>>>>         
>>>>>>>>> _______________________________________________
>>>>>>>>> resiprocate-devel mailing list
>>>>>>>>> resiprocate-devel at list.sipfoundry.org
>>>>>>>>> https://list.sipfoundry.org/mailman/listinfo/resiprocate-devel
>>>>>>>>>
>>>>>>>>>   
>>>>>>>
>>>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.resiprocate.org/pipermail/resiprocate-devel/attachments/20060922/0ac60595/attachment.htm>


More information about the resiprocate-devel mailing list