[reSIProcate] resiprocate stack memeory leak?????
FrankYuan
frankyuan at emergent-netsolutions.com
Fri Sep 22 09:42:21 CDT 2006
I find a way to trace Leak TIDs by adding time stamp in the
TransactionState class locally and print out the error msg if its still
exists for more than two minutes.
I can figure out why the problem occurs and fix my codes later on.
Deeply appreciate your help.
Frank Yuan
Byron Campen wrote:
> Okay, I will see about adding this feature, but the more pressing
> concern I have right now is the (apparent) leakage of client
> TransactionStates. Client transactions have a fixed lifetime, as
> opposed to server transactions. If no response is received in a client
> transaction, that transaction is supposed to die after 32s (regardless
> of what the TU does). There is only one exception that I know of; when
> we have sent an INVITE, and received a provisional response, there are
> only two things that will cause the client transaction to be torn down:
>
> 1. We receive a final response (in this case, the transaction is
> cleaned up after 64*T1, or 32s)
> 2. The TU sends a CANCEL to the stack (this will cause a 128*T1, or
> 64s, timer to be set on the INVITE transaction).
>
> So, if we are not getting final responses, it is up to the TU to
> decide when it no longer wants to pursue the call. In proxies, this is
> done with Timer C (See RFC 3261 Sec 16.6 bullet 11, Sec 16.7 bullet 2,
> and Sec 16.8). In endpoints, it is assumed the human user will
> eventually give up on the call. I am not aware of any defined
> mechanism for B2BUAs.
>
> Since you are running under heavy load, it is very likely stuff is
> getting dropped by the kernel, so it seems likely that we are winding
> up in a situation where the only thing that can keep us from leaking
> is CANCEL from the TU.
>
> Best regards,
> Byron Campen
>
>> Yes, very useful.
>>
>> Could you add debug msg concerning the potential leak of TID if
>> there is no response from TU when the timer expires?
>>
>> Could you list the conditions where TIDs may get leaked in
>> Transaction Map?
>>
>> case 1: For None INVITE Requests (Internal and Externam), no
>> resposne from remote and its own TU etc.
>> case 2: ...............
>> .............................
>>
>>
>> Frank Yuan
>> Emergent-Netsolutions.com
>> 972-359-6600
>>
>>
>> Byron Campen wrote:
>>> There is not (at present) any way to detect if particular
>>> transactions have been up for an unusually long time. We could add a
>>> debug feature that would have the stack start a timer when it sends
>>> a request to the TU, and if that timer fires and finds that the
>>> transaction is still in its initial state (ie, it hasn't gotten
>>> anything from the TU), log information about that transaction, and
>>> maybe even poke the TU. Would this be useful?
>>>
>>> Best regards,
>>> Byron Campen
>>>
>>>
>>>> Your input is very helpful.
>>>>
>>>> Are there ways for TU to detect TID leaks and free the lost TIDs?
>>>>
>>>> If yes, how? If no, it is tough to use this stateful stack.
>>>>
>>>> Could the stateless resip stack be used in order to get rid of TIDs
>>>> issue and hughe memory holding (32 to 64 seconds holding time for
>>>> each request)?
>>>>
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Frank Yuan
>>>> Emergent-Netsolutions.com
>>>> 972-359-6600
>>>>
>>>>
>>>> Byron Campen wrote:
>>>>> I disagree with your first point. If the message gets to the TU,
>>>>> it is the TU's responsibility to handle it. I have in the past
>>>>> proposed something like a SipStack::defer(const resip::Data& tid,
>>>>> MethodType method) that would tell the stack to clean up state for
>>>>> that tid, but this idea did not take on.
>>>>>
>>>>> As for the call volume question, you can set your TU's fifo to
>>>>> have size and/or time-depth limitations (by re-initializing your
>>>>> TU's fifo in the c'tor). If the stack detects that your fifo is
>>>>> overfull, it will not forward new requests to you, but will
>>>>> instead statelessly respond with 503s. Responses and ACKs will
>>>>> still make it through, but if your TU ignores these, it could end
>>>>> up leaking memory itself (since it never saw the response for the
>>>>> request it is holding onto state for).
>>>>>
>>>>> However, there comes a point when the stack's ability to send out
>>>>> 503s fast enough is overwhelmed, and really bad stuff starts
>>>>> happening then. Also, if we are statelessly sending 503s, any ACKs
>>>>> to those 503s end up being indistinguishable from ACK/200 (since
>>>>> we never remembered the tid), and therefore get sent up to the TU,
>>>>> further compounding the problem.
>>>>>
>>>>> Best regards,
>>>>> Byron Campen
>>>>>
>>>>>> Case 1: if the call volume is very high, some messages may get
>>>>>> lost or dropped, and the sip stack should have self protection to
>>>>>> prevent this problem.
>>>>>> Other way: Are there mechanisms for the application
>>>>>> to identify dangling TIDs and free them?
>>>>>> Does RESIP stack have the func intefaces for
>>>>>> application to use.
>>>>>>
>>>>>> Question: if the call volume is too high, is there any mechanism
>>>>>> for resip stack to detect it and discard any new request messages
>>>>>> if the computer cannot handle it?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Frank Yuan
>>>>>> Emergent-Netsolutions.com
>>>>>> 972-359-6600
>>>>>>
>>>>>>
>>>>>> Byron Campen wrote:
>>>>>>>> TU summary: 0 TRANSPORT 0 TRANSACTION 0 *CLIENTTX 1998 SERVERTX
>>>>>>>> 10690* TIMERS 0
>>>>>>>> Transaction summary: reqi 1225266 reqo 1200525 rspi 955555 rspo
>>>>>>>> 1229993
>>>>>>>> Details: INVi 383069/S322324/F36145 INVo 344348/S322414/F0 ACKi
>>>>>>>> 312462 ACKo 3223
>>>>>>>> 60 BYEi 507150/S507141/F0 BYEo 307875/S307111/F0 CANi
>>>>>>>> 22517/S507141/F0 CANo 2223
>>>>>>>> /S1550/F593 MSGi 0/S0/F0 MSGo 0/S0/F0 OPTi 0/S0/F0 OPTo 0/S0/F0
>>>>>>>> REGi 68/S64/F0 R
>>>>>>>> EGo 0/S0/F0 PUBi 0/S0/F0 PUBo 0/S0/F0 SUBi 0/S0/F0 SUBo 0/S0/F0
>>>>>>>> NOTi 0/S0/F0 NOT
>>>>>>>> o 0/S0/F0
>>>>>>>>
>>>>>>>
>>>>>>> Ok, the CLIENTTX and SERVERTX fields in the above logging
>>>>>>> statement indicate that there are lots and lots of
>>>>>>> TransactionStates lying around. Further, there are no timers
>>>>>>> left in the TimerQueue, so we aren't likely to clean any of
>>>>>>> these up. Lets talk about the server transactions first. There
>>>>>>> are a couple of likely possibilities:
>>>>>>>
>>>>>>> 1. The TU is failing to respond to some of the requests that the
>>>>>>> stack passes it; the stack will wait indefinitely for a response
>>>>>>> from the TU. It is the TU's responsibility to respond to EVERY
>>>>>>> request that is passed to it, no matter how malformed the
>>>>>>> request might be. The TU should *never* elect to "quietly" drop
>>>>>>> a request. Doing so is guaranteed to leak exactly one server
>>>>>>> TransactionState.
>>>>>>>
>>>>>>> 2. High load conditions (note the number of retransmissions)
>>>>>>> have caused the stack to leak transactions (I will take a closer
>>>>>>> look at this)
>>>>>>>
>>>>>>> As for the client TransactionStates, this worries me more. There
>>>>>>> are fewer things that the TU can do wrong that will cause the
>>>>>>> stack to leak client TransactionStates. I will try to figure out
>>>>>>> what might be happening here.
>>>>>>>
>>>>>>>
>>>>>>> So, are you using your own TU? If so, try putting a simple
>>>>>>> counter that gets incremented for each request that comes from
>>>>>>> the stack (excepting ACKs), and decremented for every
>>>>>>> *final* response sent to the stack. If this counter ends up
>>>>>>> being non-zero, you have a bug in your TU.
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Byron Campen
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sep 21, 2006, at 1:36 PM, FrankYuan wrote:
>>>>>>>
>>>>>>>> After call generator stopped for 10 minutes, I found that the
>>>>>>>> resip statistics did not have any problem on these FIFO queues.
>>>>>>>> So I created core file and print the size of Transaction map.
>>>>>>>> There are still lot of TIDs in the transaction map. At least it
>>>>>>>> is part of culprit to hold memory.
>>>>>>>> Should there be a grarbage collection to free these lost TIDs?
>>>>>>>>
>>>>>>>> Here are the log files:
>>>>>>>>
>>>>>>>> 20060921-125408.091 | TuSelector.cxx:71 | Stats message
>>>>>>>> 20060921-125408.091 | StatisticsMessage.cxx:153 | RESIP:TRANSACTION
>>>>>>>> TU summary: 0 TRANSPORT 0 TRANSACTION 0 CLIENTTX 1998 SERVERTX
>>>>>>>> 10690 TIMERS 0
>>>>>>>> Transaction summary: reqi 1225266 reqo 1200525 rspi 955555 rspo
>>>>>>>> 1229993
>>>>>>>> Details: INVi 383069/S322324/F36145 INVo 344348/S322414/F0 ACKi
>>>>>>>> 312462 ACKo 3223
>>>>>>>> 60 BYEi 507150/S507141/F0 BYEo 307875/S307111/F0 CANi
>>>>>>>> 22517/S507141/F0 CANo 2223
>>>>>>>> /S1550/F593 MSGi 0/S0/F0 MSGo 0/S0/F0 OPTi 0/S0/F0 OPTo 0/S0/F0
>>>>>>>> REGi 68/S64/F0 R
>>>>>>>> EGo 0/S0/F0 PUBi 0/S0/F0 PUBo 0/S0/F0 SUBi 0/S0/F0 SUBo 0/S0/F0
>>>>>>>> NOTi 0/S0/F0 NOT
>>>>>>>> o 0/S0/F0
>>>>>>>> Retransmissions: INVx 116463 BYEx 105757 CANx 1499 MSGx 0 OPTx
>>>>>>>> 0 REGx 0 finx 0 n
>>>>>>>> onx 0 PUBx 0 SUBx 0 NOTx 0
>>>>>>>> 20060921-125708.084 | TuSelector.cxx:71 | Stats message
>>>>>>>> 20060921-125708.084 | StatisticsMessage.cxx:153 | RESIP:TRANSACTION
>>>>>>>> TU summary: 0 TRANSPORT 0 TRANSACTION 0 CLIENTTX 1998 SERVERTX
>>>>>>>> 10690 TIMERS 0
>>>>>>>> Transaction summary: reqi 1225268 reqo 1200525 rspi 955555 rspo
>>>>>>>> 1229995
>>>>>>>> Details: INVi 383069/S322324/F36145 INVo 344348/S322414/F0 ACKi
>>>>>>>> 312462 ACKo 3223
>>>>>>>> 60 BYEi 507150/S507141/F0 BYEo 307875/S307111/F0 CANi
>>>>>>>> 22517/S507141/F0 CANo 2223
>>>>>>>> /S1550/F593 MSGi 0/S0/F0 MSGo 0/S0/F0 OPTi 0/S0/F0 OPTo 0/S0/F0
>>>>>>>> REGi 70/S66/F0 R
>>>>>>>> EGo 0/S0/F0 PUBi 0/S0/F0 PUBo 0/S0/F0 SUBi 0/S0/F0 SUBo 0/S0/F0
>>>>>>>> NOTi 0/S0/F0 NOT
>>>>>>>> o 0/S0/F0
>>>>>>>> Retransmissions: INVx 116463 BYEx 105757 CANx 1499 MSGx 0 OPTx
>>>>>>>> 0 REGx 0 finx 0 n
>>>>>>>> onx 0 PUBx 0 SUBx 0 NOTx 0
>>>>>>>> 20060921-130008.078 | TuSelector.cxx:71 | Stats message
>>>>>>>> 20060921-130008.085 | StatisticsMessage.cxx:153 | RESIP:TRANSACTION
>>>>>>>> TU summary: 0 TRANSPORT 0 TRANSACTION 0 CLIENTTX 1998 SERVERTX
>>>>>>>> 10690 TIMERS 0
>>>>>>>> Transaction summary: reqi 1225270 reqo 1200525 rspi 955555 rspo
>>>>>>>> 1229997
>>>>>>>> Details: INVi 383069/S322324/F36145 INVo 344348/S322414/F0 ACKi
>>>>>>>> 312462 ACKo 3223
>>>>>>>> 60 BYEi 507150/S507141/F0 BYEo 307875/S307111/F0 CANi
>>>>>>>> 22517/S507141/F0 CANo 2223
>>>>>>>> /S1550/F593 MSGi 0/S0/F0 MSGo 0/S0/F0 OPTi 0/S0/F0 OPTo 0/S0/F0
>>>>>>>> REGi 72/S68/F0 R
>>>>>>>> EGo 0/S0/F0 PUBi 0/S0/F0 PUBo 0/S0/F0 SUBi 0/S0/F0 SUBo 0/S0/F0
>>>>>>>> NOTi 0/S0/F0 NOT
>>>>>>>> o 0/S0/F0
>>>>>>>> Retransmissions: INVx 116463 BYEx 105757 CANx 1499 MSGx 0 OPTx
>>>>>>>> 0 REGx 0 finx 0 n
>>>>>>>> onx 0 PUBx 0 SUBx 0 NOTx 0
>>>>>>>>
>>>>>>>>
>>>>>>>> (gdb) p
>>>>>>>> (EnSipStack->myStack->mTransactionController->mClientTransactionMap)
>>>>>>>> warning: can't find class named `resip::SipStack', as given by
>>>>>>>> C++ RTTI
>>>>>>>> $1 = {mMap = {_M_ht = {_M_node_allocator = {<No data fields>},
>>>>>>>> _M_hash = {<No data fields>},
>>>>>>>> _M_equals =
>>>>>>>> {<binary_function<resip::Data,resip::Data,bool>> = {<No data f
>>>>>>>> ields>}, <No data fields>},
>>>>>>>> _M_get_key = {<unary_function<std::pair<const
>>>>>>>> resip::Data, resip::Transact
>>>>>>>> ionState*>,const resip::Data>> = {<No data fields>}, <No data
>>>>>>>> fields>},
>>>>>>>> _M_buckets =
>>>>>>>> {<_Vector_base<__gnu_cxx::_Hashtable_node<std::pair<const res
>>>>>>>> ip::Data, resip::TransactionState*>
>>>>>>>> >*,std::allocator<resip::TransactionState*>
>>>>>>>> >> =
>>>>>>>> {<_Vector_alloc_base<__gnu_cxx::_Hashtable_node<std::pair<const
>>>>>>>> resip::Data
>>>>>>>> , resip::TransactionState*>
>>>>>>>> >*,std::allocator<resip::TransactionState*>,true>> =
>>>>>>>> {_M_start = 0x920bdd10, _M_finish = 0x920c9d14,
>>>>>>>> _M_end_of_storage = 0x920c9d14}, <No data fields>},
>>>>>>>> <No data fields>
>>>>>>>> }, *_M_num_elements = 1998*}}}
>>>>>>>> (gdb) p
>>>>>>>> (EnSipStack->myStack->mTransactionController->mServerTransactionMap)
>>>>>>>> warning: can't find class named `resip::SipStack', as given by
>>>>>>>> C++ RTTI
>>>>>>>> $2 = {mMap = {_M_ht = {_M_node_allocator = {<No data fields>},
>>>>>>>> _M_hash = {<No data fields>},
>>>>>>>> _M_equals =
>>>>>>>> {<binary_function<resip::Data,resip::Data,bool>> = {<No data f
>>>>>>>> ields>}, <No data fields>},
>>>>>>>> _M_get_key = {<unary_function<std::pair<const
>>>>>>>> resip::Data, resip::Transact
>>>>>>>> ionState*>,const resip::Data>> = {<No data fields>}, <No data
>>>>>>>> fields>},
>>>>>>>> _M_buckets =
>>>>>>>> {<_Vector_base<__gnu_cxx::_Hashtable_node<std::pair<const res
>>>>>>>> ip::Data, resip::TransactionState*>
>>>>>>>> >*,std::allocator<resip::TransactionState*>
>>>>>>>> >> =
>>>>>>>> {<_Vector_alloc_base<__gnu_cxx::_Hashtable_node<std::pair<const
>>>>>>>> resip::Data
>>>>>>>> , resip::TransactionState*>
>>>>>>>> >*,std::allocator<resip::TransactionState*>,true>> =
>>>>>>>> {_M_start = 0x8cc3e790, _M_finish = 0x8cc567d4,
>>>>>>>> _M_end_of_storage = 0x8cc567d4}, <No data fields>},
>>>>>>>> <No data fields>
>>>>>>>> }, _*M_num_elements = 10691*}}}
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>> Frank Yuan
>>>>>>>> Emergent-Netsolutions.com
>>>>>>>> 972-359-6600
>>>>>>>>
>>>>>>>>
>>>>>>>> FrankYuan wrote:
>>>>>>>>> I am still working on it and will let you know as soon as I find
>>>>>>>>> anything related.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>> Frank Yuan
>>>>>>>>> Emergent-Netsolutions.com
>>>>>>>>> 972-359-6600
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Byron Campen wrote:
>>>>>>>>>
>>>>>>>>>> This code was written long before my time here at resiprocate, so
>>>>>>>>>> I do not know. To those who are in the know, is this a relic that can
>>>>>>>>>> be safely done away with?
>>>>>>>>>>
>>>>>>>>>> Did you verify whether or not you had a genuine memory leak (this is
>>>>>>>>>> something I am very interested to know)?
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>> Byron Campen
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> My question why NoSize(0U-1) is used for mSize when clear func is
>>>>>>>>>>> called.
>>>>>>>>>>>
>>>>>>>>>>> mStateMachineFifo.size() may return either 0 or NoSize if the queue
>>>>>>>>>>> is empty.
>>>>>>>>>>>
>>>>>>>>>>> It should alway return 0 if the queue is empty and NoSize should not
>>>>>>>>>>> be used.
>>>>>>>>>>>
>>>>>>>>>>> NoSize causes confusion and is error prone.
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>>
>>>>>>>>>>> Frank Yuan
>>>>>>>>>>> Emergent-Netsolutions.com
>>>>>>>>>>> 972-359-6600
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Jason Fischl wrote:
>>>>>>>>>>>
>>>>>>>>>>>> On 9/20/06, Byron Campen <bcampen at estacado.net> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> As for your questions about AbstractFifo, I am unsure why
>>>>>>>>>>>>> mSize is
>>>>>>>>>>>>> needed. Can anyone answer this (or, answer why clear is a no-op)?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> The clear method is virtual and gets defined in the subclasses.
>>>>>>>>>>>>
>>>>>>>>>>>> I believe that mSize is there as an optimization.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> resiprocate-devel mailing list
>>>>>>>>> resiprocate-devel at list.sipfoundry.org
>>>>>>>>> https://list.sipfoundry.org/mailman/listinfo/resiprocate-devel
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.resiprocate.org/pipermail/resiprocate-devel/attachments/20060922/0ac60595/attachment.htm>
More information about the resiprocate-devel
mailing list