< Previous by Date Date Index Next by Date >
< Previous in Thread Thread Index Next in Thread >

Re: [reSIProcate] repro crash with Jitsi


On 29/01/13 17:32, Byron Campen wrote:
Uh, that's odd; a "[UBLISH" request? I would suspect that something is going wrong in our garbage-checking. This should probably be rejected with a 400, since this probably should not be considered a retransmission. Is the original request similarly formed?


I just did a grep of the log to see if this happens regularly, this is what I found:

$ cat crash1.log | grep 'bytes \['
STACK | 20130127-160847.807 | repro | RESIP:TRANSPORT | 140422323181312 | ssl/TlsConnection.cxx:317 | SSL_read returned -1 bytes []
STACK | 20130127-160854.895 | repro | RESIP:TRANSPORT | 140422323181312 | ssl/TlsConnection.cxx:317 | SSL_read returned 1 bytes [P]
STACK | 20130127-160854.895 | repro | RESIP:TRANSPORT | 140422323181312 | ssl/TlsConnection.cxx:317 | SSL_read returned 1352 bytes [UBLISH sip:1001@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx SIP/2.0
STACK | 20130127-160854.896 | repro | RESIP:TRANSPORT | 140422323181312 | ssl/TlsConnection.cxx:317 | SSL_read returned -1 bytes []

Notice a 1 byte read with [P]?  I'm not sure if this could be the way Jitsi is sending stuff over TLS, or a problem with the way resip/repro is receiving it?



Best regards,
Byron Campen


On Mon, Jan 28, 2013 at 11:32 PM, Daniel Pocock <daniel@xxxxxxxxxxxxx> wrote:


On 29/01/13 00:26, Byron Campen wrote:
> We should be tossing out the "full" response (mNextTransmission), but
> retaining the serialized retransmit buffer (mMsgToRetransmit). Did we ever
> successfully serialize/transmit this response? Perhaps there is a bug that



The 100 Trying response was sent about 3 seconds before the new PUBLISH
request

STACK | 20130127-160851.209 | repro | RESIP:TRANSPORT | 140422323181312
| ssl/TlsConnection.cxx:470 | Did TLS write 339 339 [[SIP/2.0 100 Trying
Via: SIP/2.0/TLS
192.168.1.100:49269;branch=z9hG4bK-373139-05d032d540e9bca18e3af732e0c948df
To: "1001" <sip:1001@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
From: "1001" <sip:1001@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>;tag=c72de72a
Call-ID: 9e82664c30f04d33d1ce879ab846a60b@0:0:0:0:0:0:0:0
CSeq: 2 PUBLISH
Content-Length: 0

]]
STACK | 20130127-160854.895 | repro | RESIP:TRANSPORT | 140422323181312
| ssl/TlsConnection.cxx:317 | SSL_read returned 1 bytes [P]
STACK | 20130127-160854.895 | repro | RESIP:TRANSPORT | 140422323181312
| ssl/TlsConnection.cxx:369 | SSL bytesRead=1
DEBUG | 20130127-160854.895 | repro | RESIP:TRANSPORT | 140422323181312
| ConnectionBase.cxx:115 | In State: NewMessage
DEBUG | 20130127-160854.895 | repro | RESIP:TRANSPORT | 140422323181312
| ConnectionBase.cxx:161 | ConnectionBase::process setting source [ V4
192.168.1.100:49269 TLS target domain=unspecified mFlowKey=31 ]
DEBUG | 20130127-160854.895 | repro | RESIP:TRANSPORT | 140422323181312
| Connection.cxx:298 | Connection::performReads()  read=1
STACK | 20130127-160854.895 | repro | RESIP:TRANSPORT | 140422323181312
| ssl/TlsConnection.cxx:317 | SSL_read returned 1352 bytes [UBLISH
sip:1001@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx SIP/2.0
Call-ID: 9e82664c30f04d33d1ce879ab846a60b@0:0:0:0:0:0:0:0
CSeq: 2 PUBLISH
From: "1001" <sip:1001@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>;tag=c72de72a
To: "1001" <sip:1001@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
Max-Forwards: 70
Content-Type: application/pidf+xml
User-Agent: Jitsi1.1.4400.10268Windows 7
Expires: 3600
Event: presence
Via: SIP/2.0/TLS
192.168.1.100:49269;branch=z9hG4bK-373139-05d032d540e9bca18e3af732e0c948df
Proxy-Authorization: Digest
username="1001",realm="srv1.office.readytechnology.co.uk",nonce="1359299327:1ba2dffdf6697b06053efcfe4736f581",uri="sip:1001@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",response="71efd5b4af3d25d2405c0e84d9c8e4eb",algorithm=MD5,qop=auth,cnonce="xyz",nc=00000001
Content-Length: 452







> broke the serialization? Been a very long time since I've looked at this...
>
> Best regards,
> Byron Campen
>
>
> On Mon, Jan 28, 2013 at 10:00 AM, Scott Godin <sgodin@xxxxxxxxxxxxxxx>wrote:
>
>> The assert is indicating a bug so it is valid.
>>
>> After a quick look at the offending code, my guess is that we have
>> received a re-transmission of a non-invite request (ie PUBLISH) after we
>> have formed and sent our response, and we are supposed to be re-responding
>> with the same response as last time.  However one of the code changes made
>> in the resip-b-TKLC-perf_work branch merge was to throw away the
>> mNextTransmission if it is a response.  I suspect the purpose was to try
>> and free up some memory as soon as possible.  However I see at least two
>> places in the code where re-transmitting a stored response is required.
>> 1.  Appears to be what you have found - Server Non-Invite transaction
>> after receiving a re-transmitted requested for which we have already
>> responded.
>> 2.  When TimerG expires (Server Invite transaction) and we need to re
>> transmit an Invite response when we don't see the resulting ACK.
>>
>> I am inclined to just comment out the code block starting on line 2567 in
>> order to fix this:
>>          // !bwc! If mNextTransmission is a non-ACK request, we need to
>> save the
>>          // initial request in case we need to send a simulated 408 or a
>> 503 to
>>          // the TU (at least, until we get a response back)
>>          if(!mNextTransmission->isRequest() ||
>> mNextTransmission->method()==ACK)
>>          {
>>             delete mNextTransmission;
>>             mNextTransmission=0;
>>          }
>>
>> However I'd like for Byron to chime in here, since these were originally
>> his changes.
>>
>> Regards,
>> Scott
>>
>> On Sun, Jan 27, 2013 at 4:44 PM, Daniel Pocock <daniel@xxxxxxxxxxxxx>wrote:
>>
>>>
>>>
>>>
>>> I had heard about this in a private email, and have been `lucky' enough
>>> to observe it myself now, with a full log at STACK level too
>>>
>>> The UA (sip:1001@srv1) is Jitsi v1.1.4400.10268 and repro is r9959 (main)
>>>
>>> The packet from Jitsi is below - has anyone seen similar issues with
>>> presence requests or Jitsi?  Should the assert(0) be triggered there,
>>> does it actually indicate something is wrong, or should I just compile
>>> with NDEBUG and ignore it?
>>>
>>>
>>>
>>> The code in question is
>>>
>>> void
>>> TransactionState::sendCurrentToWire()
>>> {
>>>  if(!mMsgToRetransmit.empty())
>>>    {
>>>       if(mController.mStack.statisticsManagerEnabled())
>>>       {
>>>          mController.mStatsManager.retransmitted(mCurrentMethodType,
>>>                                                    isClient(),
>>>                                                    mCurrentResponseCode);
>>>       }
>>>
>>>       mController.mTransportSelector.retransmit(mMsgToRetransmit);
>>>    }
>>>    else if(mNextTransmission) // initial transmission; need to determine
>>> target
>>>    {
>>> ... snip ...
>>>    }
>>>    else
>>>    {                         // line 2578
>>>       assert(0);
>>>    }
>>>
>>>
>>>
>>> DEBUG | 20130127-160854.896 | repro | RESIP:TRANSPORT | 140422323181312
>>> | ConnectionBase.cxx:418 | ##Connection: CONN_BASE: 0x7fb6a8234460 [ V4
>>> 192.168.1.100:49269 TLS target domain=unspecified mFlowKey=31 ]
>>> received: PUBLISH sip:1001@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx SIP/2.0
>>> Via: SIP/2.0/TLS
>>> 192.168.1.100:49269
>>> ;branch=z9hG4bK-373139-05d032d540e9bca18e3af732e0c948df
>>> Max-Forwards: 70
>>> Contact: "1001"
>>> <sip:1001@192.168.1.100:49269
>>> ;transport=tls;registering_acc=srv1_office_readytechnology_co_uk>
>>> To: "1001" <sip:1001@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
>>> From: "1001" <sip:1001@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>;tag=c72de72a
>>> Call-ID: 9e82664c30f04d33d1ce879ab846a60b@0:0:0:0:0:0:0:0
>>> CSeq: 2 PUBLISH
>>> Expires: 3600
>>> Content-Type: application/pidf+xml
>>> Proxy-Authorization: Digest
>>> username="1001",realm="srv1.office.readytechnology.co.uk
>>> ",nonce="1359299327:1ba2dffdf6697b06053efcfe4736f581",uri="
>>> sip:1001@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>> ",response="71efd5b4af3d25d2405c0e84d9c8e4eb",algorithm=MD5,qop=auth,cnonce="xyz",nc=00000001
>>> User-Agent: Jitsi1.1.4400.10268Windows 7
>>> Event: presence
>>> Content-Length: 452
>>>
>>> <?xml version="1.0" encoding="UTF-8" standalone="no"?><presence
>>> xmlns="urn:ietf:params:xml:ns:pidf"
>>> xmlns:dm="urn:ietf:params:xml:ns:pidf:data-model"
>>> xmlns:rpid="urn:ietf:params:xml:ns:pidf:rpid"
>>> entity="sip:1001@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"><dm:person
>>> id="p556"><rpid:activities/></dm:person><tuple
>>> id="t6311"><status><basic>open</basic></status><contact>
>>> sip:1001@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>> </contact><note>Online</note></tuple></presence>
>>> DEBUG | 20130127-160854.896 | repro | RESIP:TRANSPORT | 140422323181312
>>> | Connection.cxx:298 | Connection::performReads()  read=1352
>>> DEBUG | 20130127-160854.896 | repro | RESIP:TRANSPORT | 140422323181312
>>> | ConnectionBase.cxx:661 | Creating buffer for CONN_BASE: 0x7fb6a8234460
>>> [ V4 192.168.1.100:49269 TLS target domain=unspecified mFlowKey=31 ]
>>> STACK | 20130127-160854.896 | repro | RESIP:TRANSPORT | 140422323181312
>>> | ssl/TlsConnection.cxx:317 | SSL_read returned -1 bytes []
>>> STACK | 20130127-160854.896 | repro | RESIP:TRANSPORT | 140422323181312
>>> | ssl/TlsConnection.cxx:354 | Got TLS read got condition of 2
>>> STACK | 20130127-160854.896 | repro | RESIP:TRANSACTION |
>>> 140422331574016 | TransactionState.cxx:706 | Found matching transaction
>>> for SipReq:  PUBLISH 1001@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>> tid=-373139-05d032d540e9bca18e3af732e0c948df cseq=2 PUBLISH
>>> contact=1001@192.168.1.100:49269 / 2 from(wire)
>>> tlsd=srv1.office.readytechnology.co.uk ->
>>> tid=-373139-05d032d540e9bca18e3af732e0c948df [
>>> ServerNonInvite/Proceeding reliable target=[ V4 192.168.1.100:49269 TLS
>>> target domain=unspecified mFlowKey=31 ]]
>>> STACK | 20130127-160854.896 | repro | RESIP:TRANSACTION |
>>> 140422331574016 | TransactionState.cxx:1438 |
>>> TransactionState::processServerNonInvite: SipReq:  PUBLISH
>>> 1001@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>> tid=-373139-05d032d540e9bca18e3af732e0c948df cseq=2 PUBLISH
>>> contact=1001@192.168.1.100:49269 / 2 from(wire)
>>> tlsd=srv1.office.readytechnology.co.uk
>>> repro: TransactionState.cxx:2578: void
>>> resip::TransactionState::sendCurrentToWire(): Assertion `0' failed.
>>> _______________________________________________
>>> resiprocate-devel mailing list
>>> resiprocate-devel@xxxxxxxxxxxxxxx
>>> https://list.resiprocate.org/mailman/listinfo/resiprocate-devel
>>>
>>
>>
>> _______________________________________________
>> resiprocate-devel mailing list
>> resiprocate-devel@xxxxxxxxxxxxxxx
>> https://list.resiprocate.org/mailman/listinfo/resiprocate-devel
>>
>