< Previous by Date Date Index Next by Date >
< Previous in Thread Thread Index Next in Thread >

Re: [reSIProcate-users] Controlling TCP connection reuse


I *could* debug to determine why the INVITE isn't retransmitted immediately 
upon finding that the underlying socket is bad.  But now that I've got the 
AfterSocketCreation callback working for TCP, I will probably go back to the 
TCP keepalive path.  It turns out that one can programmatically change the wait 
time, number of probes, and probe interval on a per socket basis.

PK
________________________________________
From: slgodin@xxxxxxxxx [mailto:slgodin@xxxxxxxxx] On Behalf Of Scott Godin
Sent: Tuesday, May 19, 2009 4:26 PM
To: Paul Kurmas
Cc: resiprocate-users@xxxxxxxxxxxxxxx
Subject: Re: [reSIProcate-users] Controlling TCP connection reuse

> But as soon as the last session in that NetworkAssociation is eliminated, the 
> KeepAlive for that connection is cancelled.

Good point.  In this case the next message sent will start the process to 
determine that the connection is dead, but the transaction will likely timeout 
before the socket can determine it's down.

I originally thought the SO_KEEPALIVE option would help here, but it appears 
the default time is 2 hours and the time can only be modified system wide and 
not on a per connection basis:  
http://dev.fyicenter.com/Interview-Questions/Socket-4/Why_does_it_take_so_long_to_detect_that_the_peer.html

The only thing I can think of to help this situation is to:
1.  build a keep alive mechanism into the stack itself and not to use the 
NetworkAssociation functionality in DUM.  Or 
2.  to use the KeepAliveManager::add method directly from your app, so that the 
keepalives continue even though no dialog exists.

Scott

On Tue, May 19, 2009 at 4:04 PM, Paul Kurmas <pkurmas@xxxxxxxxxxxxx> wrote:
I understand the rationale behind the TCP connection reuse & won't make any 
argument against it.  The DUM KeepAliveManager seems to work fine when there is 
an active session.  But as soon as the last session in that NetworkAssociation 
is eliminated, the KeepAlive for that connection is cancelled.  This leaves me 
exposed to the problem still - an open socket connection that fails isn't 
detected.  Is the argument that the next write on that socket should expose 
that failure & cause the socket to be reopened?  That's not what happens for my 
application... our application stalls until the INVITE that is sent expires.

PK
________________________________________
From: slgodin@xxxxxxxxx [mailto:slgodin@xxxxxxxxx] On Behalf Of Scott Godin
Sent: Sunday, May 17, 2009 12:05 PM
To: Paul Kurmas
Cc: resiprocate-users@xxxxxxxxxxxxxxx
Subject: Re: [reSIProcate-users] Controlling TCP connection reuse

Right now the stack will not automatically close any TCP connections, unless it 
get's an error sending or receiving, or the OS has run out of TCP socket 
descriptors (TcpBaseTransport.cxx line 153).   Dead TCP connections are not 
normally detected until you try to send data on them.  Using the 
KeepAliveManager will help to cleanup dead connections, since it will ensure 
there is some data sent on each connection periodically.  However on some OS's, 
it can still take up to 2 mins to discover the connection is dead, after 
attempting to send data on it.  

Closing the TCP connection after each transaction does not sound like a good 
way to go, since it will be difficult to ensure that each TCP connection is 
only used for one transaction at a time, TCP connections are reasonably 
expensive, and this suggestion appears to go against RFC3261:
o In RFC 2543, closure of a TCP connection was made equivalent to a CANCEL. 
This was nearly impossible to implement (and wrong) for TCP connections between 
proxies. This has been eliminated, so that there is no coupling between TCP 
connection state and SIP processing.

I'm not sure that there is a better/faster way to recover from dead TCP 
connections.  Does anyone else have any ideas?

Scott
On Fri, May 15, 2009 at 11:16 AM, Paul Kurmas <pkurmas@xxxxxxxxxxxxx> wrote:
I'm chasing an issue with stale connections to a remote endpoint that
was shutdown incorrectly.  The TCP socket remains open, and there are no
keep-alives (either TCP or application (via DUM's KeepAliveManager).
When the remote endpoint restarts, the 1st INVITE is sent over that
stale connection, and the remote endpoint stack returns a TCP RST.  On
the local endpoint there is no immediate reaction -- the application
stalls until the INVITE expires.  The next request works fine because a
new connection must be opened.

I have activated DUM's KeepAliveManager and it does seem to clear the
connection after some time.  That's good, but I'd prefer something more
responsive.  It seems to me the best solution is to close the connection
after a much shorter period of time.  This could be an immediate closure
of the TCP connection after a transaction is complete or a pathetically
low value for the aging of the cached connections.

I'd appreciate any feedback you could provide.  By the way, we're
running Resiprocate v1.3.4 at this time.
PK
_______________________________________________
resiprocate-users mailing list
resiprocate-users@xxxxxxxxxxxxxxx
List Archive: http://list.resiprocate.org/archive/resiprocate-users/