Re: [reSIProcate-users] Controlling TCP connection reuse
I *could* debug to determine why the INVITE isn't retransmitted immediately
upon finding that the underlying socket is bad. But now that I've got the
AfterSocketCreation callback working for TCP, I will probably go back to the
TCP keepalive path. It turns out that one can programmatically change the wait
time, number of probes, and probe interval on a per socket basis.
PK
________________________________________
From: slgodin@xxxxxxxxx [mailto:slgodin@xxxxxxxxx] On Behalf Of Scott Godin
Sent: Tuesday, May 19, 2009 4:26 PM
To: Paul Kurmas
Cc: resiprocate-users@xxxxxxxxxxxxxxx
Subject: Re: [reSIProcate-users] Controlling TCP connection reuse
> But as soon as the last session in that NetworkAssociation is eliminated, the
> KeepAlive for that connection is cancelled.
Good point. In this case the next message sent will start the process to
determine that the connection is dead, but the transaction will likely timeout
before the socket can determine it's down.
I originally thought the SO_KEEPALIVE option would help here, but it appears
the default time is 2 hours and the time can only be modified system wide and
not on a per connection basis:
http://dev.fyicenter.com/Interview-Questions/Socket-4/Why_does_it_take_so_long_to_detect_that_the_peer.html
The only thing I can think of to help this situation is to:
1. build a keep alive mechanism into the stack itself and not to use the
NetworkAssociation functionality in DUM. Or
2. to use the KeepAliveManager::add method directly from your app, so that the
keepalives continue even though no dialog exists.
Scott
On Tue, May 19, 2009 at 4:04 PM, Paul Kurmas <pkurmas@xxxxxxxxxxxxx> wrote:
I understand the rationale behind the TCP connection reuse & won't make any
argument against it. The DUM KeepAliveManager seems to work fine when there is
an active session. But as soon as the last session in that NetworkAssociation
is eliminated, the KeepAlive for that connection is cancelled. This leaves me
exposed to the problem still - an open socket connection that fails isn't
detected. Is the argument that the next write on that socket should expose
that failure & cause the socket to be reopened? That's not what happens for my
application... our application stalls until the INVITE that is sent expires.
PK
________________________________________
From: slgodin@xxxxxxxxx [mailto:slgodin@xxxxxxxxx] On Behalf Of Scott Godin
Sent: Sunday, May 17, 2009 12:05 PM
To: Paul Kurmas
Cc: resiprocate-users@xxxxxxxxxxxxxxx
Subject: Re: [reSIProcate-users] Controlling TCP connection reuse
Right now the stack will not automatically close any TCP connections, unless it
get's an error sending or receiving, or the OS has run out of TCP socket
descriptors (TcpBaseTransport.cxx line 153). Dead TCP connections are not
normally detected until you try to send data on them. Using the
KeepAliveManager will help to cleanup dead connections, since it will ensure
there is some data sent on each connection periodically. However on some OS's,
it can still take up to 2 mins to discover the connection is dead, after
attempting to send data on it.
Closing the TCP connection after each transaction does not sound like a good
way to go, since it will be difficult to ensure that each TCP connection is
only used for one transaction at a time, TCP connections are reasonably
expensive, and this suggestion appears to go against RFC3261:
o In RFC 2543, closure of a TCP connection was made equivalent to a CANCEL.
This was nearly impossible to implement (and wrong) for TCP connections between
proxies. This has been eliminated, so that there is no coupling between TCP
connection state and SIP processing.
I'm not sure that there is a better/faster way to recover from dead TCP
connections. Does anyone else have any ideas?
Scott
On Fri, May 15, 2009 at 11:16 AM, Paul Kurmas <pkurmas@xxxxxxxxxxxxx> wrote:
I'm chasing an issue with stale connections to a remote endpoint that
was shutdown incorrectly. The TCP socket remains open, and there are no
keep-alives (either TCP or application (via DUM's KeepAliveManager).
When the remote endpoint restarts, the 1st INVITE is sent over that
stale connection, and the remote endpoint stack returns a TCP RST. On
the local endpoint there is no immediate reaction -- the application
stalls until the INVITE expires. The next request works fine because a
new connection must be opened.
I have activated DUM's KeepAliveManager and it does seem to clear the
connection after some time. That's good, but I'd prefer something more
responsive. It seems to me the best solution is to close the connection
after a much shorter period of time. This could be an immediate closure
of the TCP connection after a transaction is complete or a pathetically
low value for the aging of the cached connections.
I'd appreciate any feedback you could provide. By the way, we're
running Resiprocate v1.3.4 at this time.
PK
_______________________________________________
resiprocate-users mailing list
resiprocate-users@xxxxxxxxxxxxxxx
List Archive: http://list.resiprocate.org/archive/resiprocate-users/