< Previous by Date Date Index Next by Date >
< Previous in Thread Thread Index Next in Thread >

RE: [reSIProcate] TCP Behavior - on Windows at least...


OK - I think I finally understand what's happening (somewhat).  I needed to
figure out exactly how the whole buildFdSet / process loop works.  Let me
try to explain - please let me know if I'm misunderstanding somewhere:

BuildFdSet 
- iterates through all socket endpoints
  - Dns Server connections
  - Upd transports
  - TCP transports and active connections
  - etc.
- For each socket endpoint 
  - adds a read FD to the FdSet
  - adds a write FD to the FdSet (only if there is data to transmit)
  - adds an except FD to the FdSet (if socket is a TCP listener)

For TCP Transports - when process is called (TcpBaseTransport)
- ProcessAllWriteRequests
  - for each message in the TxFifo 
    - find or create a new socket connection
    - set the connection as writable and queue message on Outstanding 
      Send Queue for the connection
- ProcessSomeWrites
  - Get next connection to write from connection manager
  - if fdset contains write fd for connection then perform write
- ProcessSomeReads
  - Get next connection to read from connection manager
  - if fdset contains read fd for connection then perform read

**** This is where the problem lies.  If are calling process() on the
transport because of some other event (ie. Other than the select unblocking
because of this read FD).  Then will might actually end up calling read on a
socket that dosen't have any data.  The isReadyToRead() function - only
calls FD_ISSET - it only checks for the existence of the read FD in the set
- not whether there is actually data to read or not.  Therefore EWOULDBLOCK
is a very likely result from the socket read.

Note:  In the UDP Transport - EWOULDBLOCK is expected and ignored.

I don't really understand why this doesn't happen on other platforms tough -
unless I'm totally messed up with all this.  The only thing I can figure is
that on other platforms the result from calling socket read is not
EWOULDBLOCK.  

There is function TcpConnection::hasDataToRead() - but it just always
returns false.

Something else I don't understand.  Let's say we unblock from select because
there is data to read on one of many TCP connections.  When we call process
on the TCP Transport and ProcessSomeReads get's called - there is no
guarantee that we will actually try to read from the same connection/socket
that triggered the select.  This means the entire process call could No Op.
Then we need go through BuildFdSet/select/process again, this time hopefully
the connection manager will return the right connection (it uses round robin
reading). This doesn't seem right - is this the way it was intended to work?
It seems incredibly inefficient when there are lots of TCP connections.  It
would be good if the Connection Manager could actually return a connection
that has data waiting.


  

-----Original Message-----
From: Derek MacDonald [mailto:derek@xxxxxxxx] 
Sent: Thursday, April 21, 2005 1:34 PM
To: 'Scott Godin'; 'Alan Hawrylyshen'
Cc: resiprocate-devel@xxxxxxxxxxxxxxxxxxx
Subject: RE: [reSIProcate] TCP Behavior - on Windows at least...

We ran into this bug in ares, albeit for UDP. Windows sometimes gives a
false positive from select.  The approach is to do an FD_CLR when
WSAEWOULDBLOCK is returned. See line 289 pf ares_process for an example.

--Derek

CONFIDENTIALITY NOTICE

This email and any files transmitted with it contains proprietary
information and, unless expressly stated otherwise, all contents and
attachments are confidential. This email is intended for the addressee(s)
only and access by anyone else is unauthorized. If you are not an addressee,
any disclosure, distribution, printing or copying of the contents of this
email or its attachments, or any action taken in reliance on it, is
unauthorized and may be unlawful. If you are not an addressee, please inform
the sender immediately and then delete this email and any copies of it.
Thank you for your co-operation.

> -----Original Message-----
> From: resiprocate-devel-bounces@xxxxxxxxxxxxxxxxxxx [mailto:resiprocate-
> devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Scott Godin
> Sent: Thursday, April 21, 2005 10:19 AM
> To: 'Alan Hawrylyshen'; Scott Godin
> Cc: resiprocate-devel@xxxxxxxxxxxxxxxxxxx
> Subject: RE: [reSIProcate] TCP Behavior - on Windows at least...
> 
> Thanks for responding Alan!
> 
> I'm using SVN head.  I can see resip is closing via Ethereal (ACK-FIN).
> After digging some more, I can also see in the resip logs....
> 
> The problem is triggered in the following code:
> void
> TcpBaseTransport::processSomeReads(FdSet& fdset)
> {
>    Connection* currConnection = mConnectionManager.getNextRead();
>    if (currConnection)
>    {
>       if ( fdset.readyToRead(currConnection->getSocket()) ||
>            currConnection->hasDataToRead() )
>       {
>          DebugLog (<< "TcpBaseTransport::processSomeReads() " <<
> *currConnection);
>          fdset.clear(currConnection->getSocket());
> 
>          int bytesRead = currConnection->read(mStateMachineFifo);
>          DebugLog (<< "TcpBaseTransport::processSomeReads() "
>                    << *currConnection << " read=" << bytesRead);
>          if (bytesRead < 0)
>          {
>             DebugLog (<< "Closing connection bytesRead=" << bytesRead);
>             delete currConnection;
>          }
>       }
> ...
> 
> For some reason fdset.readyToReady is returning TRUE, but then
> currConnection->read is returning WSAWOULDBLOCK.  Therefore resip tears
> down
> the socket connection.  I suspect the problem is Windows related.  Are you
> using windows at all?
> 
> I'm still digging (trying to understand why readyToRead is returning
> true),
> so any help or suggestions would be appreciated.
> 
> Thanks,
> 
> Scott
> 
> 
> -----Original Message-----
> From: Alan Hawrylyshen [mailto:alan@xxxxxxxxxx]
> Sent: Thursday, April 21, 2005 1:12 PM
> To: Scott Godin
> Subject: Re: [reSIProcate] TCP Behavior - on Windows at least...
> 
> 
> On Apr 21, 2005, at 09.36, Scott Godin wrote:
> 
> > I've been noticing the following behavior when using TCP transport (on
> > Windows):
> >
> >     1       Resip disconnects socket connections immediately after
> receiving
> > SIP requests from the far end.
> >     2       For UAC Invites - resip closes the socket after receiving
> the 200
> > and reopens a new one for sending the ACK.
> >     3       For UAS Invites - resip does not close the socket after
> sending a
> > 200, but waits for the ACK before closing.
> >
> > For 1 - is this correct?
> > For 2 and 3 - which of these are correct?
> > Should resip be closing socket connections that it did not create?
> >
> > Any plans to implement connection re-use?
> >
> > Thanks,
> >
> > Scott
> > ___
> 
> We aren't seeing this. What repository revision is showing you this,
> and can you see the TCP-level RST / FIN bits?
> IOW -- are you SURE resip is doing the closing?
> A
> 
> a l a n a t j a s o m i d o t c o m
> _______________________________________________
> resiprocate-devel mailing list
> resiprocate-devel@xxxxxxxxxxxxxxxxxxx
> https://list.sipfoundry.org/mailman/listinfo/resiprocate-devel