< Previous by Date Date Index Next by Date >
< Previous in Thread Thread Index  

Re: [reSIProcate] Transport TxFifo draining: timing


Hi Kennard,

I like the idea of keeping a list/queue of transports that have data to write in order to avoid the O(n) walk of the transports every process loop.

At one point in time the stack transports were able to run in their own threads, this has since been deprecated due to a lack of performance gain, but it likely explains why things are done the way they are today.  I don't see any issues trying to write data out immediately if possible, but it would be good to get feedback from the original authors on this.  : )

Scott

On Fri, Nov 19, 2010 at 6:56 PM, Kennard White <kennard_white@xxxxxxxxxxxx> wrote:
Hi,

One of the key issues I'm facing in extending resip/repro to handle many connections is the why the stack Transport's drain their TxFifo queues. The flow today is:
  • When data is ready to send from transaction layer, the message is added to the Transport's TxFifo queue.
  • Prior to calling select(), the hasDataToSend() query is run on all internal Transports. This is O(#transports).
  • After select returns (with zero timeout), process() is called on all internal Transports. Each transport checks its TxFifo queue.
With the epoll() patch in place, these two O(N) traversals for the TxFifo appear to be driving factors. Aside from performance issues, this is another obstacle to libevent migration. I can think of a few things to do about this:
  • Keep a list queue of Transports with data to transmit. As data is added to TxFifo, append the Transport to queue. This would solve the O(N) problem. The work is non-trivial low-level queue stuff with reasonable chance of bugs, especially if implemented with performance in mind.
  • InternalTransport::transmit(), which posts to data to the queue, could also call a method of Transport to do "something" with the data. There are two somethings that could happen:
    • Request callback on socket writable, and in that callback (from epoll, etc.) would then try draining the queue. This is what I ended up doing with UdpTransport: see checkTransmitQueue().
    • Read from the queue immediately and try writing the message onto the wire. If the write fails, then need to stash this "working" message and request writable callback. This approach is the preferred style when working with edge-triggered epoll.
Is there any particular reason not to try writing the data immediately when posted to the queue? Issues I can think of:
  • Threads. I'm concerned only with the InternalTransports which all run within the same thread as the stack itself. As far as I know, no other threads post to the transmit queue. Is that true?
  • Fairness. Defering the write until after returning to the main loop would allow some sort of prioritization to take place. But as far as I can tell there isn't any code in place to enforce fairness or prioritization.
  • Recursion. Some transports post messages to their own TxFifo; would need to careful about recursion; ideally avoid it. I suspect recursion safety is relatively simple to implement.

Any other approaches? Preferences? Risks?

Thanks,
Kennard

_______________________________________________
resiprocate-devel mailing list
resiprocate-devel@xxxxxxxxxxxxxxx
https://list.resiprocate.org/mailman/listinfo/resiprocate-devel