< Previous by Date | Date Index | Next by Date > |
Thread Index | Next in Thread > |
File: resip/stack/doc/resip-epoll-notes.txt Author: Kennard White Created: Nov 10, 2010 Updated: Nov 10, 2010 This memo describes the epoll() support within resiprocate. The support is a prototype only. Overview -------- The primary purpose of the first version of epoll() support is to allow resiprocate to have more than 1024 sockets concurrently open. With select() based system, the fd_set is compile-time limited to 1024 file descriptors (starting at fd zero). Some systems (which?) allow this to be increased at compile time, but there are several warnings in code not to try this. So I didn't :-). For the curous, my immediate application for needing many concurrent sockets is a server-side instance of repro that is handling TCP connections in "outbound" mode; e.g. the connection stay open indefiniately. By default, epoll() support is disabled. The runtime call must be made to turn-on epoll support. Right now, epoll is always compiled-in, and will lead to compilation problems on platforms without epoll. I need to figure out how to disable this. This version uses epoll() internally, but doesn't change any external interfaces. E.g., StackThread doesn't change at all. This is possible for two reasons: * epoll maybe used hierarchically/recursively. * select works fine for handful of descriptors that are small In particular, it is possible to select() on the epoll file-descriptor(s). The two main Socket-using subsystems (ares,Transport) open separate internal epoll descriptors and manage their file descriptors using epoll. They pass the epoll descriptor up the chain to the owning thread, which can then use select. All the descriptors seen by select are opened up early and are less than 1024. Performance ----------- This primary purpose of the change is to support a large number of transports/connections, something that is just not possible with select(). The epoll support, in its current form, actually makes things slower! This is because (I assume) there are two levels of system calls involved. Changes closely related to epoll -------------------------------- Added rutil/FdPoll files and related classes. This implements the system calls for epoll. Note that this ONLY supports epoll, unlike Poll.hxx which ONLY supports select and poll. (Aside from the system calls used, the purpose and interfaces are different). Changes to contrib/ares and rutil/dns/AresDns. The challenge here is that while ares only uses a handful of concurrent socket, they are dynamically opened. In particular, they can be opened after we have all of our tranport sockets open, and thus ares will get fds > 1024 and the fdset stuff will fail. Thus needed to make ares work with epoll. Added new callback system from contrib/ares into AresDns class, which then handles the epoll interface on its behalf. This is the trickiest set of changes. Plumb useInternalPoll flag from SipStack, ExternalDns, DnsStub, to AresDns. For Transport, when using InternalPoll, work previously performed by process() is now performed by combination of processPollEvent() and processTransmmitQueue(). The former is called by the FdPoll dispatcher, while the later is called by TransportSelector. Restructure Connection objects to pass received messages up to owning transport rather than using a fifo passed in via process(). resip/stack/test/testStack extended to have --epoll option to enable. This doesn't test very much yet, but it is something. Other Changes ------------- Changed UdpTransport::hasDataToSend() to always return false. We use the socket-writability callback (fdset or poll) to drive draining the queue. Returning true here makes the select() timeout zero, which isn't what we want. E.g., we will wait for writability, even if process() is immediately invoked. Fix some test programs (port selection & log levels) Fix ares set non-blocking function. Added a lot of diags (now commented out) to find this problem. Add getSocketError() function to rutil/Socket.cxx Clarify the multi-platform Socket support in rutil/Socket.hxx Fix compiler warning in OpenSSLInit.cxx Fix int vs long int warnings in ares Questions --------- Does any code is rutil/Poll.hxx? How to determine which platforms have epoll? Used to conditionally compile FdPoll.
Attachment:
resip-epoll-contrib.patch
Description: Binary data
Attachment:
resip-epoll-stack.patch
Description: Binary data