Re: [reSIProcate] DNS performance tuning
That makes sense. I tried a load test to see if the response time would
speed up if the pipeline remains full. I think that helps. Instead of 25
or 30 ms total for a lookup, it drops down to around 15 ms on average.
Sometimes you still get longer ones, other times there will be a lookup
complete in 2 or 3 ms but these are unusual.
I think the behavior you describe reduces the delay at the
fdset.selectMilliSeconds, but the other delay at the top of the loop at
waitForShutdown remains, and is usually in the 10 to 15 ms range.
I wonder if there is a way to redesign that loop so that in addition to
waiting for shutdown, it also waits on a signal that work is ready to
process. I'm not sure why the delay is so long if the argument is 1 ms, but
maybe it would help if we could remove the "timeout and look for work" logic
and just use an interrupt.
Regards,
Dave
FYI, this code is replacing a gethostbyname call. Those usually run in less
the 1 ms. This goal may not be realistic, but I'd like to get the
performance in the same ballpark. If I cant get near there I may need to do
my own caching scheme above the resip layer. :( I think the key to that
would be letting resip manage the TTL, so somehow my layer would need access
to that.
-----Original Message-----
From: Byron Campen [mailto:bcampen@xxxxxxxxxxxx]
Sent: Monday, January 22, 2007 3:43 PM
To: Dave Mason
Cc: resiprocate-devel@xxxxxxxxxxxxxxxxxxxx
Subject: Re: [reSIProcate] DNS performance tuning
This isn't really a poll; this relies on select, which will suspend
execution until something occurs on one of the sockets in fdset. The integer
argument to selectMilliSeconds is a maximum wait-time, after which execution
will resume, regardless of whether anything worth noting happened in fdset
(this is to keep non-socket-related processing from starving). The 10 ms
delay is probably network- related; the point of waiting on the FdSet is to
allow network stuff to get ready. So, when you call dns->lookup(), no
network traffic happens, we just queue up some work to be done, and wait
until we get the go-ahead from the kernel (by calling select). Once we are
ready to send, we do so, and select again to wait for the reply. Once you
have lots of queries in flight, you end up handling lots of them per
iteration, but early on you haven't gotten stuff pipelined yet.
Best regards,
Byron Campen
> I think I'm getting close, I'm pretty sure the problem comes from the
> way my DnsInterface::thread works. Mine is nearly identical to the
> one in testDns.cxx. I tightened the delay down to 1 ms but that
> doesn't fix the problem. Here's what I've got:
>
> class ResipDnsInterface : public DnsInterface, public ThreadIf {
> public:
> void thread()
> {
> while (!waitForShutdown(1))
> {
> DebugLog (<< "+++ResipDnsInterface::thread - top of
> loop");
> FdSet fdset;
> buildFdSet(fdset);
> mStub.buildFdSet(fdset);
> fdset.selectMilliSeconds(1);
> process(fdset);
> DebugLog (<< "+++ResipDnsInterface::thread - before
> mStub.process");
> mStub.process(fdset);
> DebugLog (<< "+++ResipDnsInterface::thread - after
> mStub.process");
> }
> }
> }
>
> I find it can take as much as 10 ms between dns->lookup and when "top
> of loop" happens. From there, it always takes 10 ms until "before
> mStub.process" appears. The actual query is fast, the "after
> mStub.process"
> log appears within 1 ms.
>
> Instead of a delay loop with polling, can you suggest a way to use a
> condition signal to trigger the thread to execute? Do I really need
> all the fdset stuff each time through the loop, or could I just do it
> once?
>
> Dave
>
> -----Original Message-----
> From: Dave Mason
> Sent: Monday, January 22, 2007 1:51 PM
> To: 'Byron Campen'
> Cc: resiprocate-devel@xxxxxxxxxxxxxxxxxxxx
> Subject: RE: [reSIProcate] DNS performance tuning
>
> As part of another thing I was checking, I added a line to the
> constructor of my singleton wrapper class to set the time to live:
> RRCache::instance()->setTTL(5). This should instantiate the cache
> before the first lookup is ever done. However, the same delay happens
> on that first lookup and on every lookup that follows. I checked the
> cache times again, and I see that the SRV cache works fine too like
> the A record. The delay only happens at the very beginning before the
> query is attempted, sometime between dns->lookup(res, query.uri) and
> stub.query(...).
>
> Thanks for the help,
> Dave
>
>
> -----Original Message-----
> From: Byron Campen [mailto:bcampen@xxxxxxxxxxxx]
> Sent: Monday, January 22, 2007 1:08 PM
> To: Dave Mason
> Cc: resiprocate-devel@xxxxxxxxxxxxxxxxxxxx
> Subject: Re: [reSIProcate] DNS performance tuning
>
> I imagine you could be running into lazy-initialization delay. I
know
> that at least the RRCache isn't initialized until the first access is
> attempted. I can look through the code when I get a free moment
> (unless someone else knows off the top of their head).
>
> Best regards,
> Byron Campen
>
>> Answering my own question a bit, I put new logs at the beginning of
>> DnsStub::query and the DnsStub::Query::Query constructor. As you
>> would expect the constructor runs near instantaneously, but the delay
>> occurs before DnsStub::query is called, 29 ms in this case.
>>
>> DEBUG | 20070122-122348.333 | dcmlaptop | SmSipMgr | RESIP:APP |
>> 2961 |
>> 182419 | app.cpp:250 | ResipDns::lookup - waiting for results
>> +++DnsStub::query: enter 1169490228.362918
>> STACK | 20070122-122348.362 | dcmlaptop | SmSipMgr | RESIP:DNS |
>> 2961 |
>> 158850 | dns/DnsStub.cxx:363 | Query ctor enter STACK |
>> 20070122-122348.363 | dcmlaptop | SmSipMgr | RESIP:DNS |
>> 2961 |
>> 158850 | dns/DnsStub.cxx:376 | DnsStub::Query::go:
>> _sip._udp.flammajobby.com
>> type(enum): 33 proto: 1
>>
>>
>> -----Original Message-----
>> From: Dave Mason
>> Sent: Monday, January 22, 2007 11:33 AM
>> To: 'Byron Campen'
>> Cc: resiprocate-devel@xxxxxxxxxxxxxxxxxxxx
>> Subject: RE: [reSIProcate] DNS performance tuning
>>
>> Oops, meant to point out in my last mail that the log in
>> DnsStub::Query::go is at the beginning of the method, so the delay
>> has already happened before that method starts. My guess is there's
>> some kind of overhead involved, which hopefully I can move or deal
>> with.
>>
>> Regards,
>> Dave
>>
>> -----Original Message-----
>> From: Byron Campen [mailto:bcampen@xxxxxxxxxxxx]
>> Sent: Monday, January 22, 2007 11:11 AM
>> To: Dave Mason
>> Cc: resiprocate-devel@xxxxxxxxxxxxxxxxxxxx
>> Subject: Re: [reSIProcate] DNS performance tuning
>>
>> Well, it looks like you don't have any SRV records in your DNS,
> which
>> is what the resolver tries looking for first. When that fails, it
>> attempts an A query. So, you're having to wait for two DNS queries to
>> finish in sequence, which would explain some amount of delay. How
>> much delay are you seeing?
>>
>> Best regards,
>> Byron Campen
>>
>>> Hi,
>>> I picked up a development build from svn last week and the RRCache
>>> code works great, thanks for the help on that one. However, I'm
>>> still seeing a big delay when DnsStub::Query::go: is first called.
>>> Is this because I'm specifying "transport=udp"? Is there anything I
>>> can do to reduce or eliminate this? I keep a DnsInterface thread
>>> running continuously for all my lookups so I don't need to restart
>>> it each time. I cant think of any problems with my mutex/condition
>>> that would prevent Query::go from starting.
>>>
>>> Regards,
>>> Dave
>>>
>>> Application code
>>> ----------------
>>> Constructor for singleton wrapper class:
>>> dns = new ResipDnsInterface(*stub);
>>> dns->run();
>>>
>>> Lookup code:
>>> Lock lock(lookupCompleteMutex);
>>> Condition dnsCond;
>>> DebugLog (<< "ResipDns::lookup - Looking up");
>>> dns->lookup(res, query.uri);
>>> DebugLog (<< "ResipDns::lookup - waiting for results");
>>> dnsCond.wait(lookupCompleteMutex); // signal sent from end of
>>> DnsHandler::handle
>>> InfoLog (<< "ResipDns::lookup - DNS results for " << query.uri);
>>>
>>> Stack trace
>>> -----------
>>> DEBUG | 20070119-163503.126 | dcmlaptop | SmSipMgr | RESIP:APP |
>>> 17423 |
>>> 190616 | app.cpp:227 | ResipDns::lookup - Creating Uri:
>>> sip:flammajobby.com;transport=udp
>>> DEBUG | 20070119-163503.126 | dcmlaptop | SmSipMgr | RESIP:APP |
>>> 17423 |
>>> 190616 | app.cpp:233 | ResipDns::lookup - Creating DnsResult DEBUG |
>>> 20070119-163503.126 | dcmlaptop | SmSipMgr | RESIP:APP |
>>> 17423 |
>>> 190616 | app.cpp:236 | ResipDns::lookup - Looking up DEBUG |
>>> 20070119-163503.126 | dcmlaptop | SmSipMgr | RESIP:DNS |
>>> 17423 |
>>> 190616 | DnsResult.cxx:198 | DnsResult::lookup
>>> sip:flammajobby.com;transport=udp STACK | 20070119-163503.126 |
>>> dcmlaptop | SmSipMgr | RESIP:DNS |
>>> 17423 |
>>> 190616 | DnsResult.cxx:321 | Doing SRV lookup of
>>> _sip._udp.flammajobby.com DEBUG | 20070119-163503.126 | dcmlaptop |
>>> SmSipMgr | RESIP:APP |
>>> 17423 |
>>> 190616 | app.cpp:250 | ResipDns::lookup - waiting for results STACK
>>> |
>>> 20070119-163503.154 | dcmlaptop | SmSipMgr | RESIP:DNS |
>>> 17423 |
>>> 162926 | dns/DnsStub.cxx:375 | DnsStub::Query::go:
>>> _sip._udp.flammajobby.com
>>> type(enum): 33 proto: 1
>>> STACK | 20070119-163503.155 | dcmlaptop | SmSipMgr | RESIP:DNS |
>>> 17423 |
>>> 162926 | DnsResult.cxx:912 | Received SRV result for:
>>> flammajobby.com
>>> STACK | 20070119-163503.155 | dcmlaptop | SmSipMgr | RESIP:DNS |
>>> 17423 |
>>> 162926 | DnsResult.cxx:915 | DnsResult::onDnsResult() 0 status=4
>>> STACK
>>> | 20070119-163503.155 | dcmlaptop | SmSipMgr | RESIP:DNS |
>>> 17423 |
>>> 162926 | DnsResult.cxx:978 | SRV lookup failed:
>>> _sip._udp.flammajobby.com 4
>>> STACK | 20070119-163503.155 | dcmlaptop | SmSipMgr | RESIP:DNS |
>>> 17423 |
>>> 162926 | DnsResult.cxx:1024 | No SRV records for flammajobby.com.
>>> Trying A
>>> records
>>> STACK | 20070119-163503.155 | dcmlaptop | SmSipMgr | RESIP:DNS |
>>> 17423 |
>>> 162926 | dns/DnsStub.cxx:375 | DnsStub::Query::go: flammajobby.com
>>> type(enum): 1 proto: 1
>>> STACK | 20070119-163503.155 | dcmlaptop | SmSipMgr | RESIP:DNS |
>>> 17423 |
>>> 162926 | DnsResult.cxx:749 | Received dns result for:
>>> flammajobby.com
>>> STACK | 20070119-163503.155 | dcmlaptop | SmSipMgr | RESIP:DNS |
>>> 17423 |
>>> 162926 | DnsResult.cxx:750 | DnsResult::onDnsResult() 0 STACK |
>>> 20070119-163503.155 | dcmlaptop | SmSipMgr | RESIP:DNS |
>>> 17423 |
>>> 162926 | DnsResult.cxx:771 | Adding [ V4 10.4.5.249:5060 UDP target
>>> domain=flammajobby.com connectionId=0 ] to result set DEBUG |
>>> 20070119-163503.155 | dcmlaptop | SmSipMgr | RESIP:APP |
>>> 17423 |
>>> 162926 | app.cpp:66 | ResipDnsHandler::handle - received
>>> flammajobby.com STACK | 20070119-163503.155 | dcmlaptop | SmSipMgr |
>>> RESIP:DNS |
>>> 17423 |
>>> 162926 | DnsResult.cxx:178 | Returning next dns entry: [ V4
>>> 10.4.5.249:5060 UDP target domain=flammajobby.com connectionId=0 ]
>>> DEBUG | 20070119-163503.155 | dcmlaptop | SmSipMgr | RESIP:APP |
>>> 17423 |
>>> 162926 | app.cpp:74 | flammajobby.com -> [ V4 10.4.5.249:5060 UDP
>>> target domain=flammajobby.com connectionId=0 ] STACK |
>>> 20070119-163503.155 | dcmlaptop | SmSipMgr | RESIP:DNS |
>>> 17423 |
>>> 162926 | DnsResult.cxx:485 | Priming [] DEBUG |
>>> 20070119-163503.155 |
>>> dcmlaptop | SmSipMgr | RESIP:APP |
>>> 17423 |
>>> 162926 | app.cpp:78 | ResipDnsHandler::handle - complete INFO |
>>> 20070119-163503.155 | dcmlaptop | SmSipMgr | RESIP:APP |
>>> 17423 |
>>> 190616 | app.cpp:253 | ResipDns::lookup - DNS results for
>>> sip:flammajobby.com;transport=udp INFO | 20070119-163503.156 |
>>> dcmlaptop | SmSipMgr | RESIP:APP |
>>> 17423 |
>>> 190616 | app.cpp:260 | [ V4 10.4.5.249:5060 UDP target
>>> domain=flammajobby.com connectionId=0 ]
>>>
>>> _______________________________________________
>>> resiprocate-devel mailing list
>>> resiprocate-devel@xxxxxxxxxxxxxxxxxxxx
>>> https://list.resiprocate.org/mailman/listinfo/resiprocate-devel
>>
>