< Previous by Date Date Index Next by Date >
< Previous in Thread Thread Index Next in Thread >

Re: [repro-users] DNS SRV failover


Hi Scott,

Thanks for follow up!

Just for confirmation 32-seconds transaction time is derived from T1*64
in stack right?

On 02/12/16 17:52, Scott Godin wrote:
> Hi Nikolay,
> 
> I know it's a year later, but I've finally had a chance to look into this
> and I understand why we are not trying the next DNS entry.  I will be
> committing a solution, likely sometime next week.
> 
> As far as a standard 408 error that occurs after 32-seconds - typically
> over UDP.  It doesn't make sense for the stack to try another DNS entry,
> since the 32 second transaction time has already expired.  Stack users
> expect some form of response within 32-seconds after issuing a request.
> For requests that 408 after 32 seconds, the application will need to be
> responsible for re-issuing the request if it is desired.
> 
> With my changes coming, as long as TCP connection timeout occurs before the
> 32-second transaction timeout, the stack will try the next DNS entry.
> 
> Best Regards,
> Scott
> 
> On Tue, Dec 1, 2015 at 5:57 AM, Nikolay Shopik <shopik@xxxxxxxxxx> wrote:
> 
>> Hi Scott,
>>
>> Any chance you was able to look into this? My future testing confirms
>> that it not related to introducing tcpconnecttimeout option, issue was
>> exist before it. Nobody just able to reproduce since nobody actually
>> waiting for 32sec timeouts before then
>>
>> tcpconnecttimeout is awesome but as global setting it doesn't fit every
>> situation, per transport option will be much better. So I hope
>> eventually this could be implemented too.
>>
>> Thanks
>>
>> On 09/10/15 17:27, Scott Godin wrote:
>>> Thanks for reporting this.  I did not specifically test out DNS failover
>>> when I added the TCP connect timeout.  This will need to be investigated.
>>> Unfortunately I'm travelling for the next 1.5 weeks and probably won't
>> have
>>> any time in the short term to take a look.  Please let us know if you are
>>> able to troubleshoot this further.
>>>
>>> Thanks,
>>> Scott
>>>
>>> On Fri, Oct 9, 2015 at 10:14 AM, Nikolay Shopik <shopik@xxxxxxxxxx>
>> wrote:
>>>
>>>> This is continuation of this thread -
>>>> http://list.resiprocate.org/archive/repro-users/msg00875.html
>>>>
>>>> I'm trying out tcpconnecttimeout option (thanks Scott for adding this),
>>>> where I have 3 SRV TCP records with different priority, where high
>>>> priority(lower value) peer always down.
>>>>
>>>> But my tcpdump show that after tcpconnecttimeout timer is passed it
>>>> notify me with request timeout 408, not even trying next peer.
>>>>
>>>> So I'm set tcpconnecttimeout to 0 and wait for 32 seconds and still get
>>>> - 408 Request Timeout after first peer failure.
>>>>
>>>> There is one thing though, first call always fails for me, but if I
>>>> redial almost immediately its get through via second DNS SRV record.
>>>>
>>>> This is repro 1.10, and I've tested on 1.9.7 too with same results so
>>>> this doesn't looks like regression.
>>>>
>>>> debug output
>>>> https://gist.github.com/nshopik/8ef091d2e329336227e8
>>>> _______________________________________________
>>>> repro-users mailing list
>>>> repro-users@xxxxxxxxxxxxxxx
>>>> https://list.resiprocate.org/mailman/listinfo/repro-users
>>>>
>>>
>>
>