curl-library
Re: Changes to connection timeout policy when multiple DNS records are present
Date: Tue, 11 Nov 2014 11:09:40 +0100 (CET)
On Mon, 10 Nov 2014, Ryan Braud wrote:
> After doing some testing with libcurl lately I have noticed curl's connect 
> behavior on timeouts has changed sometime between 7.22 and 7.39.
You're taking on three years of development there. More than 4200 commits.
[7.22]
> In this version, curl tries to connect to each IP and declares it as timed
> out it there was no response in 2 seconds.
Right, we used [timeout]/N seconds for each N hosts. That turned out a rather 
silly algoritm since if a site suddenly added more addresses to its name it 
would decrease the time curl would attempt to connect. Not really what users 
expect.
Also, if you ask for a 10 seconds timeout, it isn't fair to assume you want to 
try all possible IP addresses within that time but that you're basically 
willing to try to get it done during 10 seconds. I'm not totally convinced the 
current algorithm is perfect yet either since it takes the timeout period down 
too much after a few attempts.
> * Rebuilt URL to: www.google.com:45/
> * Hostname was NOT found in DNS cache
> *   Trying 74.125.239.51...
> * After 4998ms connect time, move on!
> Now curl seems to allocate half its time to the first connection, half of 
> that time to the second, and so on.
Yes, that's the new algorithm. It is meant to favour the first addresses more 
and not split the total time into just a fraction.
> CURLINFO_PRIMARY_IP now returns the empty string after these fetches.
Yes, but is that really wrong? What do you think the primary IP is on a failed 
connect attempt to N different IP addresses? I see that it used to report the 
last tried IP in the past, but that's not exactly how it is documented.
I can see how we can fix this back to the former data, but then we should also 
update the docs accordingly.
> 7.39:
> * After 196ms connect time, move on!
> * connect to 74.125.239.51 port 45 failed: Connection timed out
> *   Trying 74.125.239.50...
> * Connection timed out after 10000 milliseconds
> * Closing connection 0
>
> The strategy here seems mostly the same as in 7.36, except the values don't
> make as much sense.  If you add up the times it spent on each individual
> connection, you end up well short of 10000 ms, even though the wallclock
> time of the program is very close to 10 seconds.  CURLINFO_PRIMARY_IP is
> also missing here.
The times allowed seem to be roughly the same ones as used in 7.36. It splits 
the maximum time for each IP tried. So 5 seconds for the first, 2.5 for the 
next and so on which gives the fifth IP a mere 312 milliseconds (adjusted 
somewhat since time is wasted every here and there so the last one actually 
only got 196 ms).
> So I have a few questions:
> 1)  When did the retry behavior change between 7.22 and 7.36?  I don't see
> anything in the changelog relating to retries to timeouts on connections.
I couldn't find it right now.
> 2)  Was it intentional to remove CURLINFO_PRIMARY_IP when a connection was 
> not established?  I was relying on this value before as long as the DNS 
> resolution was successful and now it is mysteriously not there.
I can't recall that we removed it intentionally, but I also think that it was 
kind of there unintentionally to begin with as I mentioned already.
> 3)  What happened between 7.36 and 7.39 to make the timings "strange" in the 
> current version?
I don't think they're that strange, it just looks like something eats up some 
more time before the next address is used. Could of course be worth 
investigating.
-- / daniel.haxx.se ------------------------------------------------------------------- List admin: http://cool.haxx.se/list/listinfo/curl-library Etiquette: http://curl.haxx.se/mail/etiquette.htmlReceived on 2014-11-11