Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random delay when trying the next IP after a connect timeout #11920

Closed
hwti opened this issue Sep 22, 2023 · 5 comments
Closed

Random delay when trying the next IP after a connect timeout #11920

hwti opened this issue Sep 22, 2023 · 5 comments
Assignees

Comments

@hwti
Copy link
Contributor

hwti commented Sep 22, 2023

I did this

I tested the connect timeout handling, by connecting to 2 non-responding IPs :

~ curl -v --trace-config all --connect-timeout 1 --resolve test.com:80:192.0.2.0,192.0.2.1 test.com:80
18:46:22.229745 [0-x] * Added test.com:80:192.0.2.0,192.0.2.1 to DNS cache
18:46:22.229783 [0-0] * Hostname test.com was found in DNS cache
18:46:22.229799 [0-0] * [SETUP] added
18:46:22.229818 [0-0] * [HAPPY-EYEBALLS] created ipv4 (timeout 500ms)
18:46:22.229833 [0-0] * [HAPPY-EYEBALLS] ipv4 starting (timeout=500ms)
18:46:22.229854 [0-0] *   Trying 192.0.2.0:80...
18:46:22.229869 [0-0] * [TCP] cf_socket_open() -> 0, fd=5
18:46:22.229927 [0-0] * [HAPPY-EYEBALLS] ipv4 connect -> 0, connected=0
18:46:22.429240 [0-0] * [TCP] not connected yet
18:46:22.429284 [0-0] * [HAPPY-EYEBALLS] ipv4 connect -> 0, connected=0
18:46:22.430385 [0-0] * [TCP] not connected yet
18:46:22.430412 [0-0] * [HAPPY-EYEBALLS] ipv4 connect -> 0, connected=0
18:46:22.729805 [0-0] * [TCP] not connected yet
18:46:22.729852 [0-0] * [HAPPY-EYEBALLS] ipv4 connect -> 0, connected=0
18:46:22.729899 [0-0] * [TCP] not connected yet
18:46:22.729948 [0-0] * ipv4 connect timeout after 500ms, move on!
18:46:22.729995 [0-0] * [HAPPY-EYEBALLS] ipv4 connect -> 28, connected=0
18:46:22.730065 [0-0] * [TCP] cf_socket_close(5, not active)
18:46:22.730122 [0-0] * [TCP] destroy
18:46:22.730162 [0-0] * [HAPPY-EYEBALLS] ipv4 trying next
18:46:23.230852 [0-0] *   Trying 192.0.2.1:80...
18:46:23.230923 [0-0] * [TCP] cf_socket_open() -> 0, fd=5
18:46:23.231057 [0-0] * ipv4 connect timeout after 500ms, move on!
18:46:23.231090 [0-0] * [HAPPY-EYEBALLS] ipv4 connect -> 28, connected=0
18:46:23.231126 [0-0] * [HAPPY-EYEBALLS] ipv4 done
18:46:23.231163 [0-0] * [HAPPY-EYEBALLS] all eyeballers failed
18:46:23.231206 [0-0] * [HAPPY-EYEBALLS] ipv4 assess started=1, result=28
18:46:23.231260 [0-0] * Failed to connect to test.com port 80 after 1001 ms: Timeout was reached
18:46:23.231319 [0-0] * Closing connection
18:46:23.231368 [0-0] * [SETUP] close
18:46:23.231416 [0-0] * [HAPPY-EYEBALLS] close
18:46:23.231468 [0-0] * [TCP] cf_socket_close(5, not active)
18:46:23.231537 [0-0] * [TCP] destroy
18:46:23.231594 [0-0] * [HAPPY-EYEBALLS] destroy
18:46:23.231649 [0-0] * [SETUP] destroy
curl: (28) Failed to connect to test.com port 80 after 1001 ms: Timeout was reached

I expected the following

There should be no delay between [HAPPY-EYEBALLS] ipv4 trying next and Trying / cf_socket_open.

Sometimes it works correctly, but sometimes it seems to sleep for the connect timeout delay before opening the socket (then it timeouts immediately).

curl/libcurl version

Issue seen with 8.3.0 and on master.

curl 8.4.0-DEV (x86_64-pc-linux-gnu) libcurl/8.4.0-DEV zlib/1.2.13 brotli/1.0.9 zstd/1.5.5 libidn2/2.3.4 libpsl/0.21.2 (+libidn2/2.3.4) nghttp2/1.52.0
Release-Date: [unreleased]
Protocols: dict file ftp gopher http imap mqtt pop3 rtsp smtp telnet tftp
Features: alt-svc AsynchDNS brotli HTTP2 IDN IPv6 Largefile libz PSL threadsafe UnixSockets zstd

operating system

Linux (Fedora 38, kernel 6.4.15), but also seen on kernel 5.4 on an embedded target.

@bagder
Copy link
Member

bagder commented Sep 23, 2023

The delay does not seem to be random. It appears to be the timeout time/2, which in this case is 500ms. When I try the same command line but change the connect time-out to 2 seconds, it seems the delay is about 1000ms.

@hwti
Copy link
Contributor Author

hwti commented Sep 23, 2023

What I meant is that the issue is random, I either get the time/2, or 0.

bagder added a commit that referenced this issue Sep 25, 2023
... so that it gets called again immediately and can continue trying
addresses to connect to. Otherwise it might unnecessarily wait for a
while there.

Fixes #11920
Reported-by: Loïc Yhuel
@hwti
Copy link
Contributor Author

hwti commented Sep 25, 2023

It seems that when it works properly, it's due to another issue, sometimes curl_multi_perform is called twice for the same timer :

  • multi_timeout uses Curl_timediff which rounds down
  • curl_multi_perform is called slightly too early (less than 1ms) for the 500ms timeout
  • the connect timeout logic takes a new timestamp (enough to declare the timeout)
  • curl_multi_perfom doesn't remove the timer, since it uses the initial now value
  • multi_timeout returns 0
  • curl_multi_perform is called again, which tries the next IP address

So multi_timeout should probably use a new variant of Curl_timediff which rounds up ( ffd08df handled it for 0, but it can happen for any value).

@bagder
Copy link
Member

bagder commented Sep 25, 2023

Yes, that could be an optimization. I have just never bothered about it...

@hwti
Copy link
Contributor Author

hwti commented Sep 25, 2023

Yes, that could be an optimization. I have just never bothered about it...

Done in #11938 (since it helps reproducing this issue).
The happy eyeballs timeout can also be removed when there is only one family : #11939.

bagder added a commit that referenced this issue Sep 25, 2023
... so that it gets called again immediately and can continue trying
addresses to connect to. Otherwise it might unnecessarily wait for a
while there.

Fixes #11920
Reported-by: Loïc Yhuel
@bagder bagder closed this as completed in 01d8473 Sep 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

3 participants