Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Websockets - 150 seconds to connect to the endpoint #10157

Closed
mahisnghrwt opened this issue Dec 25, 2022 · 3 comments
Closed

Websockets - 150 seconds to connect to the endpoint #10157

mahisnghrwt opened this issue Dec 25, 2022 · 3 comments

Comments

@mahisnghrwt
Copy link

mahisnghrwt commented Dec 25, 2022

I did this

I am using the multi_socket interface with libev to connect to a WebSocket endpoint. I have observed an intermittent bug where curl gets stuck CURLMstate::MSTATE_CONNECTING stage for around 150 seconds because the underlying socket wasn't writable(is_connected done out parameter is false) when curl checked. And, nearest timeout set is for EXPIRE_DNS_PER_NAME which defaults to 200 seconds. But, every single time, eventually, curl connects to the endpoint.

I expected the following

I'd expect curl to routinely check the underlying socket until it becomes writable.

This might help in reproducing the bug

  • Websocket endpoint - wss://fstream.binance.com/stream?streams=btcusdt@depth5
  • Set CURLOPT_HAPPY_EYEBALLS_TIMEOUT_MS to 100(ms) (Not sure how this is related.)
  • Only add a single easy handle to the multi interface.
  • Log
    [Thread debugging using libthread_db enabled]
    Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
    [Detaching after vfork from child process 7672]
    OpenSSL 1.1.1s  1 Nov 2022
    libcurl version: 7.87.0
    * STATE: INIT => CONNECT handle 0x1346808; line 1909 (connection #-5000)
    * Added connection 0. The cache now contains 1 members
    [New Thread 0x7ffff76a1700 (LWP 7674)]
    * STATE: CONNECT => RESOLVING handle 0x1346808; line 1955 (connection #0)
    * [CONN-0-0] setup, init filter chain
    * [CONN-0-0] cf_add(filter=SOCKET)
    * [CONN-0-0] cf_add(filter=SSL)
    * [CONN-0-0][CF-SOCKET] setup(remotehost=fstream.binance.com)
    * STATE: RESOLVING => CONNECTING handle 0x1346808; line 2029 (connection #0)
    * family0 == v4, family1 == v6
    *   Trying 3.113.111.236:443...
    [Thread 0x7ffff76a1700 (LWP 7674) exited]
    * [CONN-0-0] connect(block=0)-> 0, done=0
    * [CONN-0-0] connect(block=0)-> 0, done=0
    * [CONN-0-0] connect(block=0)-> 0, done=0
    * [CONN-0-0] connect(block=0)-> 0, done=0
    [New Thread 0x7ffff76a1700 (LWP 7677)]
    // Stuck for 150 seconds
    
  • Log from gdb
    [Detaching after vfork from child process 13557]
    OpenSSL 1.1.1s  1 Nov 2022
    libcurl version: 7.87.0
    
    Breakpoint 1, Curl_expire (data=0x1346808, milli=0, id=EXPIRE_RUN_NOW) at multi.c:3517
    3517	in multi.c
    data = 0x1346808
    milli = 0
    id = EXPIRE_RUN_NOW
    * STATE: INIT => CONNECT handle 0x1346808; line 1909 (connection #-5000)
    * Added connection 0. The cache now contains 1 members
    [New Thread 0x7ffff76a1700 (LWP 13559)]
    
    Thread 1 "url_test.bin" hit Breakpoint 1, Curl_expire (data=0x1346808, milli=1, id=EXPIRE_ASYNC_NAME) at multi.c:3517
    3517	in multi.c
    data = 0x1346808
    milli = 1
    id = EXPIRE_ASYNC_NAME
    * STATE: CONNECT => RESOLVING handle 0x1346808; line 1955 (connection #0)
    
    Thread 1 "url_test.bin" hit Breakpoint 1, Curl_expire (data=0x1346808, milli=2, id=EXPIRE_ASYNC_NAME) at multi.c:3517
    3517	in multi.c
    data = 0x1346808
    milli = 2
    id = EXPIRE_ASYNC_NAME
    
    Thread 1 "url_test.bin" hit Breakpoint 1, Curl_expire (data=0x1346808, milli=4, id=EXPIRE_ASYNC_NAME) at multi.c:3517
    3517	in multi.c
    data = 0x1346808
    milli = 4
    id = EXPIRE_ASYNC_NAME
    
    Thread 1 "url_test.bin" hit Breakpoint 1, Curl_expire (data=0x1346808, milli=8, id=EXPIRE_ASYNC_NAME) at multi.c:3517
    3517	in multi.c
    data = 0x1346808
    milli = 8
    id = EXPIRE_ASYNC_NAME
    
    Thread 1 "url_test.bin" hit Breakpoint 1, Curl_expire (data=0x1346808, milli=16, id=EXPIRE_ASYNC_NAME) at multi.c:3517
    3517	in multi.c
    data = 0x1346808
    milli = 16
    id = EXPIRE_ASYNC_NAME
    [Thread 0x7ffff76a1700 (LWP 13559) exited]
    
    Thread 1 "url_test.bin" hit Breakpoint 1, Curl_expire (data=0x1346808, milli=0, id=EXPIRE_RUN_NOW) at multi.c:3517
    3517	in multi.c
    data = 0x1346808
    milli = 0
    id = EXPIRE_RUN_NOW
    
    Thread 1 "url_test.bin" hit Breakpoint 1, Curl_expire (data=0x1346808, milli=7, id=EXPIRE_ASYNC_NAME) at multi.c:3517
    3517	in multi.c
    data = 0x1346808
    milli = 7
    id = EXPIRE_ASYNC_NAME
    * [CONN-0-0] setup, init filter chain
    * [CONN-0-0] cf_add(filter=SOCKET)
    * [CONN-0-0] cf_add(filter=SSL)
    * [CONN-0-0][CF-SOCKET] setup(remotehost=fstream.binance.com)
    * STATE: RESOLVING => CONNECTING handle 0x1346808; line 2029 (connection #0)
    * family0 == v4, family1 == v6
    *   Trying 52.197.50.89:443...
    
    Thread 1 "url_test.bin" hit Breakpoint 1, Curl_expire (data=0x1346808, milli=149988, id=EXPIRE_DNS_PER_NAME) at multi.c:3517
    3517	in multi.c
    data = 0x1346808
    milli = 149988
    id = EXPIRE_DNS_PER_NAME
    
    Thread 1 "url_test.bin" hit Breakpoint 1, Curl_expire (data=0x1346808, milli=149988, id=EXPIRE_DNS_PER_NAME2) at multi.c:3517
    3517	in multi.c
    data = 0x1346808
    milli = 149988
    id = EXPIRE_DNS_PER_NAME2
    
    Thread 1 "url_test.bin" hit Breakpoint 1, Curl_expire (data=0x1346808, milli=100, id=EXPIRE_HAPPY_EYEBALLS) at multi.c:3517
    3517	in multi.c
    data = 0x1346808
    milli = 100
    id = EXPIRE_HAPPY_EYEBALLS
    * [CONN-0-0] connect(block=0)-> 0, done=0
    * [CONN-0-0] connect(block=0)-> 0, done=0
    * [CONN-0-0] connect(block=0)-> 0, done=0
    [New Thread 0x7ffff76a1700 (LWP 13560)]
    
    

curl/libcurl version

curl 7.87.0 (x86_64-pc-linux-gnu) libcurl/7.87.0 OpenSSL/1.1.1s zlib/1.2.11
Release-Date: 2022-12-21
Protocols: dict file ftp ftps gopher gophers http https imap imaps mqtt pop3 pop3s rtsp smb smbs smtp smtps telnet tftp ws wss
Features: alt-svc AsynchDNS Debug HSTS HTTPS-proxy IPv6 Largefile libz NTLM NTLM_WB SSL threadsafe TLS-SRP TrackMemory UnixSockets

operating system

Linux mahi 5.13.0-30-generic #33~20.04.1-Ubuntu SMP Mon Feb 7 14:25:10 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
@mahisnghrwt
Copy link
Author

I would like to point out that this issue was also present in 7.86.0.

@bagder
Copy link
Member

bagder commented Dec 25, 2022

@icing isn't this a bug? In this case, there's a socket filter and an SSL filter added, and when Curl_conn_get_select_socks() is called the first time when the multi is in MSTATE_CONNECTING ends up calling the SSL version Curl_ssl_get_select_socks...

In that function, the state is neither of the two checked for, so it returns blank. This, while the socket is not yet connected. Surely it calls the wrong getsock() here prematurely?

bagder added a commit that referenced this issue Dec 26, 2022
Since there are filters addded for both socket and SSL, the code
previously checked the SSL sockets during connect when it *should* first
check the socket layer until that has connected.

Fixes #10157
@bagder bagder closed this as completed in 728400f Dec 27, 2022
@mahisnghrwt
Copy link
Author

Thank you @bagder for quickly fixing it.

matthewsot pushed a commit to ygina/curl that referenced this issue Jan 19, 2023
When there are filters addded for both socket and SSL, the code
previously checked the SSL sockets during connect when it *should* first
check the socket layer until that has connected.

Fixes curl#10157
Fixes curl#10146
Closes curl#10160

Reviewed-by: Stefan Eissing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

2 participants