waiting for name resolver threads before quitting can cause delays even with --max-time #2975

crvv · 2018-09-11T04:21:36Z

I did this

Edit /etc/resolve.conf to use a bad DNS server.
time curl https://github.com/ --max-time 1 -v
And curl exits after 5 seconds. The output is

time curl https://github.com/ --max-time 1 -v
* Resolving timed out after 1000 milliseconds
* Could not resolve host: github.com
* stopped the pause stream!
* Closing connection 0
curl: (28) Resolving timed out after 1000 milliseconds
curl https://github.com/ --max-time 1 -v  0.02s user 0.02s system 0% cpu 5.042 total

I expected the following

curl exits after 1 second.
Document at https://ec.haxx.se/usingcurl-timeouts.html says "When the set time has elapsed, curl will exit no matter what is going on at that moment", so I think this is a bug.

curl/libcurl version

curl 7.61.0 (x86_64-pc-linux-gnu) libcurl/7.61.0 OpenSSL/1.1.0i zlib/1.2.11 libidn2/2.0.5 libpsl/0.20.2 (+libidn2/2.0.4) nghttp2/1.32.0
Release-Date: 2018-07-11
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS IDN IPv6 Largefile GSS-API Kerberos SPNEGO NTLM NTLM_WB SSL libz TLS-SRP HTTP2 UnixSockets HTTPS-proxy PSL

operating system

Arch Linux

The text was updated successfully, but these errors were encountered:

jay · 2018-09-11T05:41:02Z

Yes in your case it calls getaddrinfo which is blocking in a separate thread, however it waits in multi_done for the resolve to complete:

curl/lib/multi.c

Lines 536 to 539 in 432eb5f

    
           if(data->mstate == CURLM_STATE_WAITRESOLVE) { 
        
             /* still waiting for the resolve to complete */ 
        
             (void)Curl_resolver_wait_resolv(conn, NULL); 
        
           }

I'm not sure why that is necessary because libcurl threaded resolver was designed to handle cleaning up the resolves in the background, which would happen later on when multi_done calls via Curl_resolver_cancel and in this case that would call asyn-thread's destroy_async_data

curl/lib/asyn-thread.c

Lines 351 to 362 in 432eb5f

    
               /* 
        
                * if the thread is still blocking in the resolve syscall, detach it and 
        
                * let the thread do the cleanup... 
        
                */ 
        
               Curl_mutex_acquire(td->tsd.mtx); 
        
               done = td->tsd.done; 
        
               td->tsd.done = 1; 
        
               Curl_mutex_release(td->tsd.mtx); 
        
               if(!done) { 
        
                 Curl_thread_destroy(td->thread_hnd); 
        
               }

jay · 2018-09-11T05:57:11Z

git blame puts it at ac9a179

When the application just started the transfer and then stops it while
the name resolve in the background thread hasn't completed, we need to
wait for the resolve to complete and then cleanup data accordingly.

Enabled test 1553 again and added test 1590 to also check when the host
name resolves successfully.

Detected by OSS-fuzz.
Closes #1968

I think we should just let it leak

bagder · 2018-09-11T19:20:48Z

This isn't easy.

Leaking is a problem for tools that trigger on this (like OSS-Fuzz did) and for applications who'd do this several times. I don't think we can allow that to happen unconditionally. But also not responding sooner to this sort of failure is also disturbing...

jay · 2018-09-11T19:38:39Z

If the thread can be abandoned with ownership of all its resources which it later cleans up itself (I'm assuming here) then I don't see what benefit there is to waiting for it other than to please OSS-Fuzz. I think the fuzzer is wrong in this case.

alexeip0 · 2018-09-14T22:36:24Z

Perhaps we should track the number of detached threads and have Curl_resolver_global_cleanup wait for the count to drop to 0 before returning?

alexeip0 · 2018-09-14T22:39:15Z

And smarter fix would track detached threads on per-multi-handle basis and do similar wait in curl_multi_cleanup

This would have to be rather intricate, since curl_multi_cleanup is called from multi thread as well, thus, cannot do blocking wait.

bagder · 2018-09-16T15:37:17Z

Any wait for the pending thread(s) will risk blocking a call that will effect some users. We can only really fix this issue by removing the wait completely. If the threads don't leak any memory by this, I think it can be done. We just have to keep logic to make sure the OSS-Fuzz and possibly other tests don't consider that a leak.

Alexander-- · 2018-11-24T03:49:21Z

If the threads don't leak any memory by this, I think it can be done.

Threads leak their stacks.

This issue can't be fully fixed in libcurl, but users of libcurl (e.g. the command line curl tool) should be able to clean up by invoking exit() instead of curl_multi_cleanup(). GNU cp command does exactly this, when it isn't built for Valgrind.

stale · 2019-10-10T16:12:19Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

nicolashohm · 2019-11-11T17:32:15Z

I also encountered this issue.
I'm currently building a PHP application, accessing an external API while the user requests the page. In order to not slow down the page speed in case of any problems with the API, I want to skip the api call after a certain timeout. For that I set CURLOPT_TIMEOUT_MS to 150. Unfortunately this doesn't worked reliably. Since we faced issues with the external API or better said, with the dns resolving, I often got this error: Resolving timed out after 252 milliseconds. I've no idea where the 252 ms comes from, but not that 252ms is already too much, I saw the requests in the end takes even longer. For example when the average request took 500ms, some requests took 1200ms or even 1500ms which should never happen if the timeout would be adhered.

My guess: even that were seems to be a timeout of 252ms for the dns resolving, this timeout will be ignored. This could be valid since I figured out with dig that the dns resolving some times really took 500ms or longer.

Since I put the domain in the /etc/hosts the page speed is much more stable.

I hope this story was helpful for someone!

curl 7.38.0 (x86_64-pc-linux-gnu) libcurl/7.38.0

The php script was similar to this:

<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, '...');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FAILONERROR, true)
curl_setopt($ch, CURLOPT_TIMEOUT_MS, 150);
curl_setopt($ch, CURLOPT_PROXY, null);
curl_exec($ch);
var_dump(curl_error($ch));

Fixes #2975 Closes #9147

bagder added the name lookup DNS and related tech label Sep 11, 2018

bagder changed the title ~~option --max-time is ineffective when DNS timeout~~ waiting for name resolver threads before quitting can cause delays even with --max-time Oct 18, 2018

b-spencer mentioned this issue Dec 12, 2018

curl_multi_remove_handle() can block until DNS resolution finishes even when using c-ares #3371

Closed

jay mentioned this issue Mar 20, 2019

curl version 7.62.0 adds DNS retry and timeout_time is long case ANR when exiting. How to control DNS retry timeout time.connect_timeout only has an effect on first resolve. #3691

Closed

bagder added the help wanted label Mar 20, 2019

bagder added the KNOWN_BUGS material label Apr 13, 2019

bagder mentioned this issue Jul 22, 2019

--max-time doesn't control the DNS resolving time #4139

Closed

stale bot added the stale label Oct 10, 2019

stale bot closed this as completed Oct 24, 2019

jay mentioned this issue Dec 31, 2019

curl_easy_perform block for a long time #4769

Closed

jay mentioned this issue Jan 27, 2020

Multi perform hang waiting for threaded resolver #4852

Closed

lock bot locked as resolved and limited conversation to collaborators Feb 9, 2020

ferrieux added a commit to ferrieux/curl that referenced this issue Jul 12, 2022

Don't wait for DNS thread on exit. Fixes curl#2975.

e7e8592

ferrieux added a commit to ferrieux/curl that referenced this issue Sep 29, 2022

Don't wait for DNS thread on exit. Fixes curl#2975.

9b36b6e

ferrieux added a commit to ferrieux/curl that referenced this issue Nov 6, 2022

Don't wait for DNS thread on exit. Fixes curl#2975.

f3eba08

ferrieux added a commit to ferrieux/curl that referenced this issue Nov 17, 2022

Don't wait for DNS thread on exit. Fixes curl#2975.

499b5c4

bagder pushed a commit that referenced this issue Nov 17, 2022

CURLOPT_QUICK_EXIT: don't wait for DNS thread on exit

49798ca

Fixes #2975 Closes #9147

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

waiting for name resolver threads before quitting can cause delays even with --max-time #2975

waiting for name resolver threads before quitting can cause delays even with --max-time #2975

crvv commented Sep 11, 2018

jay commented Sep 11, 2018

jay commented Sep 11, 2018

bagder commented Sep 11, 2018

jay commented Sep 11, 2018

alexeip0 commented Sep 14, 2018

alexeip0 commented Sep 14, 2018 •

edited

bagder commented Sep 16, 2018

Alexander-- commented Nov 24, 2018

stale bot commented Oct 10, 2019

nicolashohm commented Nov 11, 2019

waiting for name resolver threads before quitting can cause delays even with --max-time #2975

waiting for name resolver threads before quitting can cause delays even with --max-time #2975

Comments

crvv commented Sep 11, 2018

I did this

I expected the following

curl/libcurl version

operating system

jay commented Sep 11, 2018

jay commented Sep 11, 2018

bagder commented Sep 11, 2018

jay commented Sep 11, 2018

alexeip0 commented Sep 14, 2018

alexeip0 commented Sep 14, 2018 • edited

bagder commented Sep 16, 2018

Alexander-- commented Nov 24, 2018

stale bot commented Oct 10, 2019

nicolashohm commented Nov 11, 2019

alexeip0 commented Sep 14, 2018 •

edited