Description
I use CURL to perform upload/download operations via HTTP from an openstack Swift server (www.hubic.com).
I get random errors "Operation timed out after 0 milliseconds with 0 out of 0 bytes received" both on HTTP PUT and GET operations. After several retries it will eventually work.
The message seems to be incorrect as the timeout is set to 20 ms.
Can someone confirm if this timeout occurs in the SSL handshake? If yes, are there any other curl flags I can use to increase connection reliability?
The code excerpts and some debug data can be found below.
Code extras
curl_easy_setopt(curl, CURLOPT_URL, url);
curl_easy_setopt(curl, CURLOPT_HEADER, 0);
curl_easy_setopt(curl, CURLOPT_NOSIGNAL, 1);
curl_easy_setopt(curl, CURLOPT_NOPROGRESS, 0);
curl_easy_setopt(curl, CURLOPT_USERAGENT, USER_AGENT);
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, 1);
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYHOST, 1);
curl_easy_setopt(curl, CURLOPT_CONNECTTIMEOUT, 20);
curl_easy_setopt(curl, CURLOPT_VERBOSE, 1);
curl_easy_setopt(curl, CURLOPT_TCP_NODELAY, 1);
curl_easy_setopt(curl, CURLOPT_UPLOAD, 1);
curl_easy_perform(curl);
Debug output
CURL version=7.38.0 ssl=OpenSSL/1.0.1k
[2016-01-27 23:18:37.]:26085==adding extra header: X-Copy-From: /default_fuse_segments/default/test/t2_segments/medium.txt/1453929407.883449/00000003
[2016-01-27 23:18:37.]:26085==adding extra header: Content-Length: 0
- Hostname was found in DNS cache
- Trying 46.105.173.106...
- TCP_NODELAY set
- Connected to lb1.hubic.ovh.net (46.105.173.106) port 443 (Change cnonce values more frequently #50)
- successfully set certificate verify locations:
- CAfile: none
CApath: /etc/ssl/certs - SSL re-using session ID
- Operation timed out after 0 milliseconds with 0 out of 0 bytes received
- Closing connection 50
Activity
jay commentedon Jan 27, 2016
Though this report seems to have some merit for the incorrect timeout message please note this is an issue tracker and in any other case your questions should be directed to the curl-library mailing list where you will get a wider audience.
I can confirm sometimes the connect timeouts may show as incorrect and show an error message of
Operation timed out after 0 milliseconds
or some small ms like 10ms. I'm using:Can you give us your
curl -V
?I don't know what is causing it yet I'm having trouble debugging it because when I add 'when hit/continue on execution' bps in visual studio on
Curl_timeleft
it slows down the timeout functions and I get acurl: (28) SSL connection timeout
. data->mstate is always <= CURLM_STATE_WAITDO. But if I use a straight breakpoint right when it shows the warning I can see thenow
time is the same as the progress start timeprogress.t_startsingle
. Not that helpful though.Regarding the rest of what you wrote:
You should use exactly the types specified, for example most values must be
long
so you would use(long)
or the integer literal suffixL
as inCURLOPT_CONNECTTIMEOUT, 20L
. Also you say 20ms but the connect timeout is actually specified in seconds so 20 seconds. Does the problem always occur with a content-length of 0 (why set the header to that?) and SSL session ID reuse?When curl uses OpenSSL for its backend if there's session re-use you should see an abbreviated handshake. I don't know why the Client hello isn't sent. You may want to check in Wireshark to see if it actually is being sent. I tried session re-use on that server and it appears to work fine (using libcurl/7.47.0-DEV OpenSSL/1.0.2e).
dan-cristian commentedon Jan 30, 2016
Thanks for the response. Indeed the main reason for reporting this is the incorrect error message.
curl -V gives this:
curl 7.38.0 (x86_64-pc-linux-gnu) libcurl/7.38.0 OpenSSL/1.0.1k zlib/1.2.8 libidn/1.29 libssh2/1.4.3 librtmp/2.3
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtmp rtsp scp sftp smtp smtps telnet tftp
Features: AsynchDNS IDN IPv6 Largefile GSS-API SPNEGO NTLM NTLM_WB SSL libz TLS-SRP
Timeout is indeed set to 20s. I changed it to 20L now.
SSL sesion ID reuse is enabled as I use a thread pool and recycle connections. I have not tried yet to test without session reuse.
Content length is set to 0 for some files that represent manifest objects in swift storage, but the issue does not occur only for 0 lenght files, as I mention I also get the error while performing GET requests.
I'l try to debug with Wireshark and comeback with the findings.
bagder commentedon Feb 6, 2016
Any news? We can't reproduce this problem based on this little data.
dan-cristian commentedon Feb 6, 2016
I was not able yet to reproduce, I think the issue appears when the target
server is very loaded, but in the last days it worked without any single
incident.
I will keep trying to reproduce.
On Feb 7, 2016 01:25, "Daniel Stenberg" notifications@github.com wrote:
bagder commentedon Feb 16, 2016
As its not reproducible and we have no clues on where to start looking for a problem that hit once, I close this. If we get more/new details we can always re-open or create a new bug then.
jay commentedon Feb 17, 2016
I didn't have time to explore this further, here is more detail in case you have any ideas.
I haven't reproduced his specific use case because I would need more information on that. However I can reproduce that error message. I had some problem debugging because any wait affects the path so when I break at almost any point it messes up the reproduction.
The way I can reproduce is if https and a ClientHello is sent but no ServerHello is received and a connect timeout is set. But I don't see that same thing in the reporter's output.
To reproduce listen on a port using netcat or socat but don't use a ssl/tls layer, like
nc -l localhost 8000
. Then connect to that port via https and after five seconds it times out.Sometimes it doesn't show that error message, and will instead just say
curl: (28) SSL connection timeout
. I'm not sure why it's inconsistent like that but I assume it's something to do with the timing and my debugger (Visual Studio).A possible fix for the 0ms might be change k->now in this line in multi to just the actual now time.
Unless there's something wrong with k->now, maybe it's supposed to be updated and wasn't. I didn't have more time to look into that.
jay commentedon Feb 18, 2016
I took another look at this and
k->now
(akadata->req.now
) is set initially to the start time (data->req.start
) during stateCURLM_STATE_CONNECT
byCurl_connect
(create_conn --> Curl_init_do) and is not set again until stateCURLM_STATE_PERFORM
byCurl_readwrite
.data->progress.t_startsingle
is set initially inCURLM_STATE_CONNECT
as well.Sometimes there is a few milliseconds of difference between the initial
data->req.now
anddata->progress.t_startsingle
because they are set separately:So when we get the difference between
data->req.now
anddata->progress.t_startsingle
before CURLM_STATE_PERFORM it is either 0ms or something short like 16ms.We could either update
data->req.now
at each state, or at the very top of the loop, or maybe we could change to using thenow
time like belowhieuvo commentedon Feb 22, 2016
I'm suffering the same thing, that error just randomly happens:
jay commentedon Feb 22, 2016
@hieuvo any information you can give us on when it's happening would be helpful. Can you enable verbose output and let us know what you see before the error occurs? Are the URLs it happens with all of one type? For example, are they all https..
hieuvo commentedon Feb 22, 2016
yes, they are all https. It happens on our live production servers.
I don't have permissions to make changes and get verbose about on those ones.
hieuvo commentedon Feb 22, 2016
The curl's results are mixed of:
Although, most of urls and data format are the same
9 remaining items