Description
I did this
- enable HTTP/2.0 multiplexing
- send GET request that is answered after 5s
- also send POST request, send data infinitely in read_callback()
- if GET request completes, re-add GET handle to multi handle immediately
- read_callback() is maybe called one more time after re-adding the GET handle, but never again after that
The problem appears when adding the GET handle.
When adding the GET handle again is delayed, the read_callback() is called until shortly after the curl_multi_add_handle().
I expected the following
read_callback() is still called continuously, data can be send continuously.
With HTTP/2.0 multiplexing disabled, everything works as expected.
curl/libcurl version
build from source a3f3853
curl 7.60.0-DEV (x86_64-pc-linux-gnu) libcurl/7.60.0-DEV OpenSSL/1.0.2g zlib/1.2.8 nghttp2/1.31.0
Release-Date: [unreleased]
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS Debug TrackMemory IPv6 Largefile NTLM NTLM_WB SSL libz TLS-SRP HTTP2 UnixSockets HTTPS-proxy
Also observed with curl 7.56 on ARM
operating system
Ubuntu 16.04
Servlet source and pcap on request.
Activity
geekodour commentedon Apr 25, 2018
@bagder is this a beginner friendly issue? I am exploring the codebase now, if it's a low complexity issue then would like to work on it.
bagder commentedon Apr 25, 2018
Thanks for considering this @geekodour, but while I haven't yet investigated this issue very closely I think I can say that this is not a simple issue. It involves multiple HTTP/2 streams and something not being right in the data handling in (one of) them.
To give you a better picture, let me describe roughly how I imagine I'd proceed to work on this bug:
bagder commentedon Apr 26, 2018
My first attempt to reproduce this issue has failed. My test application (that only changed the two URLs in the source posted above) that downloads a 4KB file while doing the POST, managed to get it 5000 times without problems by the time I stopped it...
I'm using libcurl from current git master.
steini2000 commentedon Apr 27, 2018
I'm also using git master now. And I've updated to Tomcat 9.0.7.
What I've found out so far:
read_callback() is not called because in Curl_readwrite(), select_res does not have the out bit sit.
The out bit is not set because for Curl_socket_check(), fd_write is -1.
fd_write is -1 because conn->writesockfd is -1.
conn->writesockfd is -1 because curl_setup_transfer() is called from Curl_http() in default case (GET request) with writesockindex -1 (no postdata).
I also changed the size of data for the GET request from a few bytes to ~4k , but I can still reproduce the issue.
I'll investigate further and report back next week.
steini2000 commentedon Apr 27, 2018
The servlet I'm using for testing:
https://github.com/steini2000/curl-2520
steini2000 commentedon Apr 27, 2018
I hope someone else can reproduce the issue.
This really doesn't look like a simple issue.
The problem is somewhere here:
curl/lib/http.c
Lines 2793 to 2808 in f84139f
Curl_setup_transfer() is called for the GET request. As there is no postdata, the connections writesockfd is nuked. But the connection is still used for the POST request. But the writesockfd is -1 now, the read_callback() is not called.
Please be patient, I won't be able to work on this for some days.
bagder commentedon Apr 27, 2018
I'm not convinced the problem is there. That part of the code sets up the request for this transfer. A transfer is pretty much a 1-1 mapping with a
Curl_easy
struct. Several (or even many) transfers may share a single connection (struct connectdata
) for transfers, like when doing HTTP/2 over it.The function that controls what gets waited for, per transfer, is the
multi_getsock
function. Your application callscurl_multi_fdset
which iterates over all current transfers and callsmulti_getsock
for each.steini2000 commentedon May 2, 2018
Did a little bit more research.
The problem might be solved somewhere else, but the problem can be seen there.
This part sets up exactly one GET request:
curl/lib/http.c
Lines 2792 to 2809 in 1621aed
It calls this function to setup the transfer:
curl/lib/transfer.c
Lines 1989 to 2017 in 1621aed
In setup_transfer() it sets conn->writesockfd (writesockfd of the connection !!!) to CURL_BAD_SOCK if writesockindex parameter is -1.
And writesockindex is -1 for a GET request (no postdata for this request).
So a GET requests always sets writesockfd of the connection (!!!) to CURL_BAD_SOCK.
For testing, I changed in setup_transfer():
conn->sockfd = sockindex == -1 ?
CURL_SOCKET_BAD : conn->sock[sockindex];
conn->writesockfd = writesockindex == -1 ?
CURL_SOCKET_BAD:conn->sock[writesockindex];
to
if (sockindex != -1) {
conn->sockfd = conn->sock[sockindex];
}
if (writesockindex != -1) {
conn->writesockfd = conn->sock[writesockindex];
}
and
if(conn->writesockfd != CURL_SOCKET_BAD) {
to
if((conn->writesockfd != CURL_SOCKET_BAD) && (writesockindex != -1)) {
With this changes, the read_callback() of the application is called.
But there are sometimes gaps of about 1 sec where the read_callback() is not called, so there are more problems...
bagder commentedon May 4, 2018
Your observation seems correct and it's curious this hasn't caused more problems. I think that's because
conn->writesockfd
isn't used everywhere. http2.c for example always accesses the socket withconn->sock[FIRSTSOCKET]
.So, while you've identified a problem correctly I unfortunately don't think that's the major one that causes your problems.
My proposed fix for this issue (and I'll file a separate PR in a sec) looks likes this:
bagder commentedon May 4, 2018
Hm, no that was a bit too quick. Doesn't work properly...
transfer: don't unset writesockfd on setup of multiplexed conns
bagder commentedon May 4, 2018
I polished it slightly and created #2549. But again: while I think this fixes a bug, I'm not convinced this is the fix for you. Let me know what you find out!
transfer: don't unset writesockfd on setup of multiplexed conns
11 remaining items