Skip to content

HTTP/2.0 multiplexing: broken POST request when re-adding GET request #2520

Closed
@steini2000

Description

@steini2000

I did this

  • enable HTTP/2.0 multiplexing
  • send GET request that is answered after 5s
  • also send POST request, send data infinitely in read_callback()
  • if GET request completes, re-add GET handle to multi handle immediately
  • read_callback() is maybe called one more time after re-adding the GET handle, but never again after that

The problem appears when adding the GET handle.
When adding the GET handle again is delayed, the read_callback() is called until shortly after the curl_multi_add_handle().

I expected the following

read_callback() is still called continuously, data can be send continuously.
With HTTP/2.0 multiplexing disabled, everything works as expected.

curl/libcurl version

build from source a3f3853
curl 7.60.0-DEV (x86_64-pc-linux-gnu) libcurl/7.60.0-DEV OpenSSL/1.0.2g zlib/1.2.8 nghttp2/1.31.0
Release-Date: [unreleased]
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS Debug TrackMemory IPv6 Largefile NTLM NTLM_WB SSL libz TLS-SRP HTTP2 UnixSockets HTTPS-proxy

Also observed with curl 7.56 on ARM

operating system

Ubuntu 16.04

Servlet source and pcap on request.

http2-mpx.c.txt
http2-mpx.log.zip

Activity

geekodour

geekodour commented on Apr 25, 2018

@geekodour

@bagder is this a beginner friendly issue? I am exploring the codebase now, if it's a low complexity issue then would like to work on it.

bagder

bagder commented on Apr 25, 2018

@bagder
Member

Thanks for considering this @geekodour, but while I haven't yet investigated this issue very closely I think I can say that this is not a simple issue. It involves multiple HTTP/2 streams and something not being right in the data handling in (one of) them.

To give you a better picture, let me describe roughly how I imagine I'd proceed to work on this bug:

  1. Make use of @steini2000's test source and make it reproducible using a local HTTP server. I'd probably use my local apache server and just make sure h2c is enabled in it.
  2. Make sure that I can keep sending data to a URL on the server to do the POST stream as described.
  3. Run the app and verify that it shows the described symptoms
  4. Now, assuming that step (3) is successful, put some logging/printf code around the logic that calls the read callback for the POST stream to figure out what conditions that make it not call the read callback
  5. If the wrong conditions or failed check can be figured out, the next step is to understand why it ends up in that state. It would probably involve making sure we figure out exactly when the problem appears (as in when the other stream stops or when it is started again or something else).
  6. Single-step through the functions involved in triggering the bad condition and see what makes it go wrong
  7. log/printf more
  8. try changing the wrong code with something that makes sense
  9. rerun the test app and make sure it works, or go back to step 5 or 6 or 7.
  10. when it works in (9), run the entire test suite and make sure nothing else broke
  11. clean up the code change, leave only the relevant fixes. Maybe add a comment?
  12. commit the change in a new local branch
  13. create a PR on github and have the CI tests verify the code even further
bagder

bagder commented on Apr 26, 2018

@bagder
Member

My first attempt to reproduce this issue has failed. My test application (that only changed the two URLs in the source posted above) that downloads a 4KB file while doing the POST, managed to get it 5000 times without problems by the time I stopped it...

I'm using libcurl from current git master.

steini2000

steini2000 commented on Apr 27, 2018

@steini2000
ContributorAuthor

I'm also using git master now. And I've updated to Tomcat 9.0.7.

What I've found out so far:
read_callback() is not called because in Curl_readwrite(), select_res does not have the out bit sit.
The out bit is not set because for Curl_socket_check(), fd_write is -1.
fd_write is -1 because conn->writesockfd is -1.
conn->writesockfd is -1 because curl_setup_transfer() is called from Curl_http() in default case (GET request) with writesockindex -1 (no postdata).

I also changed the size of data for the GET request from a few bytes to ~4k , but I can still reproduce the issue.

I'll investigate further and report back next week.

steini2000

steini2000 commented on Apr 27, 2018

@steini2000
ContributorAuthor

The servlet I'm using for testing:
https://github.com/steini2000/curl-2520

steini2000

steini2000 commented on Apr 27, 2018

@steini2000
ContributorAuthor

I hope someone else can reproduce the issue.
This really doesn't look like a simple issue.

The problem is somewhere here:

curl/lib/http.c

Lines 2793 to 2808 in f84139f

default:
result = Curl_add_buffer(req_buffer, "\r\n", 2);
if(result)
return result;
/* issue the request */
result = Curl_add_buffer_send(req_buffer, conn,
&data->info.request_size, 0, FIRSTSOCKET);
if(result)
failf(data, "Failed sending HTTP request");
else
/* HTTP GET/HEAD download: */
Curl_setup_transfer(conn, FIRSTSOCKET, -1, TRUE, &http->readbytecount,
http->postdata?FIRSTSOCKET:-1,
http->postdata?&http->writebytecount:NULL);

Curl_setup_transfer() is called for the GET request. As there is no postdata, the connections writesockfd is nuked. But the connection is still used for the POST request. But the writesockfd is -1 now, the read_callback() is not called.

Please be patient, I won't be able to work on this for some days.

bagder

bagder commented on Apr 27, 2018

@bagder
Member

I'm not convinced the problem is there. That part of the code sets up the request for this transfer. A transfer is pretty much a 1-1 mapping with a Curl_easy struct. Several (or even many) transfers may share a single connection (struct connectdata) for transfers, like when doing HTTP/2 over it.

The function that controls what gets waited for, per transfer, is the multi_getsock function. Your application calls curl_multi_fdset which iterates over all current transfers and calls multi_getsock for each.

steini2000

steini2000 commented on May 2, 2018

@steini2000
ContributorAuthor

Did a little bit more research.

The problem might be solved somewhere else, but the problem can be seen there.
This part sets up exactly one GET request:

curl/lib/http.c

Lines 2792 to 2809 in 1621aed

default:
result = Curl_add_buffer(req_buffer, "\r\n", 2);
if(result)
return result;
/* issue the request */
result = Curl_add_buffer_send(req_buffer, conn,
&data->info.request_size, 0, FIRSTSOCKET);
if(result)
failf(data, "Failed sending HTTP request");
else
/* HTTP GET/HEAD download: */
Curl_setup_transfer(conn, FIRSTSOCKET, -1, TRUE, &http->readbytecount,
http->postdata?FIRSTSOCKET:-1,
http->postdata?&http->writebytecount:NULL);
}

It calls this function to setup the transfer:

curl/lib/transfer.c

Lines 1989 to 2017 in 1621aed

void
Curl_setup_transfer(
struct connectdata *conn, /* connection data */
int sockindex, /* socket index to read from or -1 */
curl_off_t size, /* -1 if unknown at this point */
bool getheader, /* TRUE if header parsing is wanted */
curl_off_t *bytecountp, /* return number of bytes read or NULL */
int writesockindex, /* socket index to write to, it may very well be
the same we read from. -1 disables */
curl_off_t *writecountp /* return number of bytes written or NULL */
)
{
struct Curl_easy *data;
struct SingleRequest *k;
DEBUGASSERT(conn != NULL);
data = conn->data;
k = &data->req;
DEBUGASSERT((sockindex <= 1) && (sockindex >= -1));
/* now copy all input parameters */
conn->sockfd = sockindex == -1 ?
CURL_SOCKET_BAD : conn->sock[sockindex];
conn->writesockfd = writesockindex == -1 ?
CURL_SOCKET_BAD:conn->sock[writesockindex];
k->getheader = getheader;

In setup_transfer() it sets conn->writesockfd (writesockfd of the connection !!!) to CURL_BAD_SOCK if writesockindex parameter is -1.
And writesockindex is -1 for a GET request (no postdata for this request).

So a GET requests always sets writesockfd of the connection (!!!) to CURL_BAD_SOCK.

For testing, I changed in setup_transfer():
conn->sockfd = sockindex == -1 ?
CURL_SOCKET_BAD : conn->sock[sockindex];
conn->writesockfd = writesockindex == -1 ?
CURL_SOCKET_BAD:conn->sock[writesockindex];
to
if (sockindex != -1) {
conn->sockfd = conn->sock[sockindex];
}
if (writesockindex != -1) {
conn->writesockfd = conn->sock[writesockindex];
}

and
if(conn->writesockfd != CURL_SOCKET_BAD) {
to
if((conn->writesockfd != CURL_SOCKET_BAD) && (writesockindex != -1)) {

With this changes, the read_callback() of the application is called.
But there are sometimes gaps of about 1 sec where the read_callback() is not called, so there are more problems...

bagder

bagder commented on May 4, 2018

@bagder
Member

Your observation seems correct and it's curious this hasn't caused more problems. I think that's because conn->writesockfd isn't used everywhere. http2.c for example always accesses the socket with conn->sock[FIRSTSOCKET].

So, while you've identified a problem correctly I unfortunately don't think that's the major one that causes your problems.

My proposed fix for this issue (and I'll file a separate PR in a sec) looks likes this:

diff --git a/lib/transfer.c b/lib/transfer.c
index 5c8eb31d3..db08a6b4d 100644
--- a/lib/transfer.c
+++ b/lib/transfer.c
@@ -2016,15 +2016,22 @@ Curl_setup_transfer(
   data = conn->data;
   k = &data->req;
 
   DEBUGASSERT((sockindex <= 1) && (sockindex >= -1));
 
-  /* now copy all input parameters */
-  conn->sockfd = sockindex == -1 ?
+  if(conn->bits.multiplex) {
+    /* when multiplexing, the read/write sockets need to be the same! */
+    conn->sockfd = sockindex == -1 ?
+      conn->sock[writesockindex] : conn->sock[sockindex];
+    conn->writesockfd = conn->sockfd;
+  }
+  else {
+    conn->sockfd = sockindex == -1 ?
       CURL_SOCKET_BAD : conn->sock[sockindex];
-  conn->writesockfd = writesockindex == -1 ?
+    conn->writesockfd = writesockindex == -1 ?
       CURL_SOCKET_BAD:conn->sock[writesockindex];
+  }
   k->getheader = getheader;
 
   k->size = size;
   k->bytecountp = bytecountp;
   k->writebytecountp = writecountp;
bagder

bagder commented on May 4, 2018

@bagder
Member

Hm, no that was a bit too quick. Doesn't work properly...

bagder

bagder commented on May 4, 2018

@bagder
Member

I polished it slightly and created #2549. But again: while I think this fixes a bug, I'm not convinced this is the fix for you. Let me know what you find out!

11 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @bagder@geekodour@steini2000

        Issue actions

          HTTP/2.0 multiplexing: broken POST request when re-adding GET request · Issue #2520 · curl/curl