Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty Accept-Encoding: Header doesn't override CURLOPT_ACCEPT_ENCODING on 2nd request with reused handle #785

Closed
rcanavan opened this issue Apr 29, 2016 · 15 comments

Comments

@rcanavan
Copy link

The test program below (modified from examples/httpcustomheader.c) attempts to send 2 requests to a host, and reuse the curl handle. On the second request, it attempts to disable CURLOPT_ACCEPT_ENCODING by explicitly setting the Accept-Encoding:-header without a value.

This works as expected (i.e. no Accept-Encoding header is sent in the request) only if a "fresh" handle is used or no previous request with that handle has ever actually sent a request with the Accept-Encoding header set via CURLOPT_ACCEPT_ENCODING.

#include <stdio.h>
#include <curl/curl.h>
CURL *curl;

void request(char * url, int compress)
{
  CURLcode res;

  if(curl) {
    struct curl_slist *chunk = NULL;
    curl_easy_setopt(curl, CURLOPT_URL, url);
    res = curl_easy_setopt(curl, CURLOPT_ACCEPT_ENCODING, "");
    if (compress == 0) {
        chunk = curl_slist_append(chunk, "Accept-Encoding: ");
    }
    res = curl_easy_setopt(curl, CURLOPT_HTTPHEADER, chunk);
    //curl_easy_setopt(curl, CURLOPT_VERBOSE, 1L);

    res = curl_easy_perform(curl);
    if(res != CURLE_OK) fprintf(stderr, "curl_easy_perform() failed: %s\n", curl_easy_strerror(res));

    curl_easy_reset(curl);
    curl_slist_free_all(chunk);
  }
}

int main(void)
{
  curl_global_init(CURL_GLOBAL_ALL);
  curl = curl_easy_init();
  request("http://localhost:14080/headers.php", 1);
  request("http://localhost:14080/headers.php", 0);
  curl_easy_cleanup(curl);
  return 0;
}
@bagder bagder added the HTTP label Apr 29, 2016
@bagder
Copy link
Member

bagder commented Apr 29, 2016

The problem here is that CURLOPT_ACCEPT_ENCODING set to an empty string has a meaning other than please send a blank header. The man page says:

If a zero-length string is set like "", then an Accept-Encoding: header containing all built-in supported encodings is sent.

So you're really asking for two mutually exclusive things in that request and it's not immediately clear which of the two instructions that should override the other...

@bagder
Copy link
Member

bagder commented May 1, 2016

Could we perhaps make this clearer in the documentation?

@iboukris
Copy link
Contributor

iboukris commented May 1, 2016

Perhaps it would help to mention that one may set CURLOPT_ACCEPT_ENCODING to a NULL pointer in order to disable it (both sending the header and auto decoding) in case it was previously enabled on the given easy handle.

BTW, I find the statement "identity, which does nothing" to be a bit misleading, as setting 'identity' does in fact enable auto decoding (e.g. if the header was overridden, or if the server sends gzip despite what the client has specified) and regardless, sending the header with 'identity' only isn't exactly nothing.
Also worth mentioning that the content length header received may not match the actual length of the data in case auto decoding is enabled (I reached the above understanding by troubleshooting a customer issue with our app).

bagder added a commit that referenced this issue May 1, 2016
@bagder
Copy link
Member

bagder commented May 1, 2016

Ok, I committed an update to this man page, have a look at CURLOPT_ACCEPT_ENCODING for the new version.

Better? Anything lacking now?

iboukris added a commit to iboukris/curl that referenced this issue May 1, 2016
Clarify that CURLOPT_ACCEPT_ENCODING has to be explicitly disabled
only if it was previously enabled (as the default is NULL).

Mention possible content-length mismatch with sum of bytes reported
by write callbacks when auto decoding is enabled.

See curl#785
@iboukris
Copy link
Contributor

iboukris commented May 1, 2016

Yea, but take a look at little adjustments: iboukris@7a329b8

bagder pushed a commit that referenced this issue May 1, 2016
Mention possible content-length mismatch with sum of bytes reported
by write callbacks when auto decoding is enabled.

See #785
@bagder
Copy link
Member

bagder commented May 1, 2016

Thanks! I merged the second part of that change. I'm not comfortable with adding the "if it was set to something else before you can disable it again with" text. That is sort of implied if you need to disable it, and I would rather not add that extra text for every man page description we have that tells how to switch an option back to default state.

@iboukris
Copy link
Contributor

iboukris commented May 2, 2016

Make sense, thanks!

@bagder
Copy link
Member

bagder commented May 2, 2016

@rcanavan, you happy with this explanation or are you looking for something else?

@iboukris
Copy link
Contributor

iboukris commented May 2, 2016

Hmm, I've looked again at @rcanavan's example and gave it a quick run, and the behavior actually seem strange indeed.
It looks like if a connection is reused then setting empty "Accept-Encoding:" does not disable sending of the header (if I forbid reused it'd work as expected).
I'll try to take a closer look later on.

@bagder
Copy link
Member

bagder commented May 2, 2016

But again, you can't both set CURLOPT_ACCEPT_ENCODING (to non-NULL) and ask for disabling the header.

@iboukris
Copy link
Contributor

iboukris commented May 2, 2016

But the doc of CURLOPT_HTTPHEADER says:
If you add a header that is otherwise generated and used by libcurl internally, your added one will be used instead. If you add a header with no content as in 'Accept:' (no data on the right side of the colon), the internally used header will get disabled.

And it does work as one would expect if a connection is not reused.

@bagder
Copy link
Member

bagder commented May 2, 2016

Perhaps something like this makes it behave more deterministic?

diff --git a/lib/http.c b/lib/http.c
index f3805cc..2a7280d 100644
--- a/lib/http.c
+++ b/lib/http.c
@@ -1915,10 +1915,14 @@ CURLcode Curl_http(struct connectdata *conn, bool *done)
     conn->allocptr.accept_encoding =
       aprintf("Accept-Encoding: %s\r\n", data->set.str[STRING_ENCODING]);
     if(!conn->allocptr.accept_encoding)
       return CURLE_OUT_OF_MEMORY;
   }
+  else {
+    Curl_safefree(conn->allocptr.accept_encoding);
+    conn->allocptr.accept_encoding = NULL;
+  }

 #ifdef HAVE_LIBZ
   /* we only consider transfer-encoding magic if libz support is built-in */

   if(!Curl_checkheaders(conn, "TE:") &&

@iboukris
Copy link
Contributor

iboukris commented May 2, 2016

This works, and it looks correct as well.

@rcanavan
Copy link
Author

rcanavan commented May 2, 2016

That patch does indeed make libcurl behave consistently, not only accross reused connections, but also when adding values to Accept-Encoding: as opposed to trying to unset it. And it is exactly this consistency that I was looking for, and the assurance that other headers, when set first with a value via CURLOPT_HTTPHEADER and then unset, behave just as consistently when reusing a handle.

@bagder bagder closed this as completed in 96eb9a8 May 2, 2016
@bagder
Copy link
Member

bagder commented May 2, 2016

Yeah sorry for overseeing the consistency argument originally. I've just merged this.

Thanks both of you for your help!

@lock lock bot locked as resolved and limited conversation to collaborators May 7, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Development

No branches or pull requests

3 participants