curl / Mailing Lists / curl-library / Single Mail
Buy commercial curl support from WolfSSL. We help you work out your issues, debug your libcurl applications, use the API, port to new platforms, add new features and more. With a team lead by the curl founder himself.

Re: CURLOPT_ACCEPT_ENCODING not working (help)

From: Ray Satiro via curl-library <curl-library_at_cool.haxx.se>
Date: Tue, 9 Feb 2021 15:32:10 -0500

On 2/9/2021 2:32 PM, Adrián Gimeno Balaguer via curl-library wrote:
>
> First of all, the reason why I have split module parts search paths
> instead of using the main installation prefixes is due to a permission
> issue on my CI/CD setup, not allowing me to publish directly the
> resulting OpenSSL/ZLib installed paths (both modules having a separate
> build pipeline for reusability purposes). On the other hand, I’ve
> confirmed that all generated dependent libraries are being correctly
> used in runtime with your indication, comparing the following
> resulting output with the versions on the headers:
>
> libcurl/7.74.0-DEV OpenSSL/1.1.1i zlib/1.2.11
>
> Note the program prevents system libraries from leaking in by having
> LD_LIBRARY_PATH set pointing to a ‘bin’ subdirectory where the
> aforementioned libraries go. Also, setting CURLOPT_ACCEPT_ENCODING to
> “” doesn’t make a difference.
>
> So, recently I ran the following for a trial web service which returns
> gzipped data:
>
> CURL* pCURL;
>
> pCURL = curl_easy_init();
>
> curl_easy_setopt(pCURL, CURLOPT_URL, "http://httpbin.org/gzip");
>
> curl_easy_setopt(pCURL, CURLOPT_ACCEPT_ENCODING, "");
>
> curl_easy_setopt(pCURL, CURLOPT_VERBOSE, 1L);
>
> curl_easy_perform(pCURL);
>
> curl_easy_cleanup(pCURL);
>
> The response body is supposed to contain a demo JSON and I see that
> the above sample correctly displays the original data in the standard
> output:
>
> {
>
>   "gzipped": true,
>
>   "headers": {
>
>     "Accept": "*/*",
>
>     "Accept-Encoding": "deflate, gzip",
>
>     "Host": "httpbin.org",
>
>     "X-Amzn-Trace-Id": "Root=1-6022d6d8-6e2f080d3d5c1e001d076a2a"
>
>   },
>
>   "method": "GET",
>
>   "origin": "81.202.236.251"
>
> }
>
> * Connection #0 to host httpbin.org left intact
>
> So I believe my libcurl is correct. On the other hand, the equivalent
> output from the desired server (which is a REST API managed by a third
> party), where I’ll also include the relevant response headers, is as
> follows:
>
> HTTP/1.1 200 OK
>
> < Date: Tue, 09 Feb 2021 18:39:21 GMT
>
> < Content-Encoding: gzip
>
> < X-Powered-By: Undertow/1
>
> < Content-Type: application/xml;charset=UTF-8
>
> < Content-Length: 758
>
> < Set-Cookie:
> TS0171c831=0147f0636e20bb9c874b5ab4aad7726d9800ce6aafa5b7e8da21a7f47ea8dd54ee8da97abd26a11019badc63800405b90d7d95c801;
> Path=/
>
> <
>
> * Connection #0 to host the.domain.com left intact
>
> As you see, the Content-Length is positive, while the body section
> appears to be missing. In contrast, when I print the response gathered
> through my default program logic, which includes CURLOPT_WRITEDATA,
> CURLOPT_WRITEFUNCTION setup and the like, I see an initial byte (only)
> instead from gzipped content, the next byte being a null terminator.
> I’m then able to decompress the data and get the correct output (an
> XML file) manually through the ZLib library (passing in the allocated
> response which is wrapped in a std::string and should have the exact
> Content-Length size).
>
> If relevant, by any chance, their request also expects a certain
> gzipped XML file and my lines involving related headers setup are like
> the following:
>
> curl_slist* pHTTPHeaders = curl_slist_append(NULL, “Content-Type:
> application/octet-stream”); // Required per specifications (unlike
> “application/XML”)
>
> pHTTPHeaders = curl_slist_append(pHTTPHeaders , "Content-Encoding: gzip");
>
> curl_easy_setopt(pCURL, CURLOPT_ACCEPT_ENCODING, "gzip");
>
> Finally, the managers of that service confidently claim that they
> don’t send the data doubly compressed, but instead “my system probably
> compresses it upon receiving it”. I don’t know how this could make any
> sense at all. I haven’t yet tried to analyze the network packets to
> determine the right data nature.
>
> Thanks for any response in advance.
>
> Regards,
>
> Adrián
>
> *De: *Ray Satiro via curl-library <mailto:curl-library_at_cool.haxx.se>
> *Enviado: *viernes, 5 de febrero de 2021 21:52
> *Para: *curl-library_at_cool.haxx.se <mailto:curl-library_at_cool.haxx.se>
> *CC: *Ray Satiro <mailto:raysatiro_at_yahoo.com>
> *Asunto: *Re: CURLOPT_ACCEPT_ENCODING not working (help)
>
> On 2/4/2021 10:07 AM, Adrián Gimeno Balaguer via curl-library wrote:
>
> I’m using a self built libcurl shared library for embedded use in
> a C++ application. In the attempt of requesting it to
> automatically decompress response data from a remote server of
> interest by using a line like the following in my request setup:
>
> curl_easy_setopt(mpCURL, CURLOPT_ACCEPT_ENCODING, "gzip");
>
> The returned data seems to remain compressed. I enabled the
> CURLOPT_HEADER option and can see that the server returns positive
> Content-Lengths, with few non-human readable characters in the
> body content. To be clear, removing the CURLOPT_ACCEPT_ENCODING
> option doesn’t make any difference.
>
> The libcurl library compilation is done in a custom automated
> CI/CD in the cloud which also compiles OpenSSL and ZLib as shared
> libraries in separate pipelines, with libcurl pipeline pulling
> from the master branch (from official libcurl’s repo) and a
> compilation script like the following:
>
> cmake -Bbuild -DBUILD_SHARED_LIBS=OFF
> -DCMAKE_BUILD_TYPE=MinSizeRel -DCMAKE_POSITION_INDEPENDENT_CODE=ON
> -DCURL_DISABLE_COOKIES=ON -DCURL_DISABLE_CRYPTO_AUTH=ON
> -DCURL_DISABLE_LDAP=ON -DCURL_DISABLE_PROXY=ON -DENABLE_IPV6=OFF
> -DENABLE_UNIX_SOCKETS=OFF -DHTTP_ONLY=ON
> -DOPENSSL_CRYPTO_LIBRARY=$(Pipeline.Workspace)/ssl/lib/libcrypto.so.1.1
> -DOPENSSL_INCLUDE_DIR=$(Pipeline.Workspace)/ssl/include
> -DOPENSSL_SSL_LIBRARY=$(Pipeline.Workspace)/ssl/lib/libssl.so.1.1
> -DZLIB_INCLUDE_DIR=$(Pipeline.Workspace)/zlib/include
> -DZLIB_LIBRARY=$(Pipeline.Workspace)/zlib/lib/libz.so.1 . && cmake
> --build build
>
> I can see the following possible relevant build output lines (not
> contiguous) that may indicate ZLib gets correctly integrated:
>
> Found ZLIB: /home/vsts/work/1/zlib/lib/libz.so.1 (found version
> "1.2.11")
>
> Enabled features: SSL libz AsynchDNS alt-svc HTTPS-proxy
>
> However, I’ve seen in the following answer:
> https://stackoverflow.com/a/29966893
> <https://stackoverflow.com/a/29966893>. Looking at my generated
> libcurl headers, I can’t find any match for “HAVE_LIBZ” (neither
> in my build output).
>
> The proper way to specify OpenSSL and zlib locations is by using
> OPENSSL_ROOT_DIR [1] and ZLIB_ROOT variables [2], because we call
> cmake's find_package and that's what they use. For example:
>
> -DCMAKE_USE_OPENSSL=ON -DOPENSSL_ROOT_DIR=C:\somewhere -DCURL_ZLIB=ON
> -DZLIB_ROOT=C:\somewhere
>
> This assumes you've already installed openssl and zlib to those
> locations and they have lib,bin,etc. To set the install location when
> configuring zlib via cmake you can use -DCMAKE_INSTALL_PREFIX:PATH=.
> To set the install location when configuring openssl via Configure you
> can use --prefix= and --openssldir=.
>
> Setting ACCEPT_ENCODING to a specific string is almost never used
> correctly. What you should do instead is set it to an empty string ""
> and libcurl will only send the encodings it actually supports [3].
>
> printf("%s\n", curl_version()) to make sure your program is actually
> using the libcurl you built and not some other one in the path.
>
> [1]:
> https://github.com/Kitware/CMake/blob/master/Modules/FindOpenSSL.cmake
> <https://github.com/Kitware/CMake/blob/master/Modules/FindOpenSSL.cmake>
> [2]:
> https://github.com/Kitware/CMake/blob/master/Modules/FindZLIB.cmake
> <https://github.com/Kitware/CMake/blob/master/Modules/FindZLIB.cmake>
> [3]: https://curl.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html
> <https://curl.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html>
>

Please do not top-post it makes the conversation hard to follow. [1]

It looks like you have solved the first problem and your build is able
to decode compressed content. I reiterate don't set ACCEPT_ENCODING to
specific encodings.

Your second problem seems to be with a specific server that you request
gzip encoding (accept-encoding) and it returns gzip encoding
(content-encoding). Does curl_easy_perform return an error? If not then
curl should have received and decoded the data.


[1]: https://curl.se/mail/etiquette.html#Do_Not_Top_Post




-------------------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette: https://curl.se/mail/etiquette.html
Received on 2021-02-09