cURL / Mailing Lists / curl-users / Single Mail

curl-users

Re: Curl might not be decompressing the body if the server sends it compressed without being asked to do so

From: Isaac Boukris <iboukris_at_gmail.com>
Date: Sat, 2 Jul 2016 12:28:56 +0300

On Fri, Jul 1, 2016 at 11:53 PM, Dan Fandrich <dan_at_coneharvesters.com> wrote:
> On Fri, Jul 01, 2016 at 09:52:07PM +0200, Daniel Stenberg wrote:
>> On Fri, 1 Jul 2016, Attila-Mihaly Balazs wrote:
>>
>> >I can't reliably reproduce this, but I think that if the server sends the
>> >response compressed (ie. "Content-Encoding: gzip") without curl asking for
>> >this (ie. the "--compressed" flag was not specified), curl doesn't
>> >decompress the body.
>>
>> Correct, and it never has. "Content-Encoding: gzip" is in fact just a signal
>> from the server that the data is compressed, it isn't telling the user-agent
>> to automatically decompress it - even if it has turned out to become that a
>> lot of the time...
>>
>> So it is by design. Maybe we should reconsider that design decision...
>
> If I want to download a compressed tarball (for example) I don't want curl
> uncompressing it behind my back without my asking for it. If I do want it
> uncompressed, I'll give it the --compressed option. There may be an argument to
> make --compressed the default so a tarball download would need --no-compressed
> to be given, but that's a pretty big compatibility break.

To my understanding a tar.gz file should not be served with
"Content-Encoding: gzip", and if you do not advertise
"Accept-Encoding: gzip" the server should not respond with
"Content-Encoding: gzip" anyway (instead it should use "Content-Type:
application/x-gzip" to denote the media type).

See quote below from RFC 7231:
   If the media type includes an inherent encoding, such as a data
   format that is always compressed, then that encoding would not be
   restated in Content-Encoding even if it happens to be the same
   algorithm as one of the content codings. Such a content coding would
   only be listed if, for some bizarre reason, it is applied a second
   time to form the representation. Likewise, an origin server might
   choose to publish the same data as multiple representations that
   differ only in whether the coding is defined as part of Content-Type
   or Content-Encoding, since some user agents will behave differently
   in their handling of each response (e.g., open a "Save as ..." dialog
   instead of automatic decompression and rendering of content).

However, I agree that I see no reasons for changing the default here.
If the server only serves encoded content the client can chose to
support encoding.
-------------------------------------------------------------------
List admin: https://cool.haxx.se/list/listinfo/curl-users
FAQ: https://curl.haxx.se/docs/faq.html
Etiquette: https://curl.haxx.se/mail/etiquette.html
Received on 2016-07-02