cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: Support for GZIP Content-Encoding in HTTP responses

From: Dan Fandrich <dan_at_coneharvesters.com>
Date: Thu, 24 Apr 2003 00:18:15 -0700

On Thu, Apr 24, 2003 at 08:31:38AM +0200, Daniel Stenberg wrote:
> On Wed, 23 Apr 2003, Dan Fandrich wrote:
> > While working on a gzip test script, I discovered that I forgot to update
> > the curl front-end to support gzip as well. That highlighted two
> > weaknesses in curl's content-encoding support. First, the CURLOPT_ENCODING
> > semantics requires that the application know which encodings are supported
> > by the curl library; there's no way to say "request all supported
> > content-encodings". That's easily changed by specifying that an empty (not
> > NULL) string set for CURLOPT_ENCODING means to request all supported
> > encodings. I can't think of any unexpected side effects of this change.
>
> I can. We need to be able to tell libcurl to disable the requesting for
> encoded content, so setting NULL would probably go back to how it was before.

That's how it works now and how I suggest we leave it--setting NULL means not
to send Accept-Encoding and to ignore a received Content-Encoding.

> Instead I suggest we add a "magic" string that means "all supported
> encodings" and I suggest we do that by recommending the use of a particular
> define, like this:
>
> curl_easy_setopt(handle, CURLOPT_ENCODING, CURLENCODE_ALL_SUPPORTED);
>
> ... with the CURLENCODE_ALL_SUPPORTED being set to something that is not
> likely to ever be confused with a true and actual encoding name.

My suggestion is to set CURLENCODE_ALL_SUPPORTED to "", i.e. an empty string.
That would be replaced inside the library with the set of Accept-Encodings
supported. I can't think of a reason anyone would want to send an
Accept-Encoding: line with no argument, although it appears to be legal. If
someone *really* wants to do that, presumably he could use CURLOPT_HTTPHEADER.

Your suggestion would work if CURLENCODE_ALL_SUPPORTED were set to "@" or
"<all>" or something similar which is illegal HTTP syntax so will never
happen in real life. I'm care too much either way, but "" has a certain
felicity...

>
> > Secondly, there's no way to see what Content-Encoding the server actually
> > sent on an CURLE_BAD_CONTENT_ENCODING return code or if CURLOPT_ENCODING
> > isn't set. In the more general case, I don't see a way to look at
> > arbitrary HTTP headers except by enabling CURLOPT_HEADER and parsing them
> > myself. This is a less severe problem than the first.
>
> I'm not sure I understand the implications of this problem. What are the
> drawbacks you see from this limitation?

It's just a bit more work for the programmer who wants digested headers.
I would be useful to have a function that returns all the headers in a
struct curl_slist or some other preprocessed form. It's not something
I need right now, but it's something that could be used to elegantly
bloat libcurl if people think it's not big enough yet ;-)

>>> Dan

-- 
http://www.MoveAnnouncer.com              The web change of address service
          Let webmasters know that your web site has moved
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
Received on 2003-04-24