cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: Problem with CONTENT-ENCODING

From: Michael Wallner <mike_at_iworks.at>
Date: Wed, 14 Dec 2005 13:26:53 +0100

Hi all,

Sorry for jumping in that late, but I actually had to cope with that
issue the last few days too (but not within curl directly, though).

To me there seem to be two issues, the first being a deflate content-
encoding header with gzipped body (I'll leave that out now, cause I've
not seen that so far), and the second one being deflated/compressed
bodies with and without zlib header bytes.

Yeah, some research revealed that compress/deflate is actually the same
and the difference can, but need not to, be how deflateInit2() was called,
i.e. with MAX_WBITS which will give you the zlib bytes at the beginning
of the encoded data or -MAX_WBITS which will make deflate() leave them
out.

As curl only needs inflating I'll only talk about that and how I solved
that issue. So deflated/compressed data may come to you with and without
that zlib header bytes and cheking the first byte to be Z_DEFLATED looks
rather unreliable to me, so I'd suggest the following:

- inline the everything (except the gzip checks) into one function
- make the z_stream only live within that function
- try to inflate the stream first with inflateInit2(&z, -MAX_WBITS ...)
- if zlib returns Z_DATA_ERROR try with inflateInit2(&z, MAX_WBITS ...)

This should work fine with the inner part of gzip data as well as deflated
or compressed data and also make things easier in content_encoding.c.

You can look at http_encoding_inflate() here, but note that it tries to
decode the data as a whole, and not like curl as several chunks:
http://cvs.php.net/viewcvs.cgi/pecl/http/http_encoding_api.c?view=markup

Regards,

-- 
Michael - <mike(@)php.net> http://dev.iworks.at/ext-http/http-functions.html.gz
Received on 2005-12-14