curl-library
Working around server bugs in deflate encoding
Date: Tue, 02 Nov 2004 00:16:54 +0530
Hi,
I've been having some issues with deflate encoded content. Some servers
don't send proper headers in the encoded data, and zlib refuses to
decode it.
Here is an url that (currently) exhibits the bug:
http://timesofindia.indiatimes.com/cms.dll/html/comp//articleshow/856302.cms
Apparently other folks have run into the same issue before.
See thread starting here: http://curl.haxx.se/mail/lib-2003-08/0204.html
The attached patch (diffed against 7.12.1) works around this by retrying
with some dummy header if inflate() did not work the first time.
My patch is based on equivalent code from mozilla which can be found here:
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/netwerk/streamconv/converters/nsHTTPCompressConv.cpp&mark=255-280#255
Since other user-agents seem to have such workarounds, and since this
seems to affect multiple different server implementations of deflate
encoding, I think it makes sense to have this hack in libcurl too.
Harshal
Index: content_encoding.c
===================================================================
RCS file: /cvs/DiscoveryServer/main/newsrc/External/curl/lib/content_encoding.c,v
retrieving revision 1.2
diff -u -r1.2 content_encoding.c
--- content_encoding.c 13 Aug 2004 14:10:36 -0000 1.2
+++ content_encoding.c 1 Nov 2004 18:35:08 -0000
@@ -104,6 +104,26 @@
z->avail_out = DSIZ;
status = inflate(z, Z_SYNC_FLUSH);
+ if (status == Z_DATA_ERROR) {
+ /* some servers (notably Apache with mod_deflate) don't generate
+ zlib headers insert a dummy header and try again */
+ static char dummy_head[2] =
+ {
+ 0x8 + 0x7 * 0x10,
+ (((0x8 + 0x7 * 0x10) * 0x100 + 30) / 31 * 31) & 0xFF,
+ };
+ inflateReset(z);
+ z->next_in = (Bytef*) dummy_head;
+ z->avail_in = sizeof(dummy_head);
+
+ status = inflate(z, Z_NO_FLUSH);
+ if (status == Z_OK) {
+ z->next_in = (Bytef *)k->str;
+ z->avail_in = (uInt)nread;
+ status = inflate(z, Z_SYNC_FLUSH);
+ }
+ }
+
if (status == Z_OK || status == Z_STREAM_END) {
if (DSIZ - z->avail_out) {
result = Curl_client_write(data, CLIENTWRITE_BODY, decomp,
Received on 2004-11-01