curl-library
how does max-filesize work over http?
Date: Sat, 20 Sep 2008 14:22:34 +0400
Hi, all
I wanted to patch curl PHP extention to support max-filesize option but
suddenly I found out that I can't reproduce this option working with HTTP at
all :)
I have a search-like crawler which downloads pages via HTTP, so I wanted to
limit download size by some reasonable value.
However setting the proper CURLOPT_MAXFILESIZE this limit doesn't work.
I started to play with command line version but found out that I can't make
it work from the command line as well! :)
The following command processes the download even if the HTTP response is
much over the given limit of 1024 bytes:
curl --max-filesize 1024 anyhost
Can anybody explain me where to dig in? Probably I just misunderstand how
this option works?
Yes, there is a NOTE saying
The file size is not always known prior to download, and for
such files this option has no effect even if the file transfer ends up
being larger than this given limit. This concerns both FTP and
HTTP transfers
But from the "physical" point of view I can't understand why we just can't
stop downloading when the size of HTTP response becomes greater than the
limit? Is there any possibility to do this?
Thanx!
curl --version
curl 7.16.4 (x86_64-suse-linux-gnu) libcurl/7.16.4 OpenSSL/0.9.8e zlib/1.2.3
libidn/1.0
Protocols: tftp ftp telnet dict ldap http file https ftps
Features: IDN IPv6 Largefile NTLM SSL libz
-- wbr, fisherReceived on 2008-09-20