curl-library
Re: how does max-filesize work over http?
Date: Sat, 20 Sep 2008 16:54:41 +0400
On Sat, Sep 20, 2008 at 3:29 PM, Richard Atterer
<richard_at_2008.atterer.net>wrote:
> On Sat, Sep 20, 2008 at 02:22:34PM +0400, Alexey A. Rybak wrote:
> > curl --max-filesize 1024 anyhost
>
> Works here! - for servers that actually send a Content-Length header.
>
I see. The problem is that you will not receive this header when having http
chunked response.
> But from the "physical" point of view I can't understand why we just
> can't stop downloading when the size of HTTP response becomes greater
> than the limit? Is there any possibility to do this?
You can easily do this with CURLOPT_RANGE, i.e. sending a Content-Range
> header with your request. Some servers may ignore that header, so you might
> also have to count how many bytes your callback function has received so
> far, and abort the download once you have received enough data. (The Range
> header, if recognized, has the advantage that your crawler can re-use the
> HTTP connection after the data was received.)
I'm afraid that range will not work as well in a lot of cases.
But you said callback - this is what I didn't think about!
CURLOPT_WRITEFUNCTION seems to work fine for this, thank you!
-- wbr, fisherReceived on 2008-09-20