cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: replacing libwww's webbot with libcurl

From: Dan Fandrich <dan_at_coneharvesters.com>
Date: Sat, 13 Mar 2004 18:55:02 -0800

On Sat, Mar 13, 2004 at 06:38:08PM -0800, jmzorko_at_mac.com wrote:
> Hello, all ...
>
> I looked for this in the archives, but didn't see an answer for this.
> I want to replace libwww's webbot with my own app using libcurl (webbot
> has problems with hanging forever on certain URLs). What I want to
> know is:
>
> How do I set up libcurl to crawl a website i.e. not download any data,
> just report back URLs off of a root URL given to it, for example, given
> www.apple.com, how to make libcurl report back URLs off of that, like
> www.apple.com/ipod, www.apple.com/macosx, etc.? Is there a callback
> option for this, i.e. something like CURLOPT_WRITEFUNCTION,
> CURLOPT_WRITEDATA, etc.?

curl is limited to transferring data from one point to another; interpreting
that data is out of its scope. Take a look at
http://curl.haxx.se/curlprograms.html to see how other programs handle the
problem (e.g. recursiveftpget.pl) or
http://curl.haxx.se/libcurl/relatedlibs.html for some libraries that may be
useful.

>>> Dan

-- 
http://www.MoveAnnouncer.com              The web change of address service
          Let webmasters know that your web site has moved
Received on 2004-03-14