cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: callback functions to support robots.txt

From: <man_at_tfhs.net>
Date: Fri, 10 Feb 2006 14:44:35 -0000

On Fri, Feb 10, 2006, Andy Curtis <acurtis_at_askjeeves.com> said:

> It seems that it would be nice have a callback function which is called
> prior to connecting to a server with the URL that is about to be
> fetched. This should exist for the URL requested as well as any
> redirects. If such a callback existed which passed the URL to the
> callback, one could make another request to the same server or some
> other server for the robots.txt file.
>
> For example,
>
> on_url_request( const char * URL, ... ) {
> // check for robots.txt,
> // if not found change the URL to the robots.txt file and/or make a
> request from some cache service.
> // have some way of knowing to fetch original URL after the success
> or failure of the given robots.txt.
> };
>
> on_url_response( ... ) {
> // if response is robots.txt, do one thing,
> // if response is other, do another
> }
>
> I guess my question is why isn't there more callback hooks which can be
> tied to the curl_easy handle?

because no one has offered to write and maintain that code. you feel like
volunteering? :)

in all seriousness, you could just manage redirects yourself, (which you
were going to do cause of meta refresh tag anyway), and curl does not have
to be changed at all.

allan

-- 
m. allan noah
IT Director, TfHS.net
ph# (804) 355-5489
tf# (866) 724-9722
fx# (804) 355-0477
Received on 2006-02-10