cURL / Mailing Lists / curl-users / Single Mail

curl-users

Re: Fetching only the destination URLs

From: Dan Fandrich <dan_at_coneharvesters.com>
Date: Tue, 10 Aug 2010 23:57:46 -0700

On Wed, Aug 11, 2010 at 09:36:12AM +0800, Phoenix wrote:
> I'm now discovering that some cretins give us URLs that are actually
> forwarded to other URLs which are in turn forwarded to other URLs.
>
> So I'm trying to write a script that tells me the *final* destination URL.
>
> I think I can do this with CURL, right? My humble code for now:
>
>
> $url = 'http://example.com';
> $ch = curl_init();
> curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
> curl_setopt($ch, CURLOPT_URL, $url);
> curl_setopt($ch, CURLOPT_HEADER, 0);
> curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
> curl_setopt($ch, CURLOPT_AUTOREFERER, 1);
> curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
> curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookies.txt');
> curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookies.txt');
> curl_setopt($ch, CURLOPT_MAXREDIRS, 5);
> curl_setopt($ch, CURLOPT_TIMEOUT, 30);
> curl_setopt($ch, CURLOPT_HEADER, true);
> $url = curl_exec($ch);
> curl_close($ch);
> echo $url;

This looks like PHP/CURL code. There's a separate mailing list for
questions to do with that. Note that you have CURLOPT_HEADER listed
twice in your code.

> All I need is the final URL. Don't need any content etc. I just want
> to "FOLLOWLOCATION" as many times as needed, and just report the
> ultimate destination URL. But the above code prints out the html code
> of the final URL.
>
> What should I do so I only get the final URL -- and in the fastest way
> possible, because I'm not interested in the content at all.

The C binding has a function curl_easy_getinfo() that you can use with
the CURLINFO_EFFECTIVE_URL parameter to return the final URL used once the
transfer has completed. There's probably something similar in PHP/CURL.
If you don't want the content, then the C binding gives you the
CURLOPT_NOBODY option to do a HEAD request, but that won't work with some
URLs. You'll probably want to set a write callback function that returns an
error code to stop the transfer (if that's allowed in PHP/CURL).

If you're allowing arbitrary URLs to be entered by untrusted users, then
you need to read http://curl.haxx.se/libcurl/c/libcurl-tutorial.html#Security

>>> Dan
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-users
FAQ: http://curl.haxx.se/docs/faq.html
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2010-08-11