curl-and-php
CURLE_COULDNT_CONNECT on some valid sites
Date: Thu, 18 Jun 2009 11:22:50 -0500
I've searched the mailing list and haven't been able to find a
solution/reason for my problem, I've also tried a laundry list of curlopt
settings to get a result, with no luck.
So, the problem is: I have curl installed on about 25 servers, and have
tested from all of them, with the same results, so I assume it's a problem
with specific target domain setup and my curl options.
An example of one of the servers:
Server: Apache/1.3.37 (Unix) PHP/5.1.6 mod_auth_passthrough/1.8
mod_log_bytes/1.2 mod_bwlimited/1.4 FrontPage/5.0.2.2635.SR1.2
mod_ssl/2.8.28 OpenSSL/0.9.7a
libcurl/7.15.3 zlib/1.2.1.2
Basically what it comes down to is this. I have a list of (~75000) sites
that I need to check the status of.
In order to do this, I just send a curl request to the index, and I grab the
http status (ideally 200) and cache an imprint of the page.
Works fine in most cases, but there are a couple of domains that a) I need
to extend the timeout from 5s to 10s, else I timeout, and then I get a
CURLE_COULDNT_CONNECT error.
Now, If I visit the site in a browser, I connect, no problems, and see the
page.
Also, the page is http://www.domain.com, I curl connect to the exact same
page, no https, no redirection I am aware of.
I am setting the following options:
$header = array
(
"Accept:
text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5",
"Cache-Control: max-age=0",
"Connection: keep-alive",
"Keep-Alive: 300",
"Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7",
"Accept-Language: en-us,en;q=0.5",
"Pragma: ", // browsers keep this blank.
);
curl_setopt( $ch, CURLOPT_URL, 'http://www.domain.com' ); //
http, no trailing slash
curl_setopt( $ch, CURLOPT_HTTPHEADER, $header );
curl_setopt( $ch, CURLOPT_USERAGENT, 'Googlebot/2.1 (+
http://www.google.com/bot.html)' );
curl_setopt( $ch, CURLOPT_REFERER, 'http://www.google.com' );
curl_setopt( $ch, CURLOPT_ENCODING, 'gzip,deflate' );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, 1 );
curl_setopt( $ch, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_0 );
curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, 1 );
curl_setopt( $ch, CURLOPT_MAXREDIRS, 10 );
curl_setopt( $ch, CURLOPT_AUTOREFERER, 1 );
curl_setopt( $ch, CURLOPT_SSL_VERIFYPEER, FALSE );
curl_setopt( $ch, CURLOPT_TIMEOUT, 10 );
And as mentioned, this works fine on 99% of what I try, but a few it fails
on, even though I have no problems with a web browser.
Any idea on possible causes, curlopts I should change/add to try for
success?
_______________________________________________
http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-and-php
Received on 2009-06-18