curl-and-php
Weird 302 error loop
Date: Tue, 21 Aug 2007 11:48:00 -0500
Greetings all,
I've setup a PHP script to scrape some data from a user authenticated site.
However, when I try to log in to the site using curl, I do not get the same
headers I would as if I tried logging in myself. I get infinitely redirected
with "302 Moved Temporarily" errors.
I set up my PHP script as follows [domains, users, and passwords have been
censored]:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://fakewebsite.com");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 7.0;
Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)");
curl_setopt($ch, CURLOPT_COOKIE, "user_id=XXXXXX; auto_log=1;
password=YYYYY");
curl_setopt($ch, CURLOPT_HEADER, true);
curl_exec($ch);
curl_close($ch);
What this produces is a massive loop of 302 errors. I've trimmed the
responses I get from about 20 to just 3 for this example:
HTTP/1.0 302 Moved Temporarily
Date: Tue, 21 Aug 2007 16:15:30 GMT
Server: Apache/2.0.55 (Unix) PHP/5.2.1
X-Powered-By: PHP/5.2.1
Set-Cookie: PHPSESSID=24981136b66335297ff38104c97e9581; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Location: ?main_action=do_login&auth=1
Content-Length: 0
Content-Type: text/html; charset=iso-8859-1
X-Cache: MISS from wc02.inet.mesa1.secureserver.net
Connection: close
HTTP/1.0 302 Moved Temporarily
Date: Tue, 21 Aug 2007 16:15:30 GMT
Server: Apache/2.0.55 (Unix) PHP/5.2.1
X-Powered-By: PHP/5.2.1
Set-Cookie: PHPSESSID=5042bac8c3b12f653aff671169fb012c; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Set-Cookie: ExpiredPost=deleted; expires=Mon, 21-Aug-2006 16:15:29 GMT
Location: /app/ptm/
Content-Length: 0
Content-Type: text/html; charset=iso-8859-1
X-Cache: MISS from wc02.inet.mesa1.secureserver.net
Connection: close
HTTP/1.0 302 Moved Temporarily
Date: Tue, 21 Aug 2007 16:15:30 GMT
Server: Apache/2.0.55 (Unix) PHP/5.2.1
X-Powered-By: PHP/5.2.1
Set-Cookie: PHPSESSID=d4a24ac3e674e925eedfc09d21339510; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Set-Cookie: ExpiredPost=a%3A0%3A%7B%7D
Location: /index.php?redir=/app/ptm/index.php&
Content-Length: 0
Content-Type: text/html; charset=iso-8859-1
X-Cache: MISS from wc02.inet.mesa1.secureserver.net
Connection: close
This goes on for about 20 more responses. Each time the location is
different, but it does eventually loop and start hitting the same locations
over. The final response is a 302 error with a Location header, but for
whatever reason cURL doesn't follow it. Perhaps this is a preventative
feature of cURL?
Now, when I go to this site in a web browser (with my cookies), I get two
302 errors before getting the a code 200 response:
+++RESP 1+++
HTTP/1.1 302 Found
Date: Tue, 21 Aug 2007 05:03:50 GMT
Server: Apache/2.0.55 (Unix) PHP/5.2.1
X-Powered-By: PHP/5.2.1
Set-Cookie: PHPSESSID=6ea7cbb52f349d0bdc49b19104562953; path=/
Pragma: no-cache
Location: ?main_action=do_login&auth=1
Content-Length: 0
Content-Type: text/html; charset=iso-8859-1
+++CLOSE 1+++
+++RESP 2+++
HTTP/1.1 302 Found
Date: Tue, 21 Aug 2007 05:03:50 GMT
Server: Apache/2.0.55 (Unix) PHP/5.2.1
X-Powered-By: PHP/5.2.1
Pragma: no-cache
Set-Cookie: ExpiredPost=deleted; expires=Mon, 21-Aug-2006 05:03:49 GMT
Location: /app/ptm/
Content-Length: 0
Content-Type: text/html; charset=iso-8859-1
+++CLOSE 2+++
+++RESP 3+++
HTTP/1.1 200 OK
Date: Tue, 21 Aug 2007 05:03:50 GMT
Server: Apache/2.0.55 (Unix) PHP/5.2.1
X-Powered-By: PHP/5.2.1
Pragma: no-cache
Transfer-Encoding: chunked
Content-Type: text/html; charset=iso-8859-1
+++CLOSE 3+++
I've tried as best as possible to construct my HTTP request in cURL like a
normal person viewing the page. Does anyone see anything obvious that I'm
missing?
-Andy
_______________________________________________
http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-and-php
Received on 2007-08-21