curl-and-php
Re: post then get help
Date: Tue, 8 Jul 2008 20:29:30 -0400
Ryan,
Just a few cents worth here. I also use a server that is unable to
perform redirects. So, to get around it, I constructed a bunch (one
bot has 13 curl sessions). The basic layout after setting the file
defaults like web agent, user ids, opening a curl session etc, is to
configure curl setopt's for the PAGE to be gotten, do what parsing
needs to be done on the return, (you may not need the body), then
setup a new $target and $ref with the appropriate curl setopts to do
it all again and again. It's tedious and amounts to MANY lines of php/
curl code, but it works. Don't close the curl session until the end,
and reuse the handle, that way you don't have to configure EVERY curl
setopt for each page you're getting. Some pages are post, while most
are get for the redirect. Also, a 301, 302 doesn't alway result
indicating a redirect. Frames for example can cause you to have to GET
multiple pages with corresponding curl executes for each.
Hope that makes sense. Maybe this will help. I've included an example
of a first and second page request below. I've got file into 600-800
lines of code with debugging and DB and such.
#**********************************************************************
# 1) REQUEST the Login Page
#**********************************************************************
$target = "url";
$ref = "";
$cookie = "WASReqURL=/mtravel/app/jci.jsp;
JSESSIONID=00001pAfiyf46fsenLtjmclh2pV:-1; WASReqURL=/mtravel/app/
jci.jsp";
$header_array[] = "Accept: text/xml,application/xml,application/xhtml
+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header_array[] = "Accept-Language: en-us,en;q=0.5";
$header_array[] = "Accept-Encoding: gzip,deflate";
$header_array[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header_array[] = "Keep-Alive: 300";
$header_array[] = "Connection: keep-alive";
$ch1 = curl_init();
curl_setopt($ch1, CURLOPT_MAXCONNECTS, 6);
curl_setopt($ch1, CURLOPT_USERAGENT, $WEBBOT_NAME); // See above
Go Stealthy
curl_setopt($ch1, CURLOPT_SSL_VERIFYPEER, FALSE); // Needed for
https: & no certificate
curl_setopt($ch1, CURLOPT_URL, $target); // Define
target site
# curl_setopt($ch1, CURLOPT_REFERER, $ref); // Define
refering page
curl_setopt($ch1, CURLOPT_VERBOSE, TRUE);
curl_setopt($ch1, CURLOPT_STDERR, $fp);
curl_setopt($ch1, CURLOPT_NOPROGRESS, FALSE);
curl_setopt($ch1, CURLOPT_HTTPHEADER, $header_array); // Send
Accept: Header values
curl_setopt($ch1, CURLOPT_RETURNTRANSFER, TRUE); // Return
page in String
# curl_setopt($ch1, CURLOPT_COOKIESESSION, TRUE);
# curl_setopt($ch1, CURLOPT_COOKIEJAR, $cookie_file); // Where to
WRITE cookies
# curl_setopt($ch1, CURLOPT_COOKIEFILE, $cookie_file); // Where to
READ cookies FROM
curl_setopt($ch1, CURLOPT_COOKIE, $cookie); // Send
specific cookie
curl_setopt($ch1, CURLOPT_HEADER, TRUE);
curl_setopt($ch1, CURLOPT_NOBODY, TRUE);
curl_setopt($ch1, CURLOPT_HTTPGET, TRUE); // Use GET
Method
# curl_setopt($ch1, CURLOPT_POST, FALSE); // Use GET
Method
curl_setopt($ch1, CURLOPT_FRESH_CONNECT, TRUE); // Force New
Cache connection
# curl_setopt($ch1, CURLOPT_FOLLOWLOCATION, TRUE);
$page1_array_hdr['FILE'] = curl_exec($ch1);
$page1_array_hdr['ERROR'] = curl_error($ch1);
$page1_array_hdr['STATUS'] = curl_getinfo($ch1);
#**********************************************************************
# 2) POST the Login Page
#**********************************************************************
$target = "url";
$ref = "www.abc.com";
$form_data = "j_username=$user&j_password=$j_pass&action=Login";
$cookie = "WASReqURL=/mtravel/app/jci.jsp; " . $new_jsessionid . ";
WASReqURL=/mtravel/app/jci.jsp";
curl_setopt($ch1, CURLOPT_URL, $target); // Define
target site
curl_setopt($ch1, CURLOPT_REFERER, $ref); // Define
refering page
curl_setopt($ch1, CURLOPT_HTTPHEADER, $header_array); // Send
Cookie value obtained previously
curl_setopt($ch1, CURLOPT_POST, TRUE); // Use POST
Method
curl_setopt($ch1, CURLOPT_POSTFIELDS, $form_data);
curl_setopt($ch1, CURLOPT_COOKIESESSION, TRUE);
# curl_setopt($ch1, CURLOPT_COOKIEJAR, $cookie_file); // Where to
WRITE cookies
# curl_setopt($ch1, CURLOPT_COOKIEFILE, $cookie_file); // Where to
READ cookies FROM
curl_setopt($ch1, CURLOPT_COOKIE, $cookie); // Send
specific cookie
$page2_array_hdr['FILE'] = curl_exec($ch1);
$page2_array_hdr['ERROR'] = curl_error($ch1);
$page2_array_hdr['STATUS'] = curl_getinfo($ch1);
............>
Good luck,
David
_______________________________________________
http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-and-php
Received on 2008-07-09