cURL / Mailing Lists / curl-users / Single Mail

curl-users

Re: Problems downloading aspx pages with curl

From: Daniel Stenberg <daniel-curl_at_haxx.se>
Date: Mon, 13 Dec 2004 20:38:58 +0100 (CET)

On Mon, 13 Dec 2004, bobs_at_www.computilizer.com wrote:

> This is the form on the sites home page which I am trying to simulate
> sending using curl.
>
> There is no command line, I am using libcurl with php What I have pasted
> below is a view source of the sites home page with all non form elements
> removed, which I used to determing what fields to send to their server.

You do know there is a separate mailing list for PHP/curl issues, right?

BTW, you should _always_ work out a working command line first and then
convert that to PHP because the PHP binding doesn't offer very good
debug/trace options. While the command line version does.

> Now, from what I can tell the __VIEWSTATE input changes on each page load,
> so what I do to simulate a user actually viewing the home page and
> submitting a request, is open the [home] page ( with libcurl ), capture any
> cookie information and the value of this __VIEWSTATE field and post it back
> with the search request.

Yeps, sounds like it. Unless there's some javascript as well that fiddles with
the cookies or other stuff.

As always, LiveHTTPheaders is the tool you want that can give you all the
answers.

I added a "Debug" chapter to "The Art Of Scripting HTTP Requests Using Curl"
the other day: http://curl.haxx.se/docs/httpscripting.html

> thats when I get the error page redirect

Which probably means you missed one of the items I mention in that chapter.

> really, I understand what your saying that it doesnt matter what kindof
> server, a post is a post so to speak, but I dont think thats the issue. Ive
> done quite a few of these types of projects, and am good at duplicating what
> data needs to be posted. The reason I bring up the aspx is that I beleive
> the issue is more in convincing the servers session/state management that it
> is a valid post from their own pages.

Yes you need to convince the server exactly that, but a browser doesn't have
very many ways of doing that. You can make sure that your request resembles
those of a browser as closely as possible as then you'll have the biggest
chance of succeeding.

I therefore recommend you to use LiveHTTPheaders, and capture a full manual
session (that works). Then you work on repeating the exact sequence with
libcurl. It may involve parsing the HTML form to extract hidden field contents
and it may involve decoding javascript to understand what the browser would
do.

> if ( $method=='GET' ) {
> $url.=$params;
> curl_setopt($ch, CURLOPT_POST,0);
> curl_setopt($ch, CURLOPT_POSTFIELDS,"");
> }

This is not the correct way to enforce a GET. There is no "opposite" of POST.
CURLOPT_HTTPGET is the option you want:
http://curl.haxx.se/libcurl/c/curl_easy_setopt.html#CURLOPTHTTPGET

> libcurl 7.9.8 (OpenSSL 0.9.7a) (ipv6 enabled)

Oh lord. This is almost a million years old. You are doomed to experience bugs
and weird things we fixed ages ago. I strongly suggest you upgrade.

-- 
      Daniel Stenberg -- http://curl.haxx.se -- http://daniel.haxx.se
       Dedicated custom curl help for hire: http://haxx.se/curl.html
Received on 2004-12-13