cURL / Mailing Lists / curl-users / Single Mail

curl-users

Re: POST data goes into second packet.

From: Ralph Mitchell <rmitchell_at_eds.com>
Date: Tue, 13 May 2003 00:30:42 -0500

Daniel Stenberg wrote:

> On Mon, 12 May 2003, Kaye-Smith Adam wrote:
>
> > I am trying to replicate what occurs when I logon on to a website through a
> > browser.
> >
> > With curl I am using the -d to post the variables and -H to changes headers
> > values & this works as expected but when I monitor what packets are
> > exchanged (via the ethereal network analyser) when the browser provides its
> > posted values, the browser puts its variables in a packet that is seperate
> > to the initial post packet. ie the Host:, User-agent:, Accept: headers are
> > in the first http packet & the next http packet has just the postdata & the
> > headers Type: & Size. I am not sure if this critical to why my curl based
> > form entry is not working but how do I get curl to emulate the browser by
> > defining exactly what goes in what packets in the initial POST sequemce (
> > which includes a second packet as stated above).
>
> I doubt it very much, this is the reason for your script's failure.
>
> You can't emulate the browser's behavior to that level with curl. The only
> valid reason I can think of, would be if the remote server is utterly stupid
> imlemented and can't be fixed.
>
> One way to try it out, would be to setup a proxy in between and let curl
> operate through that one, as it'll change how the packets are sent out from
> the proxy to the remote server.

The browser may also be doing several other things *before* you get the login
page. Did you try starting ethereal before pointing your browser to the page
you're trying to fetch?

In my experience, pages that have POSTable login forms generally use cookies (or
sometimes embedded hidden text fields) to track the login, and it's usually
necessary to follow the whole cookie trail. I've had pages that have gone
through several redirects (back to themselves...) setting a different cookie
each time.

This has happened to me often enough that I have a standard set of options that
I use for every script:

    CURLOPTS="-s -S -L -b cookies -c cookies -m 90 --connect-time 45"

and then I add in output filename, proxy info where necessary, on the actual
command line:

    curl -o xxxx.html $CURLOPTS http://host.domain.com

Ralph Mitchell

-------------------------------------------------------
Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara
The only event dedicated to issues related to Linux enterprise solutions
www.enterpriselinuxforum.com
Received on 2003-05-13