cURL / Mailing Lists / curl-users / Single Mail

curl-users

Re: Newbie's question on cURL usage

From: Ralph Mitchell <rmitchell_at_eds.com>
Date: Thu, 28 Feb 2002 05:21:20 -0600

It looks to me like you might be posting to the wrong URL... And
possibly starting from the wrong location...

Try this:

=========
    cat /dev/null cookies

    # Start by trying to get the final target - provoke MSFT into making
you login
    curl -A "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)" -b
cookies -s \
     -c cookies -L -o money0.html \
     --url
"http://Moneycentral.msn.com/investor/quotes/pprtq.asp?Page=RTQ&Symbol=orcl"

    # Pull the login url out of the file - yep, really the last one, who
knows why
    url=`grep -i action money0.html | tail -1 | sed -e 's/^.*action="//'
-e 's/".*//'`

    # Login with all the bells and whistles from the previous page
    curl -A "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)" -b
cookies -s \
     -c cookies -L -o money1.html \
     -d
"notinframe=1&login=llp_gapper_at_yahoo.com&passwd=ladder&sec=rem&submit1=+Sign+In+&mspp_shared=1"
\
     --url "$url"

    # Pull the META REFRESH tag and fetch that
    url=`grep URL money1.html | sed -e 's/^.*URL=//' -e 's/".*//'`

    curl -A "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)" -b
cookies -s \
     -c cookies -L -o money2.html \
     --url "$url"

    # Pull another META REFRESH tag and fetch that too
    url=`grep url money2.html | sed -e 's/^.*url=//' -e 's/".*//'`

    curl -A "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)" -b
cookies -s \
     -i -c cookies -L -o money3.html \
     --url "$url"

    # money3.html should contain the target page...
=================

Works for me most of the time... :) Sometimes I get a Location header
in money3.html that contains "?Error=TooManyResets" instead of the stock
quote page. Dunno why curl didn't follow that Location header...

Seems like when it fails, the money3.html file comes out at around 563
bytes. Running the script again generally gets the proper result, which
is over 17Kb big.

Ralph Mitchell
curl 7.9.5-pre4 (i686-pc-linux-gnu) libcurl 7.9.5-pre4 (OpenSSL 0.9.6c)

Yanhui Liu wrote:

> At 05:25 PM 2/27/02 +0100, you wrote:
>
>> On Wed, 27 Feb 2002, Yanhui Liu wrote:
>>
>> > I am trying to access stock quote page on msn.com site using cURL,
>> however
>> > I could not get it to work. Could you help me out? I am using curl
>> 7.9.4
>> > (i686-pc-linux-gnu) libcurl 7.9.4 (OpenSSL 0.9.6) on Redhat Linux
>> 7.1.
>>
>> 7.9.4 has a SSL read bug that will make it unreliable for SSL
>> downloads. You
>> should get a 7.9.5 pre-release instead, as it will work better.
>>
>> It would help a lot if you first of all upgraded and retried this,
>> then come
>> back if it still doesn't work.
>
>
> I upgraded to 7.9.5-pre4, however I still could not get the target
> page. Test results are attached at the end.
>
>
>> I would also appreciate if you would be able to cut out a few issues
>> at a
>> time and ask specificly about them. It is very hard, and
>> breath-taking to
>> only parse through such a huge mail with many complex command lines
>> involved.
>
>
> Sorry about the lengthy mail, I just want to present all relevant
> information for the problem. For me, I could not cut the problem into
> pieces, I am totally lost.
>
>
>> > 1. Is it possible for curl to use Netscape's cookie? So we can get
>> to the
>> > client state using Netscape as a tool.
>>
>> Yes, curl can read Netscape's cookies. Use -b for reading and -c can
>> even
>> write them back in Netscape format.
>
>
> Wonderful. Does it mean curl can use netscape's cookie to retrieve
> pages? For example, I ran the curl test using netscape's cookie, which
> was generated after browsing the content.
>
> $ cp ~/.netscape/cookies .
> $ curl -A "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)" -b
> cookies -L -v -i -s -o junk.DATA --url
> "http://moneycentral.msn.com/investor/quotes/pprtq.asp?Page=RTQ&Symbol=orcl"
>
> * Connected to moneycentral.com (207.46.189.14)
> > GET /investor/quotes/pprtq.asp?Page=RTQ&Symbol=orcl HTTP/1.1
> User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)
> Host: moneycentral.msn.com
> Pragma: no-cache
> Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*
> Cookie:
> QUAUTH=66504305614a51425d57755e515248404e585a5478525906280e40640357114307215e07164b43455009;
> MC1=GUID=00D22A5A9CA043E6A8730ABFDF1F3195
>
> * Follow to new URL:
> /pplogin.asp?Page=http://moneycentral.msn.com/investor/quotes/pprtq.asp&Query=Page%3DRTQ%26Symbol%3Dorcl%26REQUEST%5FMETHOD%3DGET&AuthTime=43200&ForceLogin=False
>
> * Closing connection #0
> * Follows Location: to new URL:
> 'http://moneycentral.msn.com/pplogin.asp?Page=http://moneycentral.msn.com/investor/quotes/pprtq.asp&Query=Page%3DRTQ%26Symbol%3Dorcl%26REQUEST%5FMETHOD%3DGET&AuthTime=43200&ForceLogin=False'
>
> * Disables POST, goes with GET
> * Connected to moneycentral.com (207.46.189.14)
> > GET
> /pplogin.asp?Page=http://moneycentral.msn.com/investor/quotes/pprtq.asp&Query=Page%3DRTQ%26Symbol%3Dorcl%26REQUEST%5FMETHOD%3DGET&AuthTime=43200&ForceLogin=False
> HTTP/1.1
> User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)
> Host: moneycentral.msn.com
> Pragma: no-cache
> Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*
> Cookie:
> QUAUTH=66504305614a51425d57755e515248404e585a5478525906280e40640357114307215e07164b43455009;
> MC1=GUID=00D22A5A9CA043E6A8730ABFDF1F3195
>
> * Follow to new URL:
> http://login.passport.com/login.srf?lc=1033&id=229&ru=http%3A%2F%2Fmoneycentral%2Emsn%2Ecom%2Fpploggedin%2Easp%3FPage%3Dhttp%253A%252F%252Fmoneycentral%252Emsn%252Ecom%252Finvestor%252Fquotes%252Fpprtq%252Easp%26Query%3DPage%253DRTQ%2526Symbol%253Dorcl%2526REQUEST%255FMETHOD%253DGET&tw=43200&kv=2&ct=1014877476&ver=2.0.0248.1&tpf=5b566e78f24697f753fdd1b608fd10b3
>
> * Closing connection #0
> * Follows Location: to new URL:
> 'http://login.passport.com/login.srf?lc=1033&id=229&ru=http%3A%2F%2Fmoneycentral%2Emsn%2Ecom%2Fpploggedin%2Easp%3FPage%3Dhttp%253A%252F%252Fmoneycentral%252Emsn%252Ecom%252Finvestor%252Fquotes%252Fpprtq%252Easp%26Query%3DPage%253DRTQ%2526Symbol%253Dorcl%2526REQUEST%255FMETHOD%253DGET&tw=43200&kv=2&ct=1014877476&ver=2.0.0248.1&tpf=5b566e78f24697f753fdd1b608fd10b3'
>
> * Disables POST, goes with GET
> * Connected to login.passport.com (64.4.60.254)
> > GET
> /login.srf?lc=1033&id=229&ru=http%3A%2F%2Fmoneycentral%2Emsn%2Ecom%2Fpploggedin%2Easp%3FPage%3Dhttp%253A%252F%252Fmoneycentral%252Emsn%252Ecom%252Finvestor%252Fquotes%252Fpprtq%252Easp%26Query%3DPage%253DRTQ%2526Symbol%253Dorcl%2526REQUEST%255FMETHOD%253DGET&tw=43200&kv=2&ct=1014877476&ver=2.0.0248.1&tpf=5b566e78f24697f753fdd1b608fd10b3
> HTTP/1.1
> User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)
> Host: login.passport.com
> Pragma: no-cache
> Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*
> Cookie: MSPPre=llp_gapper_at_yahoo.com
>
> * Closing connection #0
>
> I still ended up at the login page. The content of the cookie file
> from netscape was shown in my first posting. So curl does not pass the
> cookie back to server correctly.
Received on 2002-02-28