curl-users
Re: Using curl to get Yahoo mail
Date: Thu, 28 Mar 2002 06:53:51 -0700
Ok, much further along now. I was making a dumb mistake, my grep for one of
the parameters matched two lines so my resulting url was messed up. But I
have suceeded in retrieving something out of the trash on my account
(nothing in the inbox right now). It has a whole lot of chaff I have to
figure out how to strip off, but logging in and staying logged in seemed to
be my main problem. I guess all the nice cookie stuff was added after my
version, I'm using 7.8.1. But this seems to mostly work.
There is one thing that I'm not sure about. All of the links in the
following pages are relative paths, but no host. Right now I just hardcoded
in the host it seems to go to all the time, but how do I determine where I'm
getting redirected to and save that off. But here is what I have so far:
#!/bin/sh -x
#get the initial login page
curl -A "Lynx/2.8.4rel.1 libwww-FM/2.14 SSL-MM/1.4.1 OpenSSL/0.9.6b" -D
yahoo_header1 -s --cookie cookies -L --url "http://mail.yahoo.com" >
yahoo0.html
#pull out the form values needed to login
UVAL=`grep -i "\"\.u\"" yahoo0.html | sed -e 's/^.*value="//' -e 's/".*//'`
CHALLENGE=`grep -i "name=\"\.challenge\"" yahoo0.html | sed -e
's/^.*value="//' -e 's/".*//'`
USER='someuser'
PASSWD='secret'
#login
curl -v -A "Lynx/2.8.4rel.1 libwww-FM/2.14 SSL-MM/1.4.1 OpenSSL/0.9.6b" -D
yahoo_header2 -s --cookie cookies -L --url
"login.yahoo.com/config/login?.tries=&.src=ym&.last=&promo=&.intl=us&.bypass
=&.partner=&.u=${UVAL}&.v=0&.challenge=${CHALLENGE}&.emailCode=&hasMsgr=1&.c
hkP=Y&.done=&login=${USER}&passwd=${PASSWD}&.persistent=&.save=1&.hash=0&.js
=0&.md5=0" > yahoo1.html
#go to inbox
INBOX_URL=`grep "Check Mail" yahoo1.html |sed -e 's/^.*href="//' -e
's/".*//'`
MAIL_HOST='http://us.f148.mail.yahoo.com'
curl -v -A "Lynx/2.8.4rel.1 libwww-FM/2.14 SSL-MM/1.4.1
OpenSSL/0.9.6b" -L -b yahoo_header2 -D yahoo_header3 -s --url
"${MAIL_HOST}${INBOX_URL}" > yahoo2.html
#look in other folders
FOLDERS_URL=`grep "Folders" yahoo1.html |sed -e 's/^.*href="//' -e
's/".*//'`
curl -v -A "Lynx/2.8.4rel.1 libwww-FM/2.14 SSL-MM/1.4.1
OpenSSL/0.9.6b" -L -b yahoo_header2 -D yahoo_header4 -s --url
"${MAIL_HOST}${FOLDERS_URL}" > yahoo3.html
#look in trash
TRASH_URL=`grep "Trash" yahoo3.html |sed -e 's/^.*href="//' -e 's/".*//'`
curl -v -A "Lynx/2.8.4rel.1 libwww-FM/2.14 SSL-MM/1.4.1
OpenSSL/0.9.6b" -L -b yahoo_header2 -D yahoo_header5 -s --url
"${MAIL_HOST}${TRASH_URL}" > yahoo4.html
#get one of the letters from the trash, probably won't work if there is more
than one message there
TRASH_LETTER_URL=`grep "ShowLetter" yahoo4.html |sed -e 's/^.*href="//' -e
's/".*//'`
curl -v -A "Lynx/2.8.4rel.1 libwww-FM/2.14 SSL-MM/1.4.1
OpenSSL/0.9.6b" -L -b yahoo_header2 -D yahoo_header6 -s --url
"${MAIL_HOST}${TRASH_LETTER_URL}" > yahoo5.html
Received on 2002-03-28