cURL / Mailing Lists / curl-users / Single Mail

curl-users

curl question/problem...

From: mark douglas <badouglas_at_gmail.com>
Date: Sat, 9 Jan 2010 11:40:58 -0800

Hi.

Somewhat new to curl. Trying to use Curl from the cmdline (linux) to
fetch pages from a college website (sjsu). I can somewhat get the 1st
page, but the 2nd/2rd pages use referers/cookies.. I think this is
where things are screwed up..

I'm also dealing with fetching data from within frames on the given
page. If I analyze the page, in combination with the livehttpheader
data, this shouldn't be an issue.

I've tried using cookie-jar, as well as cookie, in various
combinations with the "-e" for the referer, and the "-d" for the
posting data...

I'm posting the data from the livehttpheader process for the pages, as
well as the test shell script i'm using.. Any thoughts/pointers would
be greatly appreciated...

Thanks

-bruce

** this is the data that is displayed, when i select the "submit" btn
on the location/term page to get to the "class select" page" as i
understand the process, the Curl function should more or less
replicate this process, with the referer, the postdata, and the target
http/url

livehttpheaders data:
========================================
https://cmshr.sjsu.edu/psc/HSJPRDF/EMPLOYEE/HSJPRD/c/COMMUNITY_ACCESS.CLASS_SEARCH.GBL

POST /psc/HSJPRDF/EMPLOYEE/HSJPRD/c/COMMUNITY_ACCESS.CLASS_SEARCH.GBL HTTP/1.1

Host: cmshr.sjsu.edu

User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.11)
Gecko/2009061118 Fedora/3.0.11-1.fc9 Firefox/3.0.11

Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8

Accept-Language: en-us,en;q=0.5

Accept-Encoding: gzip,deflate

Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7

Keep-Alive: 300

Connection: keep-alive

Referer: https://cmshr.sjsu.edu/psc/HSJPRDF/EMPLOYEE/HSJPRD/c/COMMUNITY_ACCESS.CLASS_SEARCH.GBL?FolderPath=PORTAL_ROOT_OBJECT.PA_HC_CLASS_SEARCH&PortalActualURL=https%3a%2f%2fcmshr.sjsu.edu%2fpsc%2fHSJPRDF%2fEMPLOYEE%2fHSJPRD%2fc%2fCOMMUNITY_ACCESS.CLASS_SEARCH.GBL&PortalRegistryName=EMPLOYEE&PortalServletURI=https%3a%2f%2fcmshr.sjsu.edu%2fpsp%2fHSJPRDF%2f&PortalURI=https%3a%2f%2fcmshr.sjsu.edu%2fpsc%2fHSJPRDF%2f&PortalHostNode=HRMS&NoCrumbs=yes

Cookie: I4Web_uuid=ece1faf2b59ace0ba40a4849c80e1d1b_at_e18f35cc4*0339838ad3db6ba9a32396e947e5d0d6_at_1418f55cc4*8bc80f3fe499835a27c78d5593811a3a_at_3818fc5cc4*166fa2936446e617426d945230e89e49_at_1e19045cc4*88210f324b634276bce6758d896eca38_at_8195a5cc4*d1da10f4207bca1bd889fea53873c471_at_3c1baf5cc4*d56e90391bc1ab51115f5c448ac2a124_at_71be65cc4*87ccd281d6940c53a49366c366796911_at_3e1bee5cc4*bb23f7cd74e303c96a11b4fdec5b7646_at_812a35cc6*84f727d9a8f7a11871756dad0c00d545_at_1a3b445cb4*;
cssln164HSJ1-84915=B1h9LG2L1yydzpj3TvqQrHBHKdbWpnl8!695030755;
ExpirePage=https://cmshr.sjsu.edu/psp/HSJPRDF/;
PS_LOGINLIST=https://cmshr.sjsu.edu/HSJPRDF;
PS_TOKENEXPIRE=9_Jan_2010_18:06:54_GMT; SignOnDefault=CMSPUBLIC;
cssln169HSJ1-84915=JpgTLL9RJVQPWjphl3Vybx1xnvW8zbMl!-1963651055;
cssln162HSJ1-84915=1LrQLH7b20GmB0kx50cf54GPGjj97WWM!813942549;
cssln118HSJ1-84915=xbDyLLFZZCTbpLFhJQpCNwNF2ppgjH4y!-923260117;
PS_TOKEN=AAAApgECAwQAAQAAAAACvAAAAAAAAAAsAARTaGRyAgBOcQgAOAAuADEAMBQLu6wR/++AI1n7RObpon/kagLJUAAAAGYABVNkYXRhWnicHclNDkBADIbhd4ZYWbkHYYKMrSF+gghxEPdzOJ3plz5tWuBVOopRSOnPm+HYuTl56NlYcAkjB1PKLPdVPhdDjaGkks7D9FqxoqEQjdiEvQtpJZYfnXoL+Q==

Content-Type: application/x-www-form-urlencoded

Content-Length: 339

ICType=Panel&ICElementNum=0&ICStateNum=2&ICAction=CLASS_SRCH_WRK2_SSR_PB_SRCH%2457%24&ICXPos=0&ICYPos=0&ICFocus=&ICSaveWarningFilter=0&ICChanged=-1&ICResubmit=0&ICSID=HpLTZLhQFp4p&CLASS_SRCH_WRK2_INSTITUTION%2445%24=SJ000&CLASS_SRCH_STRM1=2102&CLASS_SRCH_WRK2_SSR_CLS_SRCH_TYPE%2458%24=06&CLASS_SRCH_WRK2_SSR_CLS_SRCH_TYPE%2458%24%24rad=06

HTTP/1.x 200 OK

Cache-Control: no-cache

Date: Sat, 09 Jan 2010 18:07:47 GMT

Content-Length: 21137

Content-Type: text/html; CHARSET=UTF-8

IgnorePortalRegisteredURL: 1

PortalRegisteredURL:
https://cmshr.sjsu.edu/psc/HSJPRDF/EMPLOYEE/HSJPRD/c/COMMUNITY_ACCESS.CLASS_SEARCH.GBL

UsesPortalRelativeURL: true

X-Powered-By: Servlet/2.4 JSP/2.0

Set-Cookie: PS_TOKENEXPIRE=9_Jan_2010_18:07:47_GMT; domain=.sjsu.edu;
path=/; secure

test shell script:
========================================
#!/bin/sh -v
#
# test shell for curl..
#
#curl --cookie lcookie.lwp --cookie-jar lcookie.lwp --output
"ctest.dat" -L "http://my.sjsu.edu/"

#foo="http://my.sjsu.edu/"

#curl --cookie lcookie.lwp --cookie-jar lcookie.lwp --output
"ctest.dat" -L "$foo"

#exi
curl -v --cookie-jar lcookie.lwp --output "ctest2.dat"
https://cmshr.sjsu.edu/psp/HSJPRDF/EMPLOYEE/HSJPRD/c/COMMUNITY_ACCESS.CLASS_SEARCH.GBL?FolderPath=PORTAL_ROOT_OBJECT.PA_HC_CLASS_SEARCH
#exit
curl -v -A "Mozilla/4.73 [en] (X11; U; Linux 2.2.15 i686)"
--cookie-jar lcookie.lwp --output "ctest3.dat" -e
"https://cmshr.sjsu.edu/psp/HSJPRDF/EMPLOYEE/HSJPRD/c/COMMUNITY_ACCESS.CLASS_SEARCH.GBL?FolderPath=PORTAL_ROOT_OBJECT.PA_HC_CLASS_SEARCH"
-L "https://cmshr.sjsu.edu/psc/HSJPRDF/EMPLOYEE/HRMS/s/WEBLIB_PT_NAV.ISCRIPT1.FieldFormula.IScript_UniHeader_Frame?c=uA%2buCaKuiBh5DTZEFHMBvNKbD7XLjINl&FolderPath=PORTAL_ROOT_OBJECT.PA_HC_CLASS_SEARCH&PortalActualURL=https%3a%2f%2fcmshr.sjsu.edu%2fpsc%2fHSJPRDF%2fEMPLOYEE%2fHSJPRD%2fc%2fCOMMUNITY_ACCESS.CLASS_SEARCH.GBL&PortalRegistryName=EMPLOYEE&PortalServletURI=https%3a%2f%2fcmshr.sjsu.edu%2fpsp%2fHSJPRDF%2f&PortalURI=https%3a%2f%2fcmshr.sjsu.edu%2fpsc%2fHSJPRDF%2f&PortalHostNode=HRMS&PortalIsPagelet=true&NoCrumbs=yes"

#get the page with the term/location --- uses the get
curl -v -A "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.11)
Gecko/2009061118 Fedora/3.0.11-1.fc9 Firefox/3.0.11" --cookie-jar
lcookie.lwp --output "ctest4.dat" -e
"https://cmshr.sjsu.edu/psp/HSJPRDF/EMPLOYEE/HSJPRD/c/COMMUNITY_ACCESS.CLASS_SEARCH.GBL?FolderPath=PORTAL_ROOT_OBJECT.PA_HC_CLASS_SEARCH"
-L "https://cmshr.sjsu.edu/psc/HSJPRDF/EMPLOYEE/HSJPRD/c/COMMUNITY_ACCESS.CLASS_SEARCH.GBL?FolderPath=PORTAL_ROOT_OBJECT.PA_HC_CLASS_SEARCH&PortalActualURL=https%3a%2f%2fcmshr.sjsu.edu%2fpsc%2fHSJPRDF%2fEMPLOYEE%2fHSJPRD%2fc%2fCOMMUNITY_ACCESS.CLASS_SEARCH.GBL&PortalRegistryName=EMPLOYEE&PortalServletURI=https%3a%2f%2fcmshr.sjsu.edu%2fpsp%2fHSJPRDF%2f&PortalURI=https%3a%2f%2fcmshr.sjsu.edu%2fpsc%2fHSJPRDF%2f&PortalHostNode=HRMS&NoCrumbs=yes"

#
# the following two lines are attempts to get the page with the class
display... neither one works.
# -instead, the output files are simply the same as the above page,
with the location/term menu..
#
# the page should be a pae that lists a class schedule select menu..
#

#get the page with the search class menu... it's a post
curl -v -A "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.11)
Gecko/2009061118 Fedora/3.0.11-1.fc9 Firefox/3.0.11" --cookie-jar
lcookie.lwp --output "ctest5.dat" -e "Referer:
https://cmshr.sjsu.edu/psc/HSJPRDF/EMPLOYEE/HSJPRD/c/COMMUNITY_ACCESS.CLASS_SEARCH.GBL?FolderPath=PORTAL_ROOT_OBJECT.PA_HC_CLASS_SEARCH&PortalActualURL=https%3a%2f%2fcmshr.sjsu.edu%2fpsc%2fHSJPRDF%2fEMPLOYEE%2fHSJPRD%2fc%2fCOMMUNITY_ACCESS.CLASS_SEARCH.GBL&PortalRegistryName=EMPLOYEE&PortalServletURI=https%3a%2f%2fcmshr.sjsu.edu%2fpsp%2fHSJPRDF%2f&PortalURI=https%3a%2f%2fcmshr.sjsu.edu%2fpsc%2fHSJPRDF%2f&PortalHostNode=HRMS&NoCrumbs=yes"
-d "ICType=Panel&ICElementNum=0&ICStateNum=2&ICAction=CLASS_SRCH_WRK2_SSR_PB_SRCH%2457%24&ICXPos=0&ICYPos=0&ICFocus=&ICSaveWarningFilter=0&ICChanged=-1&ICResubmit=0&ICSID=HpLTZLhQFp4p&CLASS_SRCH_WRK2_INSTITUTION%2445%24=SJ000&CLASS_SRCH_STRM1=2102&CLASS_SRCH_WRK2_SSR_CLS_SRCH_TYPE%2458%24=06&CLASS_SRCH_WRK2_SSR_CLS_SRCH_TYPE%2458%24%24rad=06"
-L "https://cmshr.sjsu.edu/psc/HSJPRDF/EMPLOYEE/HSJPRD/c/COMMUNITY_ACCESS.CLASS_SEARCH.GBL"

#get the page with the search class menu... it's a post
curl -v -A "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.11)
Gecko/2009061118 Fedora/3.0.11-1.fc9 Firefox/3.0.11" --cookie
lcookie.lwp --output "ctest6.dat" -e "Referer:
https://cmshr.sjsu.edu/psc/HSJPRDF/EMPLOYEE/HSJPRD/c/COMMUNITY_ACCESS.CLASS_SEARCH.GBL"
-d "ICType=Panel&ICElementNum=0&ICStateNum=2&ICAction=CLASS_SRCH_WRK2_SSR_PB_SRCH%2457%24&ICXPos=0&ICYPos=0&ICFocus=&ICSaveWarningFilter=0&ICChanged=-1&ICResubmit=0&ICSID=HpLTZLhQFp4p&CLASS_SRCH_WRK2_INSTITUTION%2445%24=SJ000&CLASS_SRCH_STRM1=2102&CLASS_SRCH_WRK2_SSR_CLS_SRCH_TYPE%2458%24=06&CLASS_SRCH_WRK2_SSR_CLS_SRCH_TYPE%2458%24%24rad=06"
-L "https://cmshr.sjsu.edu/psc/HSJPRDF/EMPLOYEE/HSJPRD/c/COMMUNITY_ACCESS.CLASS_SEARCH.GBL"

=====================================

thanks for any/all pointers...
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-users
FAQ: http://curl.haxx.se/docs/faq.html
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2010-01-09