cURL / Mailing Lists / curl-users / Single Mail

curl-users

Re: curl to download a file, not the web page

From: Guirong Wang <guirong2004_at_yahoo.com>
Date: Mon, 18 Jun 2007 11:03:48 -0700 (PDT)

 Thanks for your reply. I compared the headers from curl with -v option and from LiveHTTPHeaders.
  First I compared the headers when first reach the https://somesite.com/retrieve.aspx page as listed in the following:
  output from curl -v:
**********************************************************
   
 ...
 ...
 ...
  Pragma: no-cache
Accept: */*
Cookie: PD-S-SESSION-ID=2_R95MuNDSVzNlzbTD7RWJI7lAfIixBcGp1AbR3Jij25pOPE-4;
-> PD_STATEFUL_bd4058f4-c931-11db-a47c-000255ef1e40=%2Flni%2Fpeb_prod;
AMWEBJCT!%2Flni%2Fpeb_prod!ProviderWebFormsAuth=03A404742022E
  < HTTP/1.1 200 OK
< p3p: CP="NON CUR OTPi OUR NOR UNI"
< content-type: text/html; charset=utf-8
< cache-control: private
< date: Fri, 15 Jun 2007 22:11:40 GMT
< x-powered-by: ASP.NET
< transfer-encoding: chunked
< x-aspnet-version: 1.1.4322
< server: Microsoft-IIS/6.0
< x-old-content-length: 152533
 ...
 ...
 ...
  < Set-Cookie: AMWEBJCT!%2Flni%2Fpeb_prod!ProviderWebFormsAuth=042FF2C816A727B11A
 ; Path=/
* Added cookie PD-S-SESSION-ID="2_R95MuNDSVzNlzbTD7RWJI7lAfIixBcGp1AbR3Jij25pOPE
-4" for domain path /, expire 0
< Set-Cookie: PD-S-SESSION-ID=2_R95MuNDSVzNlzbTD7RWJI7lAfIixBcGp1AbR3Jij25pOPE-4
; Path=/; Secure
***********************************************************************
  header from LiveHTTPHeaders
   ...
 ...
 ...
  Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us
<-> Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: https://somesite.com/
Cookie: PD-S-SESSION-ID=2_1UbZN0a34tMtuJL2znHcoiX7fKBn+Hmz2JVKpDdqn6FrQsIs;
-> PD_STATEFUL_bd4058f4-c931-11db-a47c-000255ef1e40=%2Flni%2Fpeb_prod;
-> PD_STATEFUL_c918d4a6-fb7c-11d8-a381-000255ef1e40=%2FmyAccess; AMWEBJCT!%2Flni%2Fpeb_prod!ProviderWebFormsAuth=B601486C2938CFD280E6D3B039 ; IV_JCT=%2Flni%2Fpeb_prod
  HTTP/1.x 200 OK
p3p: CP="NON CUR OTPi OUR NOR UNI"
Content-Type: text/html; charset=utf-8
Cache-Control: private
Date: Fri, 15 Jun 2007 19:19:31 GMT
X-Powered-By: ASP.NET
Content-Encoding: gzip
Transfer-Encoding: chunked
x-aspnet-version: 1.1.4322
Server: Microsoft-IIS/6.0
x-old-content-length: 152528
Set-Cookie: AMWEBJCT!%2Flni%2Fpeb_prod!ProviderWebFormsAuth=7D855E5CEB43F7A79720 ; Path=/
Set-Cookie: PD-S-SESSION-ID=2_1UbZN0a34tMtuJL2znHcoiX7fKBn+Hmz2JVKpDdqn6FrQsIs; Path=/; Secure
  ************************************************************************
  Difference between curl and LiveHTTPHeaders when first reach the download page:
1. In the LiveHTTPHeaders, there are two PD_STATEFUL's :
PD_STATEFUL_bd4058f4-c931-11db-a47c-000255ef1e40=%2Flni%2Fpeb_prod;
PD_STATEFUL_c918d4a6-fb7c-11d8-a381-000255ef1e40=%2FmyAccess;
(as marked with -> at the beginning of the line), while there is only one PD_STAETFUL set up in the curl -v output:
PD_STATEFUL_bd4058f4-c931-11db-a47c-000255ef1e40=%2Flni%2Fpeb_prod;
 
2.In the LiveHTTPHeaders, there is a following line ( as marked with <->) while there is no such line in the curl -v output.
Accept-Encoding: gzip,deflate
   
   
  
After the form is filled out:
  ************************************************************************
The following is the output from the -v option:
  ...
...
...
  Pragma: no-cache
Accept: */*
Cookie: PD-S-SESSION-ID=2_dNFvP+PAMy39X5MhOjDQPPfAtV7q7bgoB82R9ord2SG0UvxH;
AMWEBJCT!%2Flni%2Fpeb_prod!ProviderWebFormsAuth=E97521593A739AC20FE1CAABFD317858E00A
  Content-Length: 1623
Content-Type: application/x-www-form-urlencoded
  
< HTTP/1.1 200 OK
< p3p: CP="NON CUR OTPi OUR NOR UNI"
< content-type: text/html; charset=utf-8
< cache-control: private
< date: Fri, 15 Jun 2007 23:14:16 GMT
< x-powered-by: ASP.NET
< transfer-encoding: chunked
< x-aspnet-version: 1.1.4322
< server: Microsoft-IIS/6.0
< x-old-content-length: 152533
* Added cookie AMWEBJCT!%2Flni%2Fpeb_prod!ProviderWebFormsAuth="E85CAD13A3BE45F3
" for domain , path /, expire 0
< Set-Cookie: AMWEBJCT!%2Flni%2Fpeb_prod!ProviderWebFormsAuth=E85CAD13A3BE45F373
; Path=/
* Added cookie PD_STATEFUL_bd4058f4-c931-11db-a47c-000255ef1e40="%2Flni%2Fpeb_pr
od" for domain, path /, expire 0
< Set-Cookie: PD_STATEFUL_bd4058f4-c931-11db-a47c-000255ef1e40=%2Flni%2Fpeb_prod
; Path=/
* Added cookie PD-S-SESSION-ID="2_dNFvP+PAMy39X5MhOjDQPPfAtV7q7bgoB82R9ord2SG0Uv
xH" for domain , path /, expire 0
< Set-Cookie: PD-S-SESSION-ID=2_dNFvP+PAMy39X5MhOjDQPPfAtV7q7bgoB82R9ord2SG0UvxH
; Path=/; Secure
  ********************************************************************************
  header from LiveHTTPHeaders
   
  ...
...
  
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: https://somesite.com/retrieve.aspx
Cookie:
PD-S-SESSION-ID=2_1UbZN0a34tMtuJL2znHcoiX7fKBn+Hmz2JVKpDdqn6FrQsIs; PD_STATEFUL_bd4058f4-c931-11db-a47c-000255ef1e40=%2Flni%2Fpeb_prod; PD_STATEFUL_c918d4a6-fb7c-11d8-a381-000255ef1e40=%2FmyAccess; AMWEBJCT!%2Flni%2Fpeb_prod!ProviderWebFormsAuth=7D855E5CEB43F7A797204A8F0F8B7EB64A5B5404EC ; IV_JCT=%2Flni%2Fpeb_prod
Content-Type: multipart/form-data; boundary=---------------------------236982958018339
Content-Length: 10382
  ...
  ...
  
HTTP/1.x 200 OK
p3p: CP="NON CUR OTPi OUR NOR UNI"
content-disposition: attachment; filename=EDI.txt
Content-Type: application/octet-stream; charset=utf-8
Cache-Control: private
Date: Fri, 15 Jun 2007 19:19:54 GMT
X-Powered-By: ASP.NET
Transfer-Encoding: chunked
x-aspnet-version: 1.1.4322
Server: Microsoft-IIS/6.0
Set-Cookie: AMWEBJCT!%2Flni%2Fpeb_prod!ProviderWebFormsAuth=515AF25147F07F180CED8E0B0D5E768DDBEFA8C0D8 ; Path=/
Set-Cookie: PD-S-SESSION-ID=2_1UbZN0a34tMtuJL2znHcoiX7fKBn+Hmz2JVKpDdqn6FrQsIs; Path=/; Secure
  ************************************************************************************
Differences between curl and LiveHTTPHeaders after the form is filled out for downloading:
  After the form is filled out, in LiveHTTPHeaders, there are the following two lines, while there is no such line in the curl -v output:
content-disposition: attachment; filename=EDI.txt
Content-Type: application/octet-stream; charset=utf-8
In addition, there are also two lines of PD_STATEFUL's in LiveHTTPHeaders while there is only one line in curl -v output, like the first part.
  Are these differences the possible causes for the incorrect download? Thanks.
  
 

       
---------------------------------
Choose the right car based on your needs. Check out Yahoo! Autos new Car Finder tool.
Received on 2007-06-18