cURL / Mailing Lists / curl-users / Single Mail

curl-users

Re: Extra bytes when downloading file

From: Jim Doutt <jdoutt_at_whoi.edu>
Date: Fri, 01 Aug 2003 12:06:52 -0400

Thanks for the suggestions. I have successfully eliminated the --get
and added the -L so now I get the file in one call. I've attached a
file of the newest results.

jim

Daniel Stenberg wrote:
>
> On Thu, 31 Jul 2003, Jim Doutt wrote:
>
> > I am interacting with a WEB server to generate & download data
> >
> > I can fill out the WEB page, and click on the "download data" button &
> > get a file that is 6721536 bytes long.
> >
> > It seems that with CURL, I need a 2-step process. The first appears to
> > generate the file on the WEB site
> >
> > curl -m 20 --trace trace.out --get
> > "http://128.128.xxx.yyy/RETRIEVE.HTM?SEED=L?Z+ELZZ50000000&BEG=03%2F06%2F28&END=03%2F06%2F29&FILE=aqqa&REQ=Download+Data&DONE=YES"
>
> You don't need --get in there. curl uses GET by default.
>
> If that page uses Location: to lead you to the file to download, you can do it
> in one step by using -L.
>
> > and I can retrieve it with
> >
> > curl -O -m 20 --get "http://128.128.xxx.yyy/aqqa"
> >
> > However the retrieved file is 6729728 bytes long
> >
> > ls -alrt aqqa
> > 6729728 Jul 31 10:50 aqqa
> > ls -alrt ~/aqas
> > 6721536 Jul 30 13:57 aqas
> >
> > If I do a cmp on the two files
> > cmp -l ~/aqas aqqa
> > cmp: EOF on /home/jdoutt/aqas
> >
> > SO the file downloaded by CURL seems to have "stuff" appended to the end.
> >
> > Can someone help me with the correct syntax for doing this?
>
> Interesting. The syntax is correct, so there's something else that causes curl
> to do this.
>
> What curl version is this?
>
> Can you use -i (or -v) on that second command line to show us the headers the
> servers returns?
>
> --
> Daniel Stenberg -- curl: been grokking URLs since 1998
>
> -------------------------------------------------------
> This SF.Net email sponsored by: Free pre-built ASP.NET sites including
> Data Reports, E-commerce, Portals, and Forums are available now.
> Download today and enter to win an XBOX or Visual Studio .NET.
> http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01

I'm running on RedHat LINUX 7.3g

I'm testing on the above machine using
curl 7.10.4 (i386-redhat-linux-gnu) libcurl/7.10.4 OpenSSL/0.9.6b zlib/1.1.4

But I eventually need to move to a "Bitsy" strongarm machine and on that I am using
[root_at_Linux scripts]$curl -V
curl 7.10.3 (arm-unknown-linux-gnu) libcurl/7.10.3

Downloading using Mozilla I get the smallest file size: 6721536 Jul 31 09:50 /home/jdoutt/ewqa
Using CURL on my pentium laptop I get: 6729728 Aug 1 11:42 abcd
On the strongarm I get a file size of: 7282688 Jan 16 22:07 aqqa

The data being downloaded is binary!

*******************************************************************
Here's a comparison of a file downloaded by Mozilla on my laptop with one downloaded by CURL (without the -i -v options!).

[jdoutt_at_localhost nootka]$ cmp -l ~/ewqa abcd
cmp: EOF on /home/jdoutt/ewqa
[jdoutt_at_localhost nootka]$

So, the beginnings of the two files compare - it seems that one is just longer than the other

However, the still longer file from the BITSY doesn't compare so well
[jdoutt_at_localhost nootka]$ cmp -l abcd aqqa_from_bitsy | head -40
    16 114 105
    24 262 263
    25 26 23
    26 35 66
    27 17 32
    29 2 15
    30 267 67
    31 25 24

The BITSY file was transferred using FTP (& ensuring the binary option was used!)

*******************************************************************

**********************************************************************
running CURL with both "-i" and "-v" option for additional information
**********************************************************************

[jdoutt@localhost nootka]$ curl -i -v -m 20 -L -o abcd "http://128.128.21.131/RETRIEVE.HTM?SEED=L?Z+ELZZ50000000&BEG=03%2F06%2F28&END=03%2F06%2F29&FILE=aqqd&REQ=Download+Data&DONE=YES"
* About to connect() to 128.128.21.131:80
* Connected to 128.128.21.131 (128.128.21.131) port 80
> GET /RETRIEVE.HTM?SEED=L?Z+ELZZ50000000&BEG=03%2F06%2F28&END=03%2F06%2F29&FILE=aqqd&REQ=Download+Data&DONE=YES HTTP/1.1
User-Agent: curl/7.10.4 (i386-redhat-linux-gnu) libcurl/7.10.4 OpenSSL/0.9.6b zlib/1.1.4
Host: 128.128.21.131
Pragma: no-cache
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*

  % Total % Received % Xferd Average Speed Time Curr.
                                 Dload Upload Total Current Left Speed
  0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0* Follow to new URL: /aqqd
  0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
* Closing connection #0
* Follows Location: to new URL: 'http://128.128.21.131/aqqd'
* About to connect() to 128.128.21.131:80
* Connected to 128.128.21.131 (128.128.21.131) port 80
> GET /aqqd HTTP/1.1
User-Agent: curl/7.10.4 (i386-redhat-linux-gnu) libcurl/7.10.4 OpenSSL/0.9.6b zlib/1.1.4
Host: 128.128.21.131
Pragma: no-cache
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*

 92 6564k 92 6040k 0 0 330k 0 0:00:19 0:00:18 0:00:01 446k* transfer closed with -8192 bytes remaining to read
100 6564k 100 6572k 0 0 344k 0 0:00:19 0:00:19 0:00:00 580k
* Closing connection #0
curl: (18) transfer closed with -8192 bytes remaining to read
[jdoutt_at_localhost nootka]$ emacs output.oout&
[2] 3759
[jdoutt_at_localhost nootka]$ lr
total 20240
-rw-rw-r-- 1 jdoutt jdoutt 702 Feb 28 06:10 test.html

**********************************************************************
the beginning of what's in the data file "abcd"
**********************************************************************

HTTP/1.0 302 Found
Date: Fri, 01 Aug 2003 14:49:21 GMT
Location: /aqqd
Content-type: text/html

HTTP/1.0 200 OK
Server: Quanterra-Baler14/1.02
Content-type: application/octet-stream
Date: Fri, 01 Aug 2003 14:49:21 GMT
Expires: Fri, 01 Aug 2003 14:49:21 GMT
Content-Length: 6721536

000001D D08 LLZXI.......

************** ^^ binary data ^^ *********************

-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
Received on 2003-08-01