cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: [PATCH/RFC] More flexible output filename (based on HTTP reply)

From: Daniel Stenberg <daniel_at_haxx.se>
Date: Mon, 16 Mar 2015 18:14:02 +0100 (CET)

On Mon, 16 Mar 2015, Leon Winter wrote:

> The basic problem is that curl (the binary) opens the file very early
> without any knowledge of the server response. This also leads to another
> bug/inconsistency. The effective URL does not need to match the effective
> filename since the filename is statically computed from the input URL and
> not from the effective URL (which can be different due to 3xx HTTP
> redirections).

Nice work Leon!

My first gut reaction: I don't think we are "allowed" to change what file name
-O uses unless you also use -J. I mean, your logic and work so far is great
but I think -O should get the file name from the initial URL while you're
allowed or even encouraged to figure out the "right" name with -J.

This because with -J a user cannot fully know which file name that will be
used but with -O (without -J) the URL clearly tells which file name that will
end up on the disk after a succesful transfer and there is bound to exist many
scripts out there relying on its behavior. Even with redirects.

Or am I being unreasonable?

> However this patch is still work-in-progress as it does not solve the key
> problem of #81 which is to actually resume the download. The patch however
> gets the "right" file, yet does not resume the download. This is because
> when we do the initial first request we do not know the output file name and
> thus cannot determine the filesize of our local file.

I'm pretty sure we should just document this as a known limitation somewhere
(probably documented nearby -J) as I believe chasing after a solution to this
is going to lure us into dark and scary places of guesses and assumptions.

> Regarding the code in the patch I am not to sure about my check for the
> "Location" header response field or whether one can perform this check in a
> better way.

I think that's a decent way. You can also just opt to parse incoming headers
and check the HTTP response code for relevant 3xx codes.

> Also while testing I just noticed there are problems when doing a HEAD
> request (curl -I):

> I am not sure why there is an attempt to write to the file when I am just
> doing a HEAD request though.

Hm. I'll try to give that a closer look soon.

-- 
  / daniel.haxx.se
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette:  http://curl.haxx.se/mail/etiquette.html
Received on 2015-03-16