curl / Mailing Lists / curl-users / Single Mail
Buy commercial curl support from WolfSSL. We help you work out your issues, debug your libcurl applications, use the API, port to new platforms, add new features and more. With a team lead by the curl founder himself.

Re: --etag-compare vs. --time-cond

From: Timothe Litt <litt_at_acm.org>
Date: Mon, 7 Mar 2022 09:34:40 -0500

On 07-Mar-22 08:11, Paul Gilmartin via curl-users wrote:
> What happens if -etag-compare and --time-cond disagree about whether
> to fetch the target resource? Does Curl simply send a Conditional
> Request, as in<https://datatracker.ietf.org/doc/html/rfc7232>
> and let the server decide? (But if it's not HTTP?)

Etag only applies to http.

At the server, Etags take precedence over last-modified. Specifying both
is rarely useful - there are corner cases where it may help.  E.g. if
you get an e-tag directly from a server, but a subsequent request goes
to a cache that doesn't support e-tags. Hopefully, by now these are
uncommon.

Other protocols don't have a way to determine the remote server's clock;
some use local timestamps, others use UTC.  And whether they're
synchronized is unknown.  Plus, this varies with the protocol/server
version.

> Linux keeps file tiimestamps to fractions of a second. Does Curl
> ignore fractions? This could affect "If-Modified-Since".

The http RFCs require that the If-Modified-Since time be evaluated based
on the origin server's clock.  The date format specified does not
include fractional seconds.
(https://datatracker.ietf.org/doc/html/rfc7231#section-7.1.1.1). If the
download sets the file date from the Last-Modified header, fractional
seconds will always be zero.  To avoid a number of edge cases, often the
server's comparison will be an exact match.

This is one of many reasons that etags are considered more accurate;
they may be a hash of file contents, or a mixture of inode,
mtime(whatever the server's precision), & size [apache does this] - or
anything else that enables the server to come up with a unique
validator.  Since they are opaque, the server can use as much or as
little information as it wishes, and even change what it uses, without
any impact on the client.

Other protocols differ - e.g. the latest FTP extensions allow fractional
seconds, but early versions had no time at all - unless you parsed
directory listings...

>
> May I assume that the --etag-compare <file> may contain multiple
> tags, comma-separated, on a single line?
Don't know - would hope so, and that they are parsed and compared per
the RFC...including weak vs. strong validators.
>
> May I assume that --etag-compare <file> and --etag-save <file> may
> safely refer to the same file?

If you are doing a simple superseding download:

Save can't be written until the complete output file has been written.

Compare shouldn't be kept if the output file is being overwritten.

In the case of a crash, the saved tag wouldn't match file contents.

If you are trying for a safe download, you might compare the local
version with the remote, downloading to a new/temporary file on
mismatch.  In that case, you'd want the new tag saved in a different
place - until you do an atomic series of renames (and possible delete)
on success.

How well curl handles the various use and corner cases requires an
answer from someone else.  From what I can tell, it likely has rough edges.

As I said earlier, this is more complicated than it first appears.


Timothe Litt
ACM Distinguished Engineer
--------------------------
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed.


-- 
Unsubscribe: https://lists.haxx.se/listinfo/curl-users
Etiquette:   https://curl.haxx.se/mail/etiquette.html
Received on 2022-03-07