cURL / Mailing Lists / curl-users / Single Mail

curl-users

Re: cURL --cookie-jar

From: Daniel Stenberg <daniel_at_haxx.se>
Date: Wed, 1 May 2002 16:58:11 +0200 (MET DST)

On Wed, 1 May 2002, Roth, Kevin P. wrote:

> > Actually, -c doesn't _read_ cookies. It only writes them.
>
> I stand corrected... I was under the impression -c had been added to
> eliminate having to use multiple cookie parameters to make cookies "just
> work". I know *personally*, when I use a real-life cookie jar, I never just
> remove (read) and add (write) cookies; I also consume (parse) them ;-)

Yes, and that's why -c also enables the cookie parser. Basicly, if you have
no previous cookies lying around, these two command lines are equivalent:

 curl -b readnofileatall -c storecookieshere [URL]

and

 curl -c storecookieshere [URL]

[ I edited this question slightly: ]
> Is there any reason why -c shouldn't also read (and use) any pre-existing
> cookies?

There were no significant and important reasons why, mainly a few minor ones:

 o There would be no way to tell curl to not read *any* existing cookies but
   overwrite the already present cookiefile.
 o I thought it played nicely with the fact that -b already read cookies, we
   only needed an option for writing as well.

> If there *is* such a reason, then my problem is that it seems
> counter-intuitive to specify the cookie-jar file twice (once for -b, and
> once for -c). One simple fix might be to add a magic filename for -b, that
> simply tells curl to use whatever file is specified for -c. So, if the
> special "filename" is "@", you could do this:

> curl -b @ -c cookies.txt url://...

> which specifies a cookie-jar file, and also enables the cookie parsing,
> using the cookie-jar contents as initial input, but without inputting the
> cookie-jar filename twice...

While this certainly would be possible, I can only see problems with this
approach. It is harder to document, harder to describe to people and I guess
therefor more error prone when people will use it.

I haven't received any complaints (until now) from people that they need to
repeat the file name twice when they want to both read and write cookies with
-b and -c. Most people doing this will do so from within scripts anyway.

> Can someone draw (ascii is fine) a timeline of how the -c, -b and -D
> parameters work together? In other words, given this command line:
>
> curl -c cookies -b headers -D headers http://host/f1 http://host/f2
>
> when do each of the files get read and/or written (relative to each
> other and also to making these two requests), and what contents go
> into each? Also show what happens in memory, with the cookies
> coming back from the web pages.
> Assume /f1 sets a cookie named "c1" with some value, and /f1 sets
> a cookie named "c2" (and perhaps also "c1" if that's interesting).

Let me try (but keeping it plain text), please ask again if there's anything
in this picture that seems blurry!

I have not tried to describe what happens with the cookies on the network, as
these options mainly control what goes in and out between the local
filesystem to/from curl.

-b makes curl read a set of cookies to feed the internal cookie keeper.
   This file can be a simple HTTP-headers file (as -D saves) or it can be
   a netscape cookie file (as -c saves). If the file is empty or
   non-existing, this option only enables the cookie parser. The cookies
   read into the "cookie keeper" will be used in all subsequent requests
   for which they match (domain, path, etc common cookie stuff).
   This option makes no cookies, nor any headers to get written to disk.

   If this option gets an argument using "name=contents" format, it will
   instead set that cookie in the upcoming request(s). (Thinking about it
   here and now, a possible different approach would be to actually add the
   given cookie to the "cookie keeper" instead).

   This option enables the cookie parser. libcurl will read, parse and
   possibly send away cookies. Enabled cookie parser means that live incoming
   cookies will be received, understood and kept in memory.

-c will have all cookies stored in the given file, using the netscape file
   format. This will not happen until the libcurl call curl_easy_cleanup()
   which is done after all given URLs have been taken care of. This function
   will store *all* known cookies (that aren't expired) to the output file.
   Including cookies that might have been previously read using the -b
   option. This option will not read any cookies to start with. If not used
   with any other option, no cookies will exist when the first request is
   made.

   This option enables the cookie parser. libcurl will read, parse and
   possibly send away cookies.

-D is not aware of cookies at all. It simply makes received headers get
   stored in the specified file. The headers may contain cookies, or whatever
   this makes no difference.

   This option does not read anything, it just writes to the given file.
   (If there are no headers received, the output file will not have been
   touched, not even truncated or similar.)

> I propose -j/--junk-session-cookies, which, when combined with -c or -b,
> would prevent session cookies from being read out of the cookie-jar or
> headers file. It would however, still store NEW session cookies in the
> cookie-jar file or the headers file, and it will also USE any new session
> cookies in subsequent requests (-L or multiple URLs).

Agreed, it would simply prevent session cookies from being read from the
given files.

> This way, there are two ways to "clear out" session cookies. One is to make
> a request for a non-existant file (e.g. /dev/null) at the *end* of a curl
> "session"

I don't understand how this would work. Can you elaborate?

> The other is to use this option on the *first* call at the start of a new
> curl session; in this scenario, you have to save any NEW session cookies
> which are generated, but you want to explicitly trash any old ones.

Right. -j would practicly flush all existing session cookies, and you'd get
new ones.

-- 
    Daniel Stenberg -- curl groks URLs -- http://curl.haxx.se/
Received on 2002-05-01