cURL / Mailing Lists / curl-users / Single Mail

curl-users

Re: multiple session cookie handling

From: Daniel Stenberg <daniel_at_haxx.se>
Date: Mon, 25 Jun 2001 18:11:47 +0200 (MET DST)

On Fri, 22 Jun 2001, Cris Bailiff wrote:

> > > curl should do a 'exclusive-lock, read, update, write, unlock' cycle
> > > on each update access, so that multiple curls can safely share the jar,
> > > even when end sites are setting their own cookies per-request.
> >
> > We *could* do that. Although, it sounds like a thin path to walk on, as you
> > wouldn't really know which curl that would save their view of the cookies
> > last, and thus has the final "say" about things.
>
> The last curl to write - the point is that the atomic action is the
> 'add/update' of a single cookie - you load the jar,update relatvie to the
> loaded jar and write back inside the lock, so that any other updates
> can't 'race' you. There is only a race if two simultaneous curls update
> the same cookie (same name, host/domain, path triple) - then the last one
> wins, which is the same as if two simultaneous pages set the same cookie
> in a normal browser. Otherwise, non-colliding cokoies all collect safely
> in the jar without there being any 'lost updates'.

Yeah, but that would also mean that the save procedure would be quite
complicated. It can't just save its internal state, it would need to "merge"
its internal state with the one currently saved in the cookie file.

Like:

1. get jar
2. use cookies
3. update cookies
  (might take a while if we communicate with a slow server or something)
4. save cookies in jar again

Now, if there has been a number of curls running between phase 3 and 4, any
number of cookies in the file might have been updated or added. For phase 4
to not ruin the newly added ones (which this session has no knowledge about),
this session needs to read the entire jar again so that new cookies can be
saved. For the cookies that are both in memory and is read again from the jar
we need to figure out if they or the memory-ones are to be kept (there's no
"date stamp" on cookies in the netscape cookie file, why we'd probably need
to make some simple scheme like all cookies in memory that were "touched" in
this session are newer, all the rest get their data from the jar file), and
then save them all (while using an exclusive lock).

I'm not saying this can't be done. I'm only identifying what needs to be
done.

> > This particular example could also be done by simply having the initial login
> > curl put its cookies in the jar, and then the download curls would use that
> > jar read-only (as -b works today). It would also make it a lot easier to
> > understand in what state the cookie jar is in, at any given moment.
>
> Except mayebe each subordinate curl needs to track new cookies/session
> state as it goes - for example, to talk 'hotmail', you need to peddle
> hard with cookies, redirects etc.

You're right. My thinking was a bit too narrow.

> CON: the header file format doesn't record which site the cookies came
> from, in the case of 'default domain' cookies - if the server doesn't set
> a domain attribute on the cookie, you don't know where its from, which
> prevents you storing multiple different site cookies in one file. The
> current parser login assumes one file/header-set per site.

Yeah, you're right again. There's really no future in storing cookies as
headers.

[controlling which cookies that get sent to the remote server and which
don't and what cookies that get saved]

> > Well, if the writing of the jar would be done with a callback system where
> > libcurl would pass the information to a custom function, then of course that
> > function could just not write whatever cookie it considers isn't fit for
> > writing.
>
> Yeah, that would work, but that means the client program has to work out
> where in the jar to keep the cookie. Does this imply that the client is
> responsible for deciding which cookies from the jar should be applied to
> each request? If not, then we need two callbacks - one to add cookies to
> the jar and one to pass the whole jar back in for using in the next
> request - which aren't very symetrical. Thats why I suggested this
> separately from the load/save jar calls...

Actually, when I think a bit more on this issue, I can think of many ways you
want to control how cookies are (re-)used by curl when it gets them from a
jar. There can hardly be any harm in saving a few cookies too many, but you
do want to be able to control which ones that are used in requests. Or is
there a point in controlling both ways?

[From a library angle] The application should be able to cancel out cookies
that would otherwise get sent to the server. It is impossible for us to
figure out exactly on what grounds an appliction wants to cancel a particular
cookie, why we have to pass all information about every cookie curl wants to
send to the application for acknowledgement (if that callback is used).

[From a curl angle] We need to come up with an interface that would control
which cookies to (not) use. Assuming we have a cookie jar and we know that
some of the saved cookies are not healthy to pass back to the server. We
might want to disable them based on domain, path, name, contents or
whatever... I imagine a set of rules with pattern matching. Like

  --cookie-jar cookie-jar.txt \
  --cookie-skip-if name="silly-cookie*" \
  --cookie-skip-if domain="ourdomain.com" \
  --cookie-skip-if path="/login"

... and several expressions would be treated with a logical OR, requiring
only one match to trigger the skip. Of course, it would only be a matter of
time before someone wants ANDed expressions, so maybe a different approach
will be required...

Would this be too complicated? It looks complicated already...

> I'm currently using the perl HTTP::Cookies library to load and save my
> cookies, by grabbing the raw headers from Curl::easy into the jar for
> save, and making a fake 'HTTP::Request' for HTTP::Cookies to fill from
> the jar, and then copying from HTTP::Request to the curl_request for the
> load. I'd like to simplify/speed this up if I could...

I feel quite confident that we will.

-- 
     Daniel Stenberg -- curl dude -- http://curl.haxx.se/
Received on 2001-06-25